3.3 MVUE or BUE

Imagine repeatedly sampling and computing the estimate \(\hat{\theta}\) of the parameter \(\theta\) for each sample. In this thought experiment, \(\hat{\theta}\) is a random variable. We refer to \(E \left[ (\hat{\theta} - \theta)^2 \right]\) as the **mean-squared error* (MSE) of \(\hat{\theta}\). Some people use the square-root of the MSE, which we refer to as the root-mean-squared error (RMSE).

In general, we prefer estimators with a smaller MSE to estimators with a larger MSE.

Notice that an estimator can have a larger MSE because (1) it’s more variable or (2) more biased. To see this, we can decompose the MSE into two components.

\[ \begin{aligned} E \left[ (\hat{\theta} - \theta)^2 \right] &= E \left[ (\hat{\theta} - E(\hat{\theta})\right] ^2 + E(\hat{\theta} - \theta)^2\\ \text{MSE}(\hat{\theta}) &= \text{Var}(\hat{\theta}) + \text{Bias}(\hat{\theta})^2 \end{aligned} \] When designing an estimator, we usually follow this process:

Eliminate biased estimators, if an unbiased estimator exists.
Among the remaining unbiased estimators, select the one with the smallest variance. (The variance equals the MSE for unbiased estimators.)

This process does not necessarily result in the estimator with the smallest MSE, but it does give us the estimator with the smallest MSE among the unbiased estimators.

This seems tricky—how do we know we’ve got the estimator with the smallest MSE among the unbiased estimator? Couldn’t there always be another, better unbiased estimator that we haven’t considered?

It turns out that we have a theoretical lower bound on the variance of an estimator. No unbiased estimator can have a variance below the Cramér-Rao Lower Bound. If an unbiased estimator equals the Cramér-Rao Lower Bound, then we say that estimator attains the Cramér-Rao Lower Bound.

We refer to an estimator that attains the Cramér-Rao Lower Bound as the minimum-variance unbiased estimator (MVUE) or the best unbiased estimator (BUE).

A MVUE estimator is the gold standard. It is possible, though, that a biased alternative estimator has a smaller MSE.

It’s beyond our scope to establish whether particular estimators are the MVUE. However, in general, the sample average is an MVUE of the expected value of a distribution. Alternatively, the sample average is the MVUE of the population average. If you are using \(\hat{\theta} = \text{avg}(x)\) to estimate \(\theta = E(X)\), then \(\hat{\theta}\) is an MVUE.