Chapter 1 Inference

King, Keohane, and Verba (1996, p. 46) define inference as “the process of using the facts we know to learn about facts we do not know.”

Consider the following three targets of inference:

  1. The Average Treatment Effect. Suppose you conduct an experiment in which you assign \(N\) subjects to either treatment or control. For each subject \(n\), you observe either the outcome under the treatment condition \(Y^{T}_n\) or the outcome under control \(Y^{C}_n\). We define the average treatment effect or ATE as \(\displaystyle \frac{1}{N}\sum_{n = 1}^{N}\left( Y^{T}_n - Y^{C}_n \right)\). However, because we cannot place subject \(n\) in both treatment and control, we cannot observe the ATE; we can only estimate it.
  2. Features of a Population, Using a Sample. Suppose we have a random sample of \(N\) observations from a much larger population. Then we can use the sample to estimate features of the population, such as the average of a variable or correlation between two variables. However, because we cannot (or perhaps choose to not) observe the each member of population, we cannot observe the features of the population directly.
  3. Parameters of a Stochastic Model. Suppose the outcome variable \(y\) is a collection of samples from a distribution \(f(\theta)\). Then we cannot observe \(\theta\) directly, but we can use the observed samples \(y\) to estimate \(\theta\). For example, suppose you have a binary outcome variable \(y\) that you model as draws from a Bernoulli distribution. Then \(y_i \sim \text{Bernoulli}(\pi)\). Your inferential target would not be the proportion of ones in the sample, but the value of \(\pi\).

Modeling these three targets of inference identical in some situations and very similar in many. For simplicity, I focus on estimating the parameters of a stochastic model. In this situation, we observe a sample \(y\) from a particular distribution and use the sample to estimate the parameters of the distribution.

We consider two types of estimates:

  1. Point Estimates: using the observed data to calculate a single value or best-guess for the unobservable quantity of interest.
  2. Interval Estimates: using the observed data to calculate a range of values for the unobservable quantity of interest.

For each type of estimate, we consider:

  1. How to develop an estimator.
  2. How to evaluate an estimator.