15. Comparing two means

Motivating scenario: We often want to quantify the difference between sample means but cannot design an experiment that allows a paired t-test (e.g. how do pink- and white- flowered RILs differ in their pollinator visitation?). We want to estimate the difference between group means, quantify the uncertainty in that estimate, describe the size of the effect, and test the null hypothesis that the two samples come from the same population. It’s time for a two-sample t comparison!

Learning goals: By the end of this chapter you should be able to:

  1. Describe the assumptions underlying a two-sample t-test, and know what to do when they dont match our data.
  2. Visualize two samples and their uncertainty using ggplot2.
  3. Calculate and Interpret a t-value from a two-sample comparison.
  4. Interpret Cohen’s D as a measure of effect size in a two-sample comparison.
  5. Use the \(t\) distribution to calculate a 95% confidence interval for the difference in sample means.
  6. Test the null hypothesis that two samples come from the same population with
    • Standard calculations, the pt() function.
    • The t.test() function.
    • The lm() pipeline.

One of the most common questions is “What’s the difference?” We often want to know:

Remmeber, correlation is not causation! we usually want to know the cause (e.g., “Will a vaccine help or hurt?”), but we’re often stuck with associations (e.g., “What’s the difference between vaccinated and unvaccinated people?”). Causal claims require careful experiments or other formal approaches to causal inference.

Comparing two means with the t-distribution

We can use the t-distribution to evaluate the null hypothesis that two samples come from the same statistical population.

  • Paired comparisons: When data are naturally paired, we can use a one-sample t-test on the differences between members of each pair. This design has more statistical power because pairs “soak up” variability unrelated to the treatment, decreasing the noise in our estimate of between group difference.

  • Unpaired comparisons: In many cases, natural pairing is impractical or impossible. For example, there is no natural “pairing” in comparisons between pink and white parviflora RILs. In these situations, we test whether the means of two independent samples are different.


Two photos of Clarkia xantiana flowers above two photos of Clarkia parviflora flowers, illustrating the visible differences between the species.
Figure 1: A visual comparison of Clarkia petals. The top two are xantiana, and the bottom two are parviflora. Picture by Chris Winchell posted on CalPhotos.

Our Path

We will return to our Clarkia RILs and compare pollinator visitation to pink and white flowered RILs at site Sawmill Road. We might think that pink flowers attract more pollinator visits than white flowers (our scientific hypothesis). Recasting this idea as a null hypothesis and a (two-tailed) alternative statistical hypothesis:

  • Null hypothesis: The mean number of pollinator visits to pink and white flowered parviflora RILs is the same.
  • Alternative hypothesis: The mean number of pollinator visits to pink and white flowered parviflora RILs differ.