Chapter 11. Interval Estimation and Hypothesis Testing

Chapter Purpose

Chapter 10 estimated a simple regression model relating milk prices to package volume. The slope coefficient was 417.0, but any sample estimate contains uncertainty. This chapter introduces standard errors, confidence intervals, and hypothesis tests using the actual Milk Data results.

Applied Question

In Chapter 10 we estimated:

\[ \widehat{Price}=516.6+417.0\times Volume1000 \]

Is this relationship statistically significant?

Estimating the Model

X = sm.add_constant(milk_data["Volume1000"])
y = milk_data["Price"]
model = sm.OLS(y, X).fit()
print(model.summary())
                            OLS Regression Results                            
==============================================================================
Dep. Variable:                  Price   R-squared:                       0.274
Model:                            OLS   Adj. R-squared:                  0.271
Method:                 Least Squares   F-statistic:                     96.56
Date:                Thu, 11 Jun 2026   Prob (F-statistic):           1.53e-19
Time:                        06:53:14   Log-Likelihood:                -2081.5
No. Observations:                 258   AIC:                             4167.
Df Residuals:                     256   BIC:                             4174.
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const        467.9297     74.683      6.266      0.000     320.858     615.002
Volume1000   417.0261     42.439      9.826      0.000     333.451     500.601
==============================================================================
Omnibus:                      221.760   Durbin-Watson:                   1.652
Prob(Omnibus):                  0.000   Jarque-Bera (JB):             2757.298
Skew:                           3.610   Prob(JB):                         0.00
Kurtosis:                      17.295   Cond. No.                         3.30
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.

Point Estimate

The point estimate for the volume coefficient is:

\[ \hat{\beta}_1 = 417.0 \]

This is our best estimate of the average relationship between package volume and price.

Confidence Interval

The 95% confidence interval for the volume coefficient is:

\[ 333.45 \leq \beta_1 \leq 500.60 \]

Interpretation:

The data suggest that the true effect of a 1,000 ml increase in volume is likely to lie between approximately 333 and 501 price units.

model.conf_int()
0 1
const 320.857636 615.001780
Volume1000 333.451316 500.600835

Standard Errors

A standard error measures the precision of an estimate. Small standard errors imply greater precision. Confidence intervals and hypothesis tests are built from standard errors.

model.bse
const         74.683401
Volume1000    42.439379
dtype: float64

Hypothesis Testing

The null hypothesis is:

\[ H_0:\beta_1=0 \]

The alternative hypothesis is:

\[ H_1:\beta_1\neq 0 \]

The p-value for the volume coefficient is:

p < 0.001
model.pvalues
const         1.564357e-09
Volume1000    1.525668e-19
dtype: float64

Because the p-value is below 0.05, we reject the null hypothesis.

Confidence Intervals and Significance

The confidence interval does not include zero. This agrees with the p-value result. The relationship between volume and price is statistically significant.

Statistical Significance Versus Economic Significance

Statistical significance asks whether the relationship is likely to be real. Economic significance asks whether the magnitude is large enough to matter.

The Milk Data coefficient of 417.0 is meaningful because a 1,000 ml increase in package volume is associated with a sizeable increase in price.

Warning

A statistically significant coefficient is not automatically a causal effect.

What We Learned From the Milk Data

  • Estimated coefficient: 417.0
  • 95% confidence interval: [333.45, 500.60]
  • p-value: < 0.001

The evidence supporting a positive relationship between volume and price is very strong.

Common Mistakes

WarningCommon Mistake 1

Treating p-values as measures of economic importance.

WarningCommon Mistake 2

Ignoring confidence intervals.

WarningCommon Mistake 3

Confusing statistical significance with causality.

Key Takeaways

  • Regression estimates contain uncertainty.
  • Confidence intervals provide a range of plausible values.
  • The estimated volume coefficient is 417.0.
  • The 95% confidence interval is [333.45, 500.60].
  • The p-value is below 0.001.
  • Economists should interpret coefficients before focusing on p-values.