Hypotheses and ANOVA
👉 For ANOVA, both the homogeneity of variances and the normality assumptions concern the errors of the model, so they should be assessed on the residuals.
Below is the precise reasoning, with practical nuances.
1. What ANOVA actually assumes
The classical ANOVA model is:
with the assumptions:
Normality:
Homoscedasticity:
for all groupsIndependence of
So both assumptions apply to the errors, not to the raw response .
2. Consequences for diagnostics
✅ Normality
Should be assessed on residuals, not on original data.
Raw data can be non-normal simply because group means differ.
Correct tools:
Q–Q plot of residuals
Histogram of residuals
Shapiro–Wilk test on residuals (with caution)
✅ Homogeneity of variances
Also concerns residual variance.
Therefore should be checked on residuals.
3. Important nuance: one-way ANOVA
For a simple one-way ANOVA:
Testing variance homogeneity on raw data by group
≈ testing it on residuals.That’s why Levene / Bartlett tests are usually presented on the original response.
⚠️ This equivalence breaks immediately when you have:
multiple factors,
interactions,
covariates (ANCOVA),
unbalanced designs.
In those cases, testing on raw data is incorrect.
4. What you should do in practice
Best practice workflow
Fit the ANOVA model
Examine residuals
Residuals vs fitted values → homoscedasticity
Q–Q plot → normality
Use formal tests sparingly (they are often over-sensitive)
Why formal tests are often discouraged
Large samples → reject trivial deviations
Small samples → very low power
ANOVA is quite robust to moderate non-normality if variances are homogeneous
5. Summary table
| Assumption | What it applies to | Should be tested on |
|---|---|---|
| Normality | Errors | Residuals |
| Homoscedasticity | Error variance | Residuals |
| Independence | Errors | Study design |
Commentaires
Enregistrer un commentaire