Hypotheses and ANOVA

👉 For ANOVA, both the homogeneity of variances and the normality assumptions concern the errors of the model, so they should be assessed on the residuals.

Below is the precise reasoning, with practical nuances.


1. What ANOVA actually assumes

The classical ANOVA model is:

Yij=μ+αi+εij

with the assumptions:

  • Normality:
    εijN(0,σ2)

  • Homoscedasticity:
    Var(εij)=σ2 for all groups

  • Independence of εij

So both assumptions apply to the errors, not to the raw response Y.


2. Consequences for diagnostics

✅ Normality

  • Should be assessed on residuals, not on original data.

  • Raw data can be non-normal simply because group means differ.

  • Correct tools:

    • Q–Q plot of residuals

    • Histogram of residuals

    • Shapiro–Wilk test on residuals (with caution)

✅ Homogeneity of variances

  • Also concerns residual variance.

  • Therefore should be checked on residuals.


3. Important nuance: one-way ANOVA

For a simple one-way ANOVA:

  • Testing variance homogeneity on raw data by group
    ≈ testing it on residuals.

  • That’s why Levene / Bartlett tests are usually presented on the original response.

⚠️ This equivalence breaks immediately when you have:

  • multiple factors,

  • interactions,

  • covariates (ANCOVA),

  • unbalanced designs.

In those cases, testing on raw data is incorrect.


4. What you should do in practice

Best practice workflow

  1. Fit the ANOVA model

  2. Examine residuals

    • Residuals vs fitted values → homoscedasticity

    • Q–Q plot → normality

  3. Use formal tests sparingly (they are often over-sensitive)

Why formal tests are often discouraged

  • Large samples → reject trivial deviations

  • Small samples → very low power

  • ANOVA is quite robust to moderate non-normality if variances are homogeneous


5. Summary table

AssumptionWhat it applies toShould be tested on
NormalityErrorsResiduals
HomoscedasticityError varianceResiduals
IndependenceErrorsStudy design

Commentaires

Posts les plus consultés de ce blog

Standard error from Hessian Matrix... what can be done when problem occurs

stepAIC from package MASS with AICc

Install treemix in ubuntu 20.04