Comparison of models by AIC with or without log transformation on Y
You cannot compare the AIC or BIC when fitting to two different data sets i.e. đ and đ. You only can compare two models based on AIC or BIC just when fitting to the same data set. Have a look at Model Selection and Multi-model Inference: A Practical Information-theoretic Approach (Burnham and Anderson, 2004). They mentioned my answer on page 81 (section 2.11.3 Transformations of the Response Variable):
Investigators should be sure that all hypotheses are modeled using the same response variable (e.g., if the whole set of models were based on log(y), no problem would be created; it is the mixing of response variables that is incorrect).
Akaike (1978, pg. 224) describes how the AIC can be adjusted in the presence of a transformed outcome variable to enable model comparison. He states: “the effect of transforming the variable is represented simply by the multiplication of the likelihood by the corresponding Jacobian to the AIC ... for the case of log{đŠ(đ)+1}, it is −2 ⋅∑log{đŠ(đ)+1}, where the summation extends over đ=1,2,...,đ.”
Akaike, H. 1978. "On the likelihood of a time series model," Journal of the Royal Statistical Society, Series D (The Statistician), 27(3/4), pp. 217–235.
Let do a toy example:
grain = c(21.2, 19.9, 19.2, 18.4, 17.9))
quad.lm <- lm(grain~poly(rate,2), data=seedrates)
loglin.lm <- lm(log(grain)~log(rate), data=seedrates)
AIC(quad.lm, loglin.lm)
We need to add sum(2*log(seedrates$grain)) = 29.6 to the AIC for the loglinear model (or, subtract it from the AIC for the quadratic model).
df AIC
quad.lm 4 -4.1
loglin.lm 3 -7.6
Take a look at https://stats.stackexchange.com/questions/48714/prerequisites-for-aic-model-comparison
Commentaires
Enregistrer un commentaire