Partial coefficient of correlation
Let have 3 variables, x, y and a:
x <- rnorm(100, mean=10, sd=2)
y <- x+rnorm(100, mean=20, sd=5)
a <- rnorm(100, mean=10, sd=2)
x <- rnorm(100, mean=10, sd=2)
y <- x+rnorm(100, mean=20, sd=5)
a <- rnorm(100, mean=10, sd=2)
df <- data.frame(x=x, y=y, a=a)
The partial coefficient correlation between x and y relative to a is the correlation coefficient of the residuals of the linear regression of x and y over a:
pcor <- function(df) {
ax <- lm(x ~ a, data=df)
ay <- lm(y ~ a, data=df)
rx <- residuals(ax)
ry <- residuals(ay)
return(cor(rx, ry))
}
pcor(df)
Let confirm with the package ppcor:
library("ppcor")
ppcor:::pcor(df)$estimate[2, 1]
Note that I use ppcor:::pcor() to be sure that the version of pcor in the package is used.
It works: we obtained the same result.
It should be noted that ppcor:::pcor() is more rapid than my version of pcor() even if it estimate 3 partial correlations and me only one !
> system.time(
+ for (i in 1:10000) ppcor:::pcor(df)$estimate[2, 1]
+ )
utilisateur système écoulé
2.629 0.097 2.777
> system.time(
+ for (i in 1:10000) pcor(df)
+ )
utilisateur système écoulé
18.385 0.733 19.403
Commentaires
Enregistrer un commentaire