Standard error from Hessian Matrix... what can be done when problem occurs
An Hessian matrix is a square matrix of partial second order derivative. From the square-root of the inverse of the diagonal, it is possible to estimate the standard error of parameters. Let take an example:
val=rnorm(100, mean=20, sd=5)
# Return -ln L of values in val in Gaussian distribution with mean and sd in par
fitnorm<-function(par, val) {
-sum(dnorm(val, par["mean"], par["sd"], log = TRUE))
}
# Initial values for search
p<-c(mean=20, sd=5)
# fit the model
result <- optim(par=p, fn=fitnorm, val=val, method="BFGS", hessian=TRUE)
# Inverse the hessian matrix to get SE for each parameters
mathessian <- result$hessian
inversemathessian <- solve(mathessian)
res <- sqrt(diag(inversemathessian))
# results
data.frame(Mean=result$par, SE=res)
library(MASS)
(r<-fitdistr(val, "normal"))
It works well.
However, the inverse of the matrix cannot be calculated if the second order derivative for some parameter is null or very low:
p<-c(mean=20, sd=5, k=12)
# fit the model
result <- optim(par=p, fn=fitnorm, val=val, method="BFGS", hessian=TRUE)
# Inverse the hessian matrix to get SE for each parameters
mathessian <- result$hessian
# Here an error occurs
solve(mathessian)
The solution is to remove the parameters with null second order derivatives. Here is a recursive version:
repeat {
sigma <- try(solve(mathessian), silent=TRUE)
if (class(sigma) == "try-error") {
ar <- which.min(apply(mathessian, 1, sum))
mathessian <- mathessian[-ar, -ar]
} else break
}
SE <- sqrt(diag(sigma))
SEInf <- colnames(result$hessian)[!colnames(result$hessian) %in% names(SE)]
SE <- c(SE, structure(rep(+Inf, length(SEInf)), .Names=SEInf))
Rather than to use the diagonal, an alternative solution is to use a Singular Value Decomposition of a Matrix:
Note that if s. <- svd(a), then
s.$u %*% diag(s.$d) %*% t(s.$v) is equal to a
sigma <- try(solve(mathessian), silent=TRUE)
s. <- svd(sigma)
R <- t(s.$v %*% (t(s.$u) * sqrt(pmax(s.$d, 0))))
SE <- sqrt(diag(nearPD(sigma)$mat))
If the diagonal of the inverse Hessian is negative, it is not possible to use square root.
In this case, you can calculate a pseudo-variance matrix based on:
GILL J. AND G. KING 2004. What to do when your Hessian is not invertible: Alternatives to model respecification in nonlinear estimation. Sociological Methods & Research 33: 54-87.
The pseudo-variance matrix is LL' with L=cholesky(H-1) with H being the Hessian matrix.
The function chol() from base package compute the Choleski factorization of a real symmetric positive-definite square matrix.
However, in many cases the H-1 matrix is not positive-definite but negative.
Here is a little script to perform Choleski factorization of any square symmetric matrix:
val=rnorm(100, mean=20, sd=5)
# Return -ln L of values in val in Gaussian distribution with mean and sd in par
fitnorm<-function(par, val) {
-sum(dnorm(val, par["mean"], par["sd"], log = TRUE))
}
# Initial values for search
p<-c(mean=20, sd=5)
# fit the model
result <- optim(par=p, fn=fitnorm, val=val, method="BFGS", hessian=TRUE)
# Inverse the hessian matrix to get SE for each parameters
mathessian <- result$hessian
inversemathessian <- solve(mathessian)
res <- sqrt(diag(inversemathessian))
# results
data.frame(Mean=result$par, SE=res)
library(MASS)
(r<-fitdistr(val, "normal"))
It works well.
However, the inverse of the matrix cannot be calculated if the second order derivative for some parameter is null or very low:
p<-c(mean=20, sd=5, k=12)
# fit the model
result <- optim(par=p, fn=fitnorm, val=val, method="BFGS", hessian=TRUE)
# Inverse the hessian matrix to get SE for each parameters
mathessian <- result$hessian
# Here an error occurs
solve(mathessian)
The solution is to remove the parameters with null second order derivatives. Here is a recursive version:
repeat {
sigma <- try(solve(mathessian), silent=TRUE)
if (class(sigma) == "try-error") {
ar <- which.min(apply(mathessian, 1, sum))
mathessian <- mathessian[-ar, -ar]
} else break
}
SE <- sqrt(diag(sigma))
SEInf <- colnames(result$hessian)[!colnames(result$hessian) %in% names(SE)]
SE <- c(SE, structure(rep(+Inf, length(SEInf)), .Names=SEInf))
Rather than to use the diagonal, an alternative solution is to use a Singular Value Decomposition of a Matrix:
Note that if s. <- svd(a), then
s.$u %*% diag(s.$d) %*% t(s.$v) is equal to a
sigma <- try(solve(mathessian), silent=TRUE)
s. <- svd(sigma)
R <- t(s.$v %*% (t(s.$u) * sqrt(pmax(s.$d, 0))))
SE <- structure((matrix(rep(1, nrow(R)), nrow = 1, byrow = TRUE) %*% R)[1, ], .Names=rownames(mathessian))
In the Matrix package, the function nearPD() can be used to make the matrix positive-definite:SE <- sqrt(diag(nearPD(sigma)$mat))
If the diagonal of the inverse Hessian is negative, it is not possible to use square root.
In this case, you can calculate a pseudo-variance matrix based on:
GILL J. AND G. KING 2004. What to do when your Hessian is not invertible: Alternatives to model respecification in nonlinear estimation. Sociological Methods & Research 33: 54-87.
The pseudo-variance matrix is LL' with L=cholesky(H-1) with H being the Hessian matrix.
The function chol() from base package compute the Choleski factorization of a real symmetric positive-definite square matrix.
However, in many cases the H-1 matrix is not positive-definite but negative.
Here is a little script to perform Choleski factorization of any square symmetric matrix:
a <- solve(A)
n = dim(a)[1];
root = matrix(0,n,n);
for (i in 1:n){
sum = 0;
if (i>1){
sum = sum(root[i,1:(i-1)]^2);
}
x = a[i,i] - sum;
if (x<0){
x = 0;
}
root[i,i] = sqrt(x);
if (i < n){
for (j in (i+1):n){
if (root[i,i] == 0){
x=0;
}
else{
sum = 0;
if (i>1) {
sum = root[i,1:(i-1)] %*% t(t(root[j,1:(i-1)]))
}
x = (a[i,j] - sum)/root[i,i];
}
root[j,i] = x;
}
}
}
SE <- sqrt(diag(root %*% t(root)))
In some cases, the Choleski factorization fails. Then it is possible to use these alternatives:
REBONATO R. AND P. JäCKEL 2000. The most general methodology to create a valid correlation matrix for risk management and option pricing purposes. J. Risk 2: 17-26.
BRISSETTE F.P., M. KHALILI AND R. LECONTE 2007. Efficient stochastic generation of multi-site synthetic precipitation data. Journal of Hydrology 345: 121-133.
In some cases, the Choleski factorization fails. Then it is possible to use these alternatives:
REBONATO R. AND P. JäCKEL 2000. The most general methodology to create a valid correlation matrix for risk management and option pricing purposes. J. Risk 2: 17-26.
BRISSETTE F.P., M. KHALILI AND R. LECONTE 2007. Efficient stochastic generation of multi-site synthetic precipitation data. Journal of Hydrology 345: 121-133.
These different methods have been included in the function SEfromHessian() in the package HelpersMG.
If all failed, use Metropolis-Hastings algorithm with a Markov-Chain and Monte-Carlo sampling.
If all failed, use Metropolis-Hastings algorithm with a Markov-Chain and Monte-Carlo sampling.
Commentaires
Enregistrer un commentaire