Binomial confidence limit

Hmisc: version 5.1-1

binom: version 1.1-1.1

binomCI: version 1.0

DescTools: version 0.99.50

I just notice a difference when x=1 or x=n-1 is observed between the different implementations of the Wilson method:

n <- 2
x <- 1
sr_binom <- binom::binom.confint(x=x, n=n, methods = "wilson", conf.level = 0.95)
sr_Hmisc <- Hmisc::binconf(x=x, n=n, method = "wilson", alpha = 0.05)
sr_binomCI <- binomCI::binomCI(x=x, n=n, a = 0.05)
# Warning message:
# In sqrt(n - z^2 - 2 * z/sqrt(n) - 1/n) : NaNs produced
sr_binom
#   method x n mean      lower     upper
# 1 wilson 1 2  0.5 0.09453121 0.9054688
sr_Hmisc
 # PointEst      Lower     Upper
 #     0.5 0.02564665 0.9743534
sr_binomCI$ci["Wilson", ]
#    0.025%     0.975% 
# 0.09453121 0.90546879 

No difference between packages is observed for x=2 for example with n <- 4 and x <- 2. The discrepancy is not observed also with exact method (or exact binomial).

When checking the code, I see that in package Hmisc, there is a correction of lower and upper CI when x==1 or x==(n-1):

if (x == 1) cl[1] <- -log(1 - alpha)/n
if (x == (n - 1)) cl[2] <- 1 + log(1 - alpha)/n

This correction does not appear in other packages. It comes from the paper:

Agresti, A.; Coull, B.A. Approximate is better than "exact" for interval estimation of binomial proportions. The American Statistician 1998, 52, 119-126.

Page 125:
In deciding whether to use the score interval, some may be bothered by its poor coverage for values of p just below the lower boundary of the interval when X  = 1 and just above the upper boundary of the interval when X  = n - 1. One could then use an  adapted version that replaces the lower endpoint by  -log(1 - a)/n when X  =  1 and the upper endpoint by 1 + log(1- a)/n when X  =  n -1. This adaptation improves the minimum coverage considerably.

In my opinion, this precision deserves to be indicated in Hmisc package because it was not present in the original Wilson method and the non inclusion of this correction in binom and binomCI makes the Wilson method with poor coverage in this situation.

Here are other implementations:

sr_DescTools <- DescTools::BinomCI(x=x, n=n, method = "wilson", conf.level = 0.95)
sr_DescTools_2 <- DescTools::BinomCI(x=x, n=n, method = "agresti-coull", conf.level = 0.95)
sr_DescTools_3 <- DescTools::BinomCI(x=x, n=n, method = "modified wilson", conf.level = 0.95)
sr_DescTools_4 <- DescTools::BinomCI(x=x, n=n, method = "wilsoncc", conf.level = 0.95)

Wilson is based on:
Wilson, E.B. Probable inference, the law of succession, and statistical inference. Journal of the American Statistical Association 1927, 22, 209-212.

Agresti and Coull is based on:
Agresti, A.; Coull, B.A. Approximate is better than "exact" for interval estimation of binomial proportions. The American Statistician 1998, 52, 119-126.

Modified Wilson is based on:
Brown, L.D.; Cai, T.T.; DasGupta, A. Interval estimation for a binomial proportion. Statistical Science 2001, 16, 101-117.

Wilson cc is Wilson interval adding a continuity correction term and is based on:

Commentaires

Posts les plus consultés de ce blog

Standard error from Hessian Matrix... what can be done when problem occurs

stepAIC from package MASS with AICc

Install treemix in ubuntu 20.04