Bayesian or frequentist test
This note is based on a discussion with Jean-Michel Guillon.
The idea was to test an idea from this this paper:
Sterne, J.A.C., Smith, G.D., 2001. Sifting the evidence-what’s wrong with significance tests? British Medical Journal 322, 226-231.
This it is indeed again correct.
The idea was to test an idea from this this paper:
Sterne, J.A.C., Smith, G.D., 2001. Sifting the evidence-what’s wrong with significance tests? British Medical Journal 322, 226-231.
If our prior opinion about the risk ratio is vague (we consider a wide range of values to be equally likely) then the results of a frequentist analysis are similar to the results of a bayesian analysis; both are based on what statisticians call the likelihood for the data:
• The 95% confidence interval is the same as the 95% credible interval, except that the latter has the meaning often incorrectly ascribed to a confidence interval;
• The (one sided) P value is the same as the bayesian posterior probability that the drug increases the risk of death (assuming that we found a protective effect of the drug).
The two approaches, however, will give different results if our prior opinion is not vague, relative to the amount of information contained in the data.
Let test the first idea:
set.seed(1)
set.seed(1)
val1 <- rnorm(100, 10, 2)
# 95% credibility interval
library(HelpersMG)
dnormx <- function(data, x, fixed) {
data <- unlist(data)
x <- c(x, fixed)
return(-sum(dnorm(data, mean=x['mean'], sd=x['sd'], log=TRUE)))
}
parameters_mcmc <- data.frame(Density = "dunif",
Prior1 = -100,
Prior2 = 100,
SDProp = 0.35,
Min = -100,
Max = 100,
Init = 10,
row.names = "mean", stringsAsFactors = FALSE)
set.seed(1)
mcmc_run <- MHalgoGen(n.iter=500000, parameters=parameters_mcmc, fixed=c(sd=2),
data=val1,
likelihood=dnormx, n.chains=1, n.adapt=100, thin=1, trace=100)
> quantile(mcmc_run$resultMCMC[[1]], probs=c(0.025, 0.975))
2.5% 97.5%
9.812352 10.058331
> # 95% confidence interval is
> c(mean(val1)-2*sd(val1)/sqrt(length(val1)),
+ mean(val1)+2*sd(val1)/sqrt(length(val1)))
[1] 9.807548 10.063363
2.5% 97.5%
9.812352 10.058331
> # 95% confidence interval is
> c(mean(val1)-2*sd(val1)/sqrt(length(val1)),
+ mean(val1)+2*sd(val1)/sqrt(length(val1)))
[1] 9.807548 10.063363
This is indeed correct.
Let test the second idea. We will used 3 ways of testing. A classical t-test, a mcmc posterior distribution using HelpersMG package and same using BayesFirstAid package. This package is no more maintained but is available on GitHub. See http://sumsar.net/files/academia/baath_user14_abstract.pdf
This is possible because in bayes.t.test help, it is indicated that:
The constants M[μ], S[μ], L[σ] and H[σ] are set so that the priors on μ and σ are essentially flat.
This is possible because in bayes.t.test help, it is indicated that:
The constants M[μ], S[μ], L[σ] and H[σ] are set so that the priors on μ and σ are essentially flat.
> set.seed(1)
> val1 <- rnorm(10000, 10, 2)
>
> set.seed(1)
> # credibility interval
> library(HelpersMG)
>
> dnormx <- function(data, x, fixed) {
+ data <- unlist(data)
+ x <- c(x, fixed)
+ return(-sum(dnorm(data, mean=x['mean'], sd=x['sd'], log=TRUE)))
+ }
>
> parameters_mcmc <- data.frame(Density = "dunif",
+ Prior1 = -100,
+ Prior2 = 100,
+ SDProp = 0.35,
+ Min = -100,
+ Max = 100,
+ Init = 10,
+ row.names = "mean", stringsAsFactors = FALSE)
>
> mcmc_run <- MHalgoGen(n.iter=500000, parameters=parameters_mcmc, fixed=c(sd=2),
+ data=val1,
+ likelihood=dnormx, n.chains=1, n.adapt=100, thin=1, trace=1)
Chain 1: [1] -21244.8863059415
Best likelihood for:
mean = 9.98692609373803
> plot(mcmc_run, xlim=c(-110, 110))
> hist(mcmc_run$resultMCMC[[1]])
> 1-sum(mcmc_run$resultMCMC[[1]]<10)/length(mcmc_run$resultMCMC[[1]])
[1] 0.25648
> t.test(x=val1, mu=10, alternative = c("less"))
One Sample t-test
data: val1
t = -0.64573, df = 9999, p-value = 0.2592
alternative hypothesis: true mean is less than 10
95 percent confidence interval:
-Inf 10.02023
sample estimates:
mean of x
9.986926
>
> library(BayesianFirstAid)
> bayes.t.test(x=val1, mu=10, alternative = c("less"))
|**************************************************| 100%
Bayesian estimation supersedes the t test (BEST) - one sample
data: val1, n = 10000
Estimates [95% credible interval]
mean of val1: 10 [9.9, 10]
sd of val1: 2.0 [2.0, 2.0]
The mean is more than 10 by a probability of 0.263
and less than 10 by a probability of 0.737
Warning message:
In bayes.t.test.default(x = val1, mu = 10, alternative = c("less")) :
The argument 'alternative' is ignored by bayes.binom.test
> val1 <- rnorm(10000, 10, 2)
>
> set.seed(1)
> # credibility interval
> library(HelpersMG)
>
> dnormx <- function(data, x, fixed) {
+ data <- unlist(data)
+ x <- c(x, fixed)
+ return(-sum(dnorm(data, mean=x['mean'], sd=x['sd'], log=TRUE)))
+ }
>
> parameters_mcmc <- data.frame(Density = "dunif",
+ Prior1 = -100,
+ Prior2 = 100,
+ SDProp = 0.35,
+ Min = -100,
+ Max = 100,
+ Init = 10,
+ row.names = "mean", stringsAsFactors = FALSE)
>
> mcmc_run <- MHalgoGen(n.iter=500000, parameters=parameters_mcmc, fixed=c(sd=2),
+ data=val1,
+ likelihood=dnormx, n.chains=1, n.adapt=100, thin=1, trace=1)
Chain 1: [1] -21244.8863059415
Best likelihood for:
mean = 9.98692609373803
> plot(mcmc_run, xlim=c(-110, 110))
> hist(mcmc_run$resultMCMC[[1]])
> 1-sum(mcmc_run$resultMCMC[[1]]<10)/length(mcmc_run$resultMCMC[[1]])
[1] 0.25648
> t.test(x=val1, mu=10, alternative = c("less"))
One Sample t-test
data: val1
t = -0.64573, df = 9999, p-value = 0.2592
alternative hypothesis: true mean is less than 10
95 percent confidence interval:
-Inf 10.02023
sample estimates:
mean of x
9.986926
>
> library(BayesianFirstAid)
> bayes.t.test(x=val1, mu=10, alternative = c("less"))
|**************************************************| 100%
Bayesian estimation supersedes the t test (BEST) - one sample
data: val1, n = 10000
Estimates [95% credible interval]
mean of val1: 10 [9.9, 10]
sd of val1: 2.0 [2.0, 2.0]
The mean is more than 10 by a probability of 0.263
and less than 10 by a probability of 0.737
Warning message:
In bayes.t.test.default(x = val1, mu = 10, alternative = c("less")) :
The argument 'alternative' is ignored by bayes.binom.test
This it is indeed again correct.
Commentaires
Enregistrer un commentaire