6.2 Null Hypothisis Testing On the Correlation
The null hypothesis is that the correlation \(\rho\) is 0.
set.seed(12345)
wood <- rnorm(24)
heat <- rnorm(24)
cor.test(wood,heat)
##
## Pearson's product-moment correlation
##
## data: wood and heat
## t = -0.29513, df = 22, p-value = 0.7707
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.4546764 0.3494514
## sample estimates:
## cor
## -0.06279774The t-value is -0.2951 (any t-value with an absolute value less than about 2 is unlikely to be significant). The p-value is 0.7707, which is greater than 0.05. Therefore, we fail to reject the null hypothesis that the correlation is 0. The 95% CI suggests that the 95% confidence interval for rho ranged from –0.45 to 0.35. The CI straddles 0, which suggests that the null hypothesis is implausible. In contrast, we reject the null hypothesis that the correlation is 0 if the p-value is less than 0.05 in the following case.
cor.test(wood,(wood/1.41 + heat/1.41))
##
## Pearson's product-moment correlation
##
## data: wood and (wood/1.41 + heat/1.41)
## t = 3.0604, df = 22, p-value = 0.005731
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.1834949 0.7782808
## sample estimates:
## cor
## 0.54644326.2.1 Bayesian Reasoning About Correlation
The Bayesian approach to infer the correlation is to use the posterior distribution of the correlation. We need to customize the function to do Bayesian test of the correlation as no package is available.
bfCorTest <- function (x,y) # Get r from BayesFactor
{
zx <- scale(x) # z-score of x
zy <- scale(y) # z-score of y
zData <- data.frame(x=zx,rhoNot0=zy)
bfOut <- generalTestBF(x ~ rhoNot0, data=zData) # linear coefficient
mcmcOut <- posterior(bfOut,iterations=10000) # posterior samples
print(summary(mcmcOut[,"rhoNot0"])) # Show the HDI for r
return(bfOut)
}set.seed(12345)
wood <- rnorm(24)
heat <- rnorm(24)
bfCorTest(wood,heat)
##
|
| | 0%
|
|===============================================================================================================================================| 100%
##
## Iterations = 1:10000
## Thinning interval = 1
## Number of chains = 1
## Sample size per chain = 10000
##
## 1. Empirical mean and standard deviation for each variable,
## plus standard error of the mean:
##
## Mean SD Naive SE Time-series SE
## -0.043339 0.184223 0.001842 0.001842
##
## 2. Quantiles for each variable:
##
## 2.5% 25% 50% 75% 97.5%
## -0.41867 -0.15726 -0.04104 0.07156 0.32879
## Bayes factor analysis
## --------------
## [1] rhoNot0 : 0.385294 ±0%
##
## Against denominator:
## Intercept only
## ---
## Bayes factor type: BFlinearModel, JZSThe point estimate for \(\rho\) is -0.043. The 95% HDI for \(\rho\) is (-0.42, 0.33). The HDI straddles 0, which suggests that the null hypothesis is implausible. The Bayes factor is 0.385 indicates that the odd ratios are weakly in favor of the null hypothesis.