6.2 Null Hypothisis Testing On the Correlation
The null hypothesis is that the correlation \(\rho\) is 0.
set.seed(12345)
<- rnorm(24)
wood <- rnorm(24)
heat cor.test(wood,heat)
##
## Pearson's product-moment correlation
##
## data: wood and heat
## t = -0.29513, df = 22, p-value = 0.7707
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.4546764 0.3494514
## sample estimates:
## cor
## -0.06279774
The t-value is -0.2951 (any t-value with an absolute value less than about 2 is unlikely to be significant). The p-value is 0.7707, which is greater than 0.05. Therefore, we fail to reject the null hypothesis that the correlation is 0. The 95% CI suggests that the 95% confidence interval for rho ranged from –0.45 to 0.35. The CI straddles 0, which suggests that the null hypothesis is implausible. In contrast, we reject the null hypothesis that the correlation is 0 if the p-value is less than 0.05 in the following case.
cor.test(wood,(wood/1.41 + heat/1.41))
##
## Pearson's product-moment correlation
##
## data: wood and (wood/1.41 + heat/1.41)
## t = 3.0604, df = 22, p-value = 0.005731
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.1834949 0.7782808
## sample estimates:
## cor
## 0.5464432
6.2.1 Bayesian Reasoning About Correlation
The Bayesian approach to infer the correlation is to use the posterior distribution of the correlation. We need to customize the function to do Bayesian test of the correlation as no package is available.
<- function (x,y) # Get r from BayesFactor
bfCorTest
{<- scale(x) # z-score of x
zx <- scale(y) # z-score of y
zy <- data.frame(x=zx,rhoNot0=zy)
zData <- generalTestBF(x ~ rhoNot0, data=zData) # linear coefficient
bfOut <- posterior(bfOut,iterations=10000) # posterior samples
mcmcOut print(summary(mcmcOut[,"rhoNot0"])) # Show the HDI for r
return(bfOut)
}
set.seed(12345)
<- rnorm(24)
wood <- rnorm(24)
heat bfCorTest(wood,heat)
##
|
| | 0%
|
|===============================================================================================================================================| 100%
##
## Iterations = 1:10000
## Thinning interval = 1
## Number of chains = 1
## Sample size per chain = 10000
##
## 1. Empirical mean and standard deviation for each variable,
## plus standard error of the mean:
##
## Mean SD Naive SE Time-series SE
## -0.043339 0.184223 0.001842 0.001842
##
## 2. Quantiles for each variable:
##
## 2.5% 25% 50% 75% 97.5%
## -0.41867 -0.15726 -0.04104 0.07156 0.32879
## Bayes factor analysis
## --------------
## [1] rhoNot0 : 0.385294 ±0%
##
## Against denominator:
## Intercept only
## ---
## Bayes factor type: BFlinearModel, JZS
The point estimate for \(\rho\) is -0.043. The 95% HDI for \(\rho\) is (-0.42, 0.33). The HDI straddles 0, which suggests that the null hypothesis is implausible. The Bayes factor is 0.385 indicates that the odd ratios are weakly in favor of the null hypothesis.