6.2 Null Hypothisis Testing On the Correlation

The null hypothesis is that the correlation \(\rho\) is 0.

set.seed(12345)
wood <- rnorm(24)
heat <- rnorm(24)
cor.test(wood,heat)
## 
##  Pearson's product-moment correlation
## 
## data:  wood and heat
## t = -0.29513, df = 22, p-value = 0.7707
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  -0.4546764  0.3494514
## sample estimates:
##         cor 
## -0.06279774

The t-value is -0.2951 (any t-value with an absolute value less than about 2 is unlikely to be significant). The p-value is 0.7707, which is greater than 0.05. Therefore, we fail to reject the null hypothesis that the correlation is 0. The 95% CI suggests that the 95% confidence interval for rho ranged from –0.45 to 0.35. The CI straddles 0, which suggests that the null hypothesis is implausible. In contrast, we reject the null hypothesis that the correlation is 0 if the p-value is less than 0.05 in the following case.

cor.test(wood,(wood/1.41 + heat/1.41))
## 
##  Pearson's product-moment correlation
## 
## data:  wood and (wood/1.41 + heat/1.41)
## t = 3.0604, df = 22, p-value = 0.005731
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
##  0.1834949 0.7782808
## sample estimates:
##       cor 
## 0.5464432

6.2.1 Bayesian Reasoning About Correlation

The Bayesian approach to infer the correlation is to use the posterior distribution of the correlation. We need to customize the function to do Bayesian test of the correlation as no package is available.

bfCorTest <- function (x,y) # Get r from BayesFactor
{
  zx <- scale(x) # z-score of x
  zy <- scale(y) # z-score of y
  zData <- data.frame(x=zx,rhoNot0=zy)
  bfOut <- generalTestBF(x ~ rhoNot0, data=zData) # linear coefficient
  mcmcOut <- posterior(bfOut,iterations=10000) # posterior samples
  print(summary(mcmcOut[,"rhoNot0"])) # Show the HDI for r
  return(bfOut)
}

set.seed(12345)
wood <- rnorm(24)
heat <- rnorm(24)
bfCorTest(wood,heat)
## 
  |                                                                                                                                                     
  |                                                                                                                                               |   0%
  |                                                                                                                                                     
  |===============================================================================================================================================| 100%
## 
## Iterations = 1:10000
## Thinning interval = 1 
## Number of chains = 1 
## Sample size per chain = 10000 
## 
## 1. Empirical mean and standard deviation for each variable,
##    plus standard error of the mean:
## 
##           Mean             SD       Naive SE Time-series SE 
##      -0.043339       0.184223       0.001842       0.001842 
## 
## 2. Quantiles for each variable:
## 
##     2.5%      25%      50%      75%    97.5% 
## -0.41867 -0.15726 -0.04104  0.07156  0.32879
## Bayes factor analysis
## --------------
## [1] rhoNot0 : 0.385294 ±0%
## 
## Against denominator:
##   Intercept only 
## ---
## Bayes factor type: BFlinearModel, JZS

The point estimate for \(\rho\) is -0.043. The 95% HDI for \(\rho\) is (-0.42, 0.33). The HDI straddles 0, which suggests that the null hypothesis is implausible. The Bayes factor is 0.385 indicates that the odd ratios are weakly in favor of the null hypothesis.