# Introduction

When I was firstly exposed to statistics, I always heard about p-value, null hypothesis, and confidence interval. It is difficult to understand them without examples. Luckily, (Stanton 2017) offers us a nice introduction of statistics using hands-on examples. This notebook is a summary of the notes I took while reading the book. In addition, I will update this notebook with other statistics concepts that I learn in the future.

Currently, this notebook is structured as follows:

- Statistical Vocabulary: This chapter introduces heads-up statistics terminologies such as Descriptive Statistics and Inferential Statistics, Parametric vs. Nonparametric, Level of Measurement, P-value, etc.
- Probability: This chapter touches on sampling with probability distribution, form which we can derive population parameters such as mean and variance.
- Inference: After we obtain the sample data, inference induces a certain level of confidence that we can draw conclusions based on the sample data. This chapter will also include the first inferential statistics example, t-test.
- Bayesian and Frequentist Statistics: This chapter introduces the difference between Bayesian and Frequentist statistics. In the following chapters, the statistical analysis will be conducted using these two approaches.
- Comparing Groups: While the t-test compare the means of the two groups, this chapter focus on methods that simultaneously compare the mean difference among any number of groups, such as analysis of variance (ANOVA).
- Associations between Variables: This chapter introduces associations between variables depicting how multiple variables are related to each other. Correlation such as Pearson product-moment correlation, Chi-square test, KENDALL’S TAU and SPEARMAN’S RANK-ORDER CORRELATION are discussed in this chapter.
- Linear Multiple Regression: This chapter introduces linear multiple regression, which is a method to predict a continuous variable based on multiple independent variables.
- Interactions in ANOVA and Regression: This chapter introduces interactions between two variables in ANOVA and regression, which is the effect of one variable on the response variable, conditional on the value of the other variable.
- Logistic Regression: Different from linear regression that predicts continuous variable, logistic regression predicts a
**categorical**variable. - Analyzing Change over Time: In this chapter, we consider the dependency among data points that collected at different point in a time series. We examine two configuration of data: repeated measures and time series.
- Dealing with Too Many Variables: This chapter introduces methods to deal with too many variables, such as principal component analysis (PCA) and factor analysis.

I also include a summary diagram of the statistical analysis methods based on the statistics family, the group number, the variable types, assumption in the end of this notebook.

### References

Stanton, Jeffrey M. 2017.

*Reasoning with Data: An Introduction to Traditional and Bayesian Statistics Using r*. Guilford Publications.