Quick Links
- General Procedure (5 Steps)
- Errors
- Relation to Confidence Intervals
- Hypothesis Tests
Statistical inference
Making statistical inference about the population using sample data. Two main types of inference:
- estimation
- testing hypothesis
Hypothesis
A statement about a population, claiming that a parameter takes a particular numerical value or falls in a certain range of values.
General Procedure (5 Steps)
- Look at assumptions
- State the hypothesis
- Find the test statistic, and its null distribution
- Find the
-value and interpret it - Make a conclusion
1. Assumptions
If assumptions do not hold, test does not have the properties needed.
Generally, the most important assumption is randomisation but there might be other assumptions, such as sample size and distribution shape.
2. Stating Hypotheses
Null hypothesis
A statement that the parameter takes a particular value, denoted
Alternative hypothesis
H_1
A statement that the parameter falls in some alternative range of values, denoted
The alternative hypothesis determines the side of the test:
: parameter not equal - two-sided test
: parameter larger than value under - right-sided test
: parameter smaller than value under - left-sided test
3. Test Statistic and Null Distribution
The value of test statistic requires
- the value of the point estimate, and its sampling distribution
- the parameter value specified under
The distribution of a test statistic under
4. p-value
The test begins with the assumption that
If the test statistic calculated above in step 3 is far out in the tail of the null distribution, it is too far from what
5. Conclude
If significance level
- if
value , reject - otherwise, do NOT reject
Errors
Type I
When
is rejected but it is true. Probability of error is denoted
.
Type II
When
is not rejected but it is false. Probability of error is denoted
.
Power
The power of the test is defined to be
, which is the probability of correctly rejecting , when it is false.
The two errors cannot be reduced simultaneously.
- If
is smaller, is rejected less often. - When
is retained more often, the probability to retain it when it is not correct increases. - Probability of Type II error increases (while probability of type I error decreases).
Relation to Confidence Intervals
There is a consistency between a confidence interval and a significance test
- when the confidence level of the confidence interval
, the significance level , and - the test is two-sided
- both CI and test have the same standard error
When these hold, if the point estimate found in the significance test is within the point estimate, we do not reject
Example
If the confidence interval is
:
- when the value calculated in the point estimate is
, we retain - when the value calculated in the point estimate is
, we reject .
Hypothesis Testing for Population
1. Assumptions
- Variable is categorical
- Data is obtained using randomisation
- Sample size
is sufficiently large that the sampling distribution of the sample proportion is approximately normal when the null is true. Checked using - value specified in .
2. Hypothesis
3. Test Statistic
With the statistics
Note that
4. Calculate p-value
Compute R
- pnorm(Z)
5. Interpret
Reject or retain
- If it is small (lesser than
), reject - Otherwise, do not reject
Hypothesis Testing for Mean
1. Assumptions
- Variable is quantitative
- Data is obtained using randomisation
- Population is approximately normal.
- Crucial when
small.
- Crucial when
2. Hypothesis
3. Test Statistic
With the statistics
Note that if
4. Calculate p-value
Compute
5. Interpret
Reject or retain
- If it is small (lesser than
), reject - Otherwise, do not reject
Hypothesis Testing for Two Independent Samples with Equal Variance
1. Assumptions
- Variable is quantitative
- Samples are independent
- Population distribution of each group is approximately normal
- Variances are the same
The variance test can be checked using the equal variance test.
2. Hypothesis
3. Test Statistic
With the statistics
In this formula:
refers to the pooled estimate of the common variance is the standard error.
Note that if
4. Calculate p-value
Compute
5. Interpret
Reject or retain
- If it is small (lesser than
), reject - Otherwise, do not reject
Hypothesis Testing for Two Independent Samples with Unequal Variance
1. Assumptions
- Variable is quantitative
- Samples are independent
- Population distribution of each group is approximately normal
The variance test can be checked using the equal variance test.
2. Hypothesis
3. Test Statistic
With the statistics
In this formula:
is the standard error.
Note that if
4. Calculate p-value
Compute
5. Interpret
Reject or retain
- If it is small (lesser than
), reject - Otherwise, do not reject
Hypothesis Testing for Two Dependent Samples
Also known as the paired t-test
1. Assumptions
- Variable is quantitative
- Samples are dependent
- every observation has a matched value in other sample
- Population distribution of each group is approximately normal
2. Hypothesis
Let the
With the given hypothesis, the Hypothesis Testing for Mean can be done here.