krit.club logo

Statistics and Probability - Hypothesis Testing (Chi-squared test, t-tests)

Grade 12IB

Review the key concepts, formulae, and examples before starting your quiz.

🔑Concepts

Null and Alternative Hypotheses (H0H_0 and H1H_1): The Null Hypothesis (H0H_0) represents the status quo or a statement of 'no effect' or 'independence'. The Alternative Hypothesis (H1H_1) is the claim we are testing for. Visually, H0H_0 is represented by the central portion of a probability distribution curve, while H1H_1 points toward the 'rejection regions' or tails of the distribution.

Significance Level (α\alpha) and P-value: The significance level α\alpha (typically 0.050.05 or 5%5\%) is the probability threshold for rejecting H0H_0. The p-value is the probability of observing results as extreme as the sample data, assuming H0H_0 is true. Visually, the p-value is the area under the curve in the tail(s); if this area is smaller than the area defined by α\alpha, we reject H0H_0.

Chi-squared (χ2\chi^2) Test for Independence: This test evaluates whether two categorical variables are independent based on a contingency table. Visually, imagine a table comparing 'Observed' (OO) frequencies against 'Expected' (EE) frequencies. The χ2\chi^2 distribution curve is positively skewed (starting at zero and trailing to the right), and a high χ2\chi^2 value moves the result into the shaded rejection region in the right tail.

Degrees of Freedom (dfdf): This parameter defines the specific shape of the χ2\chi^2 or t-distribution curve. For a χ2\chi^2 test of independence, df=(r1)(c1)df = (r - 1)(c - 1) where rr is rows and cc is columns. Visually, as degrees of freedom increase, the peak of the χ2\chi^2 curve shifts to the right and becomes more symmetrical, resembling a normal distribution.

One-sample and Two-sample t-tests: These tests compare means. A one-sample t-test compares a sample mean to a hypothesized population mean, while a two-sample t-test compares the means of two independent groups. Visually, the t-distribution looks like a bell-shaped Normal curve but has 'fatter tails' to account for greater uncertainty in smaller samples.

Critical Values and Rejection Regions: The critical value is the 'cut-off' point on the x-axis of a distribution. Visually, this value separates the 'fail to reject' region (the main body of the curve) from the 'rejection region' (the tails). If your calculated test statistic (t or χ2\chi^2) falls beyond this value into the tail, the result is statistically significant.

One-tailed vs. Two-tailed Tests: A one-tailed test looks for a change in a specific direction (>> or <<), while a two-tailed test looks for any difference (\neq). Visually, a one-tailed test places the entire significance level α\alpha in one tail, whereas a two-tailed test splits α\alpha into two equal areas of α2\frac{\alpha}{2} at both ends of the distribution.

📐Formulae

χcalc2=(fofe)2fe\chi^2_{calc} = \sum \frac{(f_o - f_e)^2}{f_e}

fe=row total×column totalgrand totalf_e = \frac{\text{row total} \times \text{column total}}{\text{grand total}}

df=(r1)(c1)df = (r - 1)(c - 1) (for Independence tests)

df=n1df = n - 1 (for Goodness of Fit or One-sample t-test)

t=xˉμsn1nt = \frac{\bar{x} - \mu}{\frac{s_{n-1}}{\sqrt{n}}} (One-sample t-test statistic)

sn1=(xxˉ)2n1s_{n-1} = \sqrt{\frac{\sum (x - \bar{x})^2}{n - 1}} (Unbiased sample standard deviation)

💡Examples

Problem 1:

A researcher wants to test if a die is fair. It is rolled 6060 times with the following frequencies: 1 (7), 2 (12), 3 (8), 4 (11), 5 (9), 6 (13). Perform a χ2\chi^2 Goodness of Fit test at the 5%5\% significance level.

Solution:

  1. Hypotheses: H0H_0: The die is fair (all probabilities are 16\frac{1}{6}). H1H_1: The die is not fair.
  2. Expected Frequencies: Since the total n=60n=60, for a fair die, each face should appear E=16×60=10E = \frac{1}{6} \times 60 = 10 times.
  3. Calculate χ2\chi^2: χ2=(710)210+(1210)210+(810)210+(1110)210+(910)210+(1310)210\chi^2 = \frac{(7-10)^2}{10} + \frac{(12-10)^2}{10} + \frac{(8-10)^2}{10} + \frac{(11-10)^2}{10} + \frac{(9-10)^2}{10} + \frac{(13-10)^2}{10} χ2=9+4+4+1+1+910=2810=2.8\chi^2 = \frac{9+4+4+1+1+9}{10} = \frac{28}{10} = 2.8
  4. Degrees of Freedom: df=k1=61=5df = k - 1 = 6 - 1 = 5.
  5. P-value/Critical Value: Using a GDC or table, for df=5df=5 and χ2=2.8\chi^2=2.8, the pp-value is 0.731\approx 0.731.
  6. Conclusion: Since p>0.05p > 0.05 (or 2.8<11.072.8 < 11.07 critical value), we fail to reject H0H_0. There is no significant evidence that the die is unfair.

Explanation:

This is a Goodness of Fit test because we are comparing observed data against a theoretical distribution (uniform distribution). We calculate the 'squared difference' for each outcome, scale it by the expected frequency, and sum them up.

Problem 2:

A study compares the exam scores of two independent groups. Group A (n1=10n_1=10) has a mean of 7272 and s1=5s_1=5. Group B (n2=12n_2=12) has a mean of 6868 and s2=6s_2=6. Test if Group A performed significantly better than Group B at a 5%5\% level (assume equal variances).

Solution:

  1. Hypotheses: H0:μ1=μ2H_0: \mu_1 = \mu_2, H1:μ1>μ2H_1: \mu_1 > \mu_2 (One-tailed test).
  2. Parameters: Group A: xˉ1=72,s1=5,n1=10\bar{x}_1 = 72, s_1 = 5, n_1 = 10. Group B: xˉ2=68,s2=6,n2=12\bar{x}_2 = 68, s_2 = 6, n_2 = 12.
  3. Test Statistic: Enter data into GDC for a 2-sample t-test.
  4. Results: The GDC calculates t1.685t \approx 1.685 and p0.054p \approx 0.054.
  5. Conclusion: Since p=0.054>0.05p = 0.054 > 0.05, we fail to reject H0H_0 at the 5%5\% significance level.
  6. Interpretation: While Group A had a higher average, the difference is not statistically significant at the 5%5\% level.

Explanation:

This is a two-sample t-test for independent means. Because we are testing if one group is specifically 'better' than the other, it is a one-tailed test. The p-value indicates that there is a 5.4%5.4\% chance this difference occurred by random variation.