Statistics and Probability - Correlation and linear regression

Grade 10IB

Review the key concepts, formulae, and examples before starting your quiz.

🔑Concepts

•

x

•

Correlation (Direction and Strength): Correlation describes the linear relationship between two variables. Visually, a positive correlation shows points trending upwards from left to right, while a negative correlation shows points trending downwards. The 'strength' refers to how closely the points cluster around a straight line: 'strong' if they are tight together and 'weak' if they are widely scattered.

•

r

•

y

•

y = mx + c

•

Interpolation and Extrapolation: Interpolation is the process of predicting a value within the range of the given data set, which is generally reliable. Extrapolation is predicting values outside the range of the data set, which is visually represented by extending the line beyond the plotted points; this is often unreliable as the linear trend may not continue.

•

Causation vs. Correlation: It is crucial to remember that a strong correlation between two variables does not necessarily mean that one causes the other. There may be a 'lurking variable' influencing both, or the relationship may be coincidental.

📐Formulae

\bar{x} = \frac{\sum x}{n}

\bar{y} = \frac{\sum y}{n}

(\bar{x}, \bar{y})

y = mx + c

-1 \leq r \leq 1

💡Examples

Problem 1:

x

Solution:

(\bar{x}, \bar{y})

Explanation:

x = 7

Problem 2:

r = -0.85

Solution:

r = -0.85

Explanation:

Correlation coefficients tell us about the strength and direction. Extrapolation is risky because we assume the mathematical model holds true far beyond the observed data points.