Review the key concepts, formulae, and examples before starting your quiz.
🔑Concepts
Bivariate Data: Data that involves two variables to determine if there is a relationship between them.
Scatter Diagram: A graph where individual data points are plotted to visualize the relationship between two variables.
Positive Correlation: As one variable increases, the other variable also tends to increase.
Negative Correlation: As one variable increases, the other variable tends to decrease.
Zero/No Correlation: No apparent relationship between the two variables; points are scattered randomly.
Strength of Correlation: Described as 'Strong' if points are close to a straight line, or 'Weak' if they are widely spread.
Line of Best Fit: A straight line drawn through the middle of the data points, used to make predictions.
Mean Point: The point through which the line of best fit must always pass.
Interpolation: Predicting a value within the range of the given data (usually reliable).
Extrapolation: Predicting a value outside the range of the given data (often unreliable).
Correlation vs Causation: A correlation between two variables does not necessarily mean that one causes the other.
📐Formulae
Mean of x:
Mean of y:
Equation of the line of best fit:
💡Examples
Problem 1:
A student collects data on the temperature ( in °C) and the number of ice creams sold (). The data points are: (20, 50), (22, 60), (24, 75), (26, 90), (28, 105). Describe the correlation and calculate the mean point.
Solution:
Correlation: Strong Positive Correlation. Mean point: , . Mean point = .
Explanation:
The correlation is positive because as temperature increases, ice cream sales increase. It is strong because the points follow a clear linear path. The mean point is found by averaging the x-values and y-values separately.
Problem 2:
Using a scatter diagram, a line of best fit is drawn for the relationship between hours spent gaming and exam scores. The equation is . If a student games for 10 hours, what is their predicted score? Is this interpolation if the data range was 0 to 8 hours?
Solution:
Predicted score: . This is Extrapolation.
Explanation:
Substitute into the linear equation. Since 10 hours is outside the original data range (0-8 hours), the prediction is an extrapolation and may not be accurate.
Problem 3:
Explain why a line of best fit should pass through the mean point .
Solution:
The mean point represents the 'center' of the bivariate data set.
Explanation:
Mathematically, the line of best fit (specifically the least squares regression line) is anchored by the average values of the variables. Drawing it through ensures the line is balanced among the data points.