Review the key concepts, formulae, and examples before starting your quiz.
πConcepts
Scatter Diagrams: This is a visual representation where pairs of bivariate data are plotted as individual points on a Cartesian plane. If the cluster of points tends to rise from the bottom-left to the top-right, it indicates a positive correlation. If the points cluster in a downward slope from top-left to bottom-right, it indicates a negative correlation. A random spread of points suggest zero correlation.
Pearsonβs Coefficient of Correlation (): This numerical value measures the strength and direction of a linear relationship between two variables. It ranges from to . A value of represents a perfect positive linear relationship (all points on a straight line rising), represents a perfect negative linear relationship (all points on a straight line falling), and indicates no linear correlation.
Regression Lines: These are 'lines of best fit' that minimize the square of the distances between the actual data points and the line. There are two lines: the regression line of on (used to predict when is known) and the regression line of on (used to predict when is known). Visually, these lines intersect at the point of the means .
Regression Coefficients ( and ): These represent the slopes of the regression lines. is the slope of the line on , indicating the change in for a unit change in . Similarly, is the slope for on . Both coefficients always have the same algebraic sign, which is also the sign of the correlation coefficient .
Spearmanβs Rank Correlation (): This method is used when variables are qualitative (like beauty or intelligence) or when the data is ranked. It measures the degree of similarity between two sets of rankings. If , the ranks are identical; if , the ranks are in exactly opposite order.
Geometric Property of : The correlation coefficient is the geometric mean of the two regression coefficients, expressed as . The sign of is chosen based on the sign of the coefficients. If both and are positive, is positive; if both are negative, is negative.
Angle Between Regression Lines: If , the two regression lines coincide, forming an angle of , indicating a perfect linear relationship. If , the lines are perpendicular to each other, intersecting at right angles at the point .
πFormulae
Arithmetic Mean:
Covariance:
Pearsonβs Correlation Coefficient:
Computational formula for :
Spearmanβs Rank Correlation: , where is the difference in ranks.
Regression Coefficient
Regression Coefficient
Regression Line of on :
Regression Line of on :
π‘Examples
Problem 1:
Given the following data: , , , , , and . Calculate the Pearson correlation coefficient .
Solution:
Step 1: Use the computational formula for : Step 2: Substitute the given values: Step 3: Simplify the numerator: . Step 4: Simplify the denominator: . Step 5: Final calculation: .
Explanation:
The value of indicates a very strong negative linear correlation between variables and .
Problem 2:
The two regression lines are given by and . Find the mean values of and ( and ) and the correlation coefficient .
Solution:
Step 1: To find means, solve the equations simultaneously since the lines intersect at :
- Multiply (1) by 2: . Subtract (2) from this: . Substitute into (1): . So, . Step 2: Find and . From (1), assume on : . From (2), assume on : . Step 3: Check validity: . Since , the assumptions are correct. Step 4: Calculate (negative because both coefficients are negative).
Explanation:
We identify the intersection of the regression lines to find the means and use the property that is the square root of the product of the regression slopes.