Review the key concepts, formulae, and examples before starting your quiz.
🔑Concepts
Measures of Dispersion: While the mean (central tendency) tells us where the center of the data is, dispersion tells us how spread out the data points are around that center. Visually, if you plot data on a number line, a high dispersion means the points are widely scattered, while low dispersion means they are clustered tightly around the mean.
Variance (): This is the arithmetic mean of the squares of deviations of all observations from their mean. By squaring the differences, we ensure that negative and positive deviations don't cancel each other out. Conceptually, imagine a square drawn for every data point where the side length is its distance from the mean; the variance represents the average area of these squares.
Standard Deviation (): Defined as the positive square root of the variance. It is the most widely used measure of dispersion because it is expressed in the same units as the original data. On a normal distribution curve (bell curve), the standard deviation determines the 'width' of the bell; a larger creates a flatter, wider curve.
Discrete Frequency Distribution: When data is given with frequencies (), the variance is calculated by weighting each squared deviation by its corresponding frequency. Visually, this is like looking at a bar chart where the distance of each bar from the central mean line is weighted by the height of that bar.
Continuous Frequency Distribution: For grouped data (classes), we use the mid-point (class mark) as the representative value () for each interval. The distribution is visualized using a histogram, and the variance calculation assumes all values within a class are concentrated at the mid-point.
Step-Deviation Method: This is a computational shortcut used to simplify calculations by shifting the origin to an assumed mean () and scaling the data by the class width (). Visually, this is equivalent to a linear transformation (shifting and shrinking) of the x-axis to make the numbers smaller and easier to manage.
Coefficient of Variation (C.V.): A relative measure of dispersion calculated as the ratio of standard deviation to the mean, expressed as a percentage. It is used to compare the variability or consistency of two different data sets, even if they have different units or means. In a visual comparison of two frequency polygons, the one with the higher C.V. is considered more variable or less consistent.
📐Formulae
Variance for Ungrouped Data:
Standard Deviation for Ungrouped Data:
Variance for Discrete/Grouped Data: , where
Shortcut formula for Variance:
Step-Deviation Method for S.D.: , where
Coefficient of Variation:
💡Examples
Problem 1:
Find the variance and standard deviation for the following data set: .
Solution:
- Calculate the mean (): . \n2. Calculate deviations : . \n3. Calculate squared deviations : . \n4. Sum of squared deviations: . \n5. Variance (): . \n6. Standard Deviation (): .
Explanation:
We first find the central point (mean), then determine how far each point is from that center. Squaring these distances eliminates negative signs, and averaging them gives the variance. Taking the square root returns the measure to the original units.
Problem 2:
The mean of a distribution is and the standard deviation is . Find the Coefficient of Variation.
Solution:
- Identify given values: Mean , Standard Deviation . \n2. Use the formula: . \n3. Substitute values: .
Explanation:
The Coefficient of Variation expresses the standard deviation as a percentage of the mean. This allows for comparison between different data sets regardless of their scale.