Statistics and Probability - Measures of Dispersion (Mean Deviation, Variance, Standard Deviation)
Review the key concepts, formulae, and examples before starting your quiz.
🔑Concepts
Dispersion refers to the extent to which numerical data is spread out or scattered around a central value (like the mean or median). Visually, if you plot data points on a number line, a low dispersion results in a dense cluster of points near the center, while high dispersion shows points widely distributed across the line.
Mean Deviation (M.D.) is the arithmetic mean of the absolute values of the deviations of individual observations from a central value. By taking the absolute value , we ignore the sign of the deviation. Visually, this is equivalent to measuring the distance of each point from the center and treating all distances as positive, regardless of whether the point lies to the left or right of the center.
Variance () is the average of the squares of the deviations from the mean. Squaring the deviations ensures all terms are non-negative and gives greater weight to outliers (points far from the mean). In a graph of squared deviations, larger distances from the mean result in disproportionately larger areas, contributing to a higher variance.
Standard Deviation () is the positive square root of the variance. It is preferred over variance because it is expressed in the same units as the original data. On a Normal Distribution curve (bell curve), the standard deviation defines the 'width' of the curve; a smaller creates a steep, narrow peak, whereas a larger creates a flatter, wider curve.
Coefficient of Variation (C.V.) is a relative measure of dispersion used to compare the variability of two or more sets of data, even if they have different units or means. It is calculated as the ratio of standard deviation to the mean, expressed as a percentage. A distribution with a lower C.V. is considered more 'consistent' or 'stable'.
Short-cut and Step-deviation Methods are techniques used to simplify calculations for Variance and Standard Deviation, especially when dealing with large numbers or class intervals. These methods involve shifting the origin (Assumed Mean) and scaling the data (Class Width), which visually shifts and compresses the data distribution on the x-axis to make the arithmetic easier.
📐Formulae
Mean: (for raw data) or (for frequency distribution)
Mean Deviation about Mean:
Mean Deviation about Median:
Variance for Raw Data:
Standard Deviation (Discrete Frequency):
Standard Deviation (Short-cut method): , where
Standard Deviation (Step-deviation method): , where
Coefficient of Variation:
💡Examples
Problem 1:
Calculate the Mean Deviation about the mean for the following data: .
Solution:
Step 1: Find the mean ().
Step 2: Find absolute deviations . .
Step 3: Sum the absolute deviations.
Step 4: Calculate M.D. .
Explanation:
To find Mean Deviation, we first determine the central point (mean), calculate how far each point is from that center using absolute values to remove negative signs, and then find the average of those distances.
Problem 2:
Find the variance and standard deviation for the following frequency distribution:
Solution:
Step 1: Create a table for calculations where .
| 4 | 3 | 12 | 16 | 48 |
| 8 | 5 | 40 | 64 | 320 |
| 11 | 9 | 99 | 121 | 1089 |
| 17 | 5 | 85 | 289 | 1445 |
| 20 | 4 | 80 | 400 | 1600 |
| 24 | 3 | 72 | 576 | 1728 |
| 32 | 1 | 32 | 1024 | 1024 |
| Total | 30 | 420 | 6254 |
Step 2: Find the mean .
Step 3: Calculate Variance ().
Step 4: Calculate Standard Deviation (). .
Explanation:
This solution uses the computational formula for variance. We calculate the sum of frequencies (), the sum of products () to find the mean, and the sum of to find the raw second moment. Variance is the difference between the average of the squares and the square of the average.