Review the key concepts, formulae, and examples before starting your quiz.
🔑Concepts
Measures of Dispersion: Dispersion refers to the extent to which data values are spread out or scattered around a central value like the mean or median. Visually, if you plot two frequency distributions on the same graph, a distribution with high dispersion will appear wide and flat (like a shallow hill), while one with low dispersion will appear narrow and tall (like a steep peak).
Range and Interquartile Range: The range is the simplest measure of dispersion, calculated as the difference between the maximum and minimum values. Visually, it represents the total horizontal length covered by the data on a number line. The Interquartile Range (IQR) focuses on the middle of the data, represented visually by the width of the 'box' in a box-and-whisker plot.
Mean Deviation: This is the arithmetic mean of the absolute deviations of the observations from a central value (mean or median). In a scatter plot, if you draw a horizontal line at the mean, the mean deviation is the average of the vertical distances from each point to that line, ignoring whether the point is above or below it.
Variance: Variance is the average of the squares of the deviations from the arithmetic mean. By squaring the deviations, we ensure all values are positive and give more weight to outliers. Visually, a larger variance indicates that data points are likely to be found further away from the center of the distribution curve.
Standard Deviation (SD): The square root of the variance is the standard deviation, which is the most widely used measure of dispersion as it shares the same units as the data. On a normal bell-shaped curve, approximately of the data falls within one standard deviation of the mean, representing the 'typical' spread of the data.
Coefficient of Variation (CV): This is a relative measure of dispersion, expressed as a percentage, used to compare the variability of two or more series even if they have different units or means. When comparing two frequency polygons, the one with the higher is considered more 'variable' or 'unstable,' while the one with the lower is more 'consistent' or 'homogeneous.'
Comparison of Distributions: For two frequency distributions with the same mean, the distribution with the smaller standard deviation (and thus smaller ) is more consistent. Visually, this distribution will have a higher concentration of frequencies near the mean, resulting in a more 'peaked' frequency curve compared to the other.
📐Formulae
Mean (Grouped Data): , where
Mean Deviation about Mean:
Mean Deviation about Median:
Variance (Discrete):
Standard Deviation (Shortcut Method): , where
Standard Deviation (Step Deviation Method): , where
Coefficient of Variation:
💡Examples
Problem 1:
Calculate the mean deviation about the mean for the following data: .
Solution:
- Find the mean ():
- Find absolute deviations :
- Sum of absolute deviations:
- Mean Deviation:
Explanation:
To find the mean deviation, we first determine the central point (the mean), then measure the average absolute distance of all data points from that center.
Problem 2:
Two series A and B have the following characteristics: Series A: Mean = , Standard Deviation = Series B: Mean = , Standard Deviation = Which series is more consistent?
Solution:
- Calculate Coefficient of Variation for Series A:
- Calculate Coefficient of Variation for Series B:
- Compare : Since (), Series A is more consistent.
Explanation:
The Coefficient of Variation (CV) is used to compare consistency. A lower CV indicates less relative variability and higher consistency.