krit.club logo

Statistics and Probability - Measures of dispersion: range, interquartile range, and box-and-whisker plots

Grade 9IB

Review the key concepts, formulae, and examples before starting your quiz.

🔑Concepts

Range: The simplest measure of dispersion, calculated as the difference between the maximum and minimum values in a dataset. Visually, it represents the total horizontal distance covered by the data points on a number line, from the lowest dot to the highest.

Quartiles: These are values that divide an ordered dataset into four equal parts. The First Quartile (Q1Q_1) marks the 25th25th percentile, the Median (Q2Q_2) marks the 50th50th percentile, and the Third Quartile (Q3Q_3) marks the 75th75th percentile.

Interquartile Range (IQR): The difference between the third and first quartiles (Q3Q1Q_3 - Q_1). It represents the spread of the middle 50%50\% of the data. Because it ignores the top and bottom 25%25\%, it is a more 'robust' measure than the range, meaning it is not heavily influenced by extreme outliers.

Five-Number Summary: A descriptive set of five values used to summarize the distribution of data: Minimum, Q1Q_1, Median, Q3Q_3, and Maximum. This summary provides a snapshot of both the central tendency and the spread of the data.

Box-and-Whisker Plot (Visual Representation): A diagram constructed using the five-number summary. It features a central 'box' extending from Q1Q_1 to Q3Q_3, with a vertical line drawn at the Median (Q2Q_2). Two horizontal lines, called 'whiskers', extend from the box to the Minimum and Maximum values. The length of the box represents the IQR, and the total length of the plot represents the Range.

Outliers and Boundaries: An outlier is an extreme value that lies significantly far from the rest of the data. Mathematically, any value smaller than Q11.5×IQRQ_1 - 1.5 \times IQR or larger than Q3+1.5×IQRQ_3 + 1.5 \times IQR is considered an outlier. On a box plot, these are often marked as individual points (dots or crosses) beyond the whiskers.

Data Distribution and Skewness: The shape of the box-and-whisker plot indicates how data is distributed. If the median line is closer to the left side (Q1Q_1) of the box, the data may be positively skewed. If the whiskers are of significantly different lengths, it indicates that the spread of the extreme values is higher on one side than the other.

📐Formulae

Range=xmaxxminRange = x_{max} - x_{min}

IQR=Q3Q1IQR = Q_3 - Q_1

LowerOutlierBound=Q11.5×IQRLower Outlier Bound = Q_1 - 1.5 \times IQR

UpperOutlierBound=Q3+1.5×IQRUpper Outlier Bound = Q_3 + 1.5 \times IQR

PositionofMedian=n+12Position of Median = \frac{n + 1}{2}

💡Examples

Problem 1:

The marks obtained by 9 students in a quiz are: 15,12,18,10,20,15,17,9,2115, 12, 18, 10, 20, 15, 17, 9, 21. Calculate the Range and the Interquartile Range (IQR).

Solution:

  1. Order the data from least to greatest: 9,10,12,15,15,17,18,20,219, 10, 12, 15, 15, 17, 18, 20, 21
  2. Find the Range: MaxMin=219=12Max - Min = 21 - 9 = 12
  3. Find the Median (Q2Q_2): The middle value of 9 numbers is the 5th5^{th} position: Median=15Median = 15
  4. Find Q1Q_1: The median of the lower half (9,10,12,159, 10, 12, 15) is 10+122=11\frac{10 + 12}{2} = 11
  5. Find Q3Q_3: The median of the upper half (17,18,20,2117, 18, 20, 21) is 18+202=19\frac{18 + 20}{2} = 19
  6. Calculate IQRIQR: Q3Q1=1911=8Q_3 - Q_1 = 19 - 11 = 8

Explanation:

To find measures of dispersion, the data must first be ordered. The range gives the total spread (1212), while the IQR (88) focuses on the spread of the middle half of the marks, which is less affected by the highest and lowest scores.

Problem 2:

A dataset has Q1=20Q_1 = 20, Median=25Median = 25, and Q3=32Q_3 = 32. Determine if a value of 5555 would be considered an outlier in this dataset.

Solution:

  1. Calculate the IQR: IQR=Q3Q1=3220=12IQR = Q_3 - Q_1 = 32 - 20 = 12
  2. Calculate the multiplier for outliers: 1.5×IQR=1.5×12=181.5 \times IQR = 1.5 \times 12 = 18
  3. Calculate the Upper Bound: Q3+18=32+18=50Q_3 + 18 = 32 + 18 = 50
  4. Compare the value to the bound: 55>5055 > 50

Explanation:

Since the value 5555 is greater than the upper boundary of 5050, it is statistically classified as an outlier. In a box-and-whisker plot, the whisker would stop at the last data point within the 5050 limit, and 5555 would be plotted as a separate dot.