krit.club logo

Statistics - Cumulative frequency and box plots

Grade 11IGCSE

Review the key concepts, formulae, and examples before starting your quiz.

🔑Concepts

Cumulative Frequency: The running total of frequencies. It is calculated by adding each frequency to the sum of the preceding frequencies.

Cumulative Frequency Graph (Ogive): A plot where the upper class boundary is on the x-axis and the cumulative frequency is on the y-axis. Points are joined with a smooth curve.

Median (Q2Q_2): The middle value of the data set, found at the 50th50^{th} percentile (halfway up the cumulative frequency axis).

Quartiles: Q1Q_1 (Lower Quartile) is at 25%25\% of the total frequency; Q3Q_3 (Upper Quartile) is at 75%75\% of the total frequency.

Interquartile Range (IQR): A measure of spread that calculates the range of the middle 50%50\% of the data. It is less affected by outliers than the range.

Box Plot (Box-and-Whisker): A diagram representing the five-number summary: Minimum, Lower Quartile (Q1Q_1), Median (Q2Q_2), Upper Quartile (Q3Q_3), and Maximum.

Percentiles: Values that divide the data into 100 equal parts (e.g., the 90th90^{th} percentile is the value below which 90%90\% of the data falls).

📐Formulae

PositionofMediann2Position of Median \approx \frac{n}{2}

PositionofLowerQuartile(Q1)n4Position of Lower Quartile (Q_1) \approx \frac{n}{4}

PositionofUpperQuartile(Q3)3n4Position of Upper Quartile (Q_3) \approx \frac{3n}{4}

InterquartileRange(IQR)=Q3Q1Interquartile Range (IQR) = Q_3 - Q_1

OutlierBoundaries:Lower=Q11.5×IQR,Upper=Q3+1.5×IQROutlier Boundaries: Lower = Q_1 - 1.5 \times IQR, Upper = Q_3 + 1.5 \times IQR

💡Examples

Problem 1:

In a survey of 80 students, their heights (h cm) were recorded. The cumulative frequency table shows: h150:10h \le 150: 10, h160:30h \le 160: 30, h170:65h \le 170: 65, h180:80h \le 180: 80. Estimate the Median and the Interquartile Range.

Solution:

  1. Total frequency (nn) = 80.
  2. Median position = 80/2=4080 / 2 = 40. Looking at the cumulative frequency, 40 falls between 160 and 170. By linear interpolation or reading a curve, Median 163\approx 163 cm.
  3. Q1Q_1 position = 80/4=2080 / 4 = 20. Since 20 falls in the 150<h160150 < h \le 160 class, Q1155Q_1 \approx 155 cm.
  4. Q3Q_3 position = 3/4×80=603/4 \times 80 = 60. Since 60 falls in the 160<h170160 < h \le 170 class, Q3168Q_3 \approx 168 cm.
  5. IQR=Q3Q1=168155=13IQR = Q_3 - Q_1 = 168 - 155 = 13 cm.

Explanation:

To estimate these values, we locate the specific rank (20th, 40th, 60th) on the cumulative frequency (y-axis), move horizontally to the curve, and then vertically down to the height (x-axis).

Problem 2:

Given the following summary for a set of test scores: Min = 20, Q1=45Q_1 = 45, Median = 55, Q3=70Q_3 = 70, Max = 95. How is this represented on a Box Plot?

Solution:

  1. Draw a horizontal scale from 20 to 100.
  2. Draw a rectangular box from 45 to 70.
  3. Draw a vertical line inside the box at 55.
  4. Draw 'whiskers' (lines) extending from the box at 45 down to 20, and from 70 up to 95.

Explanation:

The box represents the Interquartile Range (central 50%50\% of the data), the line inside shows the average (median), and the whiskers show the full extent (range) of the data set.