krit.club logo

Data Handling - Organising and Grouping Data

Grade 8CBSE

Review the key concepts, formulae, and examples before starting your quiz.

🔑Concepts

Raw Data and Observations: Data collected in its original, unorganized form is called raw data. Each numerical entry or entry of information is called an observation. To make raw data useful, it is organized into a frequency distribution table.

Frequency and Tally Marks: Frequency represents the number of times a specific observation occurs in a dataset. In a frequency distribution table, tally marks are used to count occurrences where four vertical bars are drawn and the fifth bar is a diagonal line crossing the first four to form a group of 55.

Grouped Frequency Distribution: When the number of observations is large, data is organized into groups called class intervals, such as 010,1020,20300-10, 10-20, 20-30. This is known as a grouped frequency distribution. By convention, an observation equal to the upper limit of a class is included in the next higher class (e.g., 2020 belongs to 203020-30, not 102010-20).

Class Limits and Class Size: In a class interval like 203020-30, the smaller number (2020) is called the lower class limit and the greater number (3030) is called the upper class limit. The difference between the upper limit and the lower limit is the class size or class width.

Class Mark: The mid-value of a class interval is called the class mark. It is calculated by taking the average of the upper and lower limits of that specific class.

Histogram: A histogram is a visual representation of grouped data using vertical rectangles. The class intervals are represented on the horizontal x-axis and the frequencies on the vertical y-axis. Visually, the bars are placed adjacent to each other with no gaps between them because the class intervals are continuous.

Kink or Broken Line: In a histogram, if the first class interval does not start from zero, a 'kink' or a zigzag line is drawn on the horizontal axis near the origin. This visual indicator shows that the scale along the horizontal axis does not show the values between zero and the lower limit of the first class.

📐Formulae

Range=Maximum ValueMinimum Value\text{Range} = \text{Maximum Value} - \text{Minimum Value}

Class Size=Upper Class LimitLower Class Limit\text{Class Size} = \text{Upper Class Limit} - \text{Lower Class Limit}

Class Mark=Upper Limit+Lower Limit2\text{Class Mark} = \frac{\text{Upper Limit} + \text{Lower Limit}}{2}

💡Examples

Problem 1:

The marks obtained by 2020 students in a math test (out of 5050) are: 21,10,30,22,33,5,37,12,25,42,15,39,26,32,18,27,28,19,29,3521, 10, 30, 22, 33, 5, 37, 12, 25, 42, 15, 39, 26, 32, 18, 27, 28, 19, 29, 35. Organise this data into a grouped frequency distribution table using class intervals of 010,10200-10, 10-20, etc.

Solution:

  1. Identify the range: Min = 55, Max = 4242.
  2. Create intervals: 010,1020,2030,3040,40500-10, 10-20, 20-30, 30-40, 40-50.
  3. Tally the data:
  • 0100-10: 55 (Frequency = 11)
  • 102010-20: 10,12,15,18,1910, 12, 15, 18, 19 (Frequency = 55)
  • 203020-30: 21,22,25,26,27,28,2921, 22, 25, 26, 27, 28, 29 (Frequency = 77)
  • 304030-40: 30,33,37,39,32,3530, 33, 37, 39, 32, 35 (Frequency = 66)
  • 405040-50: 4242 (Frequency = 11)
  1. Total Frequency = 1+5+7+6+1=201 + 5 + 7 + 6 + 1 = 20.

Explanation:

We group the raw marks into continuous classes. Note that a value like 3030 is placed in the 304030-40 group because the upper limit is excluded from the current group and included in the next.

Problem 2:

Calculate the class mark and class size for the class interval 150175150-175.

Solution:

  1. Upper Limit = 175175, Lower Limit = 150150
  2. Class Size=175150=25\text{Class Size} = 175 - 150 = 25
  3. Class Mark=175+1502=3252=162.5\text{Class Mark} = \frac{175 + 150}{2} = \frac{325}{2} = 162.5

Explanation:

Class size tells us the width of the interval, while the class mark gives us the central value used in certain types of statistical calculations.