Lesson Summary
find the mean, median and mode of a set of data (including grouped data) from the data
This lesson explores the use of frequency tables in representing data, distinguishing between ungrouped data and grouped data. We’ll cover how to organize data efficiently and compute key summary statistics like the mean, median, and mode to analyze data distributions effectively.
Frequency Tables for Ungrouped Data
Ungrouped data consists of individual, distinct values, often discrete in nature. A frequency table for ungrouped data lists each unique value alongside its frequency, which is the number of times it appears in the dataset.
Benefits of Using Frequency Tables for Ungrouped Data
Organizing raw data into a frequency table offers several advantages, particularly when handling substantial datasets:
- It streamlines data collection via tally marks before finalizing counts.
- Ideal for discrete numerical data, such as counts or categories.
- Simplifies computations for averages, ranges, and other summary measures compared to scattered raw values.
- Reveals data patterns quickly, highlighting concentrations and outliers.
- Retains all original values, allowing precise calculations of statistics like the exact mean or median.
Computing Mean, Median, and Mode from Ungrouped Frequency Tables
Assuming familiarity with these measures from raw data, here’s how they adapt to frequency tables:
-
Mode: The data value with the highest frequency in the table.
-
Median: The middle value in the ordered dataset. For a table, calculate the position using
\frac{n+1}{2}, wherenis the total frequency. Locate this position via cumulative frequencies to identify the corresponding value. -
Mean: The average, found by weighting each value by its frequency. Use the formula:
\bar{x} = \frac{\sum xf}{\sum f}
where x is the data value, f is its frequency, and sums are over all entries.
Example: Analyzing Shoe Sizes with Ungrouped Frequency Table
Consider a frequency table for the shoe sizes (in UK units) of 50 athletes:
| Shoe Size | 5 | 5.5 | 6 | 6.5 | 7 | 7.5 | 8 | 8.5 | 9 |
|---|---|---|---|---|---|---|---|---|---|
| Frequency | 2 | 4 | 8 | 6 | 12 | 9 | 5 | 3 | 1 |
(i) Identify the modal shoe size.
(ii) Determine the median shoe size and explain your reasoning.
(iii) Compute the mean shoe size to 3 significant figures.
Solutions:
(i) The modal shoe size is the value with the maximum frequency of 12, which is 7.
(ii) Total frequency n = 2 + 4 + 8 + 6 + 12 + 9 + 5 + 3 + 1 = 50.
Position: \frac{50 + 1}{2} = 25.5th value.
Cumulative frequencies:
| Shoe Size | 5 | 5.5 | 6 | 6.5 | 7 | 7.5 | 8 | 8.5 | 9 |
|---|---|---|---|---|---|---|---|---|---|
| Frequency | 2 | 4 | 8 | 6 | 12 | 9 | 5 | 3 | 1 |
| Cumulative | 2 | 6 | 14 | 20 | 32 | 41 | 46 | 49 | 50 |
The 25.5th value falls between the 25th and 26th, both in the size 7 group (cumulative reaches 32 at 7). Thus, median = 7.
(iii) Calculate xf products:
Shoe Size x |
5 | 5.5 | 6 | 6.5 | 7 | 7.5 | 8 | 8.5 | 9 |
|---|---|---|---|---|---|---|---|---|---|
Frequency f |
2 | 4 | 8 | 6 | 12 | 9 | 5 | 3 | 1 |
xf |
10 | 22 | 48 | 39 | 84 | 67.5 | 40 | 25.5 | 9 |
\sum xf = 10 + 22 + 48 + 39 + 84 + 67.5 + 40 + 25.5 + 9 = 345
\bar{x} = \frac{345}{50} = 6.9
Mean shoe size = 6.90 (3 s.f.).
Tips for Exam Success with Ungrouped Data
- Verify your mean falls within the data range; an implausible value (e.g., mean age of 50 for school students) signals an error.
- Always double-check total frequency and cumulative sums for median accuracy.
Frequency Tables for Grouped Data
Grouped data involves continuous variables spanning wide ranges, where individual values are binned into classes or intervals. A frequency table here records the frequency for each class.
Pros and Cons of Grouping Data
Grouping is beneficial for:
- Managing datasets with extensive value ranges.
- Identifying trends and distributions visually.
- Speeding up statistical computations.
However, drawbacks include:
- Loss of precise individual values.
- Resulting in only estimated averages and other statistics.
Key Notation for Grouped Frequency Tables
Clear class boundaries prevent overlap:
- For discrete data, use integer ranges like 10–19, 20–29.
- For continuous data, employ inequalities: e.g.,
10 \leq x \lt 20,20 \leq x \lt 30.
For continuous data, ensure no gaps between classes; adjust boundaries if needed (e.g., from 10 \leq x \leq 19 and 20 \leq x \leq 29 to 9.5 \leq x < 19.5 and 19.5 \leq x < 29.5).
Estimating Averages from Grouped Frequency Tables
-
Modal Class: The interval with the highest frequency.
-
Estimated Mean: Use class midpoints (average of boundaries) instead of exact values. Formula remains:
\bar{x} = \frac{\sum xf}{\sum f}
where x is the midpoint.
- Estimated Median: Typically, identify the class containing the median position; detailed interpolation is beyond basic requirements here.
Example: Heights Using Grouped Frequency Table
The table shows heights (in cm) of 30 adults:
Height h (cm) |
Frequency |
|---|---|
160 ≤ h < 165 |
4 |
165 ≤ h < 170 |
7 |
170 ≤ h < 175 |
12 |
175 ≤ h < 180 |
5 |
180 ≤ h < 185 |
2 |
(i) Specify the class for a person of height 170 cm.
(ii) State the modal class.
(iii) Estimate the mean height to 3 significant figures.
Solutions:
(i) 170 cm fits in 170 ≤ h < 175 (upper limit of previous class is <170).
(ii) Modal class = 170 \le h \lt 175 (frequency 12).
(iii) Compute midpoints and xf:
Height h (cm) |
Midpoint x |
Frequency f |
xf |
|---|---|---|---|
160 ≤ h < 165 |
162.5 | 4 | 650 |
165 ≤ h < 170 |
167.5 | 7 | 1172.5 |
170 ≤ h < 175 |
172.5 | 12 | 2070 |
175 ≤ h < 180 |
177.5 | 5 | 887.5 |
180 ≤ h < 185 |
182.5 | 2 | 365 |
\sum xf = 650 + 1172.5 + 2070 + 887.5 + 365 = 5145
\sum f = 4 + 7 + 12 + 5 + 2 = 30
\bar{x} = \frac{5145}{30} = 171.5
Estimated mean height = 172 cm (3 s.f.).
Tips for Exam Success with Grouped Data
- Perform calculations methodically, adding columns to your table for midpoints and products—partial credit often rewards shown work.
- Watch for calculator errors in sums; recount totals.
- Confirm class boundaries include all data without overlaps or gaps.
In summary, mastering frequency tables for both ungrouped and grouped data is essential for efficiently representing datasets and deriving reliable mean, median, and mode in A-level statistics. Practice with varied examples to build confidence in spotting patterns and estimating measures accurately.