Web Lesson: Statistics
#11 › Cumulative Frequency Curves
The Interactive Web-Lesson below has questions embedded
|
x |
5-15 |
15-20 |
20-23 |
23-26 |
26-29 |
29-33 |
33-38 |
|
f |
13 |
18 |
35 |
42 |
34 |
13 |
5 |
U.C.B. |
up to 5 |
up to 15 |
up to 20 |
up to 23 |
up to 26 |
up to 29 |
up to 33 |
up to 38 |
|
|
|
|
|
|
|
U.C.B. |
up to 5 |
up to 15 |
up to 20 |
up to 10 |
up to 15 |
up to 20 |
up to 25 |
up to 30 |
F |
0 |
13 |
31 |
66 |
108 |
142 |
155 |
16 |
▲ ▲ ▲ ▲ ▲ ▲ ▲ ┐└─┐ │ │ │ ╚════════etc═════════╝ Use L.C.B. of │ │ │ │ │ the 1st class ├──┘ │ │ │ and F = 0 │ ┌────┴───┐ │ │ ┘ 1st freq │ │ │ │ ┌───────┴───────┐ │ 1st + 2nd freqs │ 13 + 18 = 31 │ │ ┌──────────┴──────────┐ 1st + 2nd + 3rd freqs 13 + 18 + 35 = 66
Alternatively, click HERE to download all of the links in 1 go!
Amount (£) |
5-20 |
20-30 |
30-40 |
40-50 |
50-70 |
70-120 |
|
Number of pupils |
8 |
15 |
18 |
11 |
6 |
2 |
U.C.B. |
up to 5 |
up to 20 |
up to ⋯ |
up to ⋯ |
up to ⋯ |
up to ⋯ |
up to 120 |
Cumulative Frequency |
0 |
8 |
⋯ |
⋯ |
⋯ |
⋯ |
60 |
Hints: Step 1: Always CHECK the data is written using CLASS BOUNDARIES (not class-limits) ✔ IT IS (i.e. there no GAP between the end of any class and the start of the next class) Step 2: To populate the TOP-ROW of the Cumulative Frequency Table, look at the LIMITS of each class: The first class is SPECIAL: ‘5 - 20’ ▲ ▲ │ │ This is the only class where its L.C.B. ─────┘ │ ISN'T deleted: its the 1st № in our CF │ table (the F-value below it is just ‘0’) └────────── This is the 2nd № in the top-row of our CF-table After that, fill in the U.C.B.s of the rest of classes to complete the top-row: Step 3: To populate the BOTTOM-ROW, you need to work out the ‘running-totals’ of the frequency ┌───┐ We saw already that the first number in the F-row is: |‘0’| After that - the values are: 8 = | 8 | 8 + 15 = | ⋯ | 8 + 15 + 18 = | ⋯ | 8 + 15 + 18 + …… = | ⋯ | 8 + …… + …… + …… + … = | ⋯ | 8 + …… + …… + …… + … + … = | ⋯ | └─┬─┘ these are our CUMULATIVE FREQUENCIES ───────┘ Step 4: PRINT off the grid provided (see the link in the question), plot the points and connect them with a smooth curve (smooth out the curve, even if it means you slightly miss one of your points
Time (mins) |
10-15 |
15-30 |
30-40 |
40-50 |
50-65 |
65-90 |
|
Frequency |
4 |
25 |
36 |
27 |
7 |
1 |
Hints: Step 1: Always CHECK the data is written using CLASS BOUNDARIES (not class-limits) ✔ IT IS (i.e. there no GAP between the end of any class and the start of the next class) Step 2: To populate the TOP-ROW of the Cumulative Frequency Table, look at the LIMITS of each class: The first class is SPECIAL: ‘10 - 20’ ▲ ▲ │ │ This is the only class where its L.C.B. ─────┘ │ ISN'T deleted: its the 1st № in our CF │ table (the F-value below it is just ‘0’) └────────── This is the 2nd № in the top-row of our CF-table After that, fill in the U.C.B.s of the rest of classes to complete the top-row TIME (UCB) │ 10 15 30 40 50 65 90 ───────────┼─────────────────────────────────────────────────────── F │ 0 ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ Step 3: To populate the BOTTOM-ROW, you need to work out the ‘running-totals’ of the frequency Step 4: PRINT off the grid provided (see the link in the question), plot the points and connect them with a smooth curve (smooth out the curve, even if it means you slightly miss one of your points
Mass (kg) |
10-16 |
16-19 |
19-22 |
22-26 |
26-40 |
|
frequency |
4 |
14 |
18 |
9 |
3 |
Hints: Don't start your cumulative frequency curve from the point (0, 0) As we have done in the questions so far, this time, you have to start from the point (10, 0)
x |
5-15 |
15-20 |
20-23 |
23-26 |
26-29 |
29-33 |
33-38 |
|
f |
13 |
18 |
35 |
42 |
34 |
13 |
5 |
Note: The number on the y-axis is always the answer to the question, "how many got LESS THAN x=..."
Hints: Part (a): Firstly - PRINT OFF THE FULL PAGE CUMULATIVE FREQUENCY CURVE for this question… We want to know how many people waited LESS THAN 30 minutes So, we go to "30" on the x-axis Add in a vertical line up to meet the curve and then a horizontal line across to the y-axis Reading the number from the y-axis tell us how many people waited less than 30 minutes Part (b): When we read a y-value from the GRID, it always tells us the NUMBER that were LESS THAN the corresponding x-value… If we wanna know how many were MORE than that x-value, we need to subtract the y-value from ‘n’ (remember, ‘n’ is the total number in the sample, which is the LAST number in the cumulative frequencies and also the y-value of the TOP of the ogive…)
Hints: Part (a): If we draw a vertical line at x = 30 minutes and read across to the y-value - that will tell us how many passengers waited LESS THAN 30 minutes If we draw a vertical line at x = 15 minutes and read across to the y-value - that will tell us how many passengers waited LESS THAN 15 minutes SUBTRACT these two values, and we'll get the number that waited BETWEEN 15-30 mins How do we then turn that into a percentage? Part (b): Since the SAMPLE-SIZES are different, it's wrong to just compare the NUMBER that waited 30+ mins It makes much more sense to compare the PERCENTAGE that waited 30+ minutes Part (c): Again, we just need to compare the PERCENTAGES!
Hints: Part (a): Start by finding the NUMBER of students that were 1.4 - 1.7 m tall: By drawing vertical lines at x=1.7 and x=1.4, reading across to the y-values and subtracting… Then turn it into a percentage (by dividing by 4000) Part (b): This is kinda a ‘trick’ question - but understanding the answer is gonna be crucial to your understanding of continuous data… It's not impossible to deduce: If you follow the logic of how you answered Part (a), then it's kind-o-bvious - a ‘prize’ to anyone that gets it right! Part (c): The TALLEST 10% refers to the 10% at the TOP of the curve… …10% of 4000 is 400 └─────────┬─────────┘ │ └──────────────────── The 400 at the ‘TOP’ of the curve are the 400 between y = 3600 and y = 4000
Amount (£) |
5-20 |
20-30 |
30-40 |
40-50 |
50-70 |
70-120 |
|
Number of pupils |
8 |
15 |
18 |
11 |
6 |
2 |
Hints: We already know that n = 60 (i.e. the total number in the sample was 60) The MEDIAN is the 30th value:
WARNING: ››› Students sometimes write: MEDIAN = 30 = 34 ‹‹‹ which you should obviously recognise gobbledegook (how and 30=34 ???) - it is BETTER to write: MEDIAN (30th value) = 34
Similarly, the 3rd QUARTILE is the 45th value, and the 1st QUARTILE is the 15th value
Time (mins) |
10-15 |
15-30 |
30-40 |
40-50 |
50-65 |
65-90 |
|
Frequency |
4 |
25 |
36 |
27 |
7 |
1 |
Hints: Sorry - no help for you here!
Mass (kg) |
10-16 |
16-19 |
19-22 |
22-26 |
26-40 |
|
frequency |
4 |
14 |
18 |
9 |
3 |
Hints: Firstly, before we can find median/quartiles etc we need a CUMULATIVE FREQUENCY TABLE and a CUMULATIVE FREQUENCY CURVE (remembering to show n = 48 on our curve) Americans call the MEDIAN the 50th percentile (we use: ½n, they use: ⁵⁰⁄₁₀₀n - same difference!) To find the 90th Percentile, start by working out ⁹⁰⁄₁₀₀n Then look up this value on the y-axis and read across to the x-axis
Weight (kg) |
10-19 |
20-29 |
30-34 |
35-39 |
40-49 |
50-59 |
|
frequency |
7 |
21 |
28 |
32 |
23 |
9 |
U.C.B. |
up to … |
up to … |
up to … |
up to … |
up to … |
up to … |
up to … |
F |
0 |
7 |
28 |
... |
... |
... |
120 |
Hints: This is NOT WRITTEN using CLASS BOUNDARIES: these should be the same! ┌┬────────┬┐ ▼▼ ▼▼ WEIGHT (kg) │ 10-19 20-29 30-34 35-39 40-49 50-59 ─────────────┼───────────────────────────────────────────────────────────────────────── f │ 7 21 28 32 23 9 The WEB LESSON showed you that, when that happens - you need to consider some values between x = 19 and x = 20 and see which CLASS each value should be placed ┌─────────────┐ ┌─────────────┐ WHICH CLASS: │ 10 < x ≤ 19 │ OR │ 20 < x ≤ 29 │ ┌┴────┬─┬─┬─┬──┘ ┌─┴───┬─┬─┬──┬──┴─┐ ┌────┘ ┌───┘ │ │ └───┐ ┌────┘ ┌──┘ │ │ └──┐ └────┐ ┌────┘ ┌────┘ ┌──┘ └──┐ └────┐ ┌─────┘ ┌─────┘ ┌──┘ └──┐ └────┐ └────┐ 19 19.1 19.2 19.3 19.4 19.5 19.6 19.7 19.8 19.9 20 └─┬──┘ ┌──┘ ┌──────────────────────┴────┐ THIS IS THE CUT-OFF VALUE └────┬──┬───────────────────┘ So we change: │ │ ┌──┘ └──┐ WEIGHT (kg) │ ⋯-19.5 19.5-⋯ ⋯-⋯ ⋯-⋯ ⋯-⋯ ⋯-⋯ ─────────────┼───────────────────────────────────────────────────────────────────────── f │ 7 21 28 32 23 9 Next, do the same to ever UPPER-LIMIT (i.e. add ½) Then, do the same to all the LOWER-LIMITS (i.e. subtract ½) There - now we've got it written using CLASS-BOUNDARIES, we can carry on as usual!