Workshop Statistics: Discovery with Data and Fathom

Topic 6: Comparing Distributions I: Quantitative Variables

Activity 6-1: Birth and Death Rates

(a) See Activity 6-12, where the table is broken down into Eastern and Western states
(c)
West
  East
  10 9
9
11 3488
9
12 358
8874430
13 01125566799
98765420
14 112
94
15 2668
331
16  
  17  
9
18  
  19  
  20  
2
21  
(d) The median birth rate for the eastern states is 13.5 births per 1000 residents. This median for western states is 14.45 births per 1000 residents.
(e) Answers will vary from student to student.
(f) The western states tend to have higher birth rates than eastern states. The median is higher in the west, and the distribution of birh rates in the west generally covers higher values than in the east.
(g) No, there are many cases where an eastern state has a higher birth rate than a western state. One extreme example of such a pair is Georgia and Montana.
(h) It's likely that the western state would have a higher birth rate than the eastern state, because the distributions in the stemplot reveal that more western states have higher values than eastern states.
 

Activity 6-2: Professional Golfers' Winnings

(a)
 
min
Q1
median
Q3
max
PGA
1259
1491
1675
2021
6617
LPGA
296
367
491.5
584
1592
Seniors
631
738
970
1118
2516
(b) Note that the tours are in a different order below:

(c) No, the boxplot would not look different because the five-number summary would not change.
(d) LPGA: IQR = 584 – 367 = 217, so 1.5IQR = 325.5. Subtracting this from the lower quartile gives 41.5. Since no woman on the list made less than this, there are no low outliers. Adding 325.5 to the upper quartile (584) gives 909.5. Outliers are thus Webb (1592), Inkster (1337), and Pak (957)
        Seniors: IQR = 1118 – 738 = 380, so 1.5IQR = 570. Subtracting this from 738 gives 168, and no senior made less than that. Adding 570 to 1118 gives 1688, so outliers are Fleisher (2516), Irwin (2025), and Doyle (1912).
(e)

(f) Yes, the modified boxplots show the outliers and also how close the non-outliers come to them (or how far they are in the PGA case!).
(g) The boxplots reveal that the PGA golfers tend to make the most, followed in order by the Seniors and then the LPGA golfers. This difference is exemplified by the realization that the lowest money winner among the PGA top 30 would be a high outlier among the LPGA top 30. Also, the lowest money winner among the Seniors' top 30 is higher than the upper quartile among the women's top 30. The PGA winnings show the most variability (widest box). All three distributions of earnings are skewed to the right (lower quartile is closer to median than upper quartile).
 

Activity 6-3: Variables of State (cont.)

Region 1 is the West, 2 is the Northeast, 3 is the Midwest, and 4 is the South. Some clues are the high rates of college graduates and single persons in region 2 (Northeast), the high rate of teen mothers in region 4 (South), and the high rate of people of Mexican descent in region 1 (West).
 

Activity 6-4: Top American Films

(a)
 
number
mean
std. dev.
min
Q1
median
Q3
max
won Oscar
32
34.13
18.23
6
22.25
33
46.75
70
did not
68
43.56
17.99
4
30.25
44.5
59.75
85
(b)

(c) Both distributions look fairly symmetric with maybe a slight skew to the right for movies that won the Oscar. The movies that won an Oscar tend to have lower age values than moves that did not win the Oscar. Both distributions show a large spread with slightly more variability among the ages of movies that did not win the Oscar.

(d)-(f) Answers vary from student to student.