Workshop Statistics: Discovery with Data and Fathom
Topic 12: Sampling
Activity 12-1: States and SATs
(a) yes
(b) No, we didn't collect data from all students at the school. To
decide if you have a reasonable estimate, you have to decide if you think
the travel behavior of your class is representative of that of the rest
of the school.
Activity 12-2: Elvis Presley and Alf Landon
(a)
-
population: Americans
-
sample: those Americans listening to those radio stations who cared to
call in
(b) No, it is probably not an accurate reflection of beliefs of all Americans.
People who choose to call in (take the time, spend the money) probably
feel differently about the issue than other Americans. Was also only a
Dallas radio station, may not be broadcast nationwide, and opinion of southerns
may differ since Elvis was from the south..
(c)
-
population: Americans
-
sample: those Americans who were listed in telephone books and vehicle
registration lists who cared to return the poll
(d) Their prediction was in error because their sampling technique was
biased. By sampling people who owned vehicles, they were looking
at people with some money. In 1936, people with money tended to vote Republican
(conservative). Those without money tended to vote Democratic (social
change). Thus, the pollsters heard mostly from Republicans, but then lots
of Democrats turned out on election day.
Activity 12-3: Sampling Senators
(a)
-
Sex: categorical, binary
-
Party: categorical, binary
-
State: categorical
-
Years of service: quantitative
(b) Answers will vary from student to student.
(c) sample
(d)-(h) Answers will vary from student to student.
(i) Answers will vary from class to class.
(j) This sampling method is biased because people will tend to choose
senators from their own state, as well as senators who have served for
many years. The latter has a direct impact on our measurement of "years
of service."
(k) No, this would not be likely to produce more representative samples.
Taking more data doesn't make up for the fact that the sampling is biased.
Activity 12-4: Sampling Senators (cont.)
Students' answers to (a)-(h) may differ since the
data is chosen randomly. These are meant to be sample answers using
row 1 of the random number table.
(a)
ID #
|
Name
|
Party
|
Years
|
Your state?
|
17
|
Byrd
|
Dem
|
40
|
no
|
13
|
Brownback
|
Rep
|
2
|
no
|
92
|
Stevens
|
Rep
|
31
|
no
|
78
|
Reid
|
Dem
|
12
|
no
|
38
|
Frist
|
Rep
|
4
|
no
|
(b) no
(c)
-
No. of Democrats: 2
-
No. of Republicans: 3
Years of service
Sample Mean
|
Min
|
Max
|
17.8
|
2
|
40
|
(d) The proportional breakdown is Democrats - .4, Republicans - .6.
This is not equal to the proporitonal breakdown of the entire population
of Senators, though it is close. The mean years of service of the
sample is higher than that of the population.
(e) This does not mean that this sampling is biased, we may have just
gotten unlucky.
(f)-(h) Answers will vary from class to class.
(i)
-
The 56% of callers who believed that Elvis was alive: statistic
-
The 57% of voters who indicated they would vote for Alf Landon: statistic
-
The 63% of voters who voted for Franklin Roosevelt: parameter
-
The mean years of serivice among the 100 Senators: parameter
-
The mean years of service among your five Senators: statistic
-
The proportion of men in the entire 1999 Senate: parameter
Activity 12-5: Sampling Senators (cont.)
(a)-(f) Answers will vary from student to student.
(g) The distribution using a sample size of 20 should have less variability.
(h) A sample size of 20 is more likely to have a mean closer to the
true mean of the population.
Activity 12-6: Sampling Senators (cont.)
Students' answers may differ since the data is chosen
randomly. These are meant to be sample answers.
(e)
The distribution is roughly
symmetric, with a center of spread around .45. There is a high level
of granularity.
(f)
(g) Yes, both distributions seem to be centered at the population value
of .45. The mean of the first distribution is .43 and the mean of
the second distribution is .46. Both of these are pretty close to
.45, so the sample proportion is unbiased.
(h) Yes, these distributions seem to have similar variability.
(i) Nothing significant changed when we sampled from the larger population.