Workshop Statistics: Discovery with Data, Second
Edition
Topic 12: Sampling
Activity 12-1: States and SATs
(a) yes
(b) No, we didn't collect data from all students at the school. To
decide if you have a reasonable estimate, you have to decide if you think
the travel behavior of your class is representative of that of the rest
of the school.
Activity 12-2: Elvis Presley and Alf Landon
(a)
-
population: Americans
-
sample: those Americans listening to those radio stations who cared to
call in
(b) No, it is probably not an accurate reflection of beliefs of all Americans.
People who choose to call in (take the time, spend the money) probably
feel differently about the issue than other Americans. Was also only a
Dallas radio station, may not be broadcast nationwide, and opinion of southerns
may differ since Elvis was from the south..
(c)
-
population: Americans
-
sample: those Americans who were listed in telephone books and vehicle
registration lists who cared to return the poll
(d) Their prediction was in error because their sampling technique was
biased. By sampling people who owned vehicles, they were looking
at people with some money. In 1936, people with money tended to vote Republican
(conservative). Those without money tended to vote Democratic (social
change). Thus, the pollsters heard mostly from Republicans, but then lots
of Democrats turned out on election day.
Activity 12-3: Sampling Senators
(a)
-
Sex: categorical, binary
-
Party: categorical, binary
-
State: categorical
-
Years of service: quantitative
(b) Answers will vary from student to student.
(c) sample
(d)-(h) Answers will vary from student to student.
(i) Answers will vary from class to class.
(j) This sampling method is biased because people will tend to choose
senators from their own state, as well as senators who have served for
many years. The latter has a direct impact on our measurement of "years
of service."
(k) No, this would not be likely to produce more representative samples.
Taking more data doesn't make up for the fact that the sampling is biased.
Activity 12-4: Sampling Senators (cont.)
Students' answers to (a)-(h) may differ since the
data is chosen randomly. These are meant to be sample answers using
row 1 of the random number table.
(a)
ID #
|
Name
|
Party
|
Years
|
Your state?
|
17
|
Byrd
|
Dem
|
40
|
no
|
13
|
Brownback
|
Rep
|
2
|
no
|
92
|
Stevens
|
Rep
|
31
|
no
|
78
|
Reid
|
Dem
|
12
|
no
|
38
|
Frist
|
Rep
|
4
|
no
|
(b) no
(c)
-
No. of Democrats: 2
-
No. of Republicans: 3
Years of service
Sample Mean
|
Min
|
Max
|
17.8
|
2
|
40
|
(d) The proportional breakdown is Democrats - .4, Republicans - .6.
This is not equal to the proporitonal breakdown of the entire population
of Senators, though it is close. The mean years of service of the
sample is higher than that of the population.
(e) This does not mean that this sampling is biased, we may have just
gotten unlucky.
(f)-(h) Answers will vary from class to class.
(i)
-
The 56% of callers who believed that Elvis was alive: statistic
-
The 57% of voters who indicated they would vote for Alf Landon: statistic
-
The 63% of voters who voted for Franklin Roosevelt: parameter
-
The mean years of serivice among the 100 Senators: parameter
-
The mean years of service among your five Senators: statistic
-
The proportion of men in the entire 1999 Senate: parameter
Activity 12-5: Sampling Senators (cont.)
Note: The answer to (a) below is the answer to (a)
and (b) in the Minitab version. The answers to (b)-(f) below are
the answers to (c)-(g) respectively in the Minitab version.
Students' answers to (a)-(c) may differ since
the data is chosen randomly. These are meant to be sample answers.
(a)
sample
|
1
|
2
|
3
|
4
|
5
|
6
|
7
|
8
|
9
|
10
|
proportion Dem
|
.7
|
.5
|
.5
|
.4
|
.4
|
.4
|
.6
|
.6
|
.7
|
.7
|
mean years
|
12.9
|
12.3
|
11.6
|
13.2
|
8.6
|
10.5
|
19.1
|
6.3
|
12.0
|
13.5
|
(b) no and no
(c)Answers will vary from student to student.
(d) yes
(e) sample size of 20
(f) sample size of 20
Activity 12-6: Sampling Senators (cont.)
Note: The second half of the answer to (b) below
is the answer to (c) with the Minitab version. The answers to (c)
and (d) are the answers to (e) and (e) in the Minitab version.
Students' answers to (a)-(c) may differ since
the data is chosen randomly. These are meant to be sample answers.
(a)
This distribution is roughly
symmetrical, with the center of spread at roughly 0.45. There is
a high level of granularity.
(b)
This distribution is roughly
symmetrical, with the center of spread at roughly 0.45. There is
a high level of granularity. The sample proportion is unbiased in
both of these situations.
(c) These distributions seem to have similar variability. Their
level of granularity is the same. The proportions of data are about
the same.
(d) Not much changed when we sampled from the much larger population.