Stat 301 - HW 8
Due noon, Friday, March 13
Please remember to submit
a separate file for each problem (with the problem number in the file name and
your name inside the file) and to integrate all relevant computer output. Also remember the Extensions assignments have
their own submission page in Canvas.
JMP Note: To
run a “matched pairs analysis” in JMP (using the two columns of data), it’s now
under Analyze > Specialized Modelling > Matched Pairs
0) Confirm with me if you requested a switch to final exam
days
(2pm section:
Friday 1:10-4pm, 3pm section: Monday 1:10-4pm in 10-215)
Interest in
review sessions? 3/15, 3/19
Two online
course evaluations
1) Suppose you want to compare student’s performances on the first two exams in a course to see whether there is convincing evidence that students (in general) tend to perform better on the first exam.
(a) Would you recommend a paired design or an independent samples design? Justify your recommendation,
clarify how each design would differ in this context.
(b) Consider the following two classes. In
one class, pairing appears to be useful, but not in the other class. Explain how you can tell, and what this implies about students’
exam performances in the two classes.
Class A
|
|
|
|
|
Class B |
|
|
|
Exam 1
|
n1 = 12 |
1 = 86.4 |
s1 = 9.5 |
|
Exam 3
|
n3 = 12 |
3 = 86.4 |
s3 = 9.5 |
Exam 2 |
n2 = 12 |
2 = 83.3 |
s2 = 12.3 |
|
Exam 4 |
n4 = 12 |
4 = 83.3 |
s4 = 12.3 |
Differences |
nd = 12 |
d = 3.2 |
sd = 4.5 |
|
Differences |
nd = 12 |
d = 3.2 |
sd = 18.0 |
(c) Below is output of a power analysis in
JMP for an independent samples design.
Explain
the output to a non-statistician. Your
explanation should include
·
Definitions
of Alpha, Std Dev, Difference to detect, Sample Size
·
Interpretation
of the calculated power value in context
·
Sketches
(probably by hand) of the null and alternative sampling distributions, showing
the area under the curve representing power
Hint: If you are having trouble
visualizing this process, perhaps check out this applet: http://www.rossmanchance.com/applets/TwoPopulations.html?population=model
Also keep in mind that calculation of power is a two-step process.
(d) Below is output of a power analysis in
JMP for a paired samples design
Explain
to a non-statistician why the set up of the
calculation has changed, whether this is reasonable, and why and how the power
has changed so much.
2) hw8problem2.Rmd new 3/11
Does
the path that you take to “round” first base make much of a difference?
Hollander and Wolfe (1999) report on a Master’s Thesis by W. F. Woodward (1970)
that investigated different base running strategies. For example, you could
take a “narrow angle” or a “wide angle” around first base.
In Woodward’s study, he used a stopwatch to time 22
different runners going from a spot 35 feet past home
to a spot 15 feet before second. He had each runner use each method, with a
rest period in between, randomizing which method they used first. The data in BaseRunning.txt shows
the time (in seconds) for each running using the narrow angle and the wide
angle.
(a) Compute the differences in times (wide – narrow).
Produce, include, and comment on relevant graphical displays and numerical
summaries for investigating the question of whether there is an advantage for
taking wide angles or narrow angles.
(b) Define the parameter of interest in this study and
write the null and alternative hypotheses for testing whether there is an
advantage for taking wide angles or narrow angles.
(c) Conduct a paired t-test
or use the Matched
Pairs applet to determine whether the data suggest a genuine difference in
times for wide angles and narrow angles. If you use the t-test, make sure comment on whether you believe the test procedure
is valid and how you are deciding. (Remember to include your output.)
(d) Construct, include, and interpret a 95% confidence
interval for estimating the population mean difference in base running
times. (Be sure it’s clear in your
interpretation which method is faster.)
(e) Summarize the conclusions you would draw from this
study. Make sure you comment on significance, estimation, generalizability, and
causation.
(f) Using a sign test
(e.g., Investigation 2.7, Example 3.4).
State the corresponding hypotheses (in symbols and in words) and p-value
(exact binomial or normal approximation).
Compare your conclusions to those in question (c).
3) Researchers investigated a possible link between having a tonsillectomy and developing Hodgkin’s disease. They studied a sample of 85 Hodgkin’s patients who had a sibling of the same sex within 5 years of age who was free of the disease (170 individuals total, 85 pairs). Taking into account the paired nature of these data produces the following table:
|
Hodgkin’s
patient, had
tonsillectomy |
Hodgkin’s
patient, did
not have tonsillectomy |
Total |
Control, had tonsillectomy |
26 |
7 |
33 |
Control, did not have tonsillectomy |
15 |
37 |
52 |
Total |
41 |
44 |
85 |
(a) Identify
the observational units in this study. Clearly identify the population of
interest.
(b) For how
many pairs was the tonsillectomy outcome differ? How many “successes” and how many “failures”
(being clear how you are defining each of these.)
(c) Identify
the parameter of interest in this study.
(d) State
appropriate null and alternative hypotheses for determining whether the
Hodgkin’s sibling was more likely to have a tonsillectomy, in symbols and in
words.
(e) Apply McNemar’s Test to calculate an appropriate p-value
including the name of the appropriate probability distribution and also specify
its input values. Report the p-value
(include output) and summarize your conclusion in context.
(f) Calculate
and interpret a 95% confidence interval for the parameter in (c). Do you think your interval procedure is
valid?
Possible Extension Assignments
·
Use the Two Populations applet to verify/approximate the
power calculation in 1c. First, determine the rejection region, then estimate
the power. Be very clear your process
and include some screen captures. (Hint:
Are you using a one-sided or two-sided test? How do you use the rejection
region in step 1, to estimate the power in step 2?)
·
Find the “Ask Marilyn” column in Parade Magazine from Feb. 21, 2016. Review the discussion and how it relates to
this class.
·
Read, summarize, and critique Kwok et al. (2015) “Face touching: A frequent habit that has
implications for hand hygiene,” American
Journal of Infection Control, 43(112-4).
·
If you performed poorly on a HW
or Exam problem earlier in the quarter, include your reworked solution.
·
Review and reflect on the ASA
Press Release on p-values!
https://www.causeweb.org/cause/caption-contest/march/2020/submissions