Stat 301 - HW 3

Due noon Friday, Jan. 24

You can bring a hard copy to class Thursday, to my office Friday, or upload in Canvas by noon Friday. No late assignments will be accepted. If electronic, remember to upload separate files for each problem and to put your name inside each file. All computer output should be integrated into the body of your write up. Your computer output and writeup should be done individually.  Remember to show your work/calculations/computer details.

Technology note: I’m fine with you spelling out symbols (e.g., “pi") on homework assignment or click here for instructions on creating symbols (, ) in your Word/Google Doc files.

 

1) Continuation from the Day 9 (Tuesday) handout.

(j) Suppose the researchers had conjectured that there would be “interference” and the subject would actually perform worse than guessing in the long run.  So they hypothesized

            H0:  = 0.50

            Ha:  < 0.50

Calculate the power (use the Power Simulation applet) assuming an alternative value of 0.45 and a sample size of 150.  Include your output.

(k) Suppose the researchers didn’t know in advance whether he would do better or worse than guessing so they hypothesized

            H0:  = 0.50

            Ha:  ¹ 0.50

Calculate the power assuming an alternative value of 0.45 and a sample size of 150.  Include your output.

(l) How does the power in (k) compare to (j)?  Give an intuitive explanation for why the power has changed this way. (This question won’t be graded, give it your best shot J)

(m) Verify your calculations in (j) and (k) in JMP or R (see p. 60).  Include relevant output.

[Hint: In JMP, DOE > Design Diagnostics > Sample Size and Power. If you don’t see DOE, try View > JMP Starter]

 

 

2) I once saw a claim on a “trivia coaster” from the Rock n’ Roll Hall of Fame that 30% of all songs with a color in the title are about the color blue.

(a) Describe the parameter of interest.

(b) Suppose we decide to take a random sample of 75 songs to test this claim. Does 75 appear to be a large enough sample size for the normal approximation to the binomial to be valid?  Explain your reasoning.

AC749AC4

(c) If the normal approximation suggested in (b) is valid, what shape, mean, and standard deviation do you predict for the distribution of sample proportions?

(d) Use the normal distribution suggested in (c) to approximate the probability that between 24% and 36% (inclusive) of the songs in the sample will be “blue.” [Use the normal probability calculator applet and include documentation of your inputs and output, including a good label on the horizontal axis.]

Further exploration of confidence intervals.

(e) Open the Simulating Confidence Intervals applet. Specify  = 0.30 and n = 75. Press Sample. The applet generates a random sample of 75 observations from a process with  = 0.30. Click on the interval to reveal the endpoints. Include a screen capture showing these endpoints.  Report the sample proportion for this interval (look at the bottom) and verify that the endpoints correspond to . (Show the numbers substituted in.)

(f) Change the number of Intervals to 100 and press Sample. The applet generates 100 random samples from the process you specified and calculates a confidence interval for each sample.  Include a screen capture of the distribution of sample proportions and of the intervals.  What do the red and green colors tell you?

(g) Continue to press Sample until you have generated at least 1,000 intervals.  Include a screen capture of the Running Total output.

(h) Based on what you found in (g), explain to a friend what “95% confidence” means. [Hint: Don’t use the words probability, confidence, chance, odds, likelihood etc. in your explanation. See p. 80?]

(i) Use the pull-down menu to change the Method from Wald to Adjusted Wald. Press Sample 10 times to generate at least 1,000 intervals in the running total.  How does the Running Total percentage with this method compare to that in (g)? [Include a screen capture.]

 

 

3) Patients with kidney disease often have a procedure called dialysis done to clean their blood if their kidneys can’t do this properly. This procedure is often done three days per week, with Monday, Wednesday, and Friday often being those days. In terms of the dialysis treatments, these days are all the same except for the gap of two off days before the Monday treatment. Does this gap make a difference? Cardiac arrest and sudden death for patients undergoing dialysis for kidney disease are possibilities. In recent years, it has been noted that these types of patients have cardiac arrests on Mondays more often than what would be expected. A study published in Kidney International (Karnik et al., 2001) looked at 205 dialysis patients that had cardiac arrests on Monday, Wednesday, or Friday, the same days they had dialysis. They found that 93 of these happened on a Monday. Do we have convincing evidence of a larger or smaller probability of cardiac arrests occurring on Mondays compared to the other two days?

(a) Define the parameter of interest in words.

(b) State the null and alternative hypotheses for these research question.

(c) Calculate the sample proportion (to four decimal places).

(d) Does the normal approximation to the binomial distribution appear to be valid for these data? Explain your reasoning.

(e) Carry out a one-sample z-test using an applet or R or JMP, including your output. Report the test statistic (z-score) and p-value.  Provide a one-sentence interpretation of each in context.

(f) Use technology to compute a 95% confidence interval (one proportion z-interval) for the parameter. Include your output and provide a one-sentence interpretation of the interval in context.

(g) Does 1/3 appear to be a reasonable value of ? Explain how your results in (e) and (f) lead to the same conclusion.

 

Project 1

·       Form a group of 3 people

·       Brainstorm ideas (I highly encourage you to think of a claim you recently overheard and how you could gather data to test the claim)

·       Submit your idea in Canvas (by Monday, but the sooner the better). Follow the link for Project 1 under Assignments and type the information in the text box (one person per group).

 

Possible Extension Ideas:

·       Last set of job candidates!

o   Research presentation: Tuesday, 1/21 at 4:10 PM in 33-285

o   Teaching demonstration: Wednesday, 1/22 at 11:10 AM in 38-121

o   Research presentation: Thursday, 1/23 at 4:10 PM in 38-135

o   Teaching demonstration: Friday, 1/24 at 11:10 AM in 38-121

·       Find an article that talks about using a power calculation to select the sample size for a study.

·       Look into some of the controversy and debate over confidence intervals for a single proportion!

·       Look into some of the controversy and debate over the use of p-value in scientific studies!

·       Watch this inappropriate critique of scientific studies by John Oliver (19:27), identify 2-3 points you agree with and 2-3 points you don’t agree with

·       Watch this wonderful commentary on the Census from the old Daily Show. How would you explain statistical sampling?