Example 4.2: Pushing On

 

As in many states, California mandates physical fitness testing at different grade levels.  The recommended number of push-ups for 12 year old males is 10-20 and for 13 year old males is 12-25.  A sample of 80 7th grade males was obtained at a rural high school in Central California (Wetzel and Hernandez, 2004).  Data was gathered using the measurement techniques defined by the state. (The feet are together, hands will be shoulder width apart, the subjects back will be straight, and their eyes will be looking toward the horizon.  The arms need to bend to a 90-degree angle, while keeping their back flat. The push-ups are counted at a set tempo without stopping to rest.)

 

A histogram and summary statistics of the sample data are below:

(a) Describe the distribution of the results in this sample.

(b) Suppose we take random samples of 80 males from a very large population.  According to the Central Limit Theorem, what can you say about the behavior of the sampling distribution of the sample means calculated from these samples?

(c) If this were a random sample from a population, would the sample data provide strong evidence that the population mean differs from 20 push-ups?  Conduct a significance test to address this question.  Also calculate confidence intervals to estimate the population mean with various levels of confidence.

(d) What precautions should we have in analyzing these results?

(e) Would it be reasonable to use this sample to calculate a prediction interval for the number of push-ups by a 7th grade male in California? 

 

 

Analysis:

(a) The sample distribution has a slight skew to the right.  The average number of push-ups by the 80 males in the sample was 15.49, with standard deviation 7.74 push-ups.  Most students completed about 10-25 push-ups, with the maximum around 40 or so.

(b) Since the sample size is large (80 > 30), the sampling distribution of sample means should be approximately normal, regardless of the population shape, with mean m, and standard deviation equal to s/.  The exact values of m (the population mean number of push-ups done by 7th grade males) and s (the population standard deviation) are unknown, but they should be in the ball park of 15.49 and 7.74, the sample statistics.

(c) Test of significance:  If this was a random sample from a larger population of 7th graders, let m represent the mean number of push-ups that would be completed in this population. We want to decide whether m is significantly different from 20.

            H0: m = 20 (the population mean number of push-ups is 20)

            Ha; m ≠ 20 (the population mean differs from 20)

           

 
Since we are working with a quantitative response variable and the sample size is large, we will model the sampling distribution of the sample means with the t distribution with 80-1=79 degrees of freedom.

 

            =-5.21

 

            p-value = 2P(T79 <-5.21)  = 2(.0000007) = .0000014

 

Student's t distribution with 79 DF

 

    x  P( X <= x )

-5.21    0.0000007

 

With such a small p-value, we easily reject the null hypothesis and conclude that the population mean number of push-ups differs from 20.  While the test tells us that the sample data provide overwhelmingly strong evidence that the population mean is not 20, it does not tell us what values are in fact plausible for the population mean.

 

Confidence Intervals:

Again let m represent the mean number of push-ups that would be completed in the hypothetical population of 7th graders. 

Since we are working with a quantitative response variable and the sample size is large, we will model the sampling distribution of the sample means with the t distribution with 80-1=79 degrees of freedom.

 

To construct a 95% confidence interval for m, we will use t*79=1.990

Inverse Cumulative Distribution Function

Student's t distribution with 79 DF

 

P( X <= x )         x

      0.025  -1.99045

     

       = 15.49 + 1.72 = (13.76, 17.21)

 

Verifying these calculations in Minitab, we would find:

Test of mu = 20 vs not = 20

 

 N     Mean   StDev  SE Mean        95% CI            T      P

80  15.4900  7.7400   0.8654  (13.7675, 17.2125)  -5.21  0.000

 

Based on this sample, assuming it represents the larger population, we are 95% confident that the average number of push-ups completed by all 7th graders in the population is between 13.76 and 17.21, so 20 is rejected as a plausible value at the .05 level of significance.  We could also find 99% and 99.9% confidence intervals for m to be:

            95%:  = 15.49 + 2.28 = (13.21, 17.77)

            99%:  = 15.49 + 2.96 = (12.53, 18.45)

Thus, even with these stricter standards of 99% and 99.9% confidence, we still have reason to believe that the population mean is less than 20 push-ups.  These results are consistent with the extremely small p-value from the significance test above.

 

(d) We should be very cautious in generalizing these results as we don’t know if the push-up performance of the students sampled at this rural high school in Central California is representative of a larger population 7th graders in the state.  The margin of error calculated only takes into account the expected amount of random sampling error, not any biases from our sampling methods.

 

(e) If we wanted to calculate a prediction interval, we would have the same concern that this sample may not be representative of 7th graders across the state, and the additional concern that the population distribution may not follow a normal distribution.  We have reason to doubt that it does since the sample shows some skewness to the right.  So while we may want to predict the number of push-ups by an individual in this population, it would be risky to do so with these data.