Example 5

Example 5.2: Speed Limit Changes

In 1995, the National Highway System Designation Act abolished the federal mandate of 55 miles per hour maximum speed limit and allowed states to establish their own limits. Of the 50 states (plus District of Columbia), 32 increased their speed limits in 1996. The data in trafficFatalities.mtw shows the percentage change in interstate highway traffic fatalities from 1995 to 1996 and whether or not the state increased their speed limit. (Data from the National Highway Traffic Safety Administration as reported in Ramsey and Schafer, 2002.)

(a) Identify the observational units and response variable of interest. Is this a randomized experiment or an observational study?

(b) Produce numerical and graphical summaries of these data and describe how the two groups compare.

(d) Carry out a two-sample t-test to determine whether the average percentage change in interstate highway traffic fatalities is significantly higher in states that increased their speed limit. If you find a significant difference, estimate its magnitude with a confidence interval.

(e) Discuss what the p-value in (d) measures.

Analysis:

(a) The observational units are the 50 states (and the District of Columbia). The response variable of interest is the percentage change in traffic fatalities from 1995 to 1996. This is an observational study because the researchers did not randomly assign which states would increase their speed limits.

(b) The following graphical display is dotplots of the percentage change in traffic fatalities for each state (and D.C.) in the two groups on the same scale:

Since the distributions are reasonably symmetric, it makes sense to report the means and standard deviations as the numerical summaries:

No increase _no = -8.56% s_no = 31%

Increase _yes = 13.75% s_yes = 21%

These results indicate that there is a tendency for the percentage change in traffic fatalities to be higher in those states that increase their speed limits. This tendency is also seen in stacked boxplots:

The boxplots also reveal an outlier, the District of Columbia, which did not change its speed limit and had an unusually high decrease in the percentage change of accidents.

These summaries also reveal that the two sample distributions are reasonably similar in shape and spread.

(c) In considering the technical conditions, we see that the sample sizes (19 and 32) are reasonably large. Coupled with the normal shaped sample distributions, the normality/large sample size conditions appears to be satisfied for us to use the t distribution.

The other technical condition is that we have independent random samples or randomization. We do not have either in this study, because we are examining the population of all states (and D.C.) and the states self-selected whether they changed their speed limit. Thus, any p-value we calculate would be hypothetical. Since we have all the states here, we might ask the question: would the two groups look this different if whether or not they increased their speed limit had been assigned at random? Thus, we will proceed as if this was a hypothetical experiment.

(d) Let d represent the true “effect” of increasing the speed limit on the traffic fatality rate (states that didn’t change speed limit – states the did change speed limit)

H₀: d = 0 there is no true effect from increasing the speed limit

Ha: d < 0 increasing the speed limit leads to an increase in traffic fatalities (higher average percentage change with increase in speed limit)

In theory, we can apply the two-sample t procedure to model the hypothetical randomization distribution. In this case, the test statistic will be

= -2.78

If we approximate the degrees of freedom by min(19-1, 32-1) = 18, then we find the one-sided p-value in Minitab to be:

These calculations are confirmed by the Test of Significance Calculator applet and by Minitab:

Note: Minitab uses a more exact method for determining the degrees of freedom. Our “by hand” method (also used in the applet) is conservative in that the p-value found will be larger than the actual p-value as seen here.

Such a small p-value (.005 < .01) reveals that we would observe such a large difference in group means by random assignment alone if there was no treatment effect only about 5 times in 1000, convincing us that the observed difference in the group means is larger than what we would expect just from randomization. We have strong evidence that something other than “random chance” led to this difference. However, we cannot attribute the difference solely to the speed limit change since this was not actually a randomized experiment. Since the states self-selected, there could be confounding variables that help to explain the larger increase in fatality rates in states that increased their speed limit.

Since we rejected the null hypothesis, we are also be interested in examining a confidence interval to estimate the size of the treatment effect. We first approximate the t* critical value for say 95% confidence, again using min(19-1,32-1) = 18 as the degrees of freedom.

Then the 95% confidence interval can be calculated,

= -22.4 + 16.90

We are 95% confident that the true “treatment effect” is in this interval or that the mean percentage increase in traffic fatality rates is between 5.5% to 39.3% higher in states that increase their speed limit compared to states that do not increase their speed limit (continuing to be careful not state this as a cause and effect relationship).

Before we complete this analysis, it is worthwhile to investigate the amount of influence that the outlier (the District of Columbia) has on the results, especially since D.C. does have different characteristics from the states in general. The updated Minitab output is below:

As we might have guessed, the mean increase in fatalities for the “No” group has increased so that the difference in the group means is less extreme. This leads to a less extreme test statistic and a larger p-value. While we would still reject the null hypothesis at the 5% level of significance, we would not at a 1% level of significance. Thus, we would now say we have “moderate” evidence against the null hypothesis.

(e) The above p-value measures how often we would see a difference in group means at least this large based on random assignment to the two groups if there was no true treatment effect. However, since this was not a randomized study, this p-value should be considered hypothetical. Still, we have some sense that the difference observed between the groups is larger than we would expect to see “by chance” even in a situation like this where it is not feasible to carry out a true randomized experiment. This gives some information that can be used in policy decisions but we must be careful not to overstate the attribution to the speed limit change.