Workshop Statistics: Discovery with Data, Second
Edition
Topic 10: Least Squares Regression I
Activity 10-1: Airfares
(a) Answers will vary from student to student. Might choose the mean, 166.92.
(b) Answers will vary from student to student, but some examples would
be which airline you choose to fly with, or the distance to the destination
city, how far in advance you book.
(c) Based on the scatterplot, knowing the distance would be useful
because there appears to be a fairly strong association between distance
and airfare.
(e) Answers will vary from student to student, but $130 is a good estimation.
(f) Answers will vary from student to student, but $260 is a good estimation.
(g) Answers will vary from student to student, but using our answers
from (e) and (f), slope = (260-130)/(1500-300)=13/120=.10833.
(h) Answers will vary from student to student, but using our answers
from (e) and (g), intercept = 97.5.
(i) Answers will vary from student to student, but using our answers
from (g) and (h), airfare = 97.5 + (.108) * distance.
Activity 10-2: Airfares (cont.)
Click here
for Calculator version solution.
Click here for Minitab version
solution.
(a)
|
mean
|
std. dev.
|
airfare (y)
|
166.9
|
59.5
|
distance (x)
|
713
|
403
|
r = .795
(b) b =.795(59.5/403) = .117; a = 166.9-.117(713) = 83.479
(c) airfare = 83.479 + .117 * distance
(d) 83.479+.117(300) = $118.58
(e) $258.98
(f)
(g) Answers will vary from student to student, but a good estimate
would be $190.
(h) $188.78
(i) $415.99; This is probably not a reliabe estimate since a
distance of 2,842 miles is well beyond our data set.
(j)
distance
|
900
|
901
|
902
|
903
|
predicted airfare
|
$188.78
|
$188.89
|
$189.01
|
$189.13
|
(k) Each mile adds about another $0.11, which is close to the slope of
our least squares regression line, .117.
(l) $11.70
Activity 10-3: Airfares (cont.)
(a) $150.87
(b) 178-150.87 = $27.13
(c) Atlanta: fitted - $150.87, residual - $27.13; Boston:
residual - $11.30
(d) St. Louis: distance - 737, airfare - $98, error - overestimate
of $71.77
(e) greater
(f) below
(g) $111.08
(h) 4: Atlanta, Detroit, Pittsburgh, St. Louis
(i) Most cities have a smaller residual than their deviation from the
mean. This suggests that predictions from the regression line are
generally better than the airfare mean because least squares regression
takes the explanatory variable into account.
(j) sum of squared residuals: $14,308.09; sum of squared deviations
from overall mean: $38,882.92
(k) .632
(l) .632; This is the same as the proportion of variability in
the response variable that is explained by the regression model.
(j)-(m) for the Calculator version
Activity 10-4: College Tuitions (cont.)
(a)-(b) public: tuition = -13,138 + 9.59 * founded; r2
= .257
private: tuition = 84,719 - 37.1 * founded; r2
= .255
The two values for r2
are very similar.
(c) The line on the private college scatterplot appears to do a better
job of summarizing the relationship between tuition and founding year.
The points follow the linear relationship much more closely.
(d) public: $5,083; private: $14,229; Judging from the
scatterplot, the private school prediction seems more reasonable because
the points fall closer to the line in the area of 1900 on this scatterplot.
(c)-(e) for the Calculator version