This site is designed solely for the use of Mr. Habib's MAT 120 classes. All rights reserved. No part of this site or its contents from MAT textbook (STATISTICS Informed Decisions Using Data by: Michael Sullivan, III) may be reproduced by any process without written permission by the author or publisher and/or Mr. Habib.
More on Scatter Diagram, Least-Square Regression Line and Residuals
Example1: The time it takes for a planet to complete its orbit around the sun is called the planet’s sidereal year. In 1618, Johann Kepler discovered that the sidereal year of a planet is related to the distance the planet is from the sun. The following data show the distances of the planets from the sun and their sidereal years.
|
Planet |
Distance from Sun, x (millions of miles) |
Sidereal Year, y |
|
Mercury |
36 |
0.24 |
|
Venus |
67 |
0.62 |
|
Earth |
93 |
1.00 |
|
Mars |
142 |
1.88 |
|
Jupiter |
483 |
11.9 |
|
Saturn |
887 |
29.5 |
|
Uranus |
1,785 |
84.0 |
|
Neptune |
2,797 |
165.0 |
|
Pluto |
3,675 |
248.0 |
(a) Draw a scatter diagram of the data, treating distance from the sun as the predictor variable.
(b) Compute the least-squares regression line.
(c) Plot the residuals against the distance from the Sun.
(d) Do you think the least-squares regression line is a good model? Why?
Enter the X-values (distance from the sun) into L1 and the Y-values (sidereal year) into L2.
(a.) First make sure that there is nothing stored in the Y-registers. Press Y= and check the Y-registers. If any of them contain a function, move the cursor to that Y-register and press CLEAR.
Press 2nd
,
Select 1:Plot1, turn ON plot 1 and press ENTER. For Type of graph, select the
scatter plot which is the first selection. Press ENTER. Enter L1 for Xlist and
L2 for Ylist. Highlight the first selection, the small square, for the type of
Mark. Press ENTER. Press ZOOM and 9 to select ZoomStat.
(Note: It is difficult to see all the nine points on the graph. It looks as if there are only 6 data points. In fact, there are nine points but, the first three points are so close together that they are indistinguishable from one another.)
(b.) Press STAT, highlight CALC and select 4: LinReg(ax+b) and press ENTER.
(c,)
Press 2nd
,
select 1:Plot 1, turn ON plot 1 and press ENTER.
For Type
of graph, select the scatter plot which is the first selection. Press ENTER.
Enter L1 for Xlist. Move the cursor to Ylist. Press 2nd
and
select 7:Resid. Highlight the first selection, the small square, for the type of
Mark. Press ENTER. Press ZOOM and 9 to select ZoomStat.
(d.) This graph of the residuals vs. the x-variable show a U-shaped pattern, which indicates that the linear model is not appropriate.
*****************************************************************
Example 2: The following data represent the heights and weights of various professional baseball players.
|
Player |
Height (inches) |
Weight (pounds) |
|
Albert Belle |
74 |
225 |
|
Alex Rodriguez |
75 |
210 |
|
Derek Jeter |
75 |
195 |
|
Greg Maddux |
72 |
185 |
|
Randy Johnson |
82 |
230 |
|
David Justice |
75 |
200 |
|
Al Leiter |
75 |
220 |
|
Barry Bonds |
74 |
210 |
|
Mike Bordick |
71 |
175 |
|
Ron Grant |
72 |
196 |
|
Pete Harnisch |
72 |
228 |
|
Randy Velarde |
72 |
200 |
|
Ray Lankford |
71 |
200 |
|
Jason Isringhausen |
75 |
210 |
(a) Draw a scatter diagram of the data, treating height as the predictor variable and weight as the response variable.
(b) Compute the least-squares regression line and the correlation coefficient.
(c) Remove the value corresponding to Randy Johnson and re-compute the least-squares regression line and the correlation coefficient. What effect does Randy Johnson have on the regression line and correlation coefficient?
(d) Do you think that Randy Johnson is an influential observation? Why?
Enter the X-values (heights) into L1 and Y-values (weights) into L2.
(a.) First make sure that there is nothing stored in the Y-registers. Press Y= and check the Y-registers. If any of them contain a function, move the cursor to that Y-register and press CLEAR.
Press 2nd
,
select 1:Plot 1, turn ON plot 1 and press ENTER. For Type of graph, select the
scatter plot which is the first selection. Press ENTER. Enter L1 for Xlist and
L2 for Ylist. Highlight the first selection, the small square, for the type of
Mark. Press ENTER. Press ZOOM and 9 to select ZoomStat.
(b.) Press STAT, highlight CALC and select 4:LinReg(ax+b) and press ENTER.
(C.) Press STAT and Edit. Move the cursor so that it is flashing on Randy Johnson’s height of ‘82’ in L1 and press DEL. Move the cursor so that it is flashing on Randy Johnson’s weight of ‘230’in L2 and press DEL. Press STAT, highlight CALC and select 4:LinReg(ax+b) and press ENTER.