Line of best fit concept
Regression line: A line that best represents the relationship between x (independent) and y (dependent) variables. Minimizes distance from all points.
| Component | Meaning |
|---|---|
| a | y-intercept (value when x=0) |
| b | slope (change in y per unit x) |
| x | independent variable (predictor) |
| y | dependent variable (response) |
Key idea: Regression finds the line that best fits the data pattern. Used for prediction.
Least squares method
What is minimized?: Least squares minimizes sum of squared vertical distances (residuals) from points to line.
Worked example
Points (1,2), (2,3), (3,5). Find regression line using least squares concept.
Concept
- Least squares finds line where sum of (observed y - predicted y)2 is smallest
- Formula for slope b: involves correlation r, SDx, SDy
- Formula for intercept a: involves means of x and y
- Calculator or software usually does this
Final answer
Slope measures how much y changes per unit x. Intercept is y-value at x=0.
Memorize terms 3x faster
Smart flashcards show you cards right before you forget them. Perfect for definitions and key concepts.
Using regression for prediction
Worked example
Regression line: y = 1.5 + 0.8x. Predict y when x=10.
Solution
- Substitute x=10 into equation
- y = 1.5 + 0.8(10)
- y = 1.5 + 8
- y = 9.5
Final answer
When x=10, predicted y=9.5.
Extrapolation warning: Predictions are only reliable within the range of data used. Predicting far outside this range is risky (extrapolation).
Assessing regression fit
Residual: Difference between observed y and predicted y: residual = observed - predicted.
Worked example
At x=2: observed y=4, predicted y=3.1. What is residual?
Solution
- Residual = 4 - 3.1 = 0.9
- Positive residual: actual point above line
- Negative residual: actual point below line
- Small residuals mean line fits well
Final answer
Residual=0.9. Point is 0.9 units above the regression line.
R-squared: R2 measures proportion of variation explained by regression. Closer to 1 means better fit.