Five steps: enter → choose → run → read → use: Every GDC regression question follows the same five-step workflow.
Practice this until it is automatic — in an exam you have no time to experiment.
GDC regression workflow (TI-84)
- STAT → EDIT → enter x-values in L1, y-values in L2.
- STAT → CALC → choose the correct regression type (LinReg, QuadReg, ExpReg, PwrReg, SinReg…).
- Confirm lists L1 and L2, then press ENTER.
- Read the model coefficients (a, b, c, r or R²).
- Store to Y1 via "RegEQ" (or type manually) to evaluate and graph.
| TI-84 menu option | Casio equivalent | Model type |
|---|---|---|
| LinReg(ax+b) | Reg → Linear | y = ax + b (linear) |
| QuadReg | Reg → Quadratic | y = ax² + bx + c |
| CubicReg | Reg → Cubic | y = ax³ + bx² + cx + d |
| ExpReg | Reg → Exponential | y = abx |
| PwrReg | Reg → Power | y = axb |
| SinReg | Reg → Sinusoidal | y = a sin(bx + c) + d |
IB-style question — run regression and predict [5 marks]
A shop records the price p ($) it charges for a phone case and the number of cases q sold per week for five different prices:
p: 6, 8, 10, 12, 14
q: 92, 79, 66, 53, 40
(a) Use your GDC to find the equation of the regression line of q on p, giving each coefficient to 3 significant figures.
(b) Use your equation to estimate the number of cases sold per week when the price is $11.
Step by step
- (a) Enter the prices in L1 and the sales in L2, then run LinReg(ax+b) on the GDC. The calculator returns the gradient a and intercept b.
- Write the line in the form q = ap + b, rounding each coefficient to 3 s.f.
- (b) Substitute p = 11 into the regression equation.
- Sales are whole cases, and p = 11 lies inside the data (interpolation), so round sensibly.
Final answer
(a) q = −6.50p + 131. (b) About 60 cases per week.
Look at the scatter plot shape first: Before running any regression, plot the data (STAT PLOT on TI; StatGraph on Casio) and look at the shape.
The shape tells you which model to try.
| Scatter plot shape | Model to try | Key signal in question |
|---|---|---|
| Straight line | Linear (LinReg) | "constant rate", "per unit", r close to ±1 |
| Single peak or valley | Quadratic (QuadReg) | "maximum", "minimum", projectile |
| Rapid increase, levels off | Exponential (ExpReg) | "percentage growth/decay", "doubles every..." |
| Curve through origin, increasing | Power (PwrReg) | "proportional to square/cube", "directly proportional to" |
| Repeating up-down pattern | Sinusoidal (SinReg) | "tide", "temperature cycle", "Ferris wheel" |
The question often tells you the model type: IB questions usually say "The data can be modelled by y = aebx" or "use a quadratic regression".
If the type is given, just run that regression — no need to guess.
When not given, use the scatter plot and context clues.
Memorize terms 3x faster
Smart flashcards show you cards right before you forget them. Perfect for definitions and key concepts.
Copy coefficients exactly — then write the equation: After running regression, the GDC displays coefficients.
Write the full equation immediately — do not rely on memory.
Round coefficients to 3 significant figures unless the question specifies otherwise.
Xavie collects apartment prices y (millions $) and distances x (km) from city centre.
GDC LinReg gives: a = −0.0693, b = 3.10, r = −0.998.
Write the regression equation and use it to predict y when x = 15.
Step by step
- Write the linear regression equation.
- Comment on r: r = −0.998 is very close to −1 → strong negative linear correlation. The linear model is appropriate.
- Predict y when x = 15 (within data range, so interpolation).
Final answer
y = −0.0693x + 3.10. Predicted price at 15 km from centre ≈ $2.06 million.
r (Pearson) for linear; R² for all other models: The Pearson correlation coefficient r measures how well a LINEAR model fits.
For non-linear models, use R² (coefficient of determination).
R² close to 1 means the model explains the data well.
| Statistic | Range | Meaning of value near ±1 or 1 |
|---|---|---|
| r | −1 to +1 | Strong linear relationship. r = +1: perfect positive line. r = −1: perfect negative line. |
| R² | 0 to 1 | Proportion of variation explained by the model. R² = 0.97 → 97% of variation explained. |
What to write when commenting on r: IB mark scheme expects: (1) state the value of r, (2) describe the strength (strong / moderate / weak), (3) state the direction (positive / negative).
Example: "r = −0.998 indicates a strong negative linear correlation between distance and apartment price."
IB-style question — comment on the correlation coefficient [3 marks]
For two sets of bivariate data, a GDC reports the following correlation coefficients for a linear regression.
Data set A: r = −0.96
Data set B: r = 0.38
(a) Describe the correlation in data set A.
(b) State, with a reason, whether a linear model is appropriate for data set B.
Step by step
- (a) When you comment on r, give three things: the strength, the direction, and that it is linear. Here r = −0.96 is very close to −1.
- So data set A shows a strong negative linear correlation — as one variable increases, the other tends to decrease, and the points lie close to a straight line.
- (b) For data set B, r = 0.38 is close to 0, which means the points are widely scattered about any straight line. A linear model is therefore a poor fit and not appropriate.
Final answer
(a) Strong negative linear correlation. (b) No — r = 0.38 is close to 0, so the data is weakly correlated and a linear model is not appropriate.