Key Idea: Topic 4.4 is about understanding the relationship between two variables. A scatter diagram shows whether they move together. Pearson's correlation coefficient r measures the strength and direction of a linear relationship. Linear regression gives you the equation of the line of best fit so you can make predictions.
✅ Pearson's r — interpreting strength
✅ Linear regression: y = ax + b
Example: GDC output: LinReg on data gives a = 3.14, b = 7.20, r = 0.92. Line of best fit: y = 3.14x + 7.20 Interpret: strong positive correlation. For each unit increase in x, y increases by 3.14. Predict y when x = 10: y = 3.14(10) + 7.20 = 38.6
Always state the value of r and describe its meaning in context. Just saying 'r = 0.92' is not enough — 'strong positive linear correlation between height and weight' earns the mark. The regression line always passes through the mean point (x̄, ȳ). If you need to check your equation, substitute x̄ and verify you get ȳ.
Paper 2 (GDC allowed): Enter data, run LinReg, record a, b, r. Then use the equation to predict as required. Show the substitution step. Paper 1: You may be given r and asked to describe the correlation, or given the equation and asked to interpret the gradient in context. Always link numbers to real-world meaning.
IB-style question [7 marks]
A café records the daily maximum temperature in °C (x) and the number of iced drinks sold (y) on 6 days. x: 16, 19, 22, 25, 28, 31 y: 40, 58, 70, 88, 104, 118 (a) Find the Pearson product-moment correlation coefficient r. (b) Write down the equation of the regression line of y on x. (c) Use your equation to estimate the number of iced drinks sold when the temperature is 24 °C.
Step by step:
(a) Enter the temperatures in L1 and the sales in L2, then run LinReg(ax+b). The GDC reports r.
(b) The same calculation gives the gradient a and intercept b — write the line (3 s.f.).
(c) Substitute x = 24, which is inside the data range 16–31, so the estimate is reliable.
Sales must be a whole number, so round to the nearest drink.
(a) r = 0.999 (very strong positive). (b) y = 5.20x − 42.5. (c) y = 5.20(24) − 42.5 ≈ 82 drinks (24 °C is within the data, so the estimate is reliable).