Variance and standard deviation
Standard deviation: SD(X)=√Var(X).
Measures spread around mean.
Same units as X.
Worked example
RV X: values 1,2,3 with probs 0.2,0.5,0.3.
Find Var(X).
Solution
- E(X)=1(0.2)+2(0.5)+3(0.3)=2.1
- E(X²)=1²(0.2)+4(0.5)+9(0.3)=4.7
- Var(X)=4.7-2.1²=4.7-4.41=0.29
- SD(X)=√0.29≈0.54
Final answer
Variance=0.29, SD≈0.54.
Variance properties
Independence: If X and Y independent: Var(X+Y)=Var(X)+Var(Y).
Var(X-Y)=Var(X)+Var(Y) also!
Worked example
Var(X)=4, Var(Y)=9, independent.
Find Var(X+Y) and Var(2X).
Solution
- Var(X+Y)=4+9=13
- Var(2X)=2²(4)=16
- SD(X+Y)=√13≈3.6, SD(2X)=4
Final answer
Var(X+Y)=13, Var(2X)=16.
Feeling unprepared for exams?
Get a clear study plan, practice with real questions, and know exactly where you stand before exam day. No more guessing.
Interpreting spread
Larger SD means: More spread around mean.
Values more likely to be far from mean.
Less predictable outcome.
| SD value | Interpretation |
|---|---|
| Small (near 0) | Values cluster near mean |
| Large (>2) | Values spread out, high variability |
Comparison example
Distribution A: SD=0.1.
Distribution B: SD=2.
Which is more predictable?
Answer
- A has SD=0.1 (tiny spread)
- B has SD=2 (large spread)
- A is highly predictable: values stay near mean
- B is unpredictable: values vary widely
Final answer
A more predictable. Smaller SD = clustering.
Variance for grouped data
From frequency table: Use class midpoints as x values.
Apply same variance formula.
Worked example
Classes [0-10) freq 5, [10-20) freq 8, [20-30) freq 7.
Find variance.
Solution
- Midpoints: 5,15,25. Total n=20
- E(X)=(5×5+15×8+25×7)/20=2.5
- E(X²)=(25×5+225×8+625×7)/20
- Calculate Var(X) from formula
Final answer
Use midpoint method for grouped variance.
IB-style question — standard deviation of grouped data [6 marks]
The waiting times, t minutes, of 40 customers at a service desk are grouped in the frequency table below.
Time (min): 0 ≤ t < 20, 20 ≤ t < 40, 40 ≤ t < 60, 60 ≤ t < 80
Frequency: 4, 10, 14, 12
(a) Write down the mid-interval value of each class.
(b) Use your GDC to find an estimate for the mean waiting time.
(c) Find the standard deviation of the waiting times.
Step by step
- (a) The mid-interval value is the middle of each class.
- (b) Enter the midpoints in L1 and the frequencies in L2, then run 1-Var Stats. The mean is the reported x̄.
- (c) The standard deviation is the σx (population) value the GDC reports.
- The GDC returns σx directly.
Final answer
(a) Midpoints 10, 30, 50, 70. (b) Mean ≈ 47 minutes. (c) Standard deviation σx ≈ 19.3 minutes (use the σx value, not Sx).