What is an outlier?
Big Idea: An outlier is a data value that is much larger or much smaller than the other values in the dataset.
Unusual or extreme.
| Example data | Outlier? | Why? |
|---|---|---|
| 85, 87, 89, 91, 93 | No | All close together |
| 85, 87, 89, 91, 150 | Yes | 150 much larger |
| 2, 3, 4, 5, 6 | No | All consecutive |
| 2, 3, 4, 5, −50 | Yes | −50 much smaller |
Causes of outliers
Legitimate: Real extreme values: World records, rare events.
Keep.
Error: Measurement mistakes, wrong units.
Investigate and consider removing.
Key point: Never blindly remove outliers.
Investigate first.
Worked example — investigate before removing
A class of 25 students records their daily screen-time hours: most values are between 2 and 6, but one student records 50 hours.
Is this an outlier, and what should you do?
Step by step
- Identify: 50 is far above the other values (2-6), so it IS an outlier numerically.
- Investigate the cause. A real day has only 24 hours, so 50 is impossible — this is a measurement / data-entry ERROR, not a true extreme value.
- Decide: remove the impossible value (or correct it if the true entry can be recovered). Do NOT silently drop true extreme values.
- Report: state in your write-up that the value 50 was excluded as an obvious data-entry error (>24 hours).
Final answer
Yes, 50 is an outlier. It is a data-entry error (impossible value) and should be removed or corrected, with a note in the report explaining the exclusion.
Study smarter, not longer
Most students waste 40% of study time on topics they already know. Our AI tracks your progress and optimizes every minute.
IQR method for identifying outliers
Worked example
Data: 12, 14, 15, 16, 18, 19, 20, 21, 25, 45 Identify outliers.
Solution
- Q1 = 14.75, Q3 = 21.25, IQR = 6.5
- Lower bound = 14.75 − 1.5(6.5) = 5
- Upper bound = 21.25 + 1.5(6.5) = 31
- 45 > 31, so 45 is outlier
Final answer
Outlier: 45
IB-style question — largest non-outlier
A data set has Q1 = 20 and Q3 = 32.
Find the largest value that would NOT be classed as an outlier.
Step by step
- IQR first, then the upper fence Q3 + 1.5·IQR — anything at or below it is not an outlier.
Final answer
50 — any value above 50 is an outlier.
[Diagram: math-box-plot] - Available in full study mode
How outliers affect statistics
| Statistic | Outlier effect |
|---|---|
| Mean | Highly affected |
| Median | Resistant |
| Range | Highly affected |
| IQR | Resistant |
| SD | Highly affected |
Strategy: With outliers: Use median and IQR. Without: Can use mean and SD.