Key Idea: Before you can analyse data, you need to understand what kind of data you have and how it was collected. Topic 4.1 covers the vocabulary of statistics: population vs sample, types of data, and sampling methods. Getting these right matters because the method of collection affects the validity of any conclusions you draw.
โ Types of data
โ Population and sampling
Reliability: A random sample tends to produce reliable results (low bias) if it is large enough. Non-random methods are faster but less reliable. Outlier impact: A single extreme value (outlier) can distort the mean significantly. Always identify outliers before drawing conclusions.
Paper 1: Questions often ask you to identify data type or explain why a sampling method is biased. Write a specific reason โ 'convenience sampling means people who are easy to reach are over-represented' earns the mark; vague answers do not. Paper 2: You may need to calculate sample size per stratum. Divide: n_stratum = (stratum size / population size) ร total sample size.