The big idea: A mutation is a random change to the base sequence of DNA.
DNA is a code spelled out in four bases (A, T, C and G). If that spelling is changed — even by a single base — the gene now carries different information.
Mutations are random: they are not made on purpose and cannot be aimed at a useful result. They are also the original source of all new alleles (new versions of a gene), which is why they matter so much for variation and evolution.
- Mutation
- A random change in the base sequence of DNA (the genetic material).
- Gene
- A length of DNA that carries the code for a particular characteristic.
- Allele
- A particular version of a gene. A new allele is produced when a mutation changes the base sequence of an existing gene.
- Base sequence
- The order of the four bases (A, T, C, G) along a strand of DNA — this order is the genetic code.
- Mutagen
- Anything that increases the rate of mutation, such as ultraviolet (UV) light, X-rays or certain chemicals.
Mutations make new alleles: Where do the different versions of a gene come from in the first place?
They come from mutation. A change to the base sequence of an existing gene creates a new allele — a slightly different version of that gene.
So mutation is the ultimate source of all the genetic variation in living things.
Most gene mutations you need at SL involve a change to a single base. There are three kinds, and they differ in what happens to the bases.
The key word for telling them apart is whether a base is swapped, added or removed.
- Substitution
- One base is swapped for a different base. The total number of bases stays the same.
- Insertion
- One (or more) extra base is added into the sequence, increasing the total number of bases.
- Deletion
- One (or more) base is removed from the sequence, decreasing the total number of bases.
- Frameshift
- The shift in the reading frame caused by an insertion or a deletion, so that every codon after the change is read differently.
Why insertion and deletion hit harder: DNA is read in triplets (groups of three bases called codons).
A substitution changes only the one base it touches, so usually just one codon is affected.
An insertion or a deletion changes the total number of bases, so the cell reads every triplet after the change out of step. This is a frameshift, and it usually changes all of the codons downstream — a much bigger effect from a single base change.
| Mutation type | What changes in the DNA | Effect on the reading frame |
|---|---|---|
| Substitution | One base is swapped for a different base (e.g. A becomes G) | No shift — only the one codon containing that base can change |
| Insertion | One (or more) extra base is added into the sequence | Shifts the reading frame from that point on (a frameshift) |
| Deletion | One (or more) base is removed from the sequence | Shifts the reading frame from that point on (a frameshift) |
Substitution
- One base is swapped for another
- Total number of bases stays the same
- Reading frame is not shifted
- Usually affects one codon only
Insertion / Deletion
- A base is added (insertion) or removed (deletion)
- Total number of bases changes
- Reading frame is shifted (a frameshift)
- Usually affects every codon downstream
Mutagens can cause these changes: Mutations happen by chance, but their rate goes up when DNA is exposed to a mutagen.
Ultraviolet (UV) light, X-rays and some chemicals can damage DNA and cause mutations such as a substitution (or an insertion/deletion).
If asked to name a mutation type a mutagen could cause, any of substitution, insertion or deletion is a valid answer.
A memory hook: Sub = substitute (swap one for another). Insertion = put one in. Deletion = take one out.
Only insertion and deletion change the number of bases — and changing the number is what causes a frameshift.
Practice with real exam questions
Answer exam-style questions and get AI feedback that shows you exactly what examiners want to see in a full-marks response.
How this is tested: A favourite Paper 1B task gives you a DNA base sequence before and after a change and asks you to classify the mutation — read the two sequences, then decide whether a base was swapped (substitution), added (insertion) or removed (deletion).
Paper 1A multiple-choice items often ask you to identify a true general feature of mutations (random; a change to the base sequence; the source of new alleles), or to name a mutation type a mutagen such as UV light can cause.
On Paper 1B a 2-mark Compare and contrast can ask for a similarity and a difference between two named types — give one of each, clearly labelled.
IB-style question — classify the mutation from two base sequences
A short length of DNA normally reads T A C G G A C T T (shown in triplets). After a mutation it reads T A C G T A C T T. Classify the type of mutation that has occurred, and justify your answer. [2]
How to score both marks
- Compare the two sequences base by base. The first triplet (T A C) and the last (C T T) are unchanged. In the middle triplet, the original G G A has become G T A — the second base G has become T.
- Decide what happened to the bases. One base has been swapped for a different base, and the total number of bases is unchanged (still 9). No base was added or removed.
- Classify and justify. This is a substitution mutation, because one base (G) has been replaced by another base (T) while the number of bases stays the same. (Mark 1: identifies substitution. Mark 2: justifies — one base replaced by another / number of bases unchanged.)
Final answer
Substitution — one base (G) has been replaced by another base (T); the total number of bases is unchanged, so the reading frame is not shifted.
✓ How to tell the types apart in data: Count the bases before and after.
Same number of bases, one letter different → substitution.
One more base → insertion. One fewer base → deletion — and in both of these, every triplet after the change is read differently (a frameshift).
| DNA base sequence | What happened | Mutation type |
|---|---|---|
| Original: T A C G G A C T T | the reference sequence (read in triplets) | (none) |
| T A C G T A C T T | the middle G was swapped for a T | Substitution |
| T A C G A G A C T T | an extra A was added — every triplet after it changes | Insertion (frameshift) |
| T A C G A C T T | a base (G) was removed — every triplet after it changes | Deletion (frameshift) |