The big idea: A cladogram (an evolutionary tree) used to be drawn from what organisms look like — their shape and structure.
Today, trees are built mainly from molecular evidence: we compare the base sequence of a gene (such as the gene for rRNA) or the amino-acid sequence of a protein (such as haemoglobin) in different species.
The key rule is simple: the more differences there are between two species' sequences, the longer ago they split from a common ancestor. Fewer differences means they are more closely related.
- Molecular evidence
- Data from the sequences of molecules — the bases in DNA/RNA or the amino acids in proteins — used to work out how closely species are related.
- Base sequence
- The order of the bases (A, T, C, G) along a gene; comparing the same gene in different species shows how much their DNA has changed.
- Amino-acid sequence
- The order of amino acids in a protein; because the protein is built from the gene, this also reflects differences in the DNA.
- Mutation
- A change in a base (or amino acid). Mutations build up gradually over time, so they act as a record of how long two lines have been separate.
| Type of molecule | What is compared | Why it works |
|---|---|---|
| A gene's base sequence (e.g. the gene for rRNA) | The order of bases (A, T, C, G) in the same gene in different species | rRNA is found in every cell and changes very slowly, so it can be lined up even between very distantly related groups (bacteria vs animals) |
| A protein's amino-acid sequence (e.g. haemoglobin, cytochrome c) | The order of amino acids in the same protein in different species | The protein is the gene's product, so differences in amino acids reflect differences in the underlying DNA |
Why molecules beat appearances: Looks can mislead — unrelated species can evolve to look alike (a dolphin and a shark), and close relatives can look very different.
A sequence is just a string of letters, so it can be counted and compared objectively — even between organisms that share no obvious features at all.
Read the logic as a chain of cause and effect.
When two species share a common ancestor, they start with identical sequences. After they split, each line accumulates its own mutations independently. The longer they have been separate, the more mutations pile up — so the more differences you find when you compare their sequences today.
The cause-and-effect chain
- Two species split from a common ancestor — at that moment their sequences are the same.
- After the split, mutations occur independently in each separate line.
- Mutations accumulate over time — the more time passes, the more pile up.
- So more sequence differences today = the split happened longer ago = more distantly related.
- And fewer differences = a more recent split = more closely related.
| Number of sequence differences | How long ago they diverged | How closely related |
|---|---|---|
| FEW differences | Their lines split RECENTLY (little time for mutations to build up) | CLOSELY related |
| MANY differences | Their lines split LONG AGO (lots of time for mutations to build up) | DISTANTLY related |
The molecular clock: Many mutations are neutral — they don't change how the organism survives, so natural selection neither keeps nor removes them. These neutral changes build up at a roughly constant rate over millions of years.
That steady ticking is the molecular clock. Because the rate is roughly constant, the number of differences between two species can be used to estimate the time since they shared a common ancestor — more differences means more time.
It is like growth rings on a tree: count the changes, and you can read off how long the two lines have been separate.
Worked example — ranking relatives: Suppose we compare the same protein in three species against a test fish and count the amino-acid differences.
Species P = 3 differences, Species Q = 11, Species R = 27.
Fewest differences = most recent split, so P is the closest relative, then Q, and R is the most distant. The bigger the count, the further back the common ancestor.
| Species compared with the test fish | Amino-acid differences in protein X | Interpretation |
|---|---|---|
| Species P | 3 | Fewest differences → split most recently → closest relative |
| Species Q | 11 | Intermediate → split earlier than P |
| Species R | 27 | Most differences → split longest ago → most distant relative |
Strengths — and one big caution: Strengths: molecular evidence is objective and quantitative (you count differences rather than judge by eye), and it works even when there are no fossils.
Caution: the clock does not tick at the same rate for every gene or every lineage — some genes mutate faster than others, and some lines speed up or slow down. So a molecular clock must be calibrated against the fossil record before it can give real dates.
Get feedback like a real examiner
Submit your answers and get instant feedback — what you did well, what's missing, and exactly what to write to score full marks.
How this is tested: You may be asked to outline how molecular evidence is used to build cladograms, or to explain why more sequence differences mean an earlier divergence.
A common data task gives you a small table of sequence differences and asks you to deduce which species are most closely related (fewest differences) or to rank them.
Higher-mark answers bring in the molecular clock (neutral mutations at a roughly constant rate → estimate divergence time) and note that it must be calibrated against fossils.
IB-style question — why more differences mean a longer time apart
Explain how the number of differences between the DNA base sequences of two species can be used to estimate how long ago they diverged from a common ancestor. [4]
How to score all four marks
- Same start. When the two species split from a common ancestor, their base sequences were identical.
- Independent mutation. After the split, mutations accumulate independently in each line, so the sequences gradually become different.
- More differences = more time. The more differences there are between the sequences, the longer ago they diverged (more time for mutations to build up); fewer differences means a more recent split.
- Molecular clock. Because (neutral) mutations accumulate at a roughly constant rate, the number of differences can estimate the divergence time — but the clock must be calibrated against fossils, as rates vary. (Award 1 mark per distinct point, up to 4.)
Final answer
The species started with identical sequences from a common ancestor; after the split each line accumulated mutations independently, so the number of differences grows with time. More differences therefore mean an earlier divergence, and because mutations accumulate at a roughly constant rate (the molecular clock, calibrated against fossils) the count can estimate the time since divergence.
✓ Why this scores full marks: It gives the cause (identical start → independent mutations accumulate), states the rule (more differences = longer ago / fewer = more recent), and brings in the molecular clock with the calibration caution.
A common way to lose marks is to only state the rule ('more differences = less related') without explaining that mutations accumulate over time.