Skip to content

Base Rate Fallacy

Summary

The cognitive bias of ignoring base rate information (prior probability) when evaluating the likelihood of an event, focusing instead on specific evidence or stereotypical information.

Classic Examples

Steve the Librarian/Farmer

Given a description of Steve as "meek and tidy," most people say he's more likely to be a librarian — ignoring the 20:1 farmer-to-librarian ratio in the population.

Correct reasoning: Even if a librarian is 4× more likely to fit the description, the base rate means a person fitting it is only 16.7% likely to be a librarian.

Breast Cancer Screening

  • Base rate: 1% of women have breast cancer
  • Test accuracy: 90% detection, 3% false positive
  • After positive test: 25% chance of cancer (not 90%)
  • 75% of positive results are false positives

The base rate (1%) is crucial — a rare condition means most positive tests will be false positives even with an accurate test.

Why People Fall for It

  • Representativeness heuristic — People judge likelihood by how well something matches a stereotype
  • Specific information feels more relevant — A vivid description seems more diagnostic than dry statistics
  • Base rates feel abstract — "1 in 100" is harder to grasp than "Steve is shy"

How to Avoid It

  • Use representative samples — "Out of 100 people like this..." drops errors from 85% to 0% (Kahneman & Tversky)
  • Think in frequencies, not percentages — "4 out of 24" is more intuitive than "16.7%"
  • Always ask: "How common is this in the general population?"

In AI and Machine Learning

  • Model evaluation: A 99% accurate disease detector is terrible if the disease prevalence is 0.01%
  • Spam filtering: Even a good spam filter produces false positives for rare types of legitimate email
  • Security screening: Rare threats produce overwhelming false positive rates

See Also