Bayesian Reasoning¶
Summary¶
A framework for quantifying and updating beliefs based on evidence, first articulated by Reverend Thomas Bayes in the 18th century. The core principle: new evidence should update prior beliefs, not determine them in a vacuum.
Bayes' Theorem¶
P(H|E) = P(H) × P(E|H) / P(E)
| Term | Name | Meaning |
|---|---|---|
| P(H) | Prior | Belief before seeing evidence |
| P(E|H) | Likelihood | Probability of evidence given hypothesis is true |
| P(E) | Total evidence | P(E|H) × P(H) + P(E|¬H) × P(¬H) |
| P(H|E) | Posterior | Belief after seeing evidence |
The Key Mantra¶
"New evidence does not completely determine your beliefs in a vacuum. It should update prior beliefs."
The Geometric Interpretation¶
Think of all possibilities as a 1×1 square: - Hypothesis occupies left portion (width = prior) - Evidence restricts the space — but not evenly - Posterior = proportion of hypothesis in the restricted shape - Irrelevant evidence doesn't change beliefs (equal likelihoods → no change)
Classic Examples¶
Steve the Librarian/Farmer¶
Even if a librarian is 4× as likely to fit "meek and tidy" description, the 20:1 farmer-to-librarian ratio means a person fitting the description is only 16.7% likely to be a librarian.
Breast Cancer Screening¶
- Prior: 1% chance of cancer
- Test: 90% detection rate, 3% false positive rate
- After positive test: 25% chance (not 90%)
- 3 out of 4 positive results are false positives
Enigma Code Cracking¶
Alan Turing used Bayesian ideas at Bletchley Park — changing opinions about Enigma machine settings as new patterns were found.
Bayesian vs. Frequentist Science¶
| Traditional (Frequentist) | Bayesian | |
|---|---|---|
| Starting point | Blank slate | Existing knowledge + judgments |
| Data analysis | Results speak for themselves | Each datum adds to existing knowledge |
| Objectivity | Completely objective | Acknowledges judgments underlying analysis |
| Clinical trials | Designed in ignorance | Use historical evidence to prioritize |
Making Probability Intuitive¶
- Representative samples work better than percentages ("40 out of 100" vs "40%")
- Area diagrams are more flexible and easier to sketch
- Both sides of Bayes' theorem say the same thing: "look at cases where evidence is true, consider proportion where hypothesis is also true"
Applications¶
- Scientific discovery — validating/invalidating models
- Machine learning and AI — explicitly modeling beliefs
- Medical testing — interpreting screening results
- Spam filters — updating spam probability per feature
- Treasure hunting — Bayesian search found $700M gold ship
- "Bayesian ideas reflect what it means to be human" — we always have prior expectations and revise them as we learn