Skip to content

LLM Wiki by Andrej Karpathy: Build a Compounding Knowledge Base (Tutorial)

Summary

A step-by-step tutorial from Data Science Dojo on building an LLM Wiki using Karpathy's pattern, using five foundational AI research papers as starting material.

Key Takeaways

  • An LLM wiki is a structured, AI-maintained knowledge base that grows smarter with every source added, unlike RAG which rediscovers knowledge from scratch on every query
  • Karpathy introduced the pattern in a GitHub Gist in April 2026, which went viral among developers
  • The tutorial uses five papers: Attention Is All You Need (2017), BERT (2018), GPT-3 (2020), Foundation Models (2021), and RLHF (2022)
  • Core workflow: drop sources in raw/, run compilation prompt, LLM creates entity pages with wiki-links, contradictions flagged
  • At 100+ pages, the wiki can answer questions where the answer doesn't exist in any single source — the answer lives in the relationships between pages
  • Recommends Obsidian for graph view visualization, Obsidian Web Clipper for article ingestion
  • Karpathy's own wiki reached ~100 articles and 400,000 words while remaining navigable by the LLM

LLM Wiki vs RAG Comparison

RAG LLM Wiki
Knowledge persistence None — stateless Full — builds over time
Multi-document synthesis Per query, from scratch Pre-compiled into pages
Contradiction detection No Yes — flagged during compilation
Source traceability High Moderate (page-level)
Best for Quick Q&A on documents Deep, growing research topics

Common Mistakes Warned Against

  • Putting too much in one page (each entity page should cover exactly one concept)
  • Never running linting (errors propagate fast)
  • Adding too many unrelated topics at once (wiki compounds best when sources are topically related)