LLM Wiki by Andrej Karpathy: Build a Compounding Knowledge Base (Tutorial)¶

Summary¶

A step-by-step tutorial from Data Science Dojo on building an LLM Wiki using Karpathy's pattern, using five foundational AI research papers as starting material.

Key Takeaways¶

An LLM wiki is a structured, AI-maintained knowledge base that grows smarter with every source added, unlike RAG which rediscovers knowledge from scratch on every query
Karpathy introduced the pattern in a GitHub Gist in April 2026, which went viral among developers
The tutorial uses five papers: Attention Is All You Need (2017), BERT (2018), GPT-3 (2020), Foundation Models (2021), and RLHF (2022)
Core workflow: drop sources in raw/, run compilation prompt, LLM creates entity pages with wiki-links, contradictions flagged
At 100+ pages, the wiki can answer questions where the answer doesn't exist in any single source — the answer lives in the relationships between pages
Recommends Obsidian for graph view visualization, Obsidian Web Clipper for article ingestion
Karpathy's own wiki reached ~100 articles and 400,000 words while remaining navigable by the LLM

LLM Wiki vs RAG Comparison¶

	RAG	LLM Wiki
Knowledge persistence	None — stateless	Full — builds over time
Multi-document synthesis	Per query, from scratch	Pre-compiled into pages
Contradiction detection	No	Yes — flagged during compilation
Source traceability	High	Moderate (page-level)
Best for	Quick Q&A on documents	Deep, growing research topics

Common Mistakes Warned Against¶

Putting too much in one page (each entity page should cover exactly one concept)
Never running linting (errors propagate fast)
Adding too many unrelated topics at once (wiki compounds best when sources are topically related)

LLM Wiki by Andrej Karpathy: Build a Compounding Knowledge Base (Tutorial)¶

Summary¶

Key Takeaways¶

LLM Wiki vs RAG Comparison¶

Common Mistakes Warned Against¶

Related Concepts¶