RAG vs LLM Wiki¶

Summary¶

A comparison between Retrieval-Augmented Generation (RAG) and the LLM Wiki pattern, highlighting the fundamental architectural difference: stateless retrieval versus stateful, compounding knowledge.

Core Difference¶

RAG retrieves from raw documents at query time. The LLM finds relevant chunks, synthesizes an answer, and forgets. Nothing is learned from one query to the next. Every query starts from zero.

LLM Wiki pre-compiles knowledge into structured, interlinked pages at ingest time. The LLM reads sources once, synthesizes them into wiki pages, and queries run against this compiled artifact. Knowledge accumulates and cross-references grow denser over time.

Comparison Table¶

Dimension	RAG (Semantic Search)	LLM Wiki
Discovery	Similarity search over vectors	Reads indexes, follows links
Understanding	Chunk similarity	Deep relationships via links
Knowledge persistence	None — stateless	Full — builds over time
Synthesis timing	Per query, from scratch	Pre-compiled at ingest
Multi-document answers	Retrieved chunks pieced together at query time	Pre-synthesized encyclopedia entries
Contradiction detection	No	Yes — flagged during compilation
Source traceability	High (chunk-level)	Moderate (page-level)
Infrastructure	Embedding model, vector DB, chunking pipeline	Just markdown files
Cost	Ongoing compute and storage	Basically free (tokens only)
Maintenance	Re-embed when things change	Lint, clean up, add articles
Setup complexity	Low	Low–Medium
Query speed	Consistent (retrieval cost each time)	Improves over time (pre-organized material)
Ingest cost	Low (chunk and embed)	High (routing + synthesis per page)
Long-term quality	Stays the same	Improves with each source
Scale limit	Millions of documents	Hundreds of pages (with good indexes)
Best for	Quick Q&A on documents, rapidly changing data, enterprise scale	Deep, growing research topics over weeks/months, personal scale

One user reported turning 383 scattered files and 100+ meeting transcripts into a wiki, dropping token usage by 95% when querying with Claude.

When to Use Each¶

Use RAG when: - Data changes daily or frequently - Exact source traceability matters for every claim - You need quick answers without schema design - Bulk document ingestion is the priority

Use LLM Wiki when: - Building expertise on a topic over weeks or months - You want the model to reason across your knowledge base - You value synthesis and connection-making over retrieval - You want knowledge to compound, not evaporate between sessions

The Tradeoff¶

RAG sidesteps maintenance overhead by doing all synthesis at query time. It's cheaper to build but never gets smarter. The same query on day one and day one thousand produces the same quality answer.

LLM Wiki inverts this: ingest is expensive, schema design takes thought, maintenance requires periodic linting. But a well-maintained wiki becomes qualitatively different over time — dense with cross-references, drawing on synthesized knowledge from dozens of sources.