Three-Layer Architecture¶
Summary¶
The LLM Wiki pattern organizes knowledge into three cleanly separated layers, each with a distinct role and strict ownership model: Raw Sources (human-owned, immutable), The Wiki (LLM-maintained), and The Schema (human-defined governance).
Layer 1: Raw Sources (Immutable)¶
Directory: raw/ or sources/
This is the intake layer: research papers, YouTube transcripts, web articles, meeting notes, blog posts — any text the system should learn from. Sources are treated as immutable — the LLM reads them but never modifies them. They are the ground truth from which everything is derived.
Immutability is a deliberate design choice: you can always re-derive the wiki from scratch if needed. Sources serve as your audit trail.
Layer 2: The Wiki (LLM-Maintained)¶
Directory: wiki/
A collection of *.md files — one per concept, topic, or entity — maintained entirely by the LLM. Humans read the wiki but don't write to it directly. Key files:
- Entity pages —
*.mdfiles with YAML frontmatter (title, tags, provenance, updated timestamp) - Cross-references —
[[slug]]notation linking pages, making the knowledge graph explicit in plain text index.md— Auto-updated catalog of all pages, summaries, and tagslog.md— Append-only chronological record of every operation.meta/embeddings.json— Vector index enabling semantic search (optional)
Layer 3: The Schema (Governance)¶
File: wiki/.meta/schema.json or encoded in AGENTS.md
Defines the page universe: which concepts the wiki tracks, their slugs, titles, and one-line descriptions. It is the contract between human intent and LLM execution.
- When you want to track a new concept → add a PageSpec to the schema
- When you want to stop tracking something → remove it from the schema
- The schema is the only thing humans actively manage; everything downstream is automated
Separation of concerns: humans define what knowledge should exist; the LLM handles how that knowledge is organized and kept current.
Ownership Model¶
| Layer | Directory | Owner |
|---|---|---|
| Raw Sources | raw/ |
Human (add files here) |
| Wiki | wiki/*.md + .meta/ |
LLM (auto-maintained) |
| Schema | wiki/.meta/schema.json |
Human (defines page universe) |
Implementation Notes¶
In the Python implementation by Plaban Nayak, each concern can be upgraded independently:
- Swapping embedding models → touches only embeddings.py
- Adding BM25 hybrid retrieval → touches only index.py
- Adding new source types → touches only ingest.py