Skip to content

Three-Layer Architecture

Summary

The LLM Wiki pattern organizes knowledge into three cleanly separated layers, each with a distinct role and strict ownership model: Raw Sources (human-owned, immutable), The Wiki (LLM-maintained), and The Schema (human-defined governance).

Layer 1: Raw Sources (Immutable)

Directory: raw/ or sources/

This is the intake layer: research papers, YouTube transcripts, web articles, meeting notes, blog posts — any text the system should learn from. Sources are treated as immutable — the LLM reads them but never modifies them. They are the ground truth from which everything is derived.

Immutability is a deliberate design choice: you can always re-derive the wiki from scratch if needed. Sources serve as your audit trail.

Layer 2: The Wiki (LLM-Maintained)

Directory: wiki/

A collection of *.md files — one per concept, topic, or entity — maintained entirely by the LLM. Humans read the wiki but don't write to it directly. Key files:

  • Entity pages*.md files with YAML frontmatter (title, tags, provenance, updated timestamp)
  • Cross-references[[slug]] notation linking pages, making the knowledge graph explicit in plain text
  • index.md — Auto-updated catalog of all pages, summaries, and tags
  • log.md — Append-only chronological record of every operation
  • .meta/embeddings.json — Vector index enabling semantic search (optional)

Layer 3: The Schema (Governance)

File: wiki/.meta/schema.json or encoded in AGENTS.md

Defines the page universe: which concepts the wiki tracks, their slugs, titles, and one-line descriptions. It is the contract between human intent and LLM execution.

  • When you want to track a new concept → add a PageSpec to the schema
  • When you want to stop tracking something → remove it from the schema
  • The schema is the only thing humans actively manage; everything downstream is automated

Separation of concerns: humans define what knowledge should exist; the LLM handles how that knowledge is organized and kept current.

Ownership Model

Layer Directory Owner
Raw Sources raw/ Human (add files here)
Wiki wiki/*.md + .meta/ LLM (auto-maintained)
Schema wiki/.meta/schema.json Human (defines page universe)

Implementation Notes

In the Python implementation by Plaban Nayak, each concern can be upgraded independently: - Swapping embedding models → touches only embeddings.py - Adding BM25 hybrid retrieval → touches only index.py - Adding new source types → touches only ingest.py

See Also