I Built Self-Evolving Claude Code Memory w/ Karpathy's LLM Knowledge Bases¶

Summary¶

Cole Medin adapts Karpathy's LLM Wiki pattern from external data (articles, papers) to internal data — giving Claude Code a memory that evolves with a codebase. Instead of ingesting web content, the system captures conversation logs and extracts structured knowledge articles from them.

Key Insights¶

Compiler Analogy Explained¶

Cole maps the LLM Wiki to software compilation:

Compilation Stage	LLM Wiki Equivalent
Source code	Raw articles, papers, transcripts dropped into `raw/`
Compiler	LLM processes raw info, creates summaries, links documents, structures knowledge
Executable	The wiki — compiled articles with backlinks, what we query
Test suite	Linting — finding gaps, stale data, broken links, ensuring data integrity
Runtime	Running queries, agents searching the wiki for information

Karpathy's Token Shift¶

"I'm spending more of my tokens with my agents manipulating knowledge like Markdown and Obsidian instead of manipulating code."

Karpathy works with knowledge the same way we work with code — the compiler analogy makes this explicit.

No Vector Database Needed¶

Karpathy noted: "I thought I had to reach for fancy RAG, but the LLM has been pretty good about automaintaining index files." The agent navigates files starting from the index — no semantic search or vector database required at small scale.

Cole's Adaptation: Internal Data for Claude Code Memory¶

Instead of external data ingestion, Cole built a system that:

Automatically captures session logs using Claude Code hooks (session start, pre-compact, session end)
Summarizes every conversation via Claude Agent SDK running behind the scenes
Stores summaries in daily log files (equivalent to the raw/ folder)
Flushes logs once a day — extracts concepts and connections, populates the wiki
Search focuses on the wiki (knowledge/) but can also look through daily logs

Claude Code Hooks Architecture¶

Hook	When It Fires	What It Does
Session start	New Claude Code session	Loads AGENTS.md and index.md so agent understands the system
Pre-compact	Before context compaction	Sends latest messages to LLM for summarization; writes to daily log
Session end	Session closes	Same as pre-compact — captures final takeaways, lessons, action items

Hot Cache¶

Cole introduced a hot.md file — a ~500-character cache of the most recent conversation. Useful for executive assistants that need quick context without crawling full wiki pages. Not needed for all wiki types (e.g., YouTube transcript project doesn't use it).

Compounding Knowledge Loop¶

Ask a question → agent searches across wiki articles
Agent synthesizes answer by connecting information from multiple pages
Answer is filed back — connecting information between conversations
Wiki grows over time from both conversation knowledge and future sessions
Agent gets better answers over time with no maintenance required

Customization Advantage¶

Unlike Claude Code's built-in memory system, this system is fully customizable: - Prompts for compilation and flushing can be edited - Claude Code can walk you through making customizations because it has access to AGENTS.md - It's self-contained and can improve itself