Skip to content

I Built Self-Evolving Claude Code Memory w/ Karpathy's LLM Knowledge Bases

Summary

Cole Medin adapts Karpathy's LLM Wiki pattern from external data (articles, papers) to internal data — giving Claude Code a memory that evolves with a codebase. Instead of ingesting web content, the system captures conversation logs and extracts structured knowledge articles from them.

Key Insights

Compiler Analogy Explained

Cole maps the LLM Wiki to software compilation:

Compilation Stage LLM Wiki Equivalent
Source code Raw articles, papers, transcripts dropped into raw/
Compiler LLM processes raw info, creates summaries, links documents, structures knowledge
Executable The wiki — compiled articles with backlinks, what we query
Test suite Linting — finding gaps, stale data, broken links, ensuring data integrity
Runtime Running queries, agents searching the wiki for information

Karpathy's Token Shift

"I'm spending more of my tokens with my agents manipulating knowledge like Markdown and Obsidian instead of manipulating code."

Karpathy works with knowledge the same way we work with code — the compiler analogy makes this explicit.

No Vector Database Needed

Karpathy noted: "I thought I had to reach for fancy RAG, but the LLM has been pretty good about automaintaining index files." The agent navigates files starting from the index — no semantic search or vector database required at small scale.

Cole's Adaptation: Internal Data for Claude Code Memory

Instead of external data ingestion, Cole built a system that:

  1. Automatically captures session logs using Claude Code hooks (session start, pre-compact, session end)
  2. Summarizes every conversation via Claude Agent SDK running behind the scenes
  3. Stores summaries in daily log files (equivalent to the raw/ folder)
  4. Flushes logs once a day — extracts concepts and connections, populates the wiki
  5. Search focuses on the wiki (knowledge/) but can also look through daily logs

Claude Code Hooks Architecture

Hook When It Fires What It Does
Session start New Claude Code session Loads AGENTS.md and index.md so agent understands the system
Pre-compact Before context compaction Sends latest messages to LLM for summarization; writes to daily log
Session end Session closes Same as pre-compact — captures final takeaways, lessons, action items

Hot Cache

Cole introduced a hot.md file — a ~500-character cache of the most recent conversation. Useful for executive assistants that need quick context without crawling full wiki pages. Not needed for all wiki types (e.g., YouTube transcript project doesn't use it).

Compounding Knowledge Loop

  1. Ask a question → agent searches across wiki articles
  2. Agent synthesizes answer by connecting information from multiple pages
  3. Answer is filed back — connecting information between conversations
  4. Wiki grows over time from both conversation knowledge and future sessions
  5. Agent gets better answers over time with no maintenance required

Customization Advantage

Unlike Claude Code's built-in memory system, this system is fully customizable: - Prompts for compilation and flushing can be edited - Claude Code can walk you through making customizations because it has access to AGENTS.md - It's self-contained and can improve itself