I have been building RAG systems for over a year now. Fifteen production deployments across enterprise clients. Chunking strategies, embedding models, retrieval pipelines, rerankers. The whole stack. And the thing that has been quietly bothering me is that none of it actually learns.
Every query starts from zero. The system retrieves chunks, generates an answer, and then forgets everything about the interaction. Ask a subtle question that requires synthesizing five documents, and the LLM has to find and piece together the relevant fragments every single time. Nothing accumulates. Nothing compounds.
Then Andrej Karpathy published a gist that articulated what I had been feeling but could not quite name. He called it the LLM Wiki pattern. And it is, in my opinion, one of the most important ideas in personal knowledge management right now.

Table of Contents#
- The Problem with RAG as a Knowledge System
- What the LLM Wiki Pattern Actually Is
- The Three Layer Architecture
- The Three Operations That Make It Work
- Indexing and Navigation at Scale
- Where This Pattern Shines
- How This Compares to RAG
- The Real Insight: Maintenance Cost Is the Bottleneck
- How to Get Started
- FAQ
The Problem with RAG as a Knowledge System#
RAG works. I am not going to pretend otherwise. For many use cases, retrieval augmented generation is exactly the right tool. You have a corpus of documents, a user asks a question, you retrieve relevant chunks, and you generate an answer grounded in real data. It is solid engineering.
But RAG has a fundamental limitation that most teams do not talk about. It is stateless. Every query is independent. The system never builds up an understanding of the material it is working with. It never notices patterns across documents. It never flags contradictions between sources. It never synthesizes a position that evolves as new information arrives.
Think about how a human researcher works. When you read a paper, you do not just file it away. You connect it to things you already know. You notice that this finding contradicts something from last month. You update your mental model. You form opinions. Your understanding compounds.
RAG does none of that. It is a librarian that fetches books but never reads them. NotebookLM, ChatGPT file uploads, and most RAG systems all work this way. They rediscover knowledge from scratch on every question.
For simple Q&A over a document collection, this is fine. For building deep understanding of a domain over weeks or months, it is inadequate.
What the LLM Wiki Pattern Actually Is#
Karpathy's idea is deceptively simple. Instead of retrieving from raw documents at query time, you have the LLM incrementally build and maintain a persistent wiki. A structured, interlinked collection of markdown files that sits between you and the raw sources.
When you add a new source, the LLM does not just index it for later retrieval. It reads the source, extracts key information, and integrates it into the existing wiki. It updates entity pages, revises topic summaries, notes where new data contradicts old claims, and strengthens or challenges the evolving synthesis. The knowledge is compiled once and then kept current. Not re-derived on every query.
The wiki is a persistent, compounding artifact. The cross-references are already there. The contradictions have already been flagged. The synthesis already reflects everything you have read. Every source you add and every question you ask makes the wiki richer.
Here is what makes this different from just "ask the AI to take notes." You never write the wiki yourself. The LLM writes and maintains all of it. You are in charge of sourcing, exploration, and asking the right questions. The LLM does the grunt work. The summarizing, cross-referencing, filing, and bookkeeping that makes a knowledge base actually useful over time.
Karpathy describes the workflow like this: the LLM agent is open on one side and Obsidian is open on the other. The LLM makes edits based on the conversation, and you browse the results in real time. Obsidian is the IDE. The LLM is the programmer. The wiki is the codebase.
The Three Layer Architecture#
The pattern has three distinct layers, and the separation between them is important.
Layer 1: Raw Sources#
Your curated collection of source documents. Articles, papers, images, data files, meeting transcripts, journal entries. These are immutable. The LLM reads from them but never modifies them. This is your source of truth.
Think of this as your evidence locker. You want to keep the original material untouched so you can always trace back to where a claim came from. If the wiki says something surprising, you can go check the raw source.
Layer 2: The Wiki#
A directory of LLM-generated markdown files. Summaries, entity pages, concept pages, comparisons, an overview page, a running synthesis. The LLM owns this layer entirely. It creates pages, updates them when new sources arrive, maintains cross-references, and keeps everything consistent.
You read it. The LLM writes it. This is the knowledge layer where understanding accumulates.
Layer 3: The Schema#
A configuration document (like a CLAUDE.md for Claude Code or AGENTS.md for Codex) that tells the LLM how the wiki is structured, what conventions to follow, and what workflows to execute when ingesting sources, answering questions, or maintaining the wiki.
This is the meta-layer. It is what makes the LLM a disciplined wiki maintainer rather than a generic chatbot. You and the LLM co-evolve this document over time as you figure out what works for your domain. The schema is where you encode your preferences, your organizational logic, and the rules that keep the wiki coherent as it grows.
The Three Operations That Make It Work#
The LLM Wiki pattern defines three core operations. Each one serves a different purpose, and together they create a feedback loop that keeps the wiki healthy and useful.
Ingest#
You drop a new source into the raw collection and tell the LLM to process it. The LLM reads the source, discusses key takeaways with you, writes a summary page in the wiki, updates the index, updates relevant entity and concept pages across the wiki, and appends an entry to the log.
A single source might touch 10 to 15 wiki pages. That is the compounding effect in action. Every new piece of information ripples through the existing knowledge structure.
Karpathy prefers to ingest sources one at a time and stay involved. Read the summaries, check the updates, guide the LLM on what to emphasize. But you could also batch-ingest many sources at once with less supervision. The right workflow depends on your style.
Query#
You ask questions against the wiki. The LLM searches for relevant pages, reads them, and synthesizes an answer with citations. Answers can take different forms depending on the question. A markdown page, a comparison table, a slide deck, a chart, a canvas.
Here is the important insight. Good answers can be filed back into the wiki as new pages. A comparison you asked for, an analysis, a connection you discovered. These are valuable and should not disappear into chat history. When your explorations compound in the knowledge base just like ingested sources do, the wiki gets richer from both external sources and your own thinking.
Lint#
Periodically, you ask the LLM to health-check the wiki. Look for contradictions between pages. Find stale claims that newer sources have superseded. Identify orphan pages with no inbound links. Spot important concepts mentioned but lacking their own page. Flag missing cross-references. Suggest data gaps that could be filled with a web search.
The LLM is surprisingly good at suggesting new questions to investigate and new sources to look for. This keeps the wiki healthy as it grows and prevents the kind of decay that kills most personal knowledge systems.
Indexing and Navigation at Scale#
Two special files help both the LLM and you navigate the wiki as it grows.
index.md is content-oriented. It is a catalog of everything in the wiki. Each page listed with a link, a one-line summary, and optionally metadata like date or source count. Organized by category. The LLM updates it on every ingest. When answering a query, the LLM reads the index first to find relevant pages, then drills into them. This works surprisingly well at moderate scale, around 100 sources and hundreds of pages, and avoids the need for embedding-based RAG infrastructure entirely.
log.md is chronological. It is an append-only record of what happened and when. Ingests, queries, lint passes. If each entry starts with a consistent prefix, the log becomes parseable with simple unix tools. The log gives you a timeline of the wiki's evolution and helps the LLM understand what has been done recently.
At some point you may want proper search. Karpathy mentions qmd, a local search engine for markdown files with hybrid BM25/vector search and LLM re-ranking. But at small to moderate scale, the index file alone is enough.
Where This Pattern Shines#
The LLM Wiki pattern is not a one-size-fits-all solution. It is best suited for situations where you are accumulating knowledge over time and want it organized rather than scattered.
Research deep dives. Going deep on a topic over weeks or months. Reading papers, articles, reports, and incrementally building a comprehensive wiki with an evolving thesis. Each new source updates and challenges what came before.
Reading a book. Filing each chapter as you go, building out pages for characters, themes, plot threads, and how they connect. By the end you have a rich companion wiki. Think of fan wikis like Tolkien Gateway, with thousands of interlinked pages covering characters, places, events, and languages. You could build something like that personally as you read, with the LLM doing all the cross-referencing and maintenance.
Personal development. Tracking your own goals, health, psychology, self-improvement. Filing journal entries, articles, podcast notes. Building a structured picture of yourself over time. Patterns emerge that you would never notice in scattered notes.
Team knowledge management. An internal wiki maintained by LLMs, fed by Slack threads, meeting transcripts, project documents, customer calls. The wiki stays current because the LLM does the maintenance that no one on the team wants to do. This is the use case I find most compelling for enterprise teams.
Competitive analysis, due diligence, trip planning, course notes, hobby deep dives. Anything where knowledge accumulates and organization matters.
How This Compares to RAG#
Let me be direct about the trade-offs.
| Dimension | RAG | LLM Wiki |
|---|---|---|
| Knowledge state | Stateless, re-derived per query | Persistent, compounding |
| Cross-references | Discovered at query time | Pre-built and maintained |
| Contradictions | Not detected | Flagged during ingest |
| Setup complexity | Higher (embeddings, vector DB, retrieval pipeline) | Lower (just markdown files) |
| Scale | Handles millions of documents | Best at hundreds of sources |
| Latency | Retrieval + generation per query | Index lookup + generation |
| Maintenance | Automated (re-embed on change) | LLM-maintained (ingest workflow) |
| Human involvement | Low after setup | Higher, but more productive |
RAG is the right choice when you have a large corpus and need to answer questions against it with minimal human involvement. Customer support, documentation search, enterprise Q&A.
The LLM Wiki pattern is the right choice when you are building understanding over time, when cross-references and synthesis matter, and when you want your explorations to compound. Research, learning, personal development, strategic analysis.
They are not mutually exclusive. You could use RAG as a search layer within a wiki system. But the mental models are different. RAG optimizes for retrieval. The wiki pattern optimizes for understanding.
The Real Insight: Maintenance Cost Is the Bottleneck#
Here is why I think this pattern matters beyond the specific implementation details.
The tedious part of maintaining a knowledge base is not the reading or the thinking. It is the bookkeeping. Updating cross-references. Keeping summaries current. Noting when new data contradicts old claims. Maintaining consistency across dozens of pages.
Humans abandon wikis because the maintenance burden grows faster than the value. I have seen it happen in every team I have worked with. Someone starts a Notion workspace with great intentions, and six months later it is a graveyard of outdated pages that nobody trusts.
LLMs do not get bored. They do not forget to update a cross-reference. They can touch 15 files in one pass. The wiki stays maintained because the cost of maintenance drops to near zero.
The human's job is to curate sources, direct the analysis, ask good questions, and think about what it all means. The LLM's job is everything else.
Karpathy draws a connection to Vannevar Bush's Memex from 1945. A personal, curated knowledge store with associative trails between documents. Bush's vision was closer to this than to what the web became. Private, actively curated, with the connections between documents as valuable as the documents themselves. The part he could not solve was who does the maintenance. Now, 80 years later, the LLM handles that.
How to Get Started#
The pattern is intentionally abstract. There is no specific tool or framework required. Here is how I would start.
-
Pick your LLM agent. Claude Code, OpenAI Codex, or any agent that can read and write files. The key requirement is persistent file access.
-
Create the three directories. A
raw/folder for sources, awiki/folder for the LLM-generated pages, and a schema file (CLAUDE.md or equivalent) with your conventions. -
Write a minimal schema. Tell the LLM what kind of pages to create, how to name them, what to include in each page, and what the ingest workflow looks like. Start simple. You will evolve this.
-
Ingest your first source. Drop in an article or paper and ask the LLM to process it. Watch what pages it creates. Adjust your schema based on what you like and do not like.
-
Build from there. Add sources one at a time at first. Get comfortable with the workflow. Then experiment with batch ingestion, query-filing, and lint passes.
-
Use Obsidian (or any markdown viewer) as your reading interface. The graph view is particularly useful for seeing the shape of your wiki and spotting orphan pages.
The whole thing is just a git repo of markdown files. You get version history, branching, and collaboration for free. No database. No vector store. No infrastructure beyond a text editor and an LLM.
FAQ#
Is this just Zettelkasten with AI?#
There are similarities. Both systems emphasize atomic notes, cross-references, and emergent structure. But Zettelkasten requires the human to do all the linking and maintenance. The LLM Wiki pattern offloads that entirely. You focus on the thinking. The LLM focuses on the bookkeeping. In practice, this is the difference between a system you maintain for a month and one you maintain for a year.
Does this work with Claude Code specifically?#
Yes, and quite well. Claude Code has persistent file access, can read and write markdown files, and supports CLAUDE.md as a schema file natively. The agent can touch multiple files in a single pass, which is exactly what the ingest workflow requires. I have been experimenting with this pattern using Claude Code and the experience is smooth.
What happens when the wiki gets too large for the context window?#
This is where the index.md file becomes critical. The LLM reads the index first to find relevant pages, then reads only the pages it needs. At moderate scale, around a few hundred pages, this works well. Beyond that, you would want to add a search tool like qmd. The key insight is that you need far less context than a RAG system because the wiki pages are already synthesized and cross-referenced.
Can I use this for a team, not just personal use?#
Absolutely. Feed it Slack threads, meeting transcripts, project documents, customer call summaries. The LLM maintains the internal wiki that nobody on the team wants to maintain. The challenge is establishing review workflows so that the wiki stays trustworthy. You would want humans in the loop reviewing updates, at least initially.
How is this different from NotebookLM?#
NotebookLM is a RAG system. You upload documents and ask questions. It retrieves relevant chunks and generates answers. It does not build a persistent knowledge structure. It does not maintain cross-references. It does not flag contradictions. It re-derives everything on every query. The LLM Wiki pattern is architecturally different because the understanding is compiled once and kept current.
What if the LLM gets something wrong in the wiki?#
This is why the raw sources layer exists as an immutable source of truth. If a wiki page contains something surprising, you can trace it back to the original source. The lint operation also helps here, since it checks for contradictions and stale claims. And because the wiki is just markdown files in a git repo, you have full version history. You can see exactly what changed and when.
Does this replace my existing note-taking system?#
It does not have to. The LLM Wiki pattern can sit alongside your existing system. Your raw sources might include exports from Notion, Obsidian, or whatever you currently use. The wiki is an additional layer that provides synthesis and cross-referencing on top of your existing material. Think of it as a research assistant that maintains a companion wiki based on everything you feed it.