STATUS: CONNECTED
LATENCY: 42ms

LOG // What Is a KEG? How I Built My Own Knowledge Graph

· Updated

A KEG — Knowledge Exchange Graph — is a personal knowledge management system that organizes information as a network of interconnected nodes instead of folders and documents. I’ve been building and refining this system since late 2022, and it’s become the foundation for how I capture, connect, and retrieve everything I know.

This is the story of how it came together.

It Started with Lost Paper

I started jotting notes and random scribbles on scraps of paper and found it genuinely helpful. But paper has a problem: I’d vaguely remember writing something down, want to reference it for a refresher, and the paper would always be gone. I needed a system where I could dump whatever was on my mind and retrieve the thought later — even after I’d completely forgotten about it.

The goal was never well-organized notes for their own sake. I needed a system with minimal friction for capture and reliable retrieval.

The Search for the Right Tool (Summer 2022)

I started exploring note-taking systems around the summer of 2022. The list was long: Obsidian, Notion, Emacs org-mode, various phone apps. I even bought a tablet with a pen to try handwriting-to-text conversion. It was clunky and I didn’t adopt it.

Each system had its own friction:

  • Microsoft OneNote — programming is text-based, and OneNote wasn’t a good fit at all
  • Notion — tied to the internet and painfully slow
  • Obsidian — syncing drained my phone battery, Google Docs didn’t work as a sync method, and the vim mode broke copy and paste
  • Google Keep — nice in a pinch but too minimal; it gets cluttered and hard to search over time
  • Handwriting apps — forced you into their ecosystem and required internet access

Offline access mattered to me. I’d put notes about family events so we wouldn’t repeat the same cooking mistakes year after year. We always remember that we screwed up — just not how we screwed up.

Every system either locked you into a proprietary ecosystem, added too much friction to capture, or made retrieval unreliable at scale.

Thinking About How Retrieval Actually Works

Around this time I started questioning how my brain actually retrieves information. I’d never remember an idea directly, but I could remember things adjacent to it. I’d also have divergent ideas that branched in multiple directions.

Most note-taking systems feel linear. You start with a flat directory of titled files. It gets hard to manage, so you create a hierarchy of folders. That works if information fits neatly into one bucket — but it usually doesn’t.

Take a physics class that uses ideas from math. You also have a math class. Does that math concept go in the math folder or the physics folder? It could go in both. How do you find it later? You might remember for a while, but not forever. And how do you make it retrievable by other people or external systems?

Information is graph-based, not linear. Knowledge management naturally fits into a messy graph.

Discovering Zettelkasten and KEG

I started looking into link-based systems, which led me to the zettelkasten method — a system of networked note-taking where ideas are captured as individual cards linked to each other.

I tried bolting zettelkasten onto Notion and Obsidian. It kind of worked, but the same platform issues remained. Then I stumbled on a streamer on Twitch who had created the KEG specification — essentially a zettelkasten-like system designed to be radically simple. All it required was a Unix filesystem and standard utilities. It was CLI-based (I’m a CLI junky, almost to a fault), centered on markdown notes, and reused the same development toolchain I was already comfortable with. Think git for knowledge.

The creator built a simple program around the workflow with convenient helpers and an indexer that rebuilt indexes automatically. It was designed around progressive enhancement and was extendable. The changed index showed the most recently worked-on notes, which was useful because newer things were accessed most often.

My first commit on a KEG was November 15, 2022.

The System Worked — Until It Didn’t

The file-based approach worked well for a while. But as I accumulated more and more notes, grep and the changed index started to break down. The original spec was built around the premise of an external search engine, which turned out to be a hard problem.

I came up with a tagging system. I’d put tags in a meta.yaml file alongside each node. The original KEG author wasn’t a fan of frontmatter — it made parsing harder. I actually leaned into the separation: keeping metadata in meta.yaml made it easier to build more indexes, particularly around tags. Tags became a way to dynamically generate indexes on the fly.

Today, tapper has a query system built around boolean logic over meta attributes and tags. It works well for discovery across 1500+ nodes.

From Flat Notes to Typed Entities

Over time, I noticed a pattern. Lots of my ideas were never captured because there was still too much manual work involved. I only took notes on things I thought were high-value. I looked at templates to auto-generate common structures but never settled on a design.

Then AI came along. I was trying to figure out how to get it to write my notes. I also stumbled on proof-of-concept work about using graphs for information retrieval — this was before GraphRAG was a named concept.

I introduced entity notes: an extension to the KEG spec that makes nodes typed. An entity is an additional attribute in the metadata that specifies how the note itself should be structured. A project node has certain expected sections. A guide has a particular shape. A patch records a code change with a diff and QA checklist.

AI agents query the most recent notes with the right entity type and use them as self-created templates. Very little code needed — just clever use of AI. The self-learning aspect of this extension is what makes the system unique.

I’ve since refined the process as many notes are stateful and go through phases, so AI only sees the finished product. This is still in the research phase for how to solve efficiently, but I have working prototypes.

How a KEG Works

A KEG is a directory of numbered nodes. Each node is a folder (docs/123/) containing:

  • README.md — the note content, with links to other nodes using stable IDs ([Related concept](../456))
  • meta.yaml — metadata: tags, entity type, and any entity-specific attributes
  • stats.json — machine-managed data: created/updated timestamps, access time, access count

A keg config file at the root defines indexes, aliases, and tag definitions. Indexes under dex/ are regenerated deterministically from node metadata — nodes.tsv, per-tag indexes, backlinks. This makes the graph queryable by both humans and tools.

Entities and tags work together

  • Entity describes what a node is: concept, project, guide, patch, task, recipe
  • Tags describe what areas it belongs to: homelab, php, tapper, cooking

Each KEG defines its own ontology. There’s no universal standard. My personal KEG emphasizes software projects and infrastructure. Someone else’s might focus on creative writing or cooking. The flexibility is the point.

Multiple retrieval paths

A KEG supports several ways to find knowledge:

  • Direct lookup — access a node by its ID
  • Tag-based discovery — “show me all nodes about homelab”
  • Content search — grep across all node content
  • Backlinks — see what other nodes reference a particular concept
  • Associative navigation — start at one node and follow links to related concepts

You don’t have to remember the perfect folder or filename. Multiple retrieval paths mean you rediscover knowledge in different ways — the same way your brain works.

Why Not Just Use Obsidian / Notion / Roam?

KEGs trade polish for properties I care about more:

AspectPlatform appsKEG
FormatProprietary or semi-openPlain markdown and YAML
StorageCloud-dependent or app-specificLocal filesystem, git-friendly
SyncPlatform-managedGit, rsync, whatever you want
AutomationPlugin ecosystemStandard CLI tools
AI integrationAPI wrappersDirect file access, MCP server
OfflineVariesAlways works
Lock-inHighNone

The key advantage: because a KEG is just files and folders, any tool that can read text can work with it. Including AI agents.

KEGs as AI Agent Memory

This turned out to be the most interesting evolution. Because each node is a plain-text file with structured metadata and stable IDs, an AI agent can query, create, and link nodes using the same CLI commands a human would. I now use my KEGs to pass notes and context between AI sessions — a way to give agents durable, cross-session memory without needing a database.

The entity system makes this especially powerful. An AI agent can look at recent patch nodes to understand what templates to follow, or query task nodes to understand what work is in progress. The graph structure means the agent can follow links to build context, just like a human would.

tapper now includes an MCP server so AI agents can talk to the knowledge graph directly over the Model Context Protocol, rather than shelling out to the CLI.

What’s Next

The graph-based structure opens up interesting possibilities for search and discovery that I haven’t fully explored yet. The file-based backend works well for a single user, but sharing a KEG across a team is the next frontier — an API-backed backend that preserves the same node model and CLI interface while enabling collaboration without shared filesystems or git-pull workflows.

I wonder what the next iteration will be.

Getting Started

If you want to build your own KEG:

  1. Start simple — numbered folders with README.md files and links between them
  2. Add tags gradually — use meta.yaml for categorization as your collection grows
  3. Introduce entities when you need them — typed nodes emerge naturally as patterns repeat
  4. Link thoughtfully — connect related nodes to build the graph
  5. Query and discover — use CLI tools to explore your knowledge

You don’t need to be perfect from day one. My ontology and tagging system have gone through multiple iterations and will keep evolving.

Your knowledge deserves to be more than a folder structure. It deserves to be a graph.