CodeGraph pre-indexes your repository into a local semantic knowledge graph so coding agents (Claude Code, Cursor, Codex, OpenCode, and others) spend fewer tokens on grep-and-read exploration. Dr. Alvaro Cintas’s LinkedIn post highlights up to 94% fewer tool calls and 77% faster codebase exploration; the project’s own multi-repo benchmarks report medians closer to ~71% fewer calls and ~46% faster on average—with the largest wins on big TypeScript and Rust trees.
%%{init: {"theme": "base", "themeVariables": {"background": "transparent", "lineColor": "#000000"}}}%%
graph TD
REPO[Source files] --> INDEX[tree-sitter + SQLite graph]
INDEX --> MCP[CodeGraph MCP server]
MCP --> AGENT[Claude Code / Cursor / Codex]
AGENT --> OUT[Answer with fewer Read/Grep loops]
classDef agent fill:#8B0000,color:#fff
classDef hook fill:#189AB4,color:#fff
class AGENT agent
class INDEX hook
class MCP hook

Without an index, agents re-scan the repo; CodeGraph answers from a local map built once.
The problem: exploration burns tokens
When an agent lacks structural context, it often spawns Explore sub-agents that chain grep, glob, and Read across thousands of files—paying model tokens for every hop. Architecture questions on repos like VS Code or Excalidraw can balloon to dozens of tool calls and millions of tokens before the model reads the right module.

SQLite-backed graph keeps source intelligence on-device for Claude Code, Cursor, and Codex.
What CodeGraph does
CodeGraph (MIT, by Colby McHenry) builds a pre-indexed graph on your machine: symbols, call relationships, full-text search (SQLite FTS5), framework routes, and optional cross-language bridges (Swift↔ObjC, React Native, Expo). Agents reach it through an MCP server (codegraph serve --mcp)—no source upload, no API keys for indexing.
- Smart context: tools like
codegraph_contextreturn entry points, related symbols, and snippets in one shot - Traversal: explore callers, callees, and impact radius before refactors
- Routes: 14+ web frameworks (Django, FastAPI, Express, NestJS, Rails, Spring, Gin, Axum, etc.) link URL patterns to handlers
- Fresh index: native file watchers (FSEvents / inotify / ReadDirectoryChangesW) with debounced re-sync; staleness banners during pending updates
- 20+ languages: TypeScript, Python, Rust, Go, Java, Swift, Kotlin, C#, PHP, and more
Benchmarks (with vs without CodeGraph)
Official methodology (re-validated on v0.9.4, May 2026): headless Claude Code with Opus 4.7, one architecture question per repo, 4 runs per arm, median reported. WITH = CodeGraph MCP enabled; WITHOUT = empty MCP config but built-in Read/Grep/Bash still available.
| Codebase | Language | Tool calls saved | Tokens saved | Time saved | Cost saved |
|---|---|---|---|---|---|
| VS Code | TypeScript (~10k files) | 85% | 78% | 52% | 26% |
| Excalidraw | TypeScript (~640 files) | 96% | 90% | 73% | 52% |
| Tokio | Rust (~790 files) | 92% | 86% | 71% | 82% |
| Django | Python (~3k files) | 53% | 36% | 19% | 12% |
| Alamofire | Swift (~110 files) | 83% | 64% | 48% | 47% |
| Gin | Go (~110 files) | 40% | 34% | 27% | 21% |
| Average | 7 repos | 71% | 57% | 46% | 35% |
Example medians on VS Code (“How does the extension host communicate with the main process?”): 8 tool calls with CodeGraph vs 55 without; ~601k vs ~2.8M tokens. Cintas’s 94% / 77% figures align with the best large-repo cells (e.g. Excalidraw 96% fewer calls, 73% faster)—not every project sees that peak; small repos like Gin show narrower margins because naive search is already cheap.
Supported agents
Interactive installer wires MCP config for Claude Code, Cursor, Codex CLI, OpenCode, Hermes Agent, Gemini CLI, Antigravity, and Kiro. The LinkedIn post matches the README’s core quartet plus OpenCode.
Install and index a project
# macOS / Linux (bundled runtime — no Node required)
curl -fsSL https://raw.githubusercontent.com/colbymchenry/codegraph/main/install.sh | sh
# Or npm / npx
npm i -g @colbymchenry/codegraph
# Register MCP with your agent(s)
codegraph install
# Index the repo (interactive init)
cd your-project
codegraph init -i
# Optional: serve MCP manually
codegraph serve --mcp
Indexes live under .codegraph/ per project. Remove agent integration with codegraph uninstall; drop project data with codegraph uninit.
Why it wins (and when it does not)
| Works well | Less benefit |
|---|---|
| Large monorepos and architecture / “how does X work?” questions | Tiny codebases where grep is already fast |
| Privacy-sensitive or air-gapped work (100% local SQLite) | Agents that ignore MCP and delegate everything to file-reading sub-agents |
| Impact analysis before wide refactors | Tasks needing live unindexed assets only the watcher has not synced yet |
| Multi-language mobile (RN / Expo bridging) | One-off edits where the model already knows exact file paths |
Maintainers note CodeGraph only helps when the primary agent queries the graph directly; otherwise Explore sub-agents may still burn tokens on raw file reads. Project instructions steer agents toward codegraph_context first, then targeted exploration—mirroring the “don’t burn tokens exploring” message in Cintas’s post.
Performance snapshot
| Metric | Typical range (official medians) |
|---|---|
| Fewer tool calls | 40–96% per repo; ~71% average |
| Fewer tokens | 13–90%; ~57% average |
| Faster wall time | 19–73%; ~46% average |
| Lower run cost (Claude Opus 4.7) | 2–82%; ~35% average |
| Calls with index (VS Code example) | 8 vs 55 without |
| License / hosting | MIT tool; index stays local |
For teams running agents on big codebases daily, CodeGraph is a practical layer between “raw repository” and “model context”: pay indexing cost once, then replace repetitive discovery loops with graph queries. Start with codegraph init -i on your main app, confirm MCP is active in your agent, and compare tool-call counts on the same architecture prompt—with and without the index.
Research supplement
Web search and external fetch tools were not accessible during this run. No additional verified sources could be retrieved beyond the author-provided references. The ANALYSIS and MEDIUM sections draw on domain knowledge of semantic search, RAG architectures, and agentic LLM tool-use patterns; specific claims about CodeGraph's internals should be verified against the live documentation and GitHub repository before publication.

















