1
0
Эх сурвалжийг харах

docs: overhaul CLAUDE.md and add scripts/release.sh + Cursor rules file

Replaces the old Claude-only CLAUDE.md with a comprehensive guide covering
the full project architecture, multi-agent installer, test conventions,
NodeKind/EdgeKind reference, and release workflow. Key additions:

- Documents the layered pipeline, all module paths, and the multi-target
  installer (targets/, registry.ts, AgentTarget interface).
- Adds the Cursor `--path` quirk and the "update all three surfaces" rule
  when changing MCP tool guidance.
- Documents `npm run eval`, `test:eval`, and the full set of build/test
  commands including single-file patterns.
- `scripts/release.sh` — idempotent bash script that tags the current
  commit, pushes the tag, and creates a GitHub Release whose notes are
  extracted from the matching `## [X.Y.Z]` block in CHANGELOG.md. Safe
  to re-run after partial failure.
- `.cursor/rules/codegraph.mdc` — Cursor-specific agent instructions
  (tool decision table, rules of thumb, index-lag warning) written by
  the installer and kept in sync with server-instructions.ts and
  instructions-template.ts.
Colby McHenry 1 сар өмнө
parent
commit
82fc599d8d
3 өөрчлөгдсөн 186 нэмэгдсэн , 137 устгасан
  1. 37 0
      .cursor/rules/codegraph.mdc
  2. 79 137
      CLAUDE.md
  3. 70 0
      scripts/release.sh

+ 37 - 0
.cursor/rules/codegraph.mdc

@@ -0,0 +1,37 @@
+---
+description: CodeGraph MCP usage guide — when to use which tool
+alwaysApply: true
+---
+<!-- CODEGRAPH_START -->
+## CodeGraph
+
+This project has a CodeGraph MCP server (`codegraph_*` tools) configured. CodeGraph is a tree-sitter-parsed knowledge graph of every symbol, edge, and file. Reads are sub-millisecond and return structural information grep cannot.
+
+### When to prefer codegraph over native search
+
+Use codegraph for **structural** questions — what calls what, what would break, where is X defined, what is X's signature. Use native grep/read only for **literal text** queries (string contents, comments, log messages) or after you already have a specific file open.
+
+| Question | Tool |
+|---|---|
+| "Where is X defined?" / "Find symbol named X" | `codegraph_search` |
+| "What calls function Y?" | `codegraph_callers` |
+| "What does Y call?" | `codegraph_callees` |
+| "What would break if I changed Z?" | `codegraph_impact` |
+| "Show me Y's signature / source / docstring" | `codegraph_node` |
+| "Give me focused context for a task/area" | `codegraph_context` |
+| "Survey an unfamiliar module/topic" | `codegraph_explore` |
+| "What files exist under path/" | `codegraph_files` |
+| "Is the index healthy?" | `codegraph_status` |
+
+### Rules of thumb
+
+- **Trust codegraph results.** They come from a full AST parse. Do NOT re-verify them with grep — that's slower, less accurate, and wastes context.
+- **Don't grep first** when looking up a symbol by name. `codegraph_search` is faster and returns kind + location + signature in one call.
+- **Don't chain `codegraph_search` + `codegraph_node`** when you just want context — `codegraph_context` is one call.
+- **`codegraph_explore` is the heavy hitter** for unfamiliar areas — it returns full source from all relevant files in one call, but is token-heavy. If your harness supports parallel subagents (e.g., Claude Code's Task tool), spawn one for explore-class questions to keep main session context clean.
+- **Index lag**: the file watcher debounces ~500ms behind writes; don't re-query immediately after editing a file in the same turn.
+
+### If `.codegraph/` doesn't exist
+
+The MCP server returns "not initialized." Ask the user: *"I notice this project doesn't have CodeGraph initialized. Want me to run `codegraph init -i` to build the index?"*
+<!-- CODEGRAPH_END -->

+ 79 - 137
CLAUDE.md

@@ -4,192 +4,134 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
 
 ## Project Overview
 
-CodeGraph is a local-first code intelligence system that builds a semantic knowledge graph from any codebase. It provides structural understanding of code relationships using tree-sitter for AST parsing and SQLite for storage.
+CodeGraph is a local-first code intelligence library + CLI + MCP server. It parses any supported codebase with tree-sitter, stores symbols/edges/files in SQLite (FTS5), and exposes a knowledge graph to AI agents (Claude Code, Cursor, Codex CLI, opencode) over MCP. Per-project data lives in `.codegraph/`. Extraction is deterministic — derived from AST, not LLM-summarized.
 
-**Key characteristics:**
-- Headless library (no UI) - purely an API
-- Node.js runtime (works standalone, in Electron, or any Node environment)
-- Per-project data stored in `.codegraph/` directory
-- Deterministic extraction from AST, not AI-generated summaries
+Distributed as `@colbymchenry/codegraph` on npm; same binary serves as installer, indexer, and MCP server.
 
-## Build and Development Commands
+## Build, Test, Run
 
 ```bash
-# Build
-npm run build          # Compile TypeScript and copy assets
+npm run build           # tsc + copy schema.sql and *.wasm into dist/; chmods dist/bin/codegraph.js
+npm run dev             # tsc --watch
+npm run clean           # rm -rf dist
 
-# Test
-npm test               # Run all tests once
-npm run test:watch     # Run tests in watch mode
+npm test                # vitest run (all)
+npm run test:watch
+npm run test:eval       # only __tests__/evaluation/
+npm run eval            # build then run __tests__/evaluation/runner.ts via tsx
 
-# Clean
-npm run clean          # Remove dist/ directory
+npm run cli             # build then run the local dist binary
+
+# Single test file / pattern
+npx vitest run __tests__/installer-targets.test.ts
+npx vitest run __tests__/extraction.test.ts -t "TypeScript"
 ```
 
-## Running a Single Test
+`copy-assets` (called from `build`) copies `src/db/schema.sql` and all `src/extraction/wasm/*.wasm` files into `dist/`. **Any new SQL or grammar wasm must be copied or it won't ship.**
 
-```bash
-npx vitest run __tests__/extraction.test.ts           # Run specific test file
-npx vitest run __tests__/extraction.test.ts -t "TypeScript"  # Run tests matching pattern
-```
+Node engines: `>=18.0.0 <25.0.0`. There is a hard exit on Node 25.x (see `src/bin/node-version-check.ts`).
 
 ## Architecture
 
-### Core Module Structure
+### Layered pipeline
 
 ```
-src/
-├── index.ts              # Main CodeGraph class - public API entry point
-├── types.ts              # All TypeScript interfaces and types
-├── db/                   # SQLite database layer
-│   ├── index.ts          # DatabaseConnection class
-│   ├── queries.ts        # QueryBuilder with prepared statements
-│   └── schema.sql        # Table definitions with FTS5 search
-├── extraction/           # Tree-sitter AST parsing
-│   ├── index.ts          # ExtractionOrchestrator
-│   ├── tree-sitter.ts    # Universal parser wrapper
-│   └── grammars.ts       # Language detection and grammar loading
-├── resolution/           # Reference resolver
-│   ├── index.ts          # ReferenceResolver orchestrator
-│   ├── import-resolver.ts
-│   ├── name-matcher.ts
-│   └── frameworks/       # Framework-specific patterns (React, Express, Laravel, etc.)
-├── graph/                # Graph traversal and queries
-│   ├── index.ts          # GraphQueryManager
-│   ├── traversal.ts      # GraphTraverser (BFS/DFS, impact radius)
-│   └── queries.ts        # High-level graph queries
-├── context/              # Context building for AI assistants
-│   ├── index.ts          # ContextBuilder
-│   └── formatter.ts      # Markdown/JSON output formatting
-├── sync/                 # Incremental update system
-│   ├── index.ts
-│   └── git-hooks.ts      # Post-commit hook management
-├── installer/            # Interactive installer
-│   ├── index.ts          # Installer orchestrator
-│   ├── banner.ts         # ASCII art banner
-│   ├── claude-md-template.ts # CLAUDE.md template generator
-│   ├── config-writer.ts  # Configuration file writing
-│   └── prompts.ts        # User prompts
-├── mcp/                  # Model Context Protocol server
-│   ├── index.ts          # MCPServer class
-│   ├── tools.ts          # MCP tool definitions
-│   └── transport.ts      # Stdio transport
-└── bin/codegraph.ts      # CLI entry point
+files → ExtractionOrchestrator (tree-sitter) → DB (nodes/edges/files)
+              ↓
+       ReferenceResolver (imports, name-matching, framework patterns)
+              ↓
+       GraphQueryManager / GraphTraverser (callers, callees, impact)
+              ↓
+       ContextBuilder (markdown/JSON for AI consumption)
 ```
 
-### Key Classes
-
-- **CodeGraph** (`src/index.ts`): Main entry point. Lifecycle methods (`init`, `open`, `close`), indexing (`indexAll`, `sync`), graph queries (`traverse`, `getCallGraph`, `getImpactRadius`), context building (`buildContext`)
+The public API surface is `src/index.ts` — the `CodeGraph` class wires all the layers and re-exports types. Library users only touch this file; the MCP server and CLI also drive it.
 
-- **ExtractionOrchestrator** (`src/extraction/index.ts`): Coordinates file scanning, parsing, and storing. Uses tree-sitter native bindings for each supported language
+### Module layout
 
-- **GraphTraverser** (`src/graph/traversal.ts`): BFS/DFS traversal, call graph construction, impact radius calculation, path finding
+- `src/index.ts` — `CodeGraph` class: `init`/`open`/`close`, `indexAll`, `sync`, `searchNodes`, `getCallers`/`getCallees`, `getImpactRadius`, `buildContext`, `watch`/`unwatch`.
+- `src/db/` — `DatabaseConnection`, `QueryBuilder` (prepared statements), `schema.sql`. Backed by `better-sqlite3` (native) when available, transparently falls back to `node-sqlite3-wasm`. `codegraph status` surfaces which backend is live; wasm is the slow path.
+- `src/extraction/` — `ExtractionOrchestrator`, tree-sitter wrappers, per-language extractors under `languages/` (one file per language), plus standalone extractors for non-tree-sitter formats (`svelte-extractor.ts`, `vue-extractor.ts`, `liquid-extractor.ts`, `dfm-extractor.ts` for Delphi). `parse-worker.ts` runs heavy parsing off the main thread.
+- `src/resolution/` — `ReferenceResolver` orchestrates `import-resolver.ts` (with `path-aliases.ts` for tsconfig path aliases + cargo workspace member globs), `name-matcher.ts`, and `frameworks/` (Express, Laravel, Rails, FastAPI, Django, Flask, Spring, Gin, Axum, ASP.NET, Vapor, React Router, SvelteKit, Vue/Nuxt, Cargo workspaces). Frameworks emit `route` nodes and `references` edges.
+- `src/graph/` — `GraphTraverser` (BFS/DFS, impact radius, path finding) and `GraphQueryManager` (high-level queries).
+- `src/context/` — `ContextBuilder` + formatter for markdown/JSON output.
+- `src/search/` — full-text query parser and helpers for FTS5.
+- `src/sync/` — `FileWatcher` (native FSEvents/inotify/RDCW) with debounce + filter, and git-hook helpers.
+- `src/mcp/` — MCP server (`MCPServer`, `tools.ts`, `transport.ts`). `server-instructions.ts` is what the server returns in the MCP `initialize` response — keep it in sync with the user-facing tool guidance.
+- `src/installer/` — see below.
+- `src/bin/codegraph.ts` — CLI (commander). Subcommands: `install`, `init`, `uninit`, `index`, `sync`, `status`, `query`, `files`, `context`, `affected`, `serve --mcp`.
+- `src/ui/` — terminal UI (shimmer progress, worker).
 
-- **ReferenceResolver** (`src/resolution/index.ts`): Resolves unresolved references after full indexing using framework patterns, import resolution, and name matching
+### NodeKind / EdgeKind
 
-### Database Schema
+Defined in `src/types.ts`. Both extractors and resolvers must use these exact strings.
 
-SQLite database with:
-- `nodes`: Code symbols (functions, classes, methods, etc.)
-- `edges`: Relationships (calls, imports, extends, contains, etc.)
-- `files`: Tracked source files with content hashes
-- `unresolved_refs`: References pending resolution
-- `nodes_fts`: FTS5 virtual table for full-text search
+- **NodeKind**: `file`, `module`, `class`, `struct`, `interface`, `trait`, `protocol`, `function`, `method`, `property`, `field`, `variable`, `constant`, `enum`, `enum_member`, `type_alias`, `namespace`, `parameter`, `import`, `export`, `route`, `component`.
+- **EdgeKind**: `contains`, `calls`, `imports`, `exports`, `extends`, `implements`, `references`, `type_of`, `returns`, `instantiates`, `overrides`, `decorates`.
 
-### Supported Languages
+### Multi-agent installer
 
-TypeScript, JavaScript, TSX, JSX, Svelte, Python, Go, Rust, Java, C, C++, C#, PHP, Ruby, Swift, Kotlin, Dart, Liquid, Pascal
+`src/installer/` is the entry point for `codegraph install` (and the bare `codegraph`/`npx @colbymchenry/codegraph` invocation). Architecture:
 
-### Node and Edge Types
+- `targets/registry.ts` lists every supported agent.
+- `targets/types.ts` defines the `AgentTarget` interface — adding a 5th agent (Continue, Zed, Windsurf…) is **one new file in `targets/` + one entry in `registry.ts`**. Each target owns its config-file location, MCP-server JSON/TOML/JSONC writing, and instructions-file path.
+- Current targets: `claude.ts`, `cursor.ts`, `codex.ts`, `opencode.ts`.
+- `targets/toml.ts` is a hand-rolled TOML serializer scoped to `[mcp_servers.codegraph]` (used by Codex). Sibling tables and `[[array_of_tables]]` are preserved verbatim. No new dependency.
+- opencode reads `opencode.jsonc` by default; the installer prefers existing `.jsonc`, falls back to `.json`, and creates `.jsonc` for greenfield installs. Edits are surgical via `jsonc-parser` so user comments and formatting survive install/re-install/uninstall round-trips.
+- `instructions-template.ts` is the agent-agnostic instructions file written to each target (e.g. `CLAUDE.md`, `.cursor/rules/codegraph.mdc`, `~/.codex/AGENTS.md`, `~/.config/opencode/AGENTS.md`). It explicitly says "trust codegraph results, don't re-verify with grep" — earlier versions prescribed Claude-specific "spawn an Explore agent" and confused other agents.
+- `claude-md-template.ts` is the legacy Claude-only template, retained for compatibility paths.
+- All installer changes need matching coverage in `__tests__/installer-targets.test.ts` — there are ~47 parameterized contract tests covering install idempotency, sibling preservation, uninstall reverses install, byte-equal re-runs returning `unchanged`, and partial-state recovery for Codex.
 
-**NodeKind**: `file`, `module`, `class`, `struct`, `interface`, `trait`, `protocol`, `function`, `method`, `property`, `field`, `variable`, `constant`, `enum`, `enum_member`, `type_alias`, `namespace`, `parameter`, `import`, `export`, `route`, `component`
+### Cursor MCP working-directory quirk
 
-**EdgeKind**: `contains`, `calls`, `imports`, `exports`, `extends`, `implements`, `references`, `type_of`, `returns`, `instantiates`, `overrides`, `decorates`
+Cursor launches MCP subprocesses with the wrong cwd and doesn't pass `rootUri` in `initialize`. The installer injects `--path` into Cursor's MCP args — absolute path for local installs, `${workspaceFolder}` for global installs. If you touch Cursor wiring, preserve this.
 
-## CLI Usage
+### MCP server instructions
 
-```bash
-codegraph init [path]       # Initialize in project
-codegraph index [path]      # Full index
-codegraph sync [path]       # Incremental update
-codegraph status [path]     # Show statistics
-codegraph query <search>    # Search symbols
-codegraph context <task>    # Build context for AI
-codegraph hooks install     # Install git auto-sync
-codegraph serve --mcp       # Start MCP server
-```
+`src/mcp/server-instructions.ts` is sent back to the agent in the MCP `initialize` response. This is the *first* thing every agent sees about how to use the tools — treat it as the authoritative tool guidance and keep it in sync with `instructions-template.ts` and `.cursor/rules/codegraph.mdc`.
 
-## MCP Tools Best Practices
+## Tests
 
-Use these tools **directly in the main session** for fast code exploration (replaces the need for Explore agents in most cases):
+Tests live in `__tests__/` and mirror the module they cover. Notable ones beyond the obvious:
 
-| Tool | Use For |
-|------|---------|
-| `codegraph_explore` | **Deep exploration** — comprehensive context for a topic in ONE call |
-| `codegraph_context` | Quick context for a task (lighter than explore) |
-| `codegraph_search` | Find symbols by name (functions, classes, types) |
-| `codegraph_callers` | Find what calls a function |
-| `codegraph_callees` | Find what a function calls |
-| `codegraph_impact` | See what's affected by changing a symbol |
-| `codegraph_node` | Get details + source code for a symbol |
+- `installer-targets.test.ts` — parameterized contract suite across all 4 agent targets (see installer notes above).
+- `evaluation/` — `runner.ts` + `test-cases.ts` exercise codegraph against synthetic projects and score the results; run via `npm run eval` (builds first). Not part of `npm test`.
+- `sqlite-backend.test.ts` — covers native + wasm backend selection and fallback.
+- `pr19-improvements.test.ts`, `frameworks-integration.test.ts` — regression coverage for specific past PRs/incidents; don't rename these, the names anchor to git history.
 
-### Important
-CodeGraph provides **code context**, not product requirements. For new features, still ask the user about:
-- UX preferences and behavior
-- Edge cases and error handling
-- Acceptance criteria
+Tests create temp dirs with `fs.mkdtempSync` and clean up in `afterEach`. They write real files and exercise real SQLite — there is no DB mocking.
 
 ## Releases
 
-Releases are published to npm **and** mirrored as GitHub Releases on the
-[Releases page](https://github.com/colbymchenry/codegraph/releases), which is
-where most users look for change history. `CHANGELOG.md` at the repo root is
-the source of truth — each GitHub Release's notes are extracted from it.
+Released to npm and mirrored as [GitHub Releases](https://github.com/colbymchenry/codegraph/releases). `CHANGELOG.md` is the source of truth; GitHub Release notes are extracted from it.
 
 ### Writing changelog entries
 
-When the user asks for a changelog entry for a new version:
+When asked for an entry for a new version:
 
-1. Add a new `## [X.Y.Z] - YYYY-MM-DD` block at the **top** of `CHANGELOG.md`
-   (directly under the intro, above the previous version).
-2. Group changes under `### Added`, `### Changed`, `### Fixed`, `### Removed`,
-   `### Deprecated`, `### Security` — only include sections that have entries.
-3. Write entries from the **user's perspective**, not the implementation's.
-   Lead with the observable symptom or capability, then mention internals only
-   if a user needs them (e.g., to work around an existing bad install).
-4. Add the link reference at the bottom:
-   `[X.Y.Z]: https://github.com/colbymchenry/codegraph/releases/tag/vX.Y.Z`
+1. Add a new `## [X.Y.Z] - YYYY-MM-DD` block at the **top** of `CHANGELOG.md` (under the intro, above the previous version).
+2. Group under `### Added`, `### Changed`, `### Fixed`, `### Removed`, `### Deprecated`, `### Security` — omit empty sections.
+3. Write from the **user's perspective**, not the implementation's. Lead with the observable symptom or capability; mention internals only if a user needs them (e.g., to work around an existing bad install).
+4. Add the link reference at the bottom: `[X.Y.Z]: https://github.com/colbymchenry/codegraph/releases/tag/vX.Y.Z`.
 
-### Release commands (the user runs these)
+### Release flow (the user runs these)
 
-After the changelog entry is written and the version is bumped in `package.json`:
+After the changelog entry is written and `package.json` is bumped:
 
 ```bash
 git add package.json package-lock.json CHANGELOG.md
 git commit -m "release: X.Y.Z (<one-line summary>)"
 git push
-
 npm publish
-
-git tag vX.Y.Z
-git push origin vX.Y.Z
-gh release create vX.Y.Z \
-  --title "vX.Y.Z" \
-  --notes-file <(awk '/^## \[X.Y.Z\]/,/^## \[/{ if (/^## \[/ && !/X.Y.Z/) exit; print }' CHANGELOG.md)
+./scripts/release.sh   # idempotent: tags vX.Y.Z, pushes, creates GitHub Release with notes from CHANGELOG.md
 ```
 
-Do **not** run `npm publish`, `git tag`, `git push`, or `gh release create`
-yourself — these are publish actions that affect shared state. Write the file,
-hand the user the commands.
+`scripts/release.sh` is safe to re-run after a partial failure — it skips steps already done (tag exists locally, tag on origin, release published). It extracts release notes from `CHANGELOG.md` by matching the `## [X.Y.Z]` block.
 
-## Test Structure
+**Do not run `npm publish`, `git push`, `git tag`, or `./scripts/release.sh` yourself** — these are publish actions on shared state. Write the file, hand the user the commands.
 
-Tests are in `__tests__/` directory with files mirroring the module structure:
-- `foundation.test.ts` - Database, config, directory management
-- `extraction.test.ts` - Tree-sitter parsing for all languages
-- `resolution.test.ts` - Reference resolution
-- `graph.test.ts` - Traversal and graph queries
-- `context.test.ts` - Context building
-- `sync.test.ts` - Incremental updates and git hooks
+## House rules
 
-Tests use temporary directories created with `fs.mkdtempSync` and cleaned up after each test.
+- The `0.7.x` line is in active multi-agent rollout. Any change to `src/installer/` (especially `targets/`) needs corresponding test coverage and a CHANGELOG entry — installer regressions break every new install silently.
+- When changing what the MCP tools do or how agents should use them, update **all three** of `src/mcp/server-instructions.ts`, `src/installer/instructions-template.ts`, and `.cursor/rules/codegraph.mdc` — they're written to different places but say the same thing.
+- CodeGraph provides **code context**, not product requirements. For new features, ask the user about UX, edge cases, and acceptance criteria — the graph won't tell you.

+ 70 - 0
scripts/release.sh

@@ -0,0 +1,70 @@
+#!/usr/bin/env bash
+# Tag the current commit with the version in package.json and publish a
+# matching GitHub Release whose body is the corresponding CHANGELOG.md entry.
+#
+# Run AFTER you have:
+#   - bumped package.json
+#   - added a `## [X.Y.Z] - YYYY-MM-DD` block at the top of CHANGELOG.md
+#   - committed, pushed to origin, and run `npm publish`
+#
+# Idempotent: safe to re-run after a partial failure. Skips steps that are
+# already done (tag created, tag pushed, release published).
+#
+# Usage: ./scripts/release.sh
+
+set -euo pipefail
+
+cd "$(dirname "$0")/.."
+
+VERSION=$(node -p "require('./package.json').version")
+TAG="v${VERSION}"
+
+REPO=$(git remote get-url origin | sed -E 's|.*github\.com[:/]||; s|\.git$||')
+if [ -z "${REPO}" ]; then
+  echo "error: could not derive owner/repo from origin remote URL" >&2
+  exit 1
+fi
+
+if ! grep -q "^## \[${VERSION}\]" CHANGELOG.md; then
+  echo "error: no '## [${VERSION}]' entry found in CHANGELOG.md" >&2
+  exit 1
+fi
+
+NOTES=$(awk -v v="${VERSION}" '
+  /^## \[/ {
+    if (p) exit
+    if ($0 ~ "^## \\[" v "\\]") p = 1
+  }
+  p
+' CHANGELOG.md)
+
+if [ -z "${NOTES}" ]; then
+  echo "error: failed to extract changelog notes for ${VERSION}" >&2
+  exit 1
+fi
+
+if git rev-parse "${TAG}" >/dev/null 2>&1; then
+  echo "✓ tag ${TAG} already exists locally"
+else
+  echo "→ tagging ${TAG}"
+  git tag "${TAG}"
+fi
+
+if git ls-remote --exit-code --tags origin "${TAG}" >/dev/null 2>&1; then
+  echo "✓ tag ${TAG} already on origin"
+else
+  echo "→ pushing ${TAG} to origin"
+  git push origin "${TAG}"
+fi
+
+if gh release view "${TAG}" --repo "${REPO}" >/dev/null 2>&1; then
+  echo "✓ release ${TAG} already published"
+else
+  echo "→ creating GitHub Release ${TAG} on ${REPO}"
+  gh release create "${TAG}" \
+    --repo "${REPO}" \
+    --title "${TAG}" \
+    --notes "${NOTES}"
+fi
+
+echo "done: https://github.com/${REPO}/releases/tag/${TAG}"