Selaa lähdekoodia

feat(mcp): codegraph_explore as the sole primary tool + store coverage + overload disambiguation (#647)

## Summary

Completes the explore-overhaul arc: `codegraph_explore` becomes the single primary tool an agent reaches for, and its coverage + output shape are tuned so flow/architecture questions resolve with near-zero Read/Grep.

### What changed
- **explore is the sole primary tool** — removed `codegraph_context` (the fuzzy-input Read-trigger) and `codegraph_trace` (under-picked by agents); explore already surfaces the call flow among the symbols you name. A plain natural-language question now works as the query.
- **Store/handler coverage** — functions defined inside object literals (Zustand `create((set, get) => ({ … }))`, Redux/Pinia/MobX, exported handler/route maps) are indexed as real symbols, including calls through `useStore.getState().fn()` and destructured `const { fn } = useStore.getState()`. A general AST rule, not a per-lib hack.
- **Overload disambiguation** — explore leads with the *right* definition when a method name is overloaded across types (a PascalCase type token in the query biases to that type's own def); `codegraph_node` returns *every* overload's body in one call, with an optional `file`/`line` selector to pin one.
- **Method-atomic render** — explore never returns half a method; at the size budget it drops whole methods/files (and lists what it dropped) instead of truncating a body mid-method.
- **Native-read-shaped output** — per-call output is capped to ~24K with a 25K hard ceiling and concentrated into ~150–250-line flow windows, mirroring how the agent natively reads; repo size scales the *call* budget, not the per-call size (a larger response just gets externalized to a file the host Reads back).
- **Blast radius** folded into explore (dependents + covering tests, locations only).

### Benchmark (refreshed on this build)
Re-validated the 7-repo A/B on 2026-06-02 (Opus 4.8, effort=high, median of 4). WITH arm re-measured on this build, WITHOUT reused:

**~16% cheaper · 47% fewer tokens · 22% faster · 58% fewer tool calls** — 0 file reads on 6 of 7 repos (Gin ~1).

The arc trades larger, cache-heavy explore responses for guaranteed near-zero reads, so cost/token margins soften vs the prior build (Excalidraw and Tokio land at cost break-even) while time and tool-calls stay clear wins everywhere — consistent with the project's stated optimization target (latency + tool-calls, not token cost).

### Validation
- Full suite green: **1112 passed, 2 skipped**.
- 28/28 plain WITH runs across the 7 README repos completed clean; reads median 0 on 6/7.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
Colby Mchenry 3 viikkoa sitten
vanhempi
sitoutus
68eaf0dbd8

+ 3 - 0
.gitignore

@@ -56,3 +56,6 @@ nul
 release/
 
 .antigravitycli/
+
+# Local-only: browser-based tmux session launcher (see tmux-web/README.md)
+tmux-web/

+ 12 - 0
CHANGELOG.md

@@ -9,6 +9,18 @@ and adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 
 ## [Unreleased]
 
+### New Features
+
+- `codegraph_explore` is now the primary tool, and one call is usually all an agent needs: it returns the verbatim source of the symbols relevant to your question (a plain question works as the query — you no longer need exact symbol names), grouped by file and Read-equivalent, so the agent answers without falling back to read/grep. The narrower `codegraph_context` and `codegraph_trace` tools were removed in favor of it — explore already surfaces the call flow among the symbols you name (the job trace did), so there's one obvious tool to reach for instead of three.
+- `codegraph_explore` now includes a compact "Blast radius" for the symbols you're looking at — who depends on each (just the locations, not their source) and which test files cover it — so before editing, the agent can see what else to update and which tests to run, without a separate impact lookup. Symbols nothing depends on are skipped, so it stays short.
+- Functions defined inside a store or handler object — the actions in a Zustand `create((set, get) => ({ … }))` store, and the same shape in Redux, Pinia, MobX, or any exported handler/route map — are now indexed as real symbols. Previously they existed only as object properties, so looking one up by name or asking who calls it returned "not found" and the agent had to read the whole store file to follow the flow; now `codegraph_node`, `codegraph_callers`, and `codegraph_explore` resolve them directly — including calls made through `useStore.getState().fetchUser()` or a destructured `const { fetchUser } = useStore.getState()`.
+- `codegraph_explore` now surfaces the *right* definition when a method name is overloaded across types. Asking about, say, `DataRequest`'s `task` and `validate` used to return a same-named method from an unrelated file (or an abstract base stub) and bury the one you meant; explore now recognizes the type you named in the query and leads with that type's own overloads, in full.
+
+### Fixes
+
+- Search ranking no longer lets a common word in your request hijack the results: asking about, say, a "flat object" screen used to surface an unrelated constant that merely happened to be named the same, because the exact-name match outweighed everything else. Ranking now weighs how well each result is corroborated by the rest of your request, so the symbols you actually meant come first (this improves `codegraph_explore`'s results).
+- `codegraph_node` now returns *every* definition when a name is ambiguous — an overloaded method, or the same method name on different types — instead of returning one (sometimes the wrong one) with a note listing the rest. Asking for such a symbol now hands back all of the matching definitions with their source in a single call, so the agent stops having to read the file by hand to find the specific overload it wanted (common in Swift, Go, Java, and C#). For a heavily-overloaded name (a `poll`/`validate` with dozens of definitions), pass `file` (and/or `line`) — e.g. the `file:line` shown in a trail — to get that exact definition's body. Large overload sets show the most relevant ones in full and list the remainder by location.
+- `codegraph_explore` never returns half a method anymore: when output runs up against its size budget it drops whole methods or whole files (and lists what it dropped, so you can ask for them in another call) instead of cutting off a method body partway. A truncated method was the one case that still sent the agent to read the file for the rest — so the source explore returns is now always complete and usable as-is.
 
 ## [0.9.8] - 2026-06-01
 

+ 7 - 7
CLAUDE.md

@@ -102,12 +102,12 @@ The lever that decides whether a retrieval change lands. **Test before building
 CodeGraph's only channels to influence the agent are low-salience: the MCP `initialize` instructions (`server-instructions.ts`) and the tool descriptions. Changing them does **not** reliably move the agent's tool _choice_ or query style — validated: trace-first steering ported into the server-instructions + tool descriptions (3 wording variants) never reproduced what a CLI `--append-system-prompt` achieved, and **regressed** wall-clock vs baseline. New tools fare worse (rarely chosen — the agent under-picks even `trace`); "better examples" is the same steering. The agent's tool-choice does improve on its own as host models get better at tool use — but that is not ours to force.
 
 What works is meeting the agent where it already is:
-- **Sufficiency** — `codegraph_trace` inlines each hop's body + the destination's own callees, so one trace call ends the flow investigation (no follow-up explore/node/Read).
-- **explore-flow** — `codegraph_explore`'s query is a precise bag of symbol names (incl. qualified `Class.method`) spanning the flow the agent is after; explore finds the call path _among those named symbols_ (riding synthesized edges) and leads its output with it — delivering trace-quality flow through the call the agent reliably makes. (`buildFlowFromNamedSymbols`: segment/co-naming disambiguation; ≤1 unnamed bridge so it never wanders a god-function's fan-out.)
+- **explore-flow** — `codegraph_explore` is the PRIMARY tool the agent reliably calls; its query is a precise bag of symbol names (incl. qualified `Class.method`) spanning the flow the agent is after; explore finds the call path _among those named symbols_ (riding synthesized edges) and leads its output with it. (`buildFlowFromNamedSymbols`: segment/co-naming disambiguation; ≤1 unnamed bridge so it never wanders a god-function's fan-out. Overload-aware: a PascalCase type token in the query biases an overloaded name to that type's own def — `DataRequest task` → DataRequest's `task`, not the abstract base; named-symbol files sort first.)
+- **Sufficiency** — make the tool's output complete enough that the agent stops. `codegraph_node` returns the full body + the caller/callee trail, and for an AMBIGUOUS name returns **every overload's body in one call** (so the agent never Reads a file to find the right overload — validated on Alamofire/gin). This is the after-explore depth tool (labeled SECONDARY).
 
-What fails is the inverse — folding a precise answer into a **fuzzy-input** tool. `codegraph_context` gets a description, not symbols, so it can't disambiguate a flow's endpoints and surfaces the _wrong feature_. Precise output needs precise input.
+What fails is the inverse — folding a precise answer into a **fuzzy-input** tool: the now-removed `codegraph_context` took a description, not symbols, so it couldn't disambiguate a flow's endpoints and surfaced the _wrong feature_ (which is why it was cut). Precise output needs precise input — explore takes a symbol bag for exactly this reason. (`codegraph_trace` was likewise removed: explore-flow does its job and the agent under-picked it.)
 
-The remaining lever under this axis is **coverage**: every flow made to connect statically (a new dynamic-dispatch synthesizer) is then surfaced automatically by explore-flow/`trace`, no agent change needed. Reactive/reconciler runtimes (Halo's `ReactiveExtensionClient`, MediatR, Vue Proxy) are the frontier — flows there have no static edges, so nothing surfaces (correctly — silent beats wrong). Full investigation + A/B record: `docs/benchmarks/call-sequence-analysis.md`.
+The remaining lever under this axis is **coverage**: every flow made to connect statically (a new dynamic-dispatch synthesizer, or extracting symbols static parsing skipped — e.g. object-literal store actions in `create((set,get)=>({...}))`) is then surfaced automatically by explore-flow, no agent change needed. Reactive/reconciler runtimes (Halo's `ReactiveExtensionClient`, MediatR, Vue Proxy) are the frontier — flows there have no static edges, so nothing surfaces (correctly — silent beats wrong). Full investigation + A/B record: `docs/benchmarks/call-sequence-analysis.md` + auto-memory `project_codegraph_read_displacement`.
 
 ### Explore budget — keep BOTH budgets monotonic with repo size
 
@@ -126,7 +126,7 @@ Two functions in `src/mcp/tools.ts` scale explore with indexed file count. This
 
 ### Dynamic-dispatch coverage — the flow must EXIST in the graph end-to-end
 
-Static tree-sitter extraction misses computed/indirect calls, so flows break at dynamic dispatch and the agent reads to reconstruct them. Synthesizers/resolvers bridge these so `trace`/`explore` connect end-to-end (`src/resolution/callback-synthesizer.ts`, `src/resolution/frameworks/`). Channels today: callback/observer, EventEmitter, **React re-render** (`setState`→`render`), **JSX child** (`render`→child component), django ORM descriptor. All synthesized edges are `provenance:'heuristic'` with `metadata.synthesizedBy` + `registeredAt` (the wiring site), surfaced inline in `trace`, the `node` trail, and `context` call-paths.
+Static tree-sitter extraction misses computed/indirect calls, so flows break at dynamic dispatch and the agent reads to reconstruct them. Synthesizers/resolvers bridge these so `codegraph_explore` connects them end-to-end (`src/resolution/callback-synthesizer.ts`, `src/resolution/frameworks/`). Channels today: callback/observer, EventEmitter, **React re-render** (`setState`→`render`), **JSX child** (`render`→child component), django ORM descriptor. All synthesized edges are `provenance:'heuristic'` with `metadata.synthesizedBy` + `registeredAt` (the wiring site), surfaced inline in `codegraph_explore`'s Flow section and the `codegraph_node` trail.
 
 **Principle: partial coverage is WORSE than none.** Bridging one boundary but not the next reveals a hop the agent then drills + reads to finish. Measured on excalidraw: react-render alone *raised* reads to 5–7; only completing the flow (adding the jsx-child hop) dropped it to 0–1. **Always close the flow end-to-end and re-measure** — never ship a half-bridged flow.
 
@@ -135,7 +135,7 @@ Static tree-sitter extraction misses computed/indirect calls, so flows break at
 For each **language × framework**, validate on **small, medium, and large** real repos with **≥3 different flow prompts** each:
 
 1. **Pick the canonical flow** for the framework ("how does X reach Y": state→render, request→handler→view, query→SQL, action→reducer→store…).
-2. **Deterministic probes** (`scripts/agent-eval/probe-{trace,node,context,explore}.mjs` against the built `dist/`): `trace(from,to)` connects end-to-end with no break; **no node explosion** (`select count(*) from nodes` stable before/after re-index); synthesized-edge **precision** spot-check (`select … where provenance='heuristic'`).
+2. **Deterministic probes** (`scripts/agent-eval/probe-{node,explore}.mjs` against the built `dist/`): `codegraph_explore` with the flow's symbol names connects from→to end-to-end with no break (its Flow section shows the path); **no node explosion** (`select count(*) from nodes` stable before/after re-index); synthesized-edge **precision** spot-check (`select … where provenance='heuristic'`).
 3. **Agent A/B** (`scripts/agent-eval/run-all.sh <repo> "<Q>"`): with vs without codegraph, **≥2 runs/arm** (run-to-run variance is large — never conclude from n=1). Record **duration, total tool calls, Read, Grep**. Optional forced-Read-0 sufficiency proof via the block-read hook (`scripts/agent-eval/hook-settings.json`).
 4. **Pass bar:** a normal flow question reaches **~0 Read/Grep within the repo's explore-call budget**, runs **faster** than without-codegraph, and shows **no regression on a control repo**. Record the numbers in `docs/design/dynamic-dispatch-coverage-playbook.md` (the coverage matrix).
 
@@ -209,7 +209,7 @@ Formatting rules for any entry (anywhere — `[Unreleased]` or otherwise):
 
 1. **Write friendly, user-facing notes — not engineer-facing ones.** Group under `### New Features` and `### Fixes` (sentence-case). Surface `### Breaking Changes` and `### Security` as their own sections **only when the release has them**; fold improvement-flavored changes into New Features. Omit empty sections. (This replaces the old Keep-a-Changelog `Added/Changed/Fixed/Removed/Deprecated` grouping: the GitHub Release page extracts each version block **verbatim** via `scripts/extract-release-notes.mjs`, and the old dense, implementation-focused entries rendered as an unreadable wall of text — so the whole CHANGELOG was rewritten to this format and every published release re-noted to match.)
 2. **One plain-language sentence per bullet:** what changed and why it matters to a user. Lead with the capability, or with the symptom that's now fixed.
-3. **Strip the internals.** No internal file paths (`src/...`), no internal symbol / function / class names, no benchmark numbers / percentages / node-or-edge counts. **Keep:** language & framework names (Go, Spring, NestJS, …), things a user types or sets (`codegraph install`, `codegraph_trace`, the `CODEGRAPH_*` env vars), agent / IDE names (Claude Code, Cursor, opencode, Kiro, …), and a brief `Thanks @user` when a contributor is credited.
+3. **Strip the internals.** No internal file paths (`src/...`), no internal symbol / function / class names, no benchmark numbers / percentages / node-or-edge counts. **Keep:** language & framework names (Go, Spring, NestJS, …), things a user types or sets (`codegraph install`, `codegraph_explore`, the `CODEGRAPH_*` env vars), agent / IDE names (Claude Code, Cursor, opencode, Kiro, …), and a brief `Thanks @user` when a contributor is credited.
 4. Issue / PR references in entries are by number (`(#403)` etc.); the GitHub renderer auto-links them in the published release notes.
 5. **Don't add a `[X.Y.Z]: https://...` link reference yourself** — `prepare-release.mjs` appends it automatically when it promotes the version (idempotent: a re-run is a no-op if it already exists).
 

+ 52 - 55
README.md

@@ -4,7 +4,7 @@
 
 ### Supercharge Claude Code, Cursor, Codex, OpenCode, Hermes Agent, Gemini, Antigravity, and Kiro with Semantic Code Intelligence
 
-**~25% cheaper · ~62% fewer tool calls · 100% local**
+**~16% cheaper · ~58% fewer tool calls · 100% local**
 
 ### [Documentation & Website →](https://colbymchenry.github.io/codegraph/)
 
@@ -83,21 +83,21 @@ When Claude Code explores a codebase, it spawns **Explore agents** that scan fil
 
 ### Benchmark Results
 
-Tested across **7 real-world open-source codebases** spanning 7 languages, comparing an agent (Claude Code, headless) answering one architecture question **with** and **without** CodeGraph. Each cell is the savings at the **median of 4 runs per arm**. _Re-validated on Opus 4.8 (2026-05-29), on the build with per-symbol adaptive `codegraph_explore` sizing._
+Tested across **7 real-world open-source codebases** spanning 7 languages, comparing an agent (Claude Code, headless) answering one architecture question **with** and **without** CodeGraph. Each cell is the savings at the **median of 4 runs per arm**. _Re-validated on Opus 4.8 (2026-06-02), on the current build (`codegraph_explore` as the primary tool)._
 
-> **Average: 25% cheaper · 57% fewer tokens · 23% faster · 62% fewer tool calls**
+> **Average: 16% cheaper · 47% fewer tokens · 22% faster · 58% fewer tool calls**
 
 | Codebase | Language | Cost | Tokens | Time | Tool calls |
 |----------|----------|------|--------|------|------------|
-| **VS Code** | TypeScript · ~10k files | 33% cheaper | 70% fewer | 27% faster | 80% fewer |
-| **Excalidraw** | TypeScript · ~640 | 27% cheaper | 61% fewer | 26% faster | 70% fewer |
-| **Django** | Python · ~3k | 23% cheaper | 70% fewer | 28% faster | 77% fewer |
-| **Tokio** | Rust · ~790 | 35% cheaper | 70% fewer | 37% faster | 79% fewer |
-| **OkHttp** | Java · ~645 | 11% cheaper | 48% fewer | 26% faster | 70% fewer |
-| **Gin** | Go · ~110 | 15% cheaper | 35% fewer | 9% faster | 47% fewer |
-| **Alamofire** | Swift · ~110 | 28% cheaper | 46% fewer | 7% faster | 13% fewer |
+| **VS Code** | TypeScript · ~10k files | 18% cheaper | 64% fewer | 11% faster | 81% fewer |
+| **Excalidraw** | TypeScript · ~640 | even | 25% fewer | 27% faster | 40% fewer |
+| **Django** | Python · ~3k | 8% cheaper | 60% fewer | 13% faster | 77% fewer |
+| **Tokio** | Rust · ~790 | even | 38% fewer | 18% faster | 57% fewer |
+| **OkHttp** | Java · ~645 | 25% cheaper | 54% fewer | 31% faster | 50% fewer |
+| **Gin** | Go · ~110 | 19% cheaper | 23% fewer | 24% faster | 44% fewer |
+| **Alamofire** | Swift · ~110 | 40% cheaper | 64% fewer | 33% faster | 58% fewer |
 
-CodeGraph cuts **cost, tokens, tool calls, and time on every repo** — across small, medium, and large codebases — and answers most of them with **zero file reads**, while the no-CodeGraph agent spends its budget on grep/find/Read discovery. `codegraph_explore` shows the answer in full — the mechanism plus the exact methods you asked about, even when they're buried in a multi-thousand-line file — while collapsing redundant interchangeable implementations to signatures, so the response is sized to the *answer* rather than the file count. The cost margin is narrowest on the smallest repos, where a modern model's native search is already cheap, but it stays solidly positive across the board.
+CodeGraph cuts **tokens, tool calls, and wall-clock time on every repo** — across small, medium, and large codebases — and answers them with **near-zero file reads**, while the no-CodeGraph agent spends its budget on grep/find/Read discovery. `codegraph_explore` shows the answer in full — the mechanism plus the exact methods you asked about, even when they're buried in a multi-thousand-line file — while collapsing redundant interchangeable implementations to signatures, so the response is sized to the *answer* rather than the file count. **Cost stays flat-to-cheaper everywhere** — largest on the small repos (Alamofire, OkHttp), roughly break-even on the most response-heavy ones (Excalidraw, Tokio), where CodeGraph trades the no-CodeGraph agent's many small grep/read round-trips for a few large, cache-heavy tool responses.
 
 <details>
 <summary><strong>Per-repo breakdown — WITH vs WITHOUT (median of 4)</strong></summary>
@@ -105,79 +105,79 @@ CodeGraph cuts **cost, tokens, tool calls, and time on every repo** — across s
 **VS Code** · ~10k files
 | Metric | WITH cg | WITHOUT cg | Δ |
 |---|---|---|---|
-| Time | 1m 37s | 2m 13s | 27% faster |
+| Time | 1m 59s | 2m 13s | 11% faster |
 | File Reads | 0 | 9 | −9 |
 | Grep/Bash | 0 | 11 | −11 |
-| Tool calls | 4 | 21 | 80% fewer |
-| Total tokens | 545k | 1.79M | 70% fewer |
-| Cost | $0.55 | $0.83 | 33% cheaper |
+| Tool calls | 4 | 21 | 81% fewer |
+| Total tokens | 640k | 1.79M | 64% fewer |
+| Cost | $0.68 | $0.83 | 18% cheaper |
 
 **Excalidraw** · ~640 files
 | Metric | WITH cg | WITHOUT cg | Δ |
 |---|---|---|---|
-| Time | 1m 34s | 2m 6s | 26% faster |
+| Time | 1m 32s | 2m 6s | 27% faster |
 | File Reads | 0 | 7 | −7 |
-| Grep/Bash | 0 | 8 | −8 |
-| Tool calls | 5 | 15 | 70% fewer |
-| Total tokens | 651k | 1.69M | 61% fewer |
-| Cost | $0.57 | $0.78 | 27% cheaper |
+| Grep/Bash | 1 | 8 | −7 |
+| Tool calls | 9 | 15 | 40% fewer |
+| Total tokens | 1.27M | 1.69M | 25% fewer |
+| Cost | $0.78 | $0.78 | even |
 
 **Django** · ~3k files
 | Metric | WITH cg | WITHOUT cg | Δ |
 |---|---|---|---|
-| Time | 1m 25s | 1m 58s | 28% faster |
+| Time | 1m 43s | 1m 58s | 13% faster |
 | File Reads | 0 | 9 | −9 |
 | Grep/Bash | 0 | 5 | −5 |
 | Tool calls | 3 | 13 | 77% fewer |
-| Total tokens | 419k | 1.41M | 70% fewer |
-| Cost | $0.48 | $0.62 | 23% cheaper |
+| Total tokens | 559k | 1.41M | 60% fewer |
+| Cost | $0.57 | $0.62 | 8% cheaper |
 
 **Tokio** · ~790 files
 | Metric | WITH cg | WITHOUT cg | Δ |
 |---|---|---|---|
-| Time | 1m 28s | 2m 20s | 37% faster |
+| Time | 1m 55s | 2m 20s | 18% faster |
 | File Reads | 0 | 8 | −8 |
 | Grep/Bash | 0 | 6 | −6 |
-| Tool calls | 3 | 14 | 79% fewer |
-| Total tokens | 522k | 1.73M | 70% fewer |
-| Cost | $0.53 | $0.82 | 35% cheaper |
+| Tool calls | 6 | 14 | 57% fewer |
+| Total tokens | 1.08M | 1.73M | 38% fewer |
+| Cost | $0.82 | $0.82 | even |
 
 **OkHttp** · ~645 files
 | Metric | WITH cg | WITHOUT cg | Δ |
 |---|---|---|---|
-| Time | 1m 6s | 1m 29s | 26% faster |
-| File Reads | 1 | 4 | −3 |
-| Grep/Bash | 0 | 6 | −6 |
-| Tool calls | 3 | 10 | 70% fewer |
-| Total tokens | 572k | 1.10M | 48% fewer |
-| Cost | $0.48 | $0.55 | 11% cheaper |
+| Time | 1m 1s | 1m 29s | 31% faster |
+| File Reads | 0 | 4 | −4 |
+| Grep/Bash | 2 | 6 | −4 |
+| Tool calls | 5 | 10 | 50% fewer |
+| Total tokens | 502k | 1.10M | 54% fewer |
+| Cost | $0.41 | $0.55 | 25% cheaper |
 
 **Gin** · ~110 files
 | Metric | WITH cg | WITHOUT cg | Δ |
 |---|---|---|---|
-| Time | 1m 28s | 1m 37s | 9% faster |
-| File Reads | 0 | 6 | −6 |
-| Grep/Bash | 0 | 2 | −2 |
-| Tool calls | 5 | 9 | 47% fewer |
-| Total tokens | 552k | 847k | 35% fewer |
-| Cost | $0.48 | $0.57 | 15% cheaper |
+| Time | 1m 14s | 1m 37s | 24% faster |
+| File Reads | 1 | 6 | −5 |
+| Grep/Bash | 1 | 2 | −1 |
+| Tool calls | 5 | 9 | 44% fewer |
+| Total tokens | 651k | 847k | 23% fewer |
+| Cost | $0.46 | $0.57 | 19% cheaper |
 
 **Alamofire** · ~110 files
 | Metric | WITH cg | WITHOUT cg | Δ |
 |---|---|---|---|
-| Time | 2m 11s | 2m 21s | 7% faster |
-| File Reads | 3 | 9 | −6 |
-| Grep/Bash | 2 | 4 | −2 |
-| Tool calls | 11 | 12 | 13% fewer |
-| Total tokens | 1.13M | 2.10M | 46% fewer |
-| Cost | $0.69 | $0.95 | 28% cheaper |
+| Time | 1m 35s | 2m 21s | 33% faster |
+| File Reads | 0 | 9 | −9 |
+| Grep/Bash | 0 | 4 | −4 |
+| Tool calls | 5 | 12 | 58% fewer |
+| Total tokens | 766k | 2.10M | 64% fewer |
+| Cost | $0.57 | $0.95 | 40% cheaper |
 
 </details>
 
 <details>
 <summary><strong>Full benchmark details</strong></summary>
 
-**Methodology.** Each arm is `claude -p` (Claude Opus 4.8) run headlessly against the repo with `--strict-mcp-config`: **WITH** = CodeGraph's MCP server enabled, **WITHOUT** = an empty MCP config. Built-in Read/Grep/Bash stay available to both. Same question per repo, **4 runs per arm, median reported**. Cost = the run's `total_cost_usd`; Tokens = total tokens processed (input incl. cached + output); Time = wall-clock; Tool calls = every tool invocation, including those inside any sub-agents the model spawns. Repos cloned at `--depth 1` and indexed by the same CodeGraph build that served them. Re-validated 2026-05-29 on the build with per-symbol adaptive `codegraph_explore` sizing. These numbers are lower than the prior Opus 4.7 validation — not a CodeGraph regression but a stronger native baseline: Opus 4.8 greps/reads efficiently on the main thread instead of fanning out into large Explore-subagent sweeps, so the no-CodeGraph arm is leaner than it used to be. Per-repo numbers move run-to-run with how hard the without-arm thrashes (the median-of-4 smooths it, but tails remain — e.g. Django's without-arm hit $2.71/14m one batch).
+**Methodology.** Each arm is `claude -p` (Claude Opus 4.8) run headlessly against the repo with `--strict-mcp-config`: **WITH** = CodeGraph's MCP server enabled, **WITHOUT** = an empty MCP config. Built-in Read/Grep/Bash stay available to both. Same question per repo, **4 runs per arm, median reported**. Cost = the run's `total_cost_usd`; Tokens = total tokens processed (input incl. cached + output); Time = wall-clock; Tool calls = every tool invocation, including those inside any sub-agents the model spawns. Repos cloned at `--depth 1` and indexed by the same CodeGraph build that served them. Re-validated 2026-06-02 on the current build. These numbers are lower than the prior Opus 4.7 validation — not a CodeGraph regression but a stronger native baseline: Opus 4.8 greps/reads efficiently on the main thread instead of fanning out into large Explore-subagent sweeps, so the no-CodeGraph arm is leaner than it used to be. Per-repo numbers move run-to-run with how hard the without-arm thrashes (the median-of-4 smooths it, but tails remain — e.g. Django's without-arm hit $2.71/14m one batch).
 
 **Queries:**
 | Codebase | Query |
@@ -190,7 +190,7 @@ CodeGraph cuts **cost, tokens, tool calls, and time on every repo** — across s
 | Gin | "How does gin route requests through its middleware chain?" |
 | Alamofire | "How does Alamofire build, send, and validate a request?" |
 
-**Why CodeGraph wins:** with the index available, the agent answers directly — `codegraph_context` to map the area, then one `codegraph_explore` for the relevant source — and stops, usually with zero file reads. Without it, the agent spends most of its budget on discovery (find/ls/grep) before reading the right code. CodeGraph only helps when queried *directly*, so its instructions steer agents to answer directly rather than delegate exploration to file-reading sub-agents — otherwise a sub-agent reads files regardless and CodeGraph becomes overhead.
+**Why CodeGraph wins:** with the index available, the agent answers directly — usually one `codegraph_explore` returns the relevant source — and stops, usually with zero file reads. Without it, the agent spends most of its budget on discovery (find/ls/grep) before reading the right code. CodeGraph only helps when queried *directly*, so its instructions steer agents to answer directly rather than delegate exploration to file-reading sub-agents — otherwise a sub-agent reads files regardless and CodeGraph becomes overhead.
 
 </details>
 
@@ -365,7 +365,7 @@ npm install -g @colbymchenry/codegraph
   "permissions": {
     "allow": [
       "mcp__codegraph__codegraph_search",
-      "mcp__codegraph__codegraph_context",
+      "mcp__codegraph__codegraph_explore",
       "mcp__codegraph__codegraph_callers",
       "mcp__codegraph__codegraph_callees",
       "mcp__codegraph__codegraph_impact",
@@ -385,7 +385,7 @@ npm install -g @colbymchenry/codegraph
 CodeGraph's MCP server delivers its usage guidance to your agent **automatically**, in the MCP `initialize` response — there's no instructions file to manage and nothing is added to your `CLAUDE.md` / `AGENTS.md` / `GEMINI.md`. In short, it tells the agent to:
 
 - **Answer structural questions directly with CodeGraph** — it *is* the pre-built index, so a grep/read loop just repeats work it already did. Treat the returned source as already read.
-- **Pick the tool by intent:** `codegraph_context` to map an area, `codegraph_trace` for "how does X reach Y", `codegraph_explore` to survey several symbols, `codegraph_search` to find a symbol, `codegraph_callers`/`codegraph_callees` to walk call flow, `codegraph_impact` before editing, `codegraph_node` for one symbol's source.
+- **Pick the tool by intent:** `codegraph_explore` for almost anything — "how does X work", a flow/"how does X reach Y", or surveying an area (one call returns the relevant symbols' source grouped by file); `codegraph_search` to just locate a symbol; `codegraph_callers`/`codegraph_callees` to walk call flow; `codegraph_impact` before editing; `codegraph_node` for one specific symbol's full source (it returns every overload for an ambiguous name).
 - **Trust the results — don't re-verify with grep**, and check the staleness banner after edits.
 - If `.codegraph/` doesn't exist yet, offer to run `codegraph init -i`.
 
@@ -410,7 +410,7 @@ The exact text is `src/mcp/server-instructions.ts` — the single source of trut
 ┌───────────────────────────────────────────────────────────────────┐
 │                        CodeGraph MCP Server                       │
 │                                                                   │
-│       context · trace · explore · callers · callees · impact
+│       explore · search · callers · callees · impact · node  
 │                                 │                                 │
 │                                 ▼                                 │
 │                       SQLite knowledge graph                      │
@@ -441,7 +441,6 @@ codegraph sync [path]             # Incremental update
 codegraph status [path]           # Show statistics
 codegraph query <search>          # Search symbols (--kind, --limit, --json)
 codegraph files [path]            # Show file structure (--format, --filter, --max-depth, --json)
-codegraph context <task>          # Build context for AI (--format, --max-nodes)
 codegraph callers <symbol>        # Find what calls a function/method (--limit, --json)
 codegraph callees <symbol>        # Find what a function/method calls (--limit, --json)
 codegraph impact <symbol>         # Analyze what code is affected by changing a symbol (--depth, --json)
@@ -485,14 +484,12 @@ When running as an MCP server, CodeGraph exposes these tools to Claude Code:
 
 | Tool | Purpose |
 |------|---------|
+| `codegraph_explore` | **Primary.** Answer almost any question in one call — "how does X work", a flow ("how does X reach Y"), or surveying an area — returning the relevant symbols' verbatim source grouped by file, plus a relationship map and blast radius. Surfaces dynamic-dispatch hops (callbacks, React re-render, interface→impl) grep can't follow. |
 | `codegraph_search` | Find symbols by name across the codebase |
-| `codegraph_context` | Build relevant code context for a task |
-| `codegraph_trace` | Trace the call path between two symbols ("how does X reach Y") in one call — each hop with its body inline, following dynamic-dispatch hops (callbacks, React re-render, interface→impl) that grep can't |
 | `codegraph_callers` | Find what calls a function |
 | `codegraph_callees` | Find what a function calls |
 | `codegraph_impact` | Analyze what code is affected by changing a symbol |
-| `codegraph_node` | Get details about a specific symbol (optionally with source code) |
-| `codegraph_explore` | Return source for several related symbols grouped by file, plus a relationship map, in one call |
+| `codegraph_node` | Get one specific symbol's details + full source (returns every overload for an ambiguous name) |
 | `codegraph_files` | Get indexed file structure (faster than filesystem scanning) |
 | `codegraph_status` | Check index health and statistics |
 

+ 123 - 0
__tests__/context-ranking.test.ts

@@ -0,0 +1,123 @@
+/**
+ * Context ranking: common-word precision + low-confidence handoff.
+ *
+ * Regression coverage for the failure where a prose query
+ * ("capture intro onboarding screen flat object") surfaced an unrelated
+ * constant named `FLAT` (in a download script) as a top entry point — because
+ * the descriptive word "flat" exact-matched it and the +exact-name bonus was
+ * exempt from single-term dampening. The fix: only distinctive identifiers earn
+ * that exemption; an isolated common-word exact match is demoted, and a query
+ * that resolves only to such weak matches is flagged low-confidence so the
+ * response hands off to explore/trace instead of bluffing.
+ */
+import { describe, it, expect, beforeEach, afterEach } from 'vitest';
+import * as fs from 'fs';
+import * as path from 'path';
+import * as os from 'os';
+import CodeGraph from '../src/index';
+import { LOW_CONFIDENCE_MARKER } from '../src/context';
+import { isDistinctiveIdentifier } from '../src/search/query-utils';
+
+describe('isDistinctiveIdentifier', () => {
+  it('treats plain dictionary words as non-distinctive', () => {
+    for (const word of ['flat', 'object', 'screen', 'standing', 'capture']) {
+      expect(isDistinctiveIdentifier(word)).toBe(false);
+    }
+  });
+
+  it('treats leading-capital-only words (proper nouns / sentence start) as non-distinctive', () => {
+    expect(isDistinctiveIdentifier('Screen')).toBe(false);
+    expect(isDistinctiveIdentifier('Zustand')).toBe(false);
+  });
+
+  it('treats camelCase / PascalCase / snake_case / acronyms / digits as distinctive', () => {
+    expect(isDistinctiveIdentifier('setLastEmail')).toBe(true);
+    expect(isDistinctiveIdentifier('OrgUserStore')).toBe(true);
+    expect(isDistinctiveIdentifier('user_store')).toBe(true);
+    expect(isDistinctiveIdentifier('REST')).toBe(true);
+    expect(isDistinctiveIdentifier('v2')).toBe(true);
+  });
+});
+
+describe('Context ranking — common-word precision & confidence', () => {
+  let testDir: string;
+  let cg: CodeGraph;
+
+  beforeEach(async () => {
+    testDir = fs.mkdtempSync(path.join(os.tmpdir(), 'codegraph-ctxrank-'));
+
+    // The corroborated target: a capture-flow screen whose NAME alone matches
+    // three query terms (capture + intro + screen), and which lives under a
+    // matching directory.
+    const captureDir = path.join(testDir, 'src', 'app', 'capture');
+    fs.mkdirSync(captureDir, { recursive: true });
+    fs.writeFileSync(
+      path.join(captureDir, 'intro.tsx'),
+      `export function CaptureIntroScreen() {
+  // Onboarding screen shown before the user selects flat or standing object capture.
+  return null;
+}
+`
+    );
+
+    // The trap: an unrelated constant literally named FLAT, in a totally
+    // different area. "flat" in a prose query exact-matches it.
+    const scriptsDir = path.join(testDir, 'scripts', 'dataset');
+    fs.mkdirSync(scriptsDir, { recursive: true });
+    fs.writeFileSync(
+      path.join(scriptsDir, 'download.ts'),
+      `export const FLAT = 'freiburg_flat_dataset';
+export function downloadDataset(name: string): string { return name; }
+`
+    );
+
+    cg = CodeGraph.initSync(testDir, {
+      config: { include: ['**/*.ts', '**/*.tsx'], exclude: [] },
+    });
+    await cg.indexAll();
+  });
+
+  afterEach(() => {
+    if (cg) cg.destroy();
+    if (fs.existsSync(testDir)) fs.rmSync(testDir, { recursive: true, force: true });
+  });
+
+  it('does not let a common-word exact match (FLAT) outrank a corroborated symbol', async () => {
+    const sg = await cg.findRelevantContext(
+      'capture intro onboarding screen flat object'
+    );
+    const rootNames = sg.roots.map((id) => sg.nodes.get(id)?.name);
+
+    // The corroborated capture screen surfaces as an entry point...
+    expect(rootNames).toContain('CaptureIntroScreen');
+    // ...and the trap constant is never the lead result (the bug we fixed).
+    expect(rootNames[0]).not.toBe('FLAT');
+
+    const capIdx = rootNames.indexOf('CaptureIntroScreen');
+    const flatIdx = rootNames.indexOf('FLAT');
+    if (flatIdx >= 0) expect(capIdx).toBeLessThan(flatIdx);
+
+    // And it's confidently answered (we located a corroborated symbol).
+    expect(sg.confidence).toBe('high');
+  });
+
+  it('flags low confidence and emits the handoff when only common words match', async () => {
+    const query = 'flat object thing';
+    const sg = await cg.findRelevantContext(query);
+    expect(sg.confidence).toBe('low');
+
+    const md = await cg.buildContext(query, { format: 'markdown' });
+    expect(typeof md).toBe('string');
+    expect(md as string).toContain(LOW_CONFIDENCE_MARKER);
+    // The handoff routes to the precise tools rather than claiming completeness.
+    expect(md as string).toMatch(/codegraph_explore/);
+  });
+
+  it('does not emit the handoff for a precise, distinctive-symbol query', async () => {
+    const sg = await cg.findRelevantContext('CaptureIntroScreen');
+    expect(sg.confidence).toBe('high');
+
+    const md = await cg.buildContext('CaptureIntroScreen', { format: 'markdown' });
+    expect(md as string).not.toContain(LOW_CONFIDENCE_MARKER);
+  });
+});

+ 73 - 0
__tests__/explore-blast-radius.test.ts

@@ -0,0 +1,73 @@
+/**
+ * codegraph_explore blast-radius section.
+ *
+ * explore now appends a compact, always-on "Blast radius" for the entry
+ * symbols: who depends on each (locations only — no source) and which test
+ * files cover it, so the agent knows what to update/verify before editing
+ * without a separate impact call. Symbols with no dependents are skipped, and
+ * the section is omitted entirely when nothing qualifies.
+ */
+import { describe, it, expect, beforeEach, afterEach } from 'vitest';
+import * as fs from 'fs';
+import * as path from 'path';
+import * as os from 'os';
+import CodeGraph from '../src/index';
+import { ToolHandler } from '../src/mcp/tools';
+
+describe('codegraph_explore — blast radius', () => {
+  let testDir: string;
+  let cg: CodeGraph;
+  let handler: ToolHandler;
+
+  beforeEach(async () => {
+    testDir = fs.mkdtempSync(path.join(os.tmpdir(), 'codegraph-blast-'));
+    const src = path.join(testDir, 'src');
+    fs.mkdirSync(src, { recursive: true });
+
+    // `target` is depended on by a sibling (caller) and a test file.
+    fs.writeFileSync(
+      path.join(src, 'feature.ts'),
+      `export function target() { return 1; }\n` +
+      `export function caller() { return target(); }\n`,
+    );
+    fs.writeFileSync(
+      path.join(src, 'feature.test.ts'),
+      `import { target } from './feature';\n` +
+      `export function checkTarget() { return target(); }\n`,
+    );
+    // A leaf with no dependents — must NOT show up in the blast radius.
+    fs.writeFileSync(
+      path.join(src, 'leaf.ts'),
+      `export function lonelyLeaf() { return 42; }\n`,
+    );
+
+    cg = CodeGraph.initSync(testDir, { config: { include: ['**/*.ts'], exclude: [] } });
+    await cg.indexAll();
+    handler = new ToolHandler(cg);
+  });
+
+  afterEach(() => {
+    if (cg) cg.destroy();
+    if (fs.existsSync(testDir)) fs.rmSync(testDir, { recursive: true, force: true });
+  });
+
+  it('lists dependents (locations only) and covering tests for an entry symbol', async () => {
+    const res = await handler.execute('codegraph_explore', { query: 'target' });
+    const text = res.content[0].text;
+
+    expect(text).toContain('### Blast radius');
+    expect(text).toContain('`target`');
+    expect(text).toMatch(/caller/); // a caller count is reported
+    // It names WHERE (the caller file) — not the caller's source body.
+    expect(text).toContain('feature.ts');
+    // Test coverage is surfaced (either the covering test file, or the warning).
+    expect(text).toMatch(/tests:.*feature\.test\.ts|no covering tests/);
+  });
+
+  it('omits symbols that have no dependents from the blast radius', async () => {
+    const res = await handler.execute('codegraph_explore', { query: 'lonelyLeaf' });
+    const text = res.content[0].text;
+    // lonelyLeaf has zero callers — it must never appear under a blast-radius bullet.
+    expect(text).not.toMatch(/Blast radius[\s\S]*`lonelyLeaf`/);
+  });
+});

+ 14 - 6
__tests__/explore-output-budget.test.ts

@@ -27,9 +27,14 @@ describe('getExploreOutputBudget', () => {
     expect(small.maxOutputChars).toBeLessThanOrEqual(20000);
   });
 
-  it('keeps the historical 35k+ ceiling for medium-large projects so existing benchmarks do not regress', () => {
+  it('caps medium-large projects at the inline tool-result ceiling (~24k) so the result is never externalized', () => {
+    // A bigger single response gets externalized by the host to a file the agent
+    // Reads back (a 35k vscode explore did exactly that in the n=4 A/B) — adding a
+    // read AND cache-write cost. So large repos get MORE CALLS (getExploreBudget),
+    // not a fatter single response; the output cap stays under the inline limit.
     const large = getExploreOutputBudget(10000);
-    expect(large.maxOutputChars).toBeGreaterThanOrEqual(35000);
+    expect(large.maxOutputChars).toBeLessThanOrEqual(25000);
+    expect(large.maxOutputChars).toBeGreaterThanOrEqual(20000);
   });
 
   it('uses tier breakpoints matching getExploreBudget so call-count and output-budget agree on a project', () => {
@@ -54,10 +59,13 @@ describe('getExploreOutputBudget', () => {
     const tier3b = getExploreOutputBudget(14999);
     expect(tier3a.maxOutputChars).toBe(tier3b.maxOutputChars);
 
-    // And crossing a breakpoint changes the cap.
-    expect(tier0a.maxOutputChars).not.toBe(tier1a.maxOutputChars);
-    expect(tier1a.maxOutputChars).not.toBe(tier2a.maxOutputChars);
-    expect(tier2a.maxOutputChars).not.toBe(tier3a.maxOutputChars);
+    // Small tiers step up (13k → 18k → 24k); medium and large SHARE the ~24k
+    // inline ceiling — scaling with repo size now lives in the CALL budget
+    // (getExploreBudget), not in a fatter single response.
+    expect(tier0a.maxOutputChars).not.toBe(tier1a.maxOutputChars); // <150 vs <500
+    expect(tier1a.maxOutputChars).not.toBe(tier2a.maxOutputChars); // <500 vs <5000
+    expect(tier2a.maxOutputChars).toBe(tier3a.maxOutputChars);     // <5000 == <15000 (inline cap)
+    expect(getExploreBudget(5000)).toBeGreaterThan(getExploreBudget(4999)); // calls scale instead
   });
 
   it('gates off "Additional relevant files", completeness signal, and budget note on small projects', () => {

+ 2 - 2
__tests__/integration/mcp-input-limits.test.ts

@@ -53,9 +53,9 @@ describe('MCP input size limits', () => {
     expect(result.content[0]!.text).toMatch(/maximum length/i);
   });
 
-  it('rejects an oversize task on codegraph_context', async () => {
+  it('rejects an oversize query on codegraph_explore', async () => {
     const huge = 'b'.repeat(50_000);
-    const result = await handler.execute('codegraph_context', { task: huge });
+    const result = await handler.execute('codegraph_explore', { query: huge });
     expect(result.isError).toBe(true);
     expect(result.content[0]!.text).toMatch(/maximum length/i);
   });

+ 9 - 9
__tests__/mcp-tool-allowlist.test.ts

@@ -21,28 +21,28 @@ describe('CODEGRAPH_MCP_TOOLS allowlist', () => {
     delete process.env[ENV];
     const all = listed();
     expect(all).toContain('codegraph_explore');
-    expect(all).toContain('codegraph_context');
-    expect(all).toContain('codegraph_trace');
-    expect(all.length).toBeGreaterThanOrEqual(10);
+    expect(all).not.toContain('codegraph_context');
+    expect(all).not.toContain('codegraph_trace');
+    expect(all.length).toBeGreaterThanOrEqual(8);
   });
 
   it('filters ListTools to the allowlisted short names', () => {
-    process.env[ENV] = 'trace,search,node';
-    expect(listed()).toEqual(['codegraph_node', 'codegraph_search', 'codegraph_trace']);
+    process.env[ENV] = 'explore,search,node';
+    expect(listed()).toEqual(['codegraph_explore', 'codegraph_node', 'codegraph_search']);
   });
 
   it('accepts fully-qualified codegraph_ names and ignores whitespace', () => {
-    process.env[ENV] = ' codegraph_trace , search ';
-    expect(listed()).toEqual(['codegraph_search', 'codegraph_trace']);
+    process.env[ENV] = ' codegraph_explore , search ';
+    expect(listed()).toEqual(['codegraph_explore', 'codegraph_search']);
   });
 
   it('treats an empty/whitespace value as unset (full surface)', () => {
     process.env[ENV] = '   ';
-    expect(listed().length).toBeGreaterThanOrEqual(10);
+    expect(listed().length).toBeGreaterThanOrEqual(8);
   });
 
   it('rejects a disabled tool on execute (defense in depth)', async () => {
-    process.env[ENV] = 'trace';
+    process.env[ENV] = 'node';
     const res = await new ToolHandler(null).execute('codegraph_explore', {});
     expect(res.isError).toBe(true);
     expect(res.content[0].text).toMatch(/disabled via CODEGRAPH_MCP_TOOLS/);

+ 176 - 0
__tests__/object-literal-methods.test.ts

@@ -0,0 +1,176 @@
+/**
+ * Object-literal method extraction (general AST rule).
+ *
+ * The extractor pulls function-valued properties out of an object literal that
+ * is the value of an exported const — either DIRECTLY
+ * (`export const actions = { foo: () => {} }`) or RETURNED by an initializer
+ * call (`export const useStore = create((set, get) => ({ foo: () => {} }))`,
+ * incl. middleware wrappers). This makes store actions (Zustand/Redux/Pinia/
+ * MobX/handler maps) real nodes, so `codegraph_node`/`callers` on them resolve
+ * instead of returning "not found" and forcing the agent to Read the store.
+ *
+ * Keyed purely on AST shape — no library names in the implementation — so any
+ * same-shaped store is covered. Resolution then falls out of the existing
+ * exact-name matcher: every call form (`const {foo}=useStore.getState(); foo()`,
+ * `useStore.getState().foo()`, in-store `get().foo()`) reduces to a bare `foo`
+ * call that resolves to the action node once it exists.
+ */
+import { describe, it, expect, beforeAll, afterEach } from 'vitest';
+import * as fs from 'fs';
+import * as path from 'path';
+import * as os from 'os';
+import { CodeGraph } from '../src';
+import { extractFromSource } from '../src/extraction';
+import { initGrammars, loadAllGrammars } from '../src/extraction/grammars';
+
+beforeAll(async () => {
+  await initGrammars();
+  await loadAllGrammars();
+});
+
+describe('object-literal method extraction', () => {
+  it('extracts Zustand store actions (object returned by create()) as function nodes', () => {
+    const code = `
+      import { create } from 'zustand'
+      interface Store {
+        count: number
+        fetchUser(): Promise<void>
+        switchOrganization(id: string): Promise<void>
+        reset(): void
+      }
+      export const useStore = create<Store>((set, get) => ({
+        count: 0,
+        fetchUser: async () => { await get().reset() },
+        switchOrganization: async (id: string) => { set({ count: 1 }) },
+        reset: () => set({ count: 0 }),
+      }))
+    `;
+    const result = extractFromSource('store.ts', code);
+    const fnNames = result.nodes.filter((n) => n.kind === 'function').map((n) => n.name);
+    expect(fnNames).toContain('fetchUser');
+    expect(fnNames).toContain('switchOrganization');
+    expect(fnNames).toContain('reset');
+
+    // Each action's body was walked: fetchUser references its sibling `reset`,
+    // so an in-store calls edge will resolve once the pipeline runs.
+    const fetchUser = result.nodes.find((n) => n.name === 'fetchUser')!;
+    const fetchUserRefs = result.unresolvedReferences.filter((r) => r.fromNodeId === fetchUser.id);
+    expect(fetchUserRefs.map((r) => r.referenceName)).toContain('reset');
+
+    // The action's body wasn't mis-attributed to the file scope (the reason we
+    // skip the generic body-visit for the store-factory call).
+    const fileNode = result.nodes.find((n) => n.kind === 'file')!;
+    const fileRefs = result.unresolvedReferences.filter((r) => r.fromNodeId === fileNode.id);
+    expect(fileRefs.map((r) => r.referenceName)).not.toContain('reset');
+  });
+
+  it('extracts actions through a middleware wrapper (create(persist(...)))', () => {
+    const code = `
+      import { create } from 'zustand'
+      import { persist } from 'zustand/middleware'
+      export const useCounter = create(
+        persist(
+          (set, get) => ({
+            value: 0,
+            increment: () => set({ value: get().value + 1 }),
+          }),
+          { name: 'counter' }
+        )
+      )
+    `;
+    const result = extractFromSource('counter.ts', code);
+    const fnNames = result.nodes.filter((n) => n.kind === 'function').map((n) => n.name);
+    expect(fnNames).toContain('increment');
+  });
+
+  it('extracts actions when the initializer returns via a block (=> { return {...} })', () => {
+    const code = `
+      import { create } from 'zustand'
+      export const useThing = create((set) => {
+        const initial = 0
+        return {
+          value: initial,
+          bump: () => set({ value: 1 }),
+        }
+      })
+    `;
+    const result = extractFromSource('thing.ts', code);
+    const fnNames = result.nodes.filter((n) => n.kind === 'function').map((n) => n.name);
+    expect(fnNames).toContain('bump');
+  });
+
+  it('does NOT extract methods from a non-exported call-wrapped object (noise gate)', () => {
+    const code = `
+      function wrap(f: any) { return f }
+      const local = wrap(() => ({ shouldNotExtract: () => {} }))
+    `;
+    const result = extractFromSource('inline.ts', code);
+    const names = result.nodes.map((n) => n.name);
+    expect(names).not.toContain('shouldNotExtract');
+  });
+
+  it('still extracts the existing direct-object shape (export const actions = {...})', () => {
+    const code = `
+      export const actions = {
+        load: async () => { helper() },
+      }
+      function helper() {}
+    `;
+    const result = extractFromSource('actions.ts', code);
+    const fnNames = result.nodes.filter((n) => n.kind === 'function').map((n) => n.name);
+    expect(fnNames).toContain('load');
+  });
+});
+
+describe('object-literal method resolution (end-to-end)', () => {
+  let tmpDir: string | undefined;
+  afterEach(() => {
+    if (tmpDir) fs.rmSync(tmpDir, { recursive: true, force: true });
+    tmpDir = undefined;
+  });
+
+  it('resolves callers of store actions across files (destructured + chained getState())', async () => {
+    tmpDir = fs.mkdtempSync(path.join(os.tmpdir(), 'cg-store-'));
+    fs.writeFileSync(path.join(tmpDir, 'package.json'), '{"name":"t","dependencies":{"zustand":"^4"}}\n');
+    fs.writeFileSync(
+      path.join(tmpDir, 'store.ts'),
+      `import { create } from 'zustand'\n` +
+        `interface S { fetchUser(): Promise<void>; reset(): void }\n` +
+        `export const useStore = create<S>((set, get) => ({\n` +
+        `  fetchUser: async () => { get().reset() },\n` +
+        `  reset: () => set({}),\n` +
+        `}))\n`
+    );
+    fs.writeFileSync(
+      path.join(tmpDir, 'caller.ts'),
+      `import { useStore } from './store'\n` +
+        `export async function loginFlow() {\n` +
+        `  const { fetchUser } = useStore.getState()\n` +
+        `  await fetchUser()\n` +
+        `}\n` +
+        `export function hardReset() {\n` +
+        `  useStore.getState().reset()\n` +
+        `}\n`
+    );
+
+    const cg = CodeGraph.initSync(tmpDir);
+    await cg.indexAll();
+
+    const fns = cg.getNodesByKind('function');
+    const fetchUser = fns.find((n) => n.name === 'fetchUser' && n.filePath.endsWith('store.ts'));
+    const reset = fns.find((n) => n.name === 'reset' && n.filePath.endsWith('store.ts'));
+    expect(fetchUser).toBeDefined();
+    expect(reset).toBeDefined();
+
+    // Destructured-then-bare call: loginFlow -> fetchUser
+    const fetchUserCallers = cg.getCallers(fetchUser!.id).map((c) => c.node.name);
+    expect(fetchUserCallers).toContain('loginFlow');
+
+    // Chained getState() call: hardReset -> reset, AND in-store sibling: fetchUser -> reset
+    const resetCallers = cg.getCallers(reset!.id).map((c) => c.node.name);
+    expect(resetCallers).toContain('hardReset');
+    expect(resetCallers).toContain('fetchUser');
+
+    cg.close();
+  });
+});

+ 19 - 19
__tests__/pr19-improvements.test.ts

@@ -501,10 +501,10 @@ describe('MCP Tool Improvements', () => {
     expect(typeof ToolHandler).toBe('function');
   });
 
-  it.skipIf(!HAS_SQLITE)('should have findSymbol and truncateOutput as private methods', async () => {
+  it.skipIf(!HAS_SQLITE)('should have findSymbolMatches and truncateOutput as private methods', async () => {
     const { ToolHandler } = await import('../src/mcp/tools');
     const proto = ToolHandler.prototype;
-    expect(typeof (proto as any).findSymbol).toBe('function');
+    expect(typeof (proto as any).findSymbolMatches).toBe('function');
     expect(typeof (proto as any).truncateOutput).toBe('function');
   });
 
@@ -567,20 +567,19 @@ export function getValueFromCache(): number { return 2; }
       await cg.indexAll();
 
       const handler = new ToolHandler(cg);
-      const findSymbol = (handler as any).findSymbol.bind(handler);
+      const findSymbolMatches = (handler as any).findSymbolMatches.bind(handler);
 
-      const match = findSymbol(cg, 'getValue');
-      expect(match).not.toBeNull();
-      expect(match.node.name).toBe('getValue');
-      // Should not have a disambiguation note for single exact match
-      expect(match.note).toBe('');
+      const matches = findSymbolMatches(cg, 'getValue');
+      // Exact-name match wins — a single result, not the partial getValueFromCache.
+      expect(matches.length).toBe(1);
+      expect(matches[0].name).toBe('getValue');
 
       handler.closeAll();
       cg.destroy();
       cleanupTempDir(tmpDir);
     });
 
-    it.skipIf(!HAS_SQLITE)('should note when multiple symbols share the same name', async () => {
+    it.skipIf(!HAS_SQLITE)('should return all definitions when multiple symbols share the same name', async () => {
       const { ToolHandler } = await import('../src/mcp/tools');
       const CodeGraph = (await import('../src/index')).default;
 
@@ -602,20 +601,21 @@ export function handle(): void {}
       await cg.indexAll();
 
       const handler = new ToolHandler(cg);
-      const findSymbol = (handler as any).findSymbol.bind(handler);
+      const findSymbolMatches = (handler as any).findSymbolMatches.bind(handler);
 
-      const match = findSymbol(cg, 'handle');
-      expect(match).not.toBeNull();
-      expect(match.node.name).toBe('handle');
-      // Should have a disambiguation note
-      expect(match.note).toContain('2 symbols named "handle"');
+      // Both same-named definitions are returned (no longer one + a dead-end
+      // note) so codegraph_node can hand back every overload and the agent never
+      // Reads to find the one it wanted.
+      const matches = findSymbolMatches(cg, 'handle');
+      expect(matches.length).toBe(2);
+      expect(matches.every((n: any) => n.name === 'handle')).toBe(true);
 
       handler.closeAll();
       cg.destroy();
       cleanupTempDir(tmpDir);
     });
 
-    it.skipIf(!HAS_SQLITE)('should return null when symbol is not found', async () => {
+    it.skipIf(!HAS_SQLITE)('should return no matches when symbol is not found', async () => {
       const { ToolHandler } = await import('../src/mcp/tools');
       const CodeGraph = (await import('../src/index')).default;
 
@@ -630,10 +630,10 @@ export function handle(): void {}
       await cg.indexAll();
 
       const handler = new ToolHandler(cg);
-      const findSymbol = (handler as any).findSymbol.bind(handler);
+      const findSymbolMatches = (handler as any).findSymbolMatches.bind(handler);
 
-      const match = findSymbol(cg, 'nonExistentSymbol');
-      expect(match).toBeNull();
+      const matches = findSymbolMatches(cg, 'nonExistentSymbol');
+      expect(matches.length).toBe(0);
 
       handler.closeAll();
       cg.destroy();

+ 19 - 98
__tests__/security.test.ts

@@ -263,23 +263,35 @@ describe('MCP Input Validation', () => {
     expect(result.content[0].text).toContain('non-empty string');
   });
 
-  it('should reject non-string task in codegraph_context', async () => {
-    const result = await handler.execute('codegraph_context', { task: undefined });
+  it('should reject non-string query in codegraph_explore', async () => {
+    const result = await handler.execute('codegraph_explore', { query: undefined });
     expect(result.isError).toBe(true);
     expect(result.content[0].text).toContain('non-empty string');
   });
 
-  it('should truncate oversized codegraph_context output', async () => {
-    const oversizedContext = Array.from({ length: 400 }, (_, i) => `line-${i} ${'x'.repeat(80)}`).join('\n');
+  it('should truncate oversized tool output', async () => {
+    // Force a huge result set through codegraph_search; the response must be
+    // truncated with the sentinel rather than flooding the agent's context.
+    const many = Array.from({ length: 3000 }, (_, i) => ({
+      node: {
+        id: `n${i}`,
+        name: `symbol_${i}_${'x'.repeat(40)}`,
+        kind: 'function',
+        filePath: `src/very/deep/path/file_${i}.ts`,
+        startLine: 1,
+        endLine: 2,
+        language: 'typescript',
+      },
+      score: 1,
+    }));
     const fakeCg = {
-      buildContext: async () => oversizedContext,
+      searchNodes: () => many,
     };
     const fakeHandler = new ToolHandler(fakeCg as unknown as CodeGraph);
 
-    const result = await fakeHandler.execute('codegraph_context', { task: 'find example' });
+    const result = await fakeHandler.execute('codegraph_search', { query: 'x' });
 
     expect(result.isError).toBeFalsy();
-    expect(result.content[0].text.length).toBeLessThan(oversizedContext.length);
     expect(result.content[0].text).toContain('... (output truncated)');
   });
 
@@ -551,94 +563,3 @@ describe('Symlink Cycle Detection', () => {
     expect(files).toContain('src/valid.ts');
   });
 });
-
-describe('Session marker symlink resistance', () => {
-  // The marker write lives in src/mcp/tools.ts behind handleContext. We exercise
-  // it end-to-end via ToolHandler.execute so the test exercises the same code
-  // path Claude Code drives. The session id is per-test so other parallel test
-  // runs can't collide with the marker file we plant a symlink at.
-  const SESSION_ID = `cg-test-${process.pid}-${Date.now()}-${Math.random().toString(36).slice(2)}`;
-  const crypto = require('crypto') as typeof import('crypto');
-  const hash = crypto.createHash('md5').update(SESSION_ID).digest('hex').slice(0, 16);
-  const markerPath = path.join(os.tmpdir(), `codegraph-consulted-${hash}`);
-
-  let projectDir: string;
-  let victimDir: string;
-  let victimFile: string;
-
-  beforeEach(async () => {
-    projectDir = createTempDir();
-    victimDir = createTempDir();
-    victimFile = path.join(victimDir, 'private.txt');
-    fs.writeFileSync(victimFile, 'SECRET-DO-NOT-OVERWRITE\n');
-    if (fs.existsSync(markerPath)) fs.unlinkSync(markerPath);
-
-    // A real .codegraph/ has to exist for handleContext to get past the
-    // "not initialized" guard — index a tiny fixture so the call reaches the
-    // marker write step rather than short-circuiting on missing project state.
-    fs.writeFileSync(path.join(projectDir, 'a.ts'), 'export const x = 1;\n');
-    const cg = await CodeGraph.init(projectDir);
-    await cg.indexAll();
-    cg.close();
-  });
-
-  afterEach(() => {
-    if (fs.existsSync(markerPath)) fs.unlinkSync(markerPath);
-    cleanupTempDir(projectDir);
-    cleanupTempDir(victimDir);
-  });
-
-  it('does not follow a pre-planted symlink at the marker path', async () => {
-    // Skip on platforms where the user can't create symlinks (Windows without
-    // dev mode + admin). The CWE-59 risk we're guarding against doesn't apply
-    // when symlinks aren't creatable, so the skip is correct, not a gap.
-    try {
-      fs.symlinkSync(victimFile, markerPath);
-    } catch {
-      return;
-    }
-
-    const cg = await CodeGraph.open(projectDir);
-    const handler = new ToolHandler(cg);
-    process.env.CLAUDE_SESSION_ID = SESSION_ID;
-    try {
-      await handler.execute('codegraph_context', { task: 'find x' });
-    } finally {
-      delete process.env.CLAUDE_SESSION_ID;
-      cg.close();
-    }
-
-    // The victim file's contents must be untouched — the old writeFileSync
-    // path would have followed the symlink and written an ISO timestamp here.
-    expect(fs.readFileSync(victimFile, 'utf8')).toBe('SECRET-DO-NOT-OVERWRITE\n');
-
-    // And the marker path itself must still be the symlink we planted —
-    // no fallback path that quietly unlinked + recreated it (which would
-    // also work, but is a behavior we don't want to silently rely on).
-    expect(fs.lstatSync(markerPath).isSymbolicLink()).toBe(true);
-  });
-
-  it('writes the marker file with 0o600 perms on a clean path', async () => {
-    // No symlink planted — happy path. Verifies the new openSync(mode: 0o600)
-    // call is what actually lands on disk (regression guard for the perm
-    // tightening that came with the O_NOFOLLOW fix).
-    const cg = await CodeGraph.open(projectDir);
-    const handler = new ToolHandler(cg);
-    process.env.CLAUDE_SESSION_ID = SESSION_ID;
-    try {
-      await handler.execute('codegraph_context', { task: 'find x' });
-    } finally {
-      delete process.env.CLAUDE_SESSION_ID;
-      cg.close();
-    }
-
-    expect(fs.existsSync(markerPath)).toBe(true);
-    // chmod's low 9 bits — strip the file-type bits for a clean compare.
-    // Windows can't enforce 0o600 in the POSIX sense; skip the assertion
-    // there since the underlying OS will normalize the mode anyway.
-    if (process.platform !== 'win32') {
-      const mode = fs.statSync(markerPath).mode & 0o777;
-      expect(mode).toBe(0o600);
-    }
-  });
-});

+ 58 - 30
__tests__/symbol-lookup.test.ts

@@ -75,7 +75,8 @@ describe.skipIf(!HAS_SQLITE)('matchesSymbol — module-qualified lookups (#173)'
   let projectRoot: string;
   let cg: any;
   let handler: any;
-  let findSymbol: (cg: any, s: string) => { node: any; note: string } | null;
+  // findSymbolMatches returns ALL ranked matches; [0] is the resolved/picked one.
+  let findSymbolMatches: (cg: any, s: string) => any[];
   let findAllSymbols: (cg: any, s: string) => { nodes: any[]; note: string };
 
   beforeEach(async () => {
@@ -87,7 +88,7 @@ describe.skipIf(!HAS_SQLITE)('matchesSymbol — module-qualified lookups (#173)'
     });
     await cg.indexAll();
     handler = new ToolHandler(cg);
-    findSymbol = (handler as any).findSymbol.bind(handler);
+    findSymbolMatches = (handler as any).findSymbolMatches.bind(handler);
     findAllSymbols = (handler as any).findAllSymbols.bind(handler);
   });
 
@@ -98,10 +99,11 @@ describe.skipIf(!HAS_SQLITE)('matchesSymbol — module-qualified lookups (#173)'
   });
 
   it('resolves `stage_apply::run` to the run in stage_apply.rs (not stage_detect.rs)', () => {
-    const match = findSymbol(cg, 'stage_apply::run');
-    expect(match).not.toBeNull();
-    expect(match!.node.name).toBe('run');
-    expect(match!.node.filePath).toMatch(/configurator\/stage_apply\.rs$/);
+    const matches = findSymbolMatches(cg, 'stage_apply::run');
+    expect(matches.length).toBeGreaterThan(0);
+    expect(matches[0]!.name).toBe('run');
+    // Every match must be in stage_apply.rs — never stage_detect.rs.
+    for (const n of matches) expect(n.filePath).toMatch(/configurator\/stage_apply\.rs$/);
   });
 
   it('rejects `stage_apply::run` for the same-named function in a different module', () => {
@@ -114,29 +116,29 @@ describe.skipIf(!HAS_SQLITE)('matchesSymbol — module-qualified lookups (#173)'
   });
 
   it('resolves `configurator::stage_apply::run` (multi-level qualifier)', () => {
-    const match = findSymbol(cg, 'configurator::stage_apply::run');
-    expect(match).not.toBeNull();
-    expect(match!.node.name).toBe('run');
-    expect(match!.node.filePath).toMatch(/configurator\/stage_apply\.rs$/);
+    const matches = findSymbolMatches(cg, 'configurator::stage_apply::run');
+    expect(matches.length).toBeGreaterThan(0);
+    expect(matches[0]!.name).toBe('run');
+    expect(matches[0]!.filePath).toMatch(/configurator\/stage_apply\.rs$/);
   });
 
   it('resolves `crate::configurator::stage_apply::run` (Rust path prefix stripped)', () => {
-    const match = findSymbol(cg, 'crate::configurator::stage_apply::run');
-    expect(match).not.toBeNull();
-    expect(match!.node.filePath).toMatch(/configurator\/stage_apply\.rs$/);
+    const matches = findSymbolMatches(cg, 'crate::configurator::stage_apply::run');
+    expect(matches.length).toBeGreaterThan(0);
+    expect(matches[0]!.filePath).toMatch(/configurator\/stage_apply\.rs$/);
   });
 
   it('resolves `configurator/stage_apply` (slash qualifier)', () => {
-    const match = findSymbol(cg, 'configurator/stage_apply/run');
-    expect(match).not.toBeNull();
-    expect(match!.node.filePath).toMatch(/configurator\/stage_apply\.rs$/);
+    const matches = findSymbolMatches(cg, 'configurator/stage_apply/run');
+    expect(matches.length).toBeGreaterThan(0);
+    expect(matches[0]!.filePath).toMatch(/configurator\/stage_apply\.rs$/);
   });
 
   it('does not silently collide bare `run` with `run_due_tasks`', () => {
-    const match = findSymbol(cg, 'run');
-    expect(match).not.toBeNull();
-    // Whatever it picks, it must be an exact-name match, not a partial.
-    expect(match!.node.name).toBe('run');
+    const matches = findSymbolMatches(cg, 'run');
+    expect(matches.length).toBeGreaterThan(0);
+    // Whatever it picks, every match must be an exact-name match, not a partial.
+    for (const n of matches) expect(n.name).toBe('run');
   });
 
   it('aggregates all bare-name `run` matches across modules', () => {
@@ -148,9 +150,22 @@ describe.skipIf(!HAS_SQLITE)('matchesSymbol — module-qualified lookups (#173)'
     expect(all.note).toMatch(/Aggregated|symbols named "run"/);
   });
 
-  it('still returns null for genuinely unknown qualified lookups', () => {
-    const match = findSymbol(cg, 'stage_apply::nonexistent_fn');
-    expect(match).toBeNull();
+  it('still returns nothing for genuinely unknown qualified lookups', () => {
+    const matches = findSymbolMatches(cg, 'stage_apply::nonexistent_fn');
+    expect(matches.length).toBe(0);
+  });
+
+  it('codegraph_node with a `file` hint pins an overloaded name to that file', async () => {
+    // `run` is defined in BOTH stage_apply.rs and stage_detect.rs. A bare lookup
+    // returns both; the `file` hint narrows to the one the caller saw in a trail.
+    const res = await handler.execute('codegraph_node', {
+      symbol: 'run',
+      includeCode: true,
+      file: 'stage_detect.rs',
+    });
+    const text = res.content?.[0]?.text ?? '';
+    expect(text).toMatch(/stage_detect\.rs/);
+    expect(text).not.toMatch(/stage_apply\.rs/);
   });
 });
 
@@ -158,7 +173,7 @@ describe.skipIf(!HAS_SQLITE)('matchesSymbol — dotted lookups (regression for #
   let projectRoot: string;
   let cg: any;
   let handler: any;
-  let findSymbol: (cg: any, s: string) => { node: any; note: string } | null;
+  let findSymbolMatches: (cg: any, s: string) => any[];
 
   beforeEach(async () => {
     projectRoot = tmpRoot();
@@ -166,7 +181,7 @@ describe.skipIf(!HAS_SQLITE)('matchesSymbol — dotted lookups (regression for #
     fs.mkdirSync(src, { recursive: true });
     fs.writeFileSync(
       path.join(src, 'session.ts'),
-      `export class Session {\n  request(): void {}\n}\nexport function request(): void {}\n`
+      `export class Session {\n  request(): void { fetch('x'); }\n}\nexport function request(): void {}\n`
     );
 
     const CodeGraph = (await import('../src/index')).default;
@@ -176,7 +191,7 @@ describe.skipIf(!HAS_SQLITE)('matchesSymbol — dotted lookups (regression for #
     });
     await cg.indexAll();
     handler = new ToolHandler(cg);
-    findSymbol = (handler as any).findSymbol.bind(handler);
+    findSymbolMatches = (handler as any).findSymbolMatches.bind(handler);
   });
 
   afterEach(() => {
@@ -186,9 +201,22 @@ describe.skipIf(!HAS_SQLITE)('matchesSymbol — dotted lookups (regression for #
   });
 
   it('`Session.request` resolves to the method, not the bare function', () => {
-    const match = findSymbol(cg, 'Session.request');
-    expect(match).not.toBeNull();
-    expect(match!.node.kind).toBe('method');
-    expect(match!.node.qualifiedName).toContain('Session::request');
+    const matches = findSymbolMatches(cg, 'Session.request');
+    expect(matches.length).toBeGreaterThan(0);
+    expect(matches[0]!.kind).toBe('method');
+    expect(matches[0]!.qualifiedName).toContain('Session::request');
+  });
+
+  it('codegraph_node on an ambiguous bare name returns ALL overloads with bodies (no guess)', async () => {
+    // `request` is BOTH a method (Session.request) and a free function. The old
+    // behavior returned one + a dead-end "Others:" note, forcing a Read to get
+    // the other overload; now both bodies come back in one call.
+    const res = await handler.execute('codegraph_node', { symbol: 'request', includeCode: true });
+    const text = res.content?.[0]?.text ?? '';
+    expect(text).toContain('2 definitions named "request"');
+    // Both definitions are rendered (method + function), each with a Location.
+    expect(text).toMatch(/\(method\)/);
+    expect(text).toMatch(/\(function\)/);
+    expect((text.match(/\*\*Location:\*\*/g) || []).length).toBeGreaterThanOrEqual(2);
   });
 });

+ 1 - 1
__tests__/worktree-detection.test.ts

@@ -180,7 +180,7 @@ describe('worktree mismatch surfaces on hot read tools (issue #155)', () => {
     const savedPath = process.env.PATH;
     process.env.PATH = '';
     try {
-      const second = await handler.execute('codegraph_context', { task: 'mainOnly' });
+      const second = await handler.execute('codegraph_explore', { query: 'mainOnly' });
       expect(second.content[0].text).toContain('different git worktree');
     } finally {
       process.env.PATH = savedPath;

+ 15 - 7
docs/design/callback-edge-synthesis.md

@@ -1,12 +1,19 @@
 # Design + status: general callback / observer edge synthesis
 
-**Status:** Phases 1–3 implemented & validated as a **prototype, uncommitted on `main`**
-(as of 2026-05-22). This doc is the handoff for continuing the work.
+**Status:** SHIPPED (the synthesizer in `callback-synthesizer.ts` is merged and on
+`main`). This doc records the original design.
 **Motivation:** close the dynamic-dispatch hole that static extraction leaves for
 observer / event-emitter / signal patterns, where a *dispatcher* invokes callbacks
 registered elsewhere through a shared store — so flows like "how does an update
 reach the screen" actually exist in the graph.
 
+> **Update (2026-06-01):** the `codegraph_trace` and `codegraph_context` MCP tools
+> were since **removed** — `codegraph_explore` is the single surfacing tool now. Its
+> "Flow" section (`buildFlowFromNamedSymbols`) and the `codegraph_node` trail surface
+> these synthesized edges; the `trace(a, b)` notation below means "the a→b flow,"
+> which you now verify with `codegraph_explore` / `probe-explore.mjs` (the
+> `probe-trace.mjs` / `probe-context.mjs` dev probes went away with the tools).
+
 ---
 
 ## TL;DR for a new session
@@ -35,11 +42,11 @@ rm -rf /tmp/codegraph-corpus/excalidraw/.codegraph
 sqlite3 /tmp/codegraph-corpus/excalidraw/.codegraph/codegraph.db \
   "select s.name||' → '||t.name||'  '||coalesce(e.metadata,'') from edges e \
    join nodes s on e.source=s.id join nodes t on e.target=t.id where e.provenance='heuristic';"
-# end-to-end trace (uses the dev probes):
-node scripts/agent-eval/probe-trace.mjs /tmp/codegraph-corpus/excalidraw triggerUpdate triggerRender
+# end-to-end flow (the synthesized edge shows up in explore's Flow section + node trail):
+node scripts/agent-eval/probe-explore.mjs /tmp/codegraph-corpus/excalidraw "triggerUpdate triggerRender"
 ```
 Probe scripts (dev-only, in `scripts/agent-eval/`): `probe-node.mjs` (symbol + trail),
-`probe-trace.mjs` (call path), `probe-context.mjs`, `probe-explore.mjs`. EventEmitter
+`probe-explore.mjs` (relevant source + the flow among named symbols). EventEmitter
 fixture lives at `/tmp/cb-fixture/bus.js` (ephemeral — recreate or move into `__tests__/`).
 
 ---
@@ -172,8 +179,9 @@ This is one half of closing dynamic-dispatch coverage. The other artifacts on `m
   pre-filter in `resolution/index.ts`) + django ORM resolver (`frameworks/python.ts`,
   `_iterable_class` → `ModelIterable.__iter__`).
 - **Retrieval/UX changes** (separate from coverage): `explore` whole-small-file + glue
-  fixes, `node`-with-trail, `codegraph_trace`, `context` call-paths — all in
-  `src/mcp/tools.ts` / `src/context/index.ts`.
+  fixes, the `explore` Flow section (`buildFlowFromNamedSymbols`), and `node`-with-trail
+  — all in `src/mcp/tools.ts`. (`codegraph_trace` / `codegraph_context` were later
+  removed; explore is the one surfacing tool.)
 - **Full investigation context + findings:** auto-memory
   `project_codegraph_read_displacement` (why coverage — not prompting/hooks/new-tools —
   is the lever for getting agents to use codegraph over Read).

+ 8 - 0
docs/design/dynamic-dispatch-coverage-playbook.md

@@ -9,6 +9,14 @@ each one the same way, so cross-symbol *flows* exist in the graph everywhere.
 > synthesizer) is in [`callback-edge-synthesis.md`](./callback-edge-synthesis.md).
 > Full investigation context + findings: auto-memory `project_codegraph_read_displacement`.
 
+> **Update (2026-06-01):** the `codegraph_trace` and `codegraph_context` MCP tools were
+> **removed** — `codegraph_explore` is the single surfacing tool now. Its "Flow" section
+> (`buildFlowFromNamedSymbols`) surfaces the synthesized edges this playbook is about, and
+> you validate coverage with `codegraph_explore` / `scripts/agent-eval/probe-explore.mjs`.
+> Where the text below writes `trace(a, b)` or lists `trace`/`context` among the tools,
+> read it as "the a→b flow, now surfaced and verified via explore." The synthesizers and
+> the coverage matrix are unchanged.
+
 ---
 
 ## 1. The goal (why this matters)

+ 22 - 0
scripts/agent-eval/bench-why-repo.sh

@@ -0,0 +1,22 @@
+#!/usr/bin/env bash
+# One README repo, WITH-codegraph only, N runs. Each run appends a why-Read
+# diagnostic so the agent explains any Read/Grep. (The WITHOUT baseline is
+# codegraph-independent and already in the README — no point re-running it.)
+# Output -> /tmp/ab-why/<repo>/with<n>.jsonl
+# Usage: bench-why-repo.sh <repo-path> "<query>" [N]
+set -uo pipefail
+REPO="$1"; Q="$2"; N="${3:-4}"
+NAME="$(basename "$REPO")"
+CG="/Users/colby/Development/Personal/codegraph/dist/bin/codegraph.js"
+OUT="/tmp/ab-why/$NAME"; mkdir -p "$OUT"
+WHY=$'\n\nIMPORTANT — diagnostic: if you use the Read or Grep tool at ANY point, for EACH such call explain why codegraph_explore / codegraph_node did not already give you what you needed. End your entire answer with a section titled exactly "## Why I read" listing every Read and Grep you made and the precise reason codegraph fell short for it. If you used neither, write "## Why I read" then "none — codegraph was sufficient."'
+printf '{"mcpServers":{"codegraph":{"command":"%s","args":["serve","--mcp","--path","%s"]}}}' "$CG" "$REPO" > "$OUT/cg.json"
+
+for i in $(seq 1 "$N"); do
+  pkill -f "serve --mcp" 2>/dev/null; sleep 1; rm -f "$REPO/.codegraph/daemon.sock"
+  ( cd "$REPO" && claude -p "$Q$WHY" --output-format stream-json --verbose \
+      --permission-mode bypassPermissions --model opus --effort "${EFFORT:-high}" --max-budget-usd 4 \
+      --strict-mcp-config --mcp-config "$OUT/cg.json" > "$OUT/with$i.jsonl" 2>"$OUT/with$i.err" )
+  echo "WITH run $i: exit $? ($(wc -l < "$OUT/with$i.jsonl" | tr -d ' ') lines)"
+done
+echo "DONE $NAME"

+ 1 - 47
src/bin/codegraph.ts

@@ -1073,52 +1073,6 @@ function printFileTree(
   renderNode(root, '', true, 0);
 }
 
-/**
- * codegraph context <task>
- */
-program
-  .command('context <task>')
-  .description('Build context for a task (outputs markdown)')
-  .option('-p, --path <path>', 'Project path')
-  .option('-n, --max-nodes <number>', 'Maximum nodes to include', '50')
-  .option('-c, --max-code <number>', 'Maximum code blocks', '10')
-  .option('--no-code', 'Exclude code blocks')
-  .option('-f, --format <format>', 'Output format (markdown, json)', 'markdown')
-  .action(async (task: string, options: {
-    path?: string;
-    maxNodes?: string;
-    maxCode?: string;
-    code?: boolean;
-    format?: string;
-  }) => {
-    const projectPath = resolveProjectPath(options.path);
-
-    try {
-      if (!isInitialized(projectPath)) {
-        error(`CodeGraph not initialized in ${projectPath}`);
-        process.exit(1);
-      }
-
-      const { default: CodeGraph } = await loadCodeGraph();
-      const cg = await CodeGraph.open(projectPath);
-
-      const context = await cg.buildContext(task, {
-        maxNodes: parseInt(options.maxNodes || '50', 10),
-        maxCodeBlocks: parseInt(options.maxCode || '10', 10),
-        includeCode: options.code !== false,
-        format: options.format as 'markdown' | 'json',
-      });
-
-      // Output the context
-      console.log(context);
-
-      cg.destroy();
-    } catch (err) {
-      error(`Failed to build context: ${err instanceof Error ? err.message : String(err)}`);
-      process.exit(1);
-    }
-  });
-
 /**
  * codegraph serve
  */
@@ -1161,8 +1115,8 @@ program
 }
 `));
         console.error('Available tools:');
+        console.error(chalk.cyan('  codegraph_explore') + '   - Primary: source of the relevant symbols for any question');
         console.error(chalk.cyan('  codegraph_search') + '    - Search for code symbols');
-        console.error(chalk.cyan('  codegraph_context') + '   - Build context for a task');
         console.error(chalk.cyan('  codegraph_callers') + '   - Find callers of a symbol');
         console.error(chalk.cyan('  codegraph_callees') + '   - Find what a symbol calls');
         console.error(chalk.cyan('  codegraph_impact') + '    - Analyze impact of changes');

+ 95 - 6
src/context/index.ts

@@ -25,7 +25,8 @@ import { GraphTraverser } from '../graph';
 import { formatContextAsMarkdown, formatContextAsJson } from './formatter';
 import { logDebug } from '../errors';
 import { validatePathWithinRoot } from '../utils';
-import { isTestFile, extractSearchTerms, scorePathRelevance, getStemVariants } from '../search/query-utils';
+import { isTestFile, extractSearchTerms, scorePathRelevance, getStemVariants, isDistinctiveIdentifier } from '../search/query-utils';
+import { LOW_CONFIDENCE_MARKER } from './markers';
 
 /**
  * Extract likely symbol names from a natural language query
@@ -172,6 +173,11 @@ const DEFAULT_FIND_OPTIONS: Required<FindRelevantContextOptions> = {
   nodeKinds: HIGH_VALUE_NODE_KINDS, // Filter out imports/exports by default
 };
 
+// Re-export the low-confidence sentinel (defined in a dependency-free leaf so
+// the MCP layer can import it without pulling this module's deps onto the
+// cold-start path). Builder code below uses the imported binding directly.
+export { LOW_CONFIDENCE_MARKER } from './markers';
+
 /**
  * Context Builder
  *
@@ -259,7 +265,9 @@ export class ContextBuilder {
 
     // Return formatted output or raw context
     if (opts.format === 'markdown') {
-      return formatContextAsMarkdown(context) + this.buildCallPathsSection(subgraph);
+      return formatContextAsMarkdown(context)
+        + this.buildCallPathsSection(subgraph)
+        + (subgraph.confidence === 'low' ? this.buildLowConfidenceNote(entryPoints) : '');
     } else if (opts.format === 'json') {
       return formatContextAsJson(context);
     }
@@ -267,6 +275,36 @@ export class ContextBuilder {
     return context;
   }
 
+  /**
+   * Honest handoff appended when retrieval confidence is low (the query matched
+   * mostly common words). Instead of the usual "this covers the surface" framing
+   * — which, when wrong, sends the agent off to Read/Grep — it admits the
+   * uncertainty and routes the agent to the precise tools (explore with real
+   * symbol names, search, or files to browse the closest areas we *did* surface).
+   */
+  private buildLowConfidenceNote(entryPoints: Node[]): string {
+    const dirs: string[] = [];
+    const seen = new Set<string>();
+    for (const n of entryPoints) {
+      const slash = n.filePath.lastIndexOf('/');
+      const dir = slash > 0 ? n.filePath.slice(0, slash) : n.filePath;
+      if (!seen.has(dir)) { seen.add(dir); dirs.push(dir); }
+      if (dirs.length >= 4) break;
+    }
+    const dirLine = dirs.length
+      ? `\n- \`codegraph_files\` a likely area: ${dirs.map(d => `\`${d}\``).join(', ')}`
+      : '';
+    return `\n\n${LOW_CONFIDENCE_MARKER}\n\n`
+      + 'This query matched mostly on common words, so the entry points above may '
+      + 'be off-target — treat them as a starting point, not a complete answer. '
+      + 'For a reliable result:\n'
+      + '- `codegraph_explore` with the **exact symbol names** you are after '
+      + '(class / function / method names), or\n'
+      + '- `codegraph_search <name>` for one specific symbol'
+      + dirLine
+      + '\n\nDo not assume the list above is comprehensive.';
+  }
+
   /**
    * Surface short call-paths among the symbols this context already found,
    * derived in-memory from the subgraph's `calls` edges (no extra queries).
@@ -653,6 +691,21 @@ export class ContextBuilder {
       // term group is counter-productive.
       const exactMatchIds = new Set(exactMatches.map(r => r.node.id));
 
+      // ...but only exempt exact matches the user *named as an identifier*
+      // (camelCase/snake_case/acronym). A plain dictionary word that happens to
+      // exact-match an unrelated symbol — query "flat object" → a constant named
+      // FLAT — must NOT be exempt, or the +exact-name bonus floats it to the top
+      // of a prose query with zero corroboration from any other term. Classify by
+      // the QUERY token (what the user typed), not the matched symbol's name.
+      const distinctiveTokens = new Set(
+        symbolsFromQuery.filter(isDistinctiveIdentifier).map(s => s.toLowerCase())
+      );
+      const distinctiveExactMatchIds = new Set(
+        exactMatches
+          .filter(r => distinctiveTokens.has(r.node.name.toLowerCase()))
+          .map(r => r.node.id)
+      );
+
       for (const result of searchResults) {
         // Check term matches in name (substring) and path DIRECTORIES (exact).
         // Directory segments must match exactly — "search" matches directory
@@ -672,10 +725,17 @@ export class ContextBuilder {
         if (matchCount >= 2) {
           // Multiplicative boost — 2 terms → 2x, 3 terms → 2.5x
           result.score *= 1 + matchCount * 0.5;
-        } else if (!exactMatchIds.has(result.node.id)) {
-          // Mild dampen for single-term matches — they might be generic
+        } else if (distinctiveExactMatchIds.has(result.node.id)) {
+          // Exact match on a distinctive identifier the user explicitly named —
+          // keep full score (e.g. "LiveEditMode DevServerPreview").
+        } else if (exactMatchIds.has(result.node.id)) {
+          // Exact match on a COMMON word (e.g. "flat" → FLAT): high-scoring noise
+          // inflated by the +exact-name bonus, corroborated by no other query
+          // term. Demote hard so corroborated matches win.
+          result.score *= 0.3;
+        } else {
+          // Mild dampen for generic single-term matches — they might be generic
           // but could also be the right result (e.g., "Protocol" class for an IPC query).
-          // Exempt exact name matches: they are specific symbols the user queried for.
           result.score *= 0.6;
         }
       }
@@ -841,6 +901,35 @@ export class ContextBuilder {
       filteredResults = filteredResults.slice(0, opts.searchLimit);
     }
 
+    // Confidence signal for the honest-handoff footer (consumed in buildContext).
+    // A multi-term prose query that resolves only to isolated common-word matches
+    // — no entry point corroborated by 2+ distinct query terms, and none a
+    // distinctive identifier the user explicitly named — is LOW confidence: the
+    // results are best-effort, not a located answer, so the agent should be told
+    // to drill in with explore/trace rather than trust the list as comprehensive.
+    // Single-keyword and symbol-name queries are exempt (their single match IS the
+    // answer), so the handoff never fires on them.
+    let confidence: 'high' | 'low' = 'high';
+    const confTerms = extractSearchTerms(query, { stems: false }).filter(t => t.length >= 3);
+    if (confTerms.length >= 2 && filteredResults.length > 0) {
+      const distinctive = new Set(
+        symbolsFromQuery.filter(isDistinctiveIdentifier).map(s => s.toLowerCase())
+      );
+      const anyStrong = filteredResults.some(r => {
+        if (distinctive.has(r.node.name.toLowerCase())) return true;
+        const nameLower = r.node.name.toLowerCase();
+        const dirSegs = path.dirname(r.node.filePath).toLowerCase().split('/');
+        let hits = 0;
+        for (const t of confTerms) {
+          if (nameLower.includes(t) || dirSegs.includes(t)) {
+            if (++hits >= 2) return true;
+          }
+        }
+        return false;
+      });
+      if (!anyStrong) confidence = 'low';
+    }
+
     // Add entry points to subgraph
     for (const result of filteredResults) {
       nodes.set(result.node.id, result.node);
@@ -1048,7 +1137,7 @@ export class ContextBuilder {
       }
     }
 
-    return { nodes: finalNodes, edges: finalEdges, roots };
+    return { nodes: finalNodes, edges: finalEdges, roots, confidence };
   }
 
   /**

+ 19 - 0
src/context/markers.ts

@@ -0,0 +1,19 @@
+/**
+ * Stable sentinel strings shared between the context builder (which emits them
+ * into its markdown) and the MCP layer (which detects them to adjust framing).
+ *
+ * Intentionally a dependency-free leaf module: the MCP tool layer imports this
+ * to recognise a low-confidence response, and routing that recognition through
+ * the full context module would drag its dependencies onto the cold-start path.
+ * Keep this file import-free.
+ */
+
+/**
+ * Heading that leads the honest low-confidence handoff appended to a context
+ * response when the query resolved only to weak/isolated matches. The MCP layer
+ * checks for it to suppress the contradictory "this is comprehensive, don't call
+ * explore" small-repo footer. Changing the text is a breaking sentinel change —
+ * both the emitter (`ContextBuilder`) and the detector (`src/mcp/tools.ts`)
+ * import this constant, so they stay in sync automatically.
+ */
+export const LOW_CONFIDENCE_MARKER = '### ⚠️ Low-confidence match';

+ 129 - 18
src/extraction/tree-sitter.ts

@@ -1104,6 +1104,102 @@ export class TreeSitterExtractor {
     }
   }
 
+  /**
+   * Extract function-valued properties of an object literal as named function
+   * nodes (named by their property key). Shared by the two object-of-functions
+   * shapes in extractVariable: the object as a direct const value, and the
+   * object returned by a store-initializer call. Handles both `key: () => {}` /
+   * `key: function() {}` pairs and method shorthand `key() {}`.
+   */
+  private extractObjectLiteralFunctions(obj: SyntaxNode): void {
+    for (let i = 0; i < obj.namedChildCount; i++) {
+      const member = obj.namedChild(i);
+      if (!member) continue;
+      if (member.type === 'pair') {
+        const key = getChildByField(member, 'key');
+        const value = getChildByField(member, 'value');
+        if (key && value && (value.type === 'arrow_function' || value.type === 'function_expression')) {
+          this.extractFunction(value, this.objectKeyName(key));
+        }
+      } else if (member.type === 'method_definition') {
+        // Method shorthand: `{ fetchUser() {...} }`. extractMethod deliberately
+        // skips object-literal methods, so route through extractFunction with an
+        // explicit name (method_definition exposes a `body` field, so resolveBody
+        // falls through to it and the node spans the full method).
+        const key = getChildByField(member, 'name');
+        if (key) this.extractFunction(member, this.objectKeyName(key));
+      }
+    }
+  }
+
+  /** Property-key text with surrounding quotes stripped (`'foo'` → `foo`). */
+  private objectKeyName(key: SyntaxNode): string {
+    return getNodeText(key, this.source).replace(/^['"`]|['"`]$/g, '');
+  }
+
+  /**
+   * Given a `call_expression` initializer (`create((set, get) => ({...}))`),
+   * find the object literal RETURNED by a function argument — descending through
+   * nested call_expression arguments so middleware wrappers are unwrapped
+   * (`create(persist((set, get) => ({...}), {...}))`, devtools, immer,
+   * subscribeWithSelector). Returns null when no such object is found — the
+   * common case for ordinary call initializers — so this stays cheap and silent
+   * rather than guessing. Keyed purely on AST shape; no library names.
+   */
+  private findInitializerReturnedObject(callNode: SyntaxNode, depth = 0): SyntaxNode | null {
+    if (depth > 4) return null;
+    const args = getChildByField(callNode, 'arguments');
+    if (!args) return null;
+    for (let i = 0; i < args.namedChildCount; i++) {
+      const arg = args.namedChild(i);
+      if (!arg) continue;
+      if (arg.type === 'arrow_function' || arg.type === 'function_expression') {
+        const obj = this.functionReturnedObject(arg);
+        if (obj) return obj;
+      } else if (arg.type === 'call_expression') {
+        const obj = this.findInitializerReturnedObject(arg, depth + 1);
+        if (obj) return obj;
+      }
+    }
+    return null;
+  }
+
+  /**
+   * The object literal a function expression returns — either the `=> ({...})`
+   * arrow form (a parenthesized_expression wrapping an object) or a
+   * `=> { return {...} }` block. Returns null for any other body shape.
+   */
+  private functionReturnedObject(fnNode: SyntaxNode): SyntaxNode | null {
+    const body = getChildByField(fnNode, 'body');
+    if (!body) return null;
+    const asObject = (n: SyntaxNode | null): SyntaxNode | null => {
+      if (!n) return null;
+      if (n.type === 'object' || n.type === 'object_expression') return n;
+      if (n.type === 'parenthesized_expression') {
+        for (let i = 0; i < n.namedChildCount; i++) {
+          const inner = asObject(n.namedChild(i));
+          if (inner) return inner;
+        }
+      }
+      return null;
+    };
+    // `(set, get) => ({...})` — body is the (parenthesized) object directly.
+    const direct = asObject(body);
+    if (direct) return direct;
+    // `(set, get) => { return {...} }` — scan top-level return statements.
+    if (body.type === 'statement_block') {
+      for (let i = 0; i < body.namedChildCount; i++) {
+        const stmt = body.namedChild(i);
+        if (stmt?.type !== 'return_statement') continue;
+        for (let j = 0; j < stmt.namedChildCount; j++) {
+          const obj = asObject(stmt.namedChild(j));
+          if (obj) return obj;
+        }
+      }
+    }
+    return null;
+  }
+
   /**
    * Extract a variable declaration (const, let, var, etc.)
    *
@@ -1162,29 +1258,44 @@ export class TreeSitterExtractor {
               this.extractVariableTypeAnnotation(child, varNode.id);
             }
 
+            // Exported const object-of-functions — extract each function-valued
+            // property as a function named by its key + walk its body so its
+            // calls are captured. Two shapes, both keyed on AST shape (not on any
+            // library name):
+            //   `export const actions = { default: async () => {} }` — object is
+            //     the DIRECT value (SvelteKit form actions / handler maps / route
+            //     tables).
+            //   `export const useStore = create((set, get) => ({ fetchUser:
+            //     async () => {} }))` — object is RETURNED by an initializer call,
+            //     possibly through middleware wrappers (persist/devtools/immer).
+            //     Covers Zustand/Redux/Pinia/MobX stores generically. Without
+            //     this, store actions exist only as object-literal properties —
+            //     never nodes — so `node`/`callers` on `fetchUser` return "not
+            //     found" and the agent Reads the store to reconstruct the flow.
+            // Scoped to EXPORTED consts to exclude inline-object noise
+            // (`ctx.set({...})`) the object-method skip deliberately avoids.
+            const objectOfFns =
+              valueNode && (valueNode.type === 'object' || valueNode.type === 'object_expression')
+                ? valueNode
+                : valueNode?.type === 'call_expression'
+                  ? this.findInitializerReturnedObject(valueNode)
+                  : null;
+            const extractObjectMethods = isExported && !!objectOfFns;
+
+            // Visit the initializer body for calls — EXCEPT object literals (their
+            // function-valued properties are extracted below) and the store-factory
+            // call whose returned object we extract method-by-method below (walking
+            // the whole call would re-visit those method arrows and mis-attribute
+            // their inner calls to the file/module scope).
             if (valueNode &&
                 valueNode.type !== 'object' &&
-                valueNode.type !== 'object_expression') {
+                valueNode.type !== 'object_expression' &&
+                !(extractObjectMethods && valueNode.type === 'call_expression')) {
               this.visitFunctionBody(valueNode, '');
             }
 
-            // Exported const object-of-functions: `export const actions =
-            // { default: async () => {} }` (SvelteKit form actions / handler maps
-            // / route tables). Extract each function-valued property as a function
-            // named by its key + walk its body so its calls (e.g. api.post) are
-            // captured. Scoped to EXPORTED consts to exclude the inline-object
-            // noise (`ctx.set({...})`) the object-method skip deliberately avoids.
-            if (isExported && valueNode &&
-                (valueNode.type === 'object' || valueNode.type === 'object_expression')) {
-              for (let j = 0; j < valueNode.namedChildCount; j++) {
-                const pair = valueNode.namedChild(j);
-                if (pair?.type !== 'pair') continue;
-                const v = getChildByField(pair, 'value');
-                const k = getChildByField(pair, 'key');
-                if (k && v && (v.type === 'arrow_function' || v.type === 'function_expression')) {
-                  this.extractFunction(v, getNodeText(k, this.source).replace(/^['"`]|['"`]$/g, ''));
-                }
-              }
+            if (extractObjectMethods && objectOfFns) {
+              this.extractObjectLiteralFunctions(objectOfFns);
             }
           }
         }

+ 10 - 1
src/index.ts

@@ -681,6 +681,15 @@ export class CodeGraph {
     return this.queries.getNodesByKind(kind);
   }
 
+  /**
+   * Get ALL nodes with an exact name (direct index lookup, not FTS-ranked/capped).
+   * Used to enumerate every overload of a heavily-overloaded name so the specific
+   * definition the caller wants is never dropped below a search cut.
+   */
+  getNodesByName(name: string): Node[] {
+    return this.queries.getNodesByName(name);
+  }
+
   /**
    * Search nodes by text
    */
@@ -692,7 +701,7 @@ export class CodeGraph {
    * Find the project's "primary route file" — the file with the densest
    * concentration of framework-emitted `route` nodes (≥3 routes, ≥30%
    * of all non-test routes). Used to inline the routing config in
-   * `codegraph_context` responses on small realworld template repos
+   * `codegraph_explore` responses on small realworld template repos
    * (rails-realworld, laravel-realworld, drupal-admintoolbar, …) where
    * Glob+Read of `routes.rb`/`urls.py`/etc. otherwise beats codegraph.
    */

+ 3 - 2
src/installer/targets/shared.ts

@@ -31,12 +31,13 @@ export function getMcpServerConfig(): { type: string; command: string; args: str
  */
 export function getCodeGraphPermissions(): string[] {
   return [
+    'mcp__codegraph__codegraph_explore',
     'mcp__codegraph__codegraph_search',
-    'mcp__codegraph__codegraph_context',
+    'mcp__codegraph__codegraph_node',
     'mcp__codegraph__codegraph_callers',
     'mcp__codegraph__codegraph_callees',
     'mcp__codegraph__codegraph_impact',
-    'mcp__codegraph__codegraph_node',
+    'mcp__codegraph__codegraph_files',
     'mcp__codegraph__codegraph_status',
   ];
 }

+ 18 - 19
src/mcp/server-instructions.ts

@@ -25,32 +25,31 @@ editing code, not during.
 ## Answer directly — don't delegate exploration
 
 For "how does X work", architecture, trace, or where-is-X questions,
-answer DIRECTLY using 2-3 codegraph calls: \`codegraph_context\` first,
-then ONE \`codegraph_explore\` for the source of the symbols it surfaces.
-Codegraph IS the pre-built search index — so delegating the lookup to a
-separate file-reading sub-task/agent, or running your own grep + read
-loop, repeats work codegraph already did and costs more for the same
-answer. Reach for raw Read/Grep only to confirm a specific detail
-codegraph didn't cover. A direct codegraph answer is typically a handful
-of calls; a grep/read exploration is dozens.
+answer DIRECTLY — usually with ONE \`codegraph_explore\` call.
+\`codegraph_explore\` takes either a natural-language question or a bag of
+symbol/file names and returns the verbatim source of the relevant symbols
+grouped by file, so it is Read-equivalent and most often the ONLY
+codegraph call you need. Codegraph IS the pre-built search index — so
+delegating the lookup to a separate file-reading sub-task/agent, or
+running your own grep + read loop, repeats work codegraph already did and
+costs more for the same answer. Reach for raw Read/Grep only to confirm a
+specific detail codegraph didn't cover. A direct codegraph answer is
+typically one to a few calls; a grep/read exploration is dozens.
 
 ## Tool selection by intent
 
-- **"What is the symbol named X?"** → \`codegraph_search\`
-- **"What's the deal with this task / feature / area?"** → \`codegraph_context\` (PRIMARY — composes search + node + callers + callees in one call)
-- **"How does X reach/become Y? / trace the flow / the path from X to Y"** → \`codegraph_trace\` (ONE call returns the whole call path, including dynamic-dispatch hops — callbacks, React re-render, JSX children — that grep can't follow)
-- **"What calls this?"** → \`codegraph_callers\`
-- **"What does this call?"** → \`codegraph_callees\`
-- **"What would changing this break?"** → \`codegraph_impact\`
-- **"Show me this symbol's source / signature / docstring."** → \`codegraph_node\`
-- **"Show me several related symbols' source / survey an area."** → \`codegraph_explore\` (ONE capped call; prefer over many codegraph_node/Read)
+- **Almost any question — "how does X work", architecture, a bug, "what/where is X", or surveying an area** → \`codegraph_explore\` (PRIMARY — call FIRST; ONE capped call returns the verbatim source of the relevant symbols grouped by file; most often the ONLY call you need)
+- **"How does X reach/become Y? / the flow / the path from X to Y"** → \`codegraph_explore\`, naming the symbols that span the flow (e.g. \`mutateElement renderScene\`) — it surfaces the call path among them, including dynamic-dispatch hops (callbacks, React re-render, JSX children) grep can't follow
+- **"What is the symbol named X?" (just its location)** → \`codegraph_search\`
+- **"What calls this?" / "What does this call?" / "What would changing this break?"** → \`codegraph_callers\` / \`codegraph_callees\` / \`codegraph_impact\`
+- **One specific symbol's full source (esp. a body \`codegraph_explore\` trimmed), or an OVERLOADED name** → \`codegraph_node\` (with \`includeCode\`): for an ambiguous name it returns EVERY matching definition's body in one call, so you never Read a file to find the right overload
 - **"What's in directory X?"** → \`codegraph_files\`
 - **"Is the index ready / what's its size?"** → \`codegraph_status\`
 
 ## Common chains
 
-- **Flow / "how does X reach Y"**: \`codegraph_trace\` from→to FIRST — one call returns the entire path with dynamic-dispatch hops bridged. Then ONE \`codegraph_explore\` for the hop bodies if you need them. Do NOT reconstruct the path with \`codegraph_search\` + \`codegraph_callers\` — that's exactly what trace does in a single call.
-- **Onboarding**: \`codegraph_context\` first. If still unclear, \`codegraph_explore\` for breadth, then \`codegraph_node\` on specific symbols.
+- **Flow / "how does X reach Y"**: ONE \`codegraph_explore\` with the symbol names spanning the flow — it surfaces the call path among them (riding dynamic-dispatch hops) AND returns their source. No need to reconstruct the path with \`codegraph_search\` + \`codegraph_callers\`.
+- **Onboarding / understanding any area**: ONE \`codegraph_explore\` is usually the whole answer. Only follow up — \`codegraph_node\` for a specific symbol — if something is still unclear.
 - **Refactor planning**: \`codegraph_search\` → \`codegraph_callers\` → \`codegraph_impact\`. The blast-radius answer comes from impact, not from walking callers manually.
 - **Debugging a regression**: \`codegraph_callers\` of the suspected symbol; widen with \`codegraph_impact\` if an unexpected call appears.
 
@@ -58,7 +57,7 @@ of calls; a grep/read exploration is dozens.
 
 - **Trust codegraph's results — don't re-verify them with grep.** They come from a full AST parse; re-checking with grep is slower, less accurate, and wastes context.
 - **Don't grep first** when looking up a symbol by name — \`codegraph_search\` is faster and returns kind + location + signature.
-- **Don't chain \`codegraph_search\` + \`codegraph_node\`** when you just want context — \`codegraph_context\` is one round-trip.
+- **Don't chain \`codegraph_search\` + \`codegraph_node\`** to understand an area — ONE \`codegraph_explore\` returns the relevant symbols' source together in a single round-trip.
 - **Don't loop \`codegraph_node\` over many symbols** — one \`codegraph_explore\` call returns them all grouped by file, while each separate call re-reads the whole context and costs far more. Use \`codegraph_node\` for a single symbol.
 - **After editing, check the staleness banner.** When a tool response starts with "⚠️ Some files referenced below were edited since the last index sync…", the listed files are pending re-index — Read those specific files for accurate content. Every file NOT in that banner is fresh, so still trust codegraph. \`codegraph_status\` also lists pending files under "Pending sync".
 

Tiedoston diff-näkymää rajattu, sillä se on liian suuri
+ 179 - 740
src/mcp/tools.ts


+ 27 - 0
src/search/query-utils.ts

@@ -340,3 +340,30 @@ export function kindBonus(kind: Node['kind']): number {
   };
   return bonuses[kind] ?? 0;
 }
+
+/**
+ * Whether a query token looks like a code identifier the user deliberately typed
+ * (camelCase / PascalCase-with-internal-caps / snake_case / has a digit) rather
+ * than a plain dictionary word ("flat", "object", "screen").
+ *
+ * Used to decide whether an EXACT name match earns the "the user named this
+ * symbol" exemption from single-term dampening. A common English word that
+ * happens to exact-match an unrelated symbol — the query "flat object" matching
+ * a constant named `FLAT` — must NOT get that exemption, or the +exact-name
+ * bonus floats it to the top of a prose query on its own.
+ *
+ * Classifies the token AS THE USER TYPED IT, not the matched symbol's name:
+ * "flat" (lowercase, descriptive) is non-distinctive even though it matches
+ * `FLAT`. A leading-capital-only word ("Screen", "Zustand") is also treated as
+ * a plain word — sentence-start capitalization and proper nouns aren't reliable
+ * identifier signals.
+ */
+export function isDistinctiveIdentifier(token: string): boolean {
+  if (!token) return false;
+  // snake_case / SCREAMING_SNAKE, or an embedded digit → a deliberate identifier.
+  if (/[_0-9]/.test(token)) return true;
+  // An uppercase letter anywhere AFTER the first char → a camelCase/PascalCase
+  // boundary (setLastEmail, OrgUserStore) or an acronym (REST, HTTP).
+  if (/[A-Z]/.test(token.slice(1))) return true;
+  return false;
+}

+ 9 - 0
src/types.ts

@@ -311,6 +311,15 @@ export interface Subgraph {
 
   /** Root node IDs (entry points) */
   roots: string[];
+
+  /**
+   * Retrieval confidence for context-style queries. `'low'` means the query
+   * resolved only to isolated common-word matches (no entry point corroborated
+   * by 2+ distinct query terms) — callers should surface an honest handoff to
+   * explore/trace rather than present the results as comprehensive. Undefined
+   * for graph traversals that don't run the search-ranking path.
+   */
+  confidence?: 'high' | 'low';
 }
 
 /**

Kaikkia tiedostoja ei voida näyttää, sillä liian monta tiedostoa muuttui tässä diffissä