1 месяц назад · 25cba9ad7b
--- a/docs/design/callback-edge-synthesis.md
+++ b/docs/design/callback-edge-synthesis.md
@@ -0,0 +1,179 @@
 
				+# Design + status: general callback / observer edge synthesis
			
 
				+
			
 
				+**Status:** Phases 1–3 implemented & validated as a **prototype, uncommitted on `main`**
			
 
				+(as of 2026-05-22). This doc is the handoff for continuing the work.
			
 
				+**Motivation:** close the dynamic-dispatch hole that static extraction leaves for
			
 
				+observer / event-emitter / signal patterns, where a *dispatcher* invokes callbacks
			
 
				+registered elsewhere through a shared store — so flows like "how does an update
			
 
				+reach the screen" actually exist in the graph.
			
 
				+
			
 
				+---
			
 
				+
			
 
				+## TL;DR for a new session
			
 
				+
			
 
				+We synthesize `dispatcher → callback` edges that static parsing misses. It works:
			
 
				+
			
 
				+- **Field observer** (excalidraw `Scene.onUpdate`/`triggerUpdate`): synthesizes
			
 
				+  `triggerUpdate → triggerRender`. `trace(mutateElement, triggerRender)` now = 3 hops.
			
 
				+- **EventEmitter** (express `on('mount', …)`/`emit('mount')`): synthesizes `use → onmount`.
			
 
				+- Precision is high: excalidraw got **1** synthesized edge out of 27k (the correct one);
			
 
				+  node count moved +3 after Phase 3 (no explosion).
			
 
				+
			
 
				+**Files touched (all uncommitted on `main`):**
			
 
				+- `src/resolution/callback-synthesizer.ts` — the whole-graph synthesis pass (Phase 1 + 2).
			
 
				+- `src/resolution/index.ts` — calls `synthesizeCallbackEdges()` at the end of
			
 
				+  `resolveAndPersistBatched()` (after base edges are persisted) + the import.
			
 
				+- `src/extraction/tree-sitter.ts` — `visitFunctionBody` now extracts **named** nested
			
 
				+  functions (Phase 3), so inline named handlers become linkable nodes.
			
 
				+
			
 
				+**How to reproduce / test:**
			
 
				+```bash
			
 
				+npm run build
			
 
				+rm -rf /tmp/codegraph-corpus/excalidraw/.codegraph
			
 
				+( cd /tmp/codegraph-corpus/excalidraw && codegraph init -i )
			
 
				+# synthesized edges (provenance='heuristic', metadata.synthesizedBy in {callback,event-emitter}):
			
 
				+sqlite3 /tmp/codegraph-corpus/excalidraw/.codegraph/codegraph.db \
			
 
				+  "select s.name||' → '||t.name||'  '||coalesce(e.metadata,'') from edges e \
			
 
				+   join nodes s on e.source=s.id join nodes t on e.target=t.id where e.provenance='heuristic';"
			
 
				+# end-to-end trace (uses the dev probes):
			
 
				+node scripts/agent-eval/probe-trace.mjs /tmp/codegraph-corpus/excalidraw triggerUpdate triggerRender
			
 
				+```
			
 
				+Probe scripts (dev-only, in `scripts/agent-eval/`): `probe-node.mjs` (symbol + trail),
			
 
				+`probe-trace.mjs` (call path), `probe-context.mjs`, `probe-explore.mjs`. EventEmitter
			
 
				+fixture lives at `/tmp/cb-fixture/bus.js` (ephemeral — recreate or move into `__tests__/`).
			
 
				+
			
 
				+---
			
 
				+
			
 
				+## The hole
			
 
				+
			
 
				+```ts
			
 
				+class Scene {
			
 
				+  private callbacks = new Set<Callback>();
			
 
				+  onUpdate(cb: Callback) { this.callbacks.add(cb); }          // REGISTRAR
			
 
				+  triggerUpdate() { for (const cb of this.callbacks) cb(); }  // DISPATCHER
			
 
				+}
			
 
				+this.scene.onUpdate(this.triggerRender);                      // REGISTRATION SITE
			
 
				+```
			
 
				+
			
 
				+The runtime edge `triggerUpdate → triggerRender` does not exist statically:
			
 
				+`triggerUpdate`'s only literal call is `cb()` (anonymous). Measured: `triggerUpdate`'s
			
 
				+only callee was `randomInteger`; `trace(triggerUpdate, triggerRender)` returned no path.
			
 
				+
			
 
				+## Why it's a whole-graph pass, not a `FrameworkResolver.resolve()`
			
 
				+
			
 
				+`resolve(ref)` answers "what does this **named** ref point to," one ref at a time. The
			
 
				+callback edge has **no ref to resolve** (`cb()` is anonymous) and needs **cross-file,
			
 
				+multi-site correlation** (registrar, registration, dispatcher). So it's a whole-graph
			
 
				+pass after base resolution, language-level (any OO observer), living in
			
 
				+`src/resolution/callback-synthesizer.ts` — **not** under `frameworks/`.
			
 
				+
			
 
				+> Sibling mechanism for the *other* dynamic-dispatch class — **named** attribute/
			
 
				+> descriptor dispatch (e.g. django `self._iterable_class(...)`) — is the
			
 
				+> `claimsReference` hook (`resolution/types.ts` + `resolution/index.ts` pre-filter)
			
 
				+> + a `FrameworkResolver.resolve()` (django ORM resolver in `frameworks/python.ts`).
			
 
				+> That one *does* fit `resolve()` because the ref is named. Both are part of the same
			
 
				+> coverage effort; see the "Related work" section.
			
 
				+
			
 
				+---
			
 
				+
			
 
				+## As-built algorithm (and where it diverged from the original design)
			
 
				+
			
 
				+### Field-observer channels (`fieldChannelEdges`, Phase 1)
			
 
				+1. **Candidates** by method/function **name** — registrar `^(on[A-Z]\w*|subscribe|
			
 
				+   addListener|addEventListener|register|watch|listen|addCallback)$`; dispatcher
			
 
				+   contains `(emit|trigger|notify|dispatch|fire|publish|flush)`.
			
 
				+2. **Confirm by body** (read via `ctx.readFile` + slice node lines): registrar has
			
 
				+   `this.<F>.add|push|set(`; dispatcher has `for (… of [Array.from(]this.<F>)` + a call,
			
 
				+   or `this.<F>.forEach(`.
			
 
				+3. **Pairing — DIVERGENCE:** the design said pair by *class*; the build pairs by
			
 
				+   **same file + same field `F`** (file as a class proxy — getting the containing class
			
 
				+   reliably was harder). Works for the common 1-class-per-file case; revisit for
			
 
				+   multi-class files.
			
 
				+4. **Registrations:** `queries.getIncomingEdges(registrar.id, ['calls'])` → for each,
			
 
				+   read the caller's source at the edge line and **regex-recover the arg**
			
 
				+   (`<registrarName>\s*\(\s*(?:this\.)?(\w+)`). DIVERGENCE: design preferred tree-sitter
			
 
				+   re-parse; build uses regex (named refs only — arrows/inline args are missed here).
			
 
				+5. **Synthesize** `dispatcher → fn` (`getNodesByName(arg)` → method|function). Capped at
			
 
				+   `MAX_CALLBACKS_PER_CHANNEL = 40`.
			
 
				+
			
 
				+### EventEmitter channels (`eventEmitterEdges`, Phase 2)
			
 
				+- **File-oriented scan** (`ctx.getAllFiles()` + `readFile`, substring pre-filter on
			
 
				+  `.emit(`/`.on(`/etc). `ON_RE` = `\.(?:on|once|addListener)\(\s*['"]([^'"]+)['"]\s*,\s*
			
 
				+  (?:function\s+(\w+)|(?:this\.)?(\w+))`; `EMIT_RE` = `\.(?:emit|fire|dispatchEvent)\(\s*['"]([^'"]+)['"]`.
			
 
				+- Dispatcher = **enclosing function** of the `emit('e')` call (`enclosingFn` finds the
			
 
				+  tightest function/method/component node containing the line). Handler = `getNodesByName`
			
 
				+  of the on-handler name.
			
 
				+- Correlate by **event-name literal**; synthesize dispatcher → handler.
			
 
				+- **Precision — DIVERGENCE:** design proposed receiver-type matching; build uses an
			
 
				+  **event fan-out cap** (`EVENT_FANOUT_CAP = 6`) — skip events with >6 handlers or
			
 
				+  dispatchers (generic names like `error`/`change` would over-link without type info).
			
 
				+
			
 
				+### Provenance — DIVERGENCE
			
 
				+`Edge.provenance` is a fixed enum (`'tree-sitter'|'scip'|'heuristic'`), so synthesized
			
 
				+edges use **`provenance: 'heuristic'`** + `metadata: { synthesizedBy: 'callback'|
			
 
				+'event-emitter', via/event/field }`. The design's `'callback-synthesis'` provenance and
			
 
				+high/medium/low **confidence tiers were NOT implemented** — the fan-out cap +
			
 
				+registrar-name uniqueness + named-only handlers are the precision guards instead.
			
 
				+
			
 
				+### Phase 3 — inline callback extraction (`tree-sitter.ts`)
			
 
				+The real blocker for EventEmitter on real repos: inline handlers
			
 
				+(`on('mount', function onmount(){})`) weren't **nodes**, so nothing could link to them.
			
 
				+Root cause: `visitFunctionBody` walked *through* nested functions without extracting them.
			
 
				+Fix: in `visitForCallsAndStructure`, when a body node is a `functionType` and
			
 
				+`extractName` returns a real name, call `extractFunction` (which extracts it and walks
			
 
				+its own body) and return. **Named only** — anonymous arrows fall through to the existing
			
 
				+recursion (so their inner calls stay attributed to the enclosing fn). This bounded it:
			
 
				+excalidraw +3 nodes, no explosion, no regression.
			
 
				+
			
 
				+---
			
 
				+
			
 
				+## Validation results (actual)
			
 
				+
			
 
				+| Repo | Result |
			
 
				+|---|---|
			
 
				+| excalidraw | 1 synthesized edge `triggerUpdate → triggerRender` (of 27,214); `trace(mutateElement, triggerRender)` = 3 hops; nodes 9,286 → 9,289 |
			
 
				+| express | after Phase 3: `use → onmount` `{event-emitter, event:"mount"}` (`onmount` now extracted at `application.js:109`) |
			
 
				+| `/tmp/cb-fixture/bus.js` | `tick → handleRefresh`, `persist → handleSave` (named-method EventEmitter handlers) |
			
 
				+| excalidraw / express | no Phase-1 regression; node counts stable |
			
 
				+
			
 
				+---
			
 
				+
			
 
				+## Remaining work (prioritized for the next session)
			
 
				+
			
 
				+1. **Anonymous-arrow handlers** — `on('e', () => foo())` still produce no edge (no node,
			
 
				+   intentionally not extracted in Phase 3). The fix is **synthesizer link-through-body**:
			
 
				+   parse the arrow's body and link `dispatcher → (calls inside the arrow)`. Highest
			
 
				+   remaining recall win; handles the most common modern callback shape.
			
 
				+2. **Wire into `resolveAndPersist`** (incremental sync) — synthesis currently runs only
			
 
				+   in `resolveAndPersistBatched` (full index). Incremental re-index won't refresh
			
 
				+   synthesized edges.
			
 
				+3. **Receiver-type matching** for EventEmitter precision (replace/augment the fan-out
			
 
				+   cap) — use `type_of` edges so `x.emit('change')` only links to `y.on('change', fn)`
			
 
				+   when `x`,`y` are the same type. Lets the fan-out cap relax.
			
 
				+4. **Tree-sitter arg recovery** (replace the regex in field-channel Stage 4) — robust for
			
 
				+   arrows, multi-arg, line-wrapped calls.
			
 
				+5. **Single-callback fields** (`this.onChange = cb; … this.onChange()`) — scalar-store
			
 
				+   variant of the field observer; not built.
			
 
				+6. **Broad precision/recall audit** — run across the full corpus; tally synthesized edges
			
 
				+   per repo, spot-check, confirm no explosion on EventEmitter-heavy repos.
			
 
				+7. **Tests + CHANGELOG** — the fixture is a ready vitest case for the synthesizer; add
			
 
				+   extractor tests for Phase 3 (named-nested-fn extraction; confirm other languages
			
 
				+   unaffected — the change is in the shared walker), resolver tests for the django side.
			
 
				+
			
 
				+## Edge cases / model
			
 
				+- **Over-approximation across instances** is accepted (reachability, not instance
			
 
				+  precision). `unregister`/`off` ignored.
			
 
				+- Synthesized edges are **additive** — never replace static edges; tooling can filter on
			
 
				+  `provenance='heuristic'` + `metadata.synthesizedBy`.
			
 
				+
			
 
				+## Related work (same coverage effort)
			
 
				+This is one half of closing dynamic-dispatch coverage. The other artifacts on `main`:
			
 
				+- **Named attribute/descriptor resolver**: `claimsReference` (`resolution/types.ts`,
			
 
				+  pre-filter in `resolution/index.ts`) + django ORM resolver (`frameworks/python.ts`,
			
 
				+  `_iterable_class` → `ModelIterable.__iter__`).
			
 
				+- **Retrieval/UX changes** (separate from coverage): `explore` whole-small-file + glue
			
 
				+  fixes, `node`-with-trail, `codegraph_trace`, `context` call-paths — all in
			
 
				+  `src/mcp/tools.ts` / `src/context/index.ts`.
			
 
				+- **Full investigation context + findings:** auto-memory
			
 
				+  `project_codegraph_read_displacement` (why coverage — not prompting/hooks/new-tools —
			
 
				+  is the lever for getting agents to use codegraph over Read).
			
--- a/docs/design/dynamic-dispatch-coverage-playbook.md
+++ b/docs/design/dynamic-dispatch-coverage-playbook.md
@@ -0,0 +1,234 @@
 
				+# Dynamic-Dispatch Coverage Playbook
			
 
				+
			
 
				+**Audience:** a Claude agent continuing this work.
			
 
				+**Mission:** systematically close static-extraction coverage holes for **dynamic
			
 
				+dispatch** across **every language and framework codegraph supports**, and validate
			
 
				+each one the same way, so cross-symbol *flows* exist in the graph everywhere.
			
 
				+
			
 
				+> This is the top-level playbook. The deep design for one mechanism (the callback
			
 
				+> synthesizer) is in [`callback-edge-synthesis.md`](./callback-edge-synthesis.md).
			
 
				+> Full investigation context + findings: auto-memory `project_codegraph_read_displacement`.
			
 
				+
			
 
				+---
			
 
				+
			
 
				+## 1. The goal (why this matters)
			
 
				+
			
 
				+codegraph's value is being **the map** — answering structural/flow questions
			
 
				+(`trace`, `impact`, callers, "how does X reach Y") that grep/Read cannot. Agents
			
 
				+will use codegraph instead of Read **only when it is sufficient**. We proved
			
 
				+empirically (see memory) that the lever for sufficiency is **coverage**, not
			
 
				+prompting/hooks/new-tools: when a flow is missing from the graph, the agent reads
			
 
				+the files to reconstruct it; when the flow *is* in the graph, the agent can answer
			
 
				+completely without reading.
			
 
				+
			
 
				+**Validated end-to-end on excalidraw:** after closing the update-flow hole, 2/3
			
 
				+headless agent runs answered the "how does an update reach the screen" question with
			
 
				+**Read 0 and a complete answer** — impossible before, because the key edge wasn't in
			
 
				+the graph. (Caveat: coverage *enables* the no-read path; agent confirm-by-reading
			
 
				+variance means it doesn't *force* it. Completeness improves unconditionally.)
			
 
				+
			
 
				+The mission is to make that true for **all** languages/frameworks.
			
 
				+
			
 
				+---
			
 
				+
			
 
				+## 2. The problem class: dynamic dispatch
			
 
				+
			
 
				+Static tree-sitter extraction captures explicit calls (`foo()`, `this.bar()`). It
			
 
				+**misses** any call whose target is computed/indirect. Four recurring shapes, with a
			
 
				+**difficulty gradient** (do the cheap ones first):
			
 
				+
			
 
				+| # | Shape | Example | Fix mechanism | Cost |
			
 
				+|---|---|---|---|---|
			
 
				+| 1 | **Named attribute / descriptor** | django `self._iterable_class(self)` | framework resolver (`claimsReference` + `resolve()`) | **cheap** |
			
 
				+| 2 | **Field-backed observer** | `onUpdate(cb)` + `for(cb of cbs)cb()` | callback synthesizer (whole-graph pass) | medium |
			
 
				+| 3 | **String-keyed EventEmitter** | `on('e',fn)` / `emit('e')` | callback synthesizer (event-keyed) | medium |
			
 
				+| 4 | **Inline callback handler** | `on('e', function h(){})` / `() => {}` | extraction (named) + synthesizer link-through-body (anon) | named: cheap · anon: hard |
			
 
				+
			
 
				+Key distinction driving the mechanism choice:
			
 
				+- **A named ref exists** to resolve (`_iterable_class` is an attribute name) → **resolver**.
			
 
				+- **No ref exists** (`cb()` is anonymous; needs registrar↔dispatcher correlation) → **synthesizer**.
			
 
				+
			
 
				+---
			
 
				+
			
 
				+## 3. Worked examples (the two mechanisms, end to end)
			
 
				+
			
 
				+### 3a. Django ORM descriptor — the **resolver** pattern (Python)
			
 
				+- **Hole:** `QuerySet._fetch_all` calls `self._iterable_class(self)` (a runtime-chosen
			
 
				+  iterable, default `ModelIterable`), whose `__iter__` runs the SQL compiler. Static
			
 
				+  parsing can't resolve the attribute-as-callable → `_fetch_all`'s only callee was
			
 
				+  `_prefetch_related_objects`; `trace(_fetch_all, execute_sql)` returned no path.
			
 
				+- **Fix:** `djangoResolver` claims the unresolved `_iterable_class` ref through the
			
 
				+  name-exists pre-filter, then resolves it to `ModelIterable.__iter__`.
			
 
				+- **Files:** `src/resolution/types.ts` (`claimsReference?` on `FrameworkResolver`),
			
 
				+  `src/resolution/index.ts` (pre-filter in `resolveOne` consults `claimsReference`),
			
 
				+  `src/resolution/frameworks/python.ts` (`djangoResolver.resolve` + `claimsReference` +
			
 
				+  `resolveModelIterableIter`).
			
 
				+- **Result:** `trace(_fetch_all, execute_sql)` → `_fetch_all → __iter__ → execute_sql` (3 hops).
			
 
				+
			
 
				+### 3b. Excalidraw observer + EventEmitter — the **synthesizer** (TS)
			
 
				+- **Hole:** `Scene.triggerUpdate` does `for (cb of this.callbacks) cb()`; `triggerRender`
			
 
				+  is registered via `scene.onUpdate(this.triggerRender)`. The `triggerUpdate →
			
 
				+  triggerRender` edge is dynamic → `trace` returned no path; the whole update flow broke.
			
 
				+- **Fix:** a whole-graph pass that detects registrar/dispatcher channels, correlates
			
 
				+  registration sites, and synthesizes `dispatcher → callback` edges. Plus extraction of
			
 
				+  **named** inline callbacks so handlers like express's `function onmount(){}` are nodes.
			
 
				+- **Files:** `src/resolution/callback-synthesizer.ts` (the pass — field observers +
			
 
				+  EventEmitter), `src/resolution/index.ts` (calls `synthesizeCallbackEdges()` at the end
			
 
				+  of `resolveAndPersistBatched`), `src/extraction/tree-sitter.ts` (`visitFunctionBody`
			
 
				+  extracts named nested functions).
			
 
				+- **Result:** `trace(mutateElement, triggerRender)` → 3 hops; express `use → onmount`.
			
 
				+
			
 
				+---
			
 
				+
			
 
				+## 4. The repeatable methodology (run this per language/framework)
			
 
				+
			
 
				+### Step 1 — Pick the framework's canonical *flow* question
			
 
				+Every framework has a signature data/control flow. Pick the "how does X reach/become Y"
			
 
				+question and a real repo (add to `.claude/skills/agent-eval/corpus.json`). Examples:
			
 
				+- React state→DOM, Vue reactive→render, Svelte store→update
			
 
				+- Rails request→controller→view, Spring request→`@Controller`→service
			
 
				+- Express/Koa request→middleware→handler, FastAPI request→route→dependency
			
 
				+- Redux action→reducer→store, RxJS subscribe→operator→observer
			
 
				+- Any ORM: query builder → SQL execution (django pattern)
			
 
				+
			
 
				+### Step 2 — Measure the hole (deterministic, no agent)
			
 
				+```bash
			
 
				+rm -rf <repo>/.codegraph && ( cd <repo> && codegraph init -i )
			
 
				+node scripts/agent-eval/probe-trace.mjs <repo> <from-symbol> <to-symbol>   # does the flow break? where?
			
 
				+node scripts/agent-eval/probe-node.mjs  <repo> <break-symbol>              # trail: is the next hop missing?
			
 
				+```
			
 
				+A "No direct call path … breaks at dynamic dispatch" + a sparse trail at the break
			
 
				+point **locates the hole** (this is exactly how `_iterable_class` and `triggerUpdate`
			
 
				+were found). Confirm it's dynamic by reading the break symbol's body.
			
 
				+
			
 
				+### Step 3 — Classify → choose the mechanism (use the §2 table)
			
 
				+- `self.<attr>(...)` / descriptor / metaclass → **resolver** (§3a).
			
 
				+- `for(cb of store)cb()` / `store.forEach(cb=>cb())` → **field-observer synthesizer** (§3b).
			
 
				+- `on('e',fn)` + `emit('e')` → **EventEmitter synthesizer** (§3b).
			
 
				+- Inline handler not a node → **named:** extraction (already done generically in
			
 
				+  `tree-sitter.ts`); **anonymous:** synthesizer link-through-body (not yet built).
			
 
				+
			
 
				+### Step 4 — Implement
			
 
				+- **Resolver:** add to `src/resolution/frameworks/<lang>.ts` — a `resolve()` branch +
			
 
				+  `claimsReference(name)` if the ref name isn't a declared symbol. Copy `djangoResolver`.
			
 
				+- **Synthesizer channel:** extend `src/resolution/callback-synthesizer.ts` — add the
			
 
				+  framework's registrar/dispatcher **name patterns** and **body patterns** (e.g. signals
			
 
				+  use `.connect()`/`.emit()`; Rx uses `.subscribe()`/`.next()`).
			
 
				+- Reindex (Step 2 command) and re-run `probe-trace` — the flow should now connect.
			
 
				+
			
 
				+### Step 5 — Validate (the same way every time)
			
 
				+1. **Deterministic:** `probe-trace(from,to)` finds the path; `probe-node` shows the
			
 
				+   bridged hop. The previously-broken hop is closed.
			
 
				+2. **Precision:** count + spot-check synthesized/resolved edges — no explosion, correct targets:
			
 
				+   ```bash
			
 
				+   sqlite3 <repo>/.codegraph/codegraph.db \
			
 
				+     "select s.name||' → '||t.name||'  '||coalesce(e.metadata,'') from edges e \
			
 
				+      join nodes s on e.source=s.id join nodes t on e.target=t.id where e.provenance='heuristic';"
			
 
				+   ```
			
 
				+   (Resolver edges aren't `heuristic`; verify via the trace + callees instead.)
			
 
				+3. **Regression:** node count stable (`select count(*) from nodes;` before/after — a big
			
 
				+   jump means an extraction change over-fired); existing traces on a control repo intact.
			
 
				+4. **End-to-end agent eval:** run the flow question with codegraph and measure
			
 
				+   **reads / answer-completeness / cost** vs a pre-fix baseline:
			
 
				+   ```bash
			
 
				+   # headless (exact cost + clean tool sequence)
			
 
				+   bash scripts/agent-eval/run-agent.sh <repo> with "<flow question>"
			
 
				+   # or the full A/B + interactive Explore-subagent path:
			
 
				+   scripts/agent-eval/audit.sh local <name> <url> "<flow question>" all
			
 
				+   ```
			
 
				+   Then parse: `Read` count, codegraph-tool count, cost, and whether the answer now
			
 
				+   contains the glue symbols (the ones that previously required a read).
			
 
				+
			
 
				+### Success criteria (per language/framework)
			
 
				+- `trace` finds the canonical flow end-to-end (no dynamic-dispatch break).
			
 
				+- Agent can answer the flow question with **Read 0** (achievable in ≥ some runs) and the
			
 
				+  glue symbols appear in the answer.
			
 
				+- **No node explosion** and no regression on a control repo.
			
 
				+- Synthesized edges are precise on a spot-check (no generic-name over-linking).
			
 
				+
			
 
				+---
			
 
				+
			
 
				+## 5. Validation toolkit (reference)
			
 
				+
			
 
				+| Tool | Purpose |
			
 
				+|---|---|
			
 
				+| `scripts/agent-eval/probe-trace.mjs <repo> <from> <to>` | call-path between two symbols (the hole detector) |
			
 
				+| `scripts/agent-eval/probe-node.mjs <repo> <sym> [code]` | symbol + trail (callers/callees); `code` adds the body |
			
 
				+| `scripts/agent-eval/probe-context.mjs <repo> "<task>"` | context output incl. call-paths |
			
 
				+| `scripts/agent-eval/probe-explore.mjs <repo> "<query>"` | explore output |
			
 
				+| `scripts/agent-eval/{audit,run-agent,itrun}.sh` | agent A/B (headless + interactive); also the `/agent-eval` skill |
			
 
				+| `sqlite3 <repo>/.codegraph/codegraph.db` | direct edge/node inspection (provenance, metadata, counts) |
			
 
				+
			
 
				+Probe scripts use the built `dist/` — run `npm run build` first. Reindex after any
			
 
				+extraction or resolution change (`rm -rf <repo>/.codegraph && codegraph init -i`) — the
			
 
				+synthesizer/resolvers run at index time. Test fixtures: keep a tiny per-pattern fixture
			
 
				+(see `/tmp/cb-fixture/bus.js`; **move into `__tests__/`** when shipping).
			
 
				+
			
 
				+---
			
 
				+
			
 
				+## 6. Coverage matrix (fill in as you go)
			
 
				+
			
 
				+Status legend: ✅ done+validated · 🔬 hole identified · ⬜ not started.
			
 
				+`Mechanism`: R = resolver, S = synthesizer channel, X = extraction.
			
 
				+
			
 
				+| Language | Framework(s) | Canonical flow to test | Mechanism | Status |
			
 
				+|---|---|---|---|---|
			
 
				+| TypeScript/JS | React / observer / EventEmitter | state→render; dispatch→callback | S + X | ✅ (excalidraw) |
			
 
				+| TypeScript/JS | Vue / Nuxt | reactive dep → render | ? | ⬜ |
			
 
				+| TypeScript/JS | Svelte / SvelteKit | store → DOM update | ? | ⬜ |
			
 
				+| TypeScript/JS | Express / Koa | request → middleware → handler | ? | ⬜ |
			
 
				+| TypeScript/JS | NestJS | request → controller → provider | ? | ⬜ |
			
 
				+| TypeScript/JS | RxJS / signals | subscribe → operator → observer | S | ⬜ |
			
 
				+| Python | Django ORM | QuerySet → SQL compiler | R | ✅ |
			
 
				+| Python | Django (views/signals) | url → view; signal → receiver | R/S | 🔬 (routes done; signals ⬜) |
			
 
				+| Python | Flask / FastAPI | request → route → dependency | R | 🔬 (routes done) |
			
 
				+| Go | Gin / net/http | request → handler chain | ? | ⬜ |
			
 
				+| Rust | Axum / Cargo workspace | request → handler; trait dispatch | R | 🔬 (workspaces done) |
			
 
				+| Java | Spring | request → @Controller → service; DI | ? | ⬜ |
			
 
				+| Kotlin | (coroutines / DI) | flow/callback dispatch | ? | ⬜ |
			
 
				+| Swift | Vapor | request → route → controller | ? | ⬜ |
			
 
				+| C# | ASP.NET | request → controller; DI | ? | ⬜ |
			
 
				+| Ruby | Rails / Sinatra | request → controller → view; callbacks | ? | ⬜ |
			
 
				+| PHP | Laravel / Drupal | request → controller; events | ? | ⬜ |
			
 
				+| C/C++ | (callback structs / vtables) | function-pointer dispatch | ? | ⬜ |
			
 
				+| Dart | Flutter | setState → build | S | ⬜ |
			
 
				+| Lua / Luau | (Neovim / Roblox) | event/callback dispatch | S | ⬜ |
			
 
				+| Scala | (Akka / Play) | actor message → handler | ? | ⬜ |
			
 
				+
			
 
				+(Verify the exact supported set against `src/extraction/languages/` and
			
 
				+`src/resolution/frameworks/` before starting — this table is a starting point.)
			
 
				+
			
 
				+---
			
 
				+
			
 
				+## 7. Known limits & gotchas (from the excalidraw/django work)
			
 
				+
			
 
				+- **Coverage enables, doesn't force, the no-read path.** Agents still read to *confirm
			
 
				+  source* sometimes; cost stays ~flat (codegraph calls trade for reads). The reliable
			
 
				+  win is **completeness** + making Read-0 *possible*. Don't expect a guaranteed cost drop.
			
 
				+- **Difficulty gradient is real:** named-ref dispatch (resolver) is cheap; anonymous
			
 
				+  callback dispatch (synthesizer) is medium; **anonymous-arrow handlers are the hard
			
 
				+  remaining gap** (no identity → need synthesizer link-through-body, not yet built).
			
 
				+- **Extraction changes are high blast radius.** The Phase-3 named-inline-callback
			
 
				+  extraction is in the *shared* `tree-sitter.ts` walker — re-check **node counts across
			
 
				+  several languages** after any extraction change (it held at +3 on excalidraw because
			
 
				+  anonymous arrows are skipped).
			
 
				+- **Synthesizer precision guards:** registrar-name uniqueness, named-only handlers, and
			
 
				+  an event **fan-out cap** (skip generic events like `error`/`change`). Receiver-type
			
 
				+  matching (via `type_of` edges) is the planned precision upgrade — deferred.
			
 
				+- **As-built shortcuts** (callback synthesizer): pairs registrar/dispatcher by *file*+field
			
 
				+  (class proxy), regex arg-recovery (named refs only), `provenance:'heuristic'` +
			
 
				+  `metadata.synthesizedBy` (the enum has no `'callback-synthesis'`). See the design doc.
			
 
				+- **Synthesizer runs only in `resolveAndPersistBatched`** (full index) — wire into
			
 
				+  `resolveAndPersist` for incremental sync before shipping.
			
 
				+- **Symbol ambiguity in `trace`:** common names (`render`, `execute_sql`) match many
			
 
				+  nodes; trace picks among them and may start from the wrong one. Trace from the specific
			
 
				+  method, not a class name.
			
 
				+
			
 
				+---
			
 
				+
			
 
				+## 8. Definition of done (the whole mission)
			
 
				+
			
 
				+For each language × framework: the canonical flow `trace`s end-to-end, an agent can
			
 
				+answer the flow question with Read 0 in at least some runs with the glue present, no node
			
 
				+explosion, no regression — recorded in the matrix (§6) with the validating repo + numbers.
			
 
				+Then ship-prep: tests per mechanism, CHANGELOG, wire incremental, commit.
			
--- a/scripts/agent-eval/block-read-hook.sh
+++ b/scripts/agent-eval/block-read-hook.sh
@@ -0,0 +1,19 @@
 
				+#!/usr/bin/env bash
			
 
				+# PreToolUse hook (experiment): deny Read of codegraph-indexed source files and
			
 
				+# steer the agent to codegraph_explore/codegraph_node instead. Tests whether
			
 
				+# codegraph can FULLY replace Read for code-understanding once the escape hatch
			
 
				+# is removed. Non-source reads (config, .env, markdown, new files) pass through.
			
 
				+#
			
 
				+# Wire via:  claude ... --settings scripts/agent-eval/hook-settings.json
			
 
				+set -uo pipefail
			
 
				+input="$(cat)"
			
 
				+fp="$(printf '%s' "$input" | jq -r '.tool_input.file_path // empty' 2>/dev/null)"
			
 
				+
			
 
				+case "$fp" in
			
 
				+  *.ts|*.tsx|*.js|*.jsx|*.mjs|*.cjs|*.py|*.go|*.rs|*.java|*.rb|*.php|*.swift|*.kt|*.kts|*.c|*.cc|*.cpp|*.h|*.hpp|*.cs|*.lua|*.vue|*.svelte)
			
 
				+    msg="Read is disabled for source files in this session — codegraph already has this file indexed (with line numbers, kept in sync on every change). Use codegraph_explore (several related symbols at once) or codegraph_node (one symbol's full source). If a symbol you need wasn't in a prior explore, run ANOTHER codegraph_explore with its exact name instead of reading the file."
			
 
				+    jq -n --arg m "$msg" '{reason:$m, hookSpecificOutput:{hookEventName:"PreToolUse",permissionDecision:"deny",permissionDecisionReason:$m}}'
			
 
				+    exit 0
			
 
				+    ;;
			
 
				+esac
			
 
				+exit 0
			
--- a/scripts/agent-eval/hook-settings.json
+++ b/scripts/agent-eval/hook-settings.json
@@ -0,0 +1,15 @@
 
				+{
			
 
				+  "hooks": {
			
 
				+    "PreToolUse": [
			
 
				+      {
			
 
				+        "matcher": "Read",
			
 
				+        "hooks": [
			
 
				+          {
			
 
				+            "type": "command",
			
 
				+            "command": "bash /Users/colby/Development/Personal/codegraph/scripts/agent-eval/block-read-hook.sh"
			
 
				+          }
			
 
				+        ]
			
 
				+      }
			
 
				+    ]
			
 
				+  }
			
 
				+}
			
--- a/scripts/agent-eval/probe-context.mjs
+++ b/scripts/agent-eval/probe-context.mjs
@@ -0,0 +1,21 @@
 
				+#!/usr/bin/env node
			
 
				+// Probe codegraph_context (with call-paths) against an index using the built dist.
			
 
				+// Usage: node probe-context.mjs <repo-with-.codegraph> <task words...>
			
 
				+import { pathToFileURL } from 'node:url';
			
 
				+import { resolve } from 'node:path';
			
 
				+
			
 
				+const [, , repo, ...taskParts] = process.argv;
			
 
				+const task = taskParts.join(' ');
			
 
				+if (!repo || !task) { console.error('usage: probe-context.mjs <repo> <task...>'); process.exit(1); }
			
 
				+
			
 
				+const load = async (rel) => import(pathToFileURL(resolve(rel)).href);
			
 
				+const idx = await load('dist/index.js');
			
 
				+const tools = await load('dist/mcp/tools.js');
			
 
				+const CodeGraph = idx.default?.default ?? idx.default ?? idx.CodeGraph;
			
 
				+const ToolHandler = tools.ToolHandler ?? tools.default?.ToolHandler;
			
 
				+
			
 
				+const cg = CodeGraph.openSync(repo);
			
 
				+const h = new ToolHandler(cg);
			
 
				+const res = await h.execute('codegraph_context', { task });
			
 
				+console.log(res.content?.[0]?.text ?? '(no text)');
			
 
				+try { cg.close?.(); } catch {}
			
--- a/scripts/agent-eval/probe-explore.mjs
+++ b/scripts/agent-eval/probe-explore.mjs
@@ -0,0 +1,40 @@
 
				+#!/usr/bin/env node
			
 
				+// One-shot probe: run handleExplore against an existing index using the built
			
 
				+// dist, print the output + a few stats. Lets us verify explore's coverage fix
			
 
				+// without a full agent run. Usage: node probe-explore.mjs <repo-with-.codegraph> "<query>"
			
 
				+import { pathToFileURL } from 'node:url';
			
 
				+import { resolve } from 'node:path';
			
 
				+
			
 
				+const [, , repo, query] = process.argv;
			
 
				+if (!repo || !query) {
			
 
				+  console.error('usage: probe-explore.mjs <repo> "<query>"');
			
 
				+  process.exit(1);
			
 
				+}
			
 
				+
			
 
				+const load = async (rel) => import(pathToFileURL(resolve(rel)).href);
			
 
				+const idx = await load('dist/index.js');
			
 
				+const tools = await load('dist/mcp/tools.js');
			
 
				+
			
 
				+// esModuleInterop: dynamic import of CJS yields { default: module.exports, ...named }
			
 
				+const CodeGraph = idx.default?.default ?? idx.default ?? idx.CodeGraph;
			
 
				+const ToolHandler = tools.ToolHandler ?? tools.default?.ToolHandler;
			
 
				+
			
 
				+if (typeof CodeGraph?.openSync !== 'function') {
			
 
				+  console.error('could not resolve CodeGraph.openSync; index keys:', Object.keys(idx), 'default keys:', idx.default && Object.keys(idx.default));
			
 
				+  process.exit(2);
			
 
				+}
			
 
				+if (typeof ToolHandler !== 'function') {
			
 
				+  console.error('could not resolve ToolHandler; tools keys:', Object.keys(tools));
			
 
				+  process.exit(2);
			
 
				+}
			
 
				+
			
 
				+const cg = CodeGraph.openSync(repo);
			
 
				+const h = new ToolHandler(cg);
			
 
				+const res = await h.execute('codegraph_explore', { query });
			
 
				+const text = res.content?.[0]?.text ?? '(no text)';
			
 
				+console.log(text);
			
 
				+console.error('\n--- PROBE STATS ---');
			
 
				+console.error('output chars:', text.length);
			
 
				+console.error('triggerRender body present (-> setState({})):', /triggerRender[\s\S]{0,400}setState\(\{\}\)/.test(text));
			
 
				+console.error('App.tsx in source section:', /#### .*App\.tsx —/.test(text));
			
 
				+try { cg.close?.(); } catch {}
			
--- a/scripts/agent-eval/probe-node.mjs
+++ b/scripts/agent-eval/probe-node.mjs
@@ -0,0 +1,20 @@
 
				+#!/usr/bin/env node
			
 
				+// Probe codegraph_node (with trail) against an index using the built dist.
			
 
				+// Usage: node probe-node.mjs <repo-with-.codegraph> <symbol> [code]
			
 
				+import { pathToFileURL } from 'node:url';
			
 
				+import { resolve } from 'node:path';
			
 
				+
			
 
				+const [, , repo, symbol, code] = process.argv;
			
 
				+if (!repo || !symbol) { console.error('usage: probe-node.mjs <repo> <symbol> [code]'); process.exit(1); }
			
 
				+
			
 
				+const load = async (rel) => import(pathToFileURL(resolve(rel)).href);
			
 
				+const idx = await load('dist/index.js');
			
 
				+const tools = await load('dist/mcp/tools.js');
			
 
				+const CodeGraph = idx.default?.default ?? idx.default ?? idx.CodeGraph;
			
 
				+const ToolHandler = tools.ToolHandler ?? tools.default?.ToolHandler;
			
 
				+
			
 
				+const cg = CodeGraph.openSync(repo);
			
 
				+const h = new ToolHandler(cg);
			
 
				+const res = await h.execute('codegraph_node', { symbol, includeCode: code === 'code' });
			
 
				+console.log(res.content?.[0]?.text ?? '(no text)');
			
 
				+try { cg.close?.(); } catch {}
			
--- a/scripts/agent-eval/probe-trace.mjs
+++ b/scripts/agent-eval/probe-trace.mjs
@@ -0,0 +1,20 @@
 
				+#!/usr/bin/env node
			
 
				+// Probe codegraph_trace against an index using the built dist.
			
 
				+// Usage: node probe-trace.mjs <repo-with-.codegraph> <from> <to>
			
 
				+import { pathToFileURL } from 'node:url';
			
 
				+import { resolve } from 'node:path';
			
 
				+
			
 
				+const [, , repo, from, to] = process.argv;
			
 
				+if (!repo || !from || !to) { console.error('usage: probe-trace.mjs <repo> <from> <to>'); process.exit(1); }
			
 
				+
			
 
				+const load = async (rel) => import(pathToFileURL(resolve(rel)).href);
			
 
				+const idx = await load('dist/index.js');
			
 
				+const tools = await load('dist/mcp/tools.js');
			
 
				+const CodeGraph = idx.default?.default ?? idx.default ?? idx.CodeGraph;
			
 
				+const ToolHandler = tools.ToolHandler ?? tools.default?.ToolHandler;
			
 
				+
			
 
				+const cg = CodeGraph.openSync(repo);
			
 
				+const h = new ToolHandler(cg);
			
 
				+const res = await h.execute('codegraph_trace', { from, to });
			
 
				+console.log(res.content?.[0]?.text ?? '(no text)');
			
 
				+try { cg.close?.(); } catch {}