Status: SHIPPED (the synthesizer in callback-synthesizer.ts is merged and on
main). This doc records the original design.
Motivation: close the dynamic-dispatch hole that static extraction leaves for
observer / event-emitter / signal patterns, where a dispatcher invokes callbacks
registered elsewhere through a shared store — so flows like "how does an update
reach the screen" actually exist in the graph.
Update (2026-06-01): the
codegraph_traceandcodegraph_contextMCP tools were since removed —codegraph_exploreis the single surfacing tool now. Its "Flow" section (buildFlowFromNamedSymbols) and thecodegraph_nodetrail surface these synthesized edges; thetrace(a, b)notation below means "the a→b flow," which you now verify withcodegraph_explore/probe-explore.mjs(theprobe-trace.mjs/probe-context.mjsdev probes went away with the tools).
We synthesize dispatcher → callback edges that static parsing misses. It works:
Scene.onUpdate/triggerUpdate): synthesizes
triggerUpdate → triggerRender. trace(mutateElement, triggerRender) now = 3 hops.on('mount', …)/emit('mount')): synthesizes use → onmount.Files touched (all uncommitted on main):
src/resolution/callback-synthesizer.ts — the whole-graph synthesis pass (Phase 1 + 2).src/resolution/index.ts — calls synthesizeCallbackEdges() at the end of
resolveAndPersistBatched() (after base edges are persisted) + the import.src/extraction/tree-sitter.ts — visitFunctionBody now extracts named nested
functions (Phase 3), so inline named handlers become linkable nodes.How to reproduce / test:
npm run build
rm -rf /tmp/codegraph-corpus/excalidraw/.codegraph
( cd /tmp/codegraph-corpus/excalidraw && codegraph init -i )
# synthesized edges (provenance='heuristic', metadata.synthesizedBy in {callback,event-emitter}):
sqlite3 /tmp/codegraph-corpus/excalidraw/.codegraph/codegraph.db \
"select s.name||' → '||t.name||' '||coalesce(e.metadata,'') from edges e \
join nodes s on e.source=s.id join nodes t on e.target=t.id where e.provenance='heuristic';"
# end-to-end flow (the synthesized edge shows up in explore's Flow section + node trail):
node scripts/agent-eval/probe-explore.mjs /tmp/codegraph-corpus/excalidraw "triggerUpdate triggerRender"
Probe scripts (dev-only, in scripts/agent-eval/): probe-node.mjs (symbol + trail),
probe-explore.mjs (relevant source + the flow among named symbols). EventEmitter
fixture lives at /tmp/cb-fixture/bus.js (ephemeral — recreate or move into __tests__/).
class Scene {
private callbacks = new Set<Callback>();
onUpdate(cb: Callback) { this.callbacks.add(cb); } // REGISTRAR
triggerUpdate() { for (const cb of this.callbacks) cb(); } // DISPATCHER
}
this.scene.onUpdate(this.triggerRender); // REGISTRATION SITE
The runtime edge triggerUpdate → triggerRender does not exist statically:
triggerUpdate's only literal call is cb() (anonymous). Measured: triggerUpdate's
only callee was randomInteger; trace(triggerUpdate, triggerRender) returned no path.
FrameworkResolver.resolve()resolve(ref) answers "what does this named ref point to," one ref at a time. The
callback edge has no ref to resolve (cb() is anonymous) and needs cross-file,
multi-site correlation (registrar, registration, dispatcher). So it's a whole-graph
pass after base resolution, language-level (any OO observer), living in
src/resolution/callback-synthesizer.ts — not under frameworks/.
Sibling mechanism for the other dynamic-dispatch class — named attribute/ descriptor dispatch (e.g. django
self._iterable_class(...)) — is theclaimsReferencehook (resolution/types.ts+resolution/index.tspre-filter)
- a
FrameworkResolver.resolve()(django ORM resolver inframeworks/python.ts). That one does fitresolve()because the ref is named. Both are part of the same coverage effort; see the "Related work" section.
fieldChannelEdges, Phase 1)^(on[A-Z]\w*|subscribe|
addListener|addEventListener|register|watch|listen|addCallback)$; dispatcher
contains (emit|trigger|notify|dispatch|fire|publish|flush).ctx.readFile + slice node lines): registrar has
this.<F>.add|push|set(; dispatcher has for (… of [Array.from(]this.<F>) + a call,
or this.<F>.forEach(.F (file as a class proxy — getting the containing class
reliably was harder). Works for the common 1-class-per-file case; revisit for
multi-class files.queries.getIncomingEdges(registrar.id, ['calls']) → for each,
read the caller's source at the edge line and regex-recover the arg
(<registrarName>\s*\(\s*(?:this\.)?(\w+)). DIVERGENCE: design preferred tree-sitter
re-parse; build uses regex (named refs only — arrows/inline args are missed here).dispatcher → fn (getNodesByName(arg) → method|function). Capped at
MAX_CALLBACKS_PER_CHANNEL = 40.eventEmitterEdges, Phase 2)ctx.getAllFiles() + readFile, substring pre-filter on
.emit(/.on(/etc). ON_RE = \.(?:on|once|addListener)\(\s*['"]([^'"]+)['"]\s*,\s*
(?:function\s+(\w+)|(?:this\.)?(\w+)); EMIT_RE = \.(?:emit|fire|dispatchEvent)\(\s*['"]([^'"]+)['"].emit('e') call (enclosingFn finds the
tightest function/method/component node containing the line). Handler = getNodesByName
of the on-handler name.EVENT_FANOUT_CAP = 6) — skip events with >6 handlers or
dispatchers (generic names like error/change would over-link without type info).Edge.provenance is a fixed enum ('tree-sitter'|'scip'|'heuristic'), so synthesized
edges use provenance: 'heuristic' + metadata: { synthesizedBy: 'callback'|
'event-emitter', via/event/field }. The design's 'callback-synthesis' provenance and
high/medium/low confidence tiers were NOT implemented — the fan-out cap +
registrar-name uniqueness + named-only handlers are the precision guards instead.
tree-sitter.ts)The real blocker for EventEmitter on real repos: inline handlers
(on('mount', function onmount(){})) weren't nodes, so nothing could link to them.
Root cause: visitFunctionBody walked through nested functions without extracting them.
Fix: in visitForCallsAndStructure, when a body node is a functionType and
extractName returns a real name, call extractFunction (which extracts it and walks
its own body) and return. Named only — anonymous arrows fall through to the existing
recursion (so their inner calls stay attributed to the enclosing fn). This bounded it:
excalidraw +3 nodes, no explosion, no regression.
| Repo | Result |
|---|---|
| excalidraw | 1 synthesized edge triggerUpdate → triggerRender (of 27,214); trace(mutateElement, triggerRender) = 3 hops; nodes 9,286 → 9,289 |
| express | after Phase 3: use → onmount {event-emitter, event:"mount"} (onmount now extracted at application.js:109) |
/tmp/cb-fixture/bus.js |
tick → handleRefresh, persist → handleSave (named-method EventEmitter handlers) |
| excalidraw / express | no Phase-1 regression; node counts stable |
on('e', () => foo()) still produce no edge (no node,
intentionally not extracted in Phase 3). The fix is synthesizer link-through-body:
parse the arrow's body and link dispatcher → (calls inside the arrow). Highest
remaining recall win; handles the most common modern callback shape.resolveAndPersist (incremental sync) — synthesis currently runs only
in resolveAndPersistBatched (full index). Incremental re-index won't refresh
synthesized edges.type_of edges so x.emit('change') only links to y.on('change', fn)
when x,y are the same type. Lets the fan-out cap relax.this.onChange = cb; … this.onChange()) — scalar-store
variant of the field observer; not built.unregister/off ignored.provenance='heuristic' + metadata.synthesizedBy.This is one half of closing dynamic-dispatch coverage. The other artifacts on main:
claimsReference (resolution/types.ts,
pre-filter in resolution/index.ts) + django ORM resolver (frameworks/python.ts,
_iterable_class → ModelIterable.__iter__).explore whole-small-file + glue
fixes, the explore Flow section (buildFlowFromNamedSymbols), and node-with-trail
— all in src/mcp/tools.ts. (codegraph_trace / codegraph_context were later
removed; explore is the one surfacing tool.)project_codegraph_read_displacement (why coverage — not prompting/hooks/new-tools —
is the lever for getting agents to use codegraph over Read).