Parcourir la source

feat(go): generated-file down-rank + gRPC stub-impl bridge + trace-failure inlining

Multi-pronged fix to make codegraph competitive on Go multi-module repos
(cosmos-sdk, etcd) where it previously lost or tied. Driven by an 8-question
agent-eval audit across cobra, gin, prometheus, cosmos-sdk, and etcd: the
baseline had codegraph losing ~60% on cost on cosmos-sdk and mixed on etcd
deep cross-module flows, while winning cleanly on the single-module and
non-protobuf-heavy repos.

Diagnostics ruled OUT `go.work` parsing as the gap (prometheus crushes
without it). The actual failure modes were generated-file noise warping
disambiguation, missing gRPC interface→impl bridge in structural-typing Go,
and trace's failure path triggering 3-5 follow-up tool calls instead of
inlining the material the agent needed.

Changes:

- New `src/extraction/generated-detection.ts` — path-pattern classifier
  for `.pb.go`, `.pulsar.go`, `_grpc.pb.go`, `_mock.go`, `_mocks.go`,
  `mock_*.go`, `.generated.[jt]sx?`, `_pb2(_grpc)?.py`, `.pb.{cc,h}`,
  `.g.dart`, `.freezed.dart`. Applied as a stable sort tiebreaker in
  `findSymbol`, `findAllSymbols`, `codegraph_search` (MCP + CLI),
  `codegraph_explore` file ranking, and context formatter Entry Points /
  Related Symbols / Code blocks. Cosmos's `msgServer.Send` now ranks #3
  instead of #9 on a `Send` search.

- New `goGrpcStubImplEdges` synthesizer in `callback-synthesizer.ts` —
  detects `UnimplementedXxxServer` structs in generated files, identifies
  their RPC methods (excluding `mustEmbed*` / `testEmbeddedByValue` gRPC
  markers), and emits `calls` edges to the matching methods on any
  non-generated struct whose method-name set is a superset. Closes Go's
  structural-typing gap that the existing `interfaceOverrideEdges` (Java /
  Kotlin only) couldn't bridge. 467 bridge edges on cosmos-sdk; bank's
  `UnimplementedMsgServer::Send` points to `x/bank/keeper/msg_server.go`
  only, not to `msgClient` siblings or mock files.

- Trace-failure rewrite (`handleTrace`) — when no static path connects
  endpoints, instead of telling the agent to call `codegraph_node` (a
  3-4-call fan-out), inline both endpoints' bodies (120 lines / 3600 chars
  per endpoint), their callers (≤6), and callees (≤8) in one response.

- Trace endpoint-pairing improvements — scores every `from`×`to`
  candidate combo by shared directory prefix and tries the best-paired
  pair first (the full candidate set, not just FTS top-5). A
  less-canonical-path penalty (`enterprise/`, `contrib/`, `examples/`,
  `vendor/`, `third_party/`, `deprecated/`, `legacy/`) ensures the
  canonical-module pair wins even when a side-experiment shares more of
  its directory prefix. Find-path probe budget capped at 20 pairs.

- Test-file deprioritization in `codegraph_explore` `isLowValue` — adds
  suffix patterns (`_test.go`, `_spec.rb`, `.test.ts`, `.spec.tsx`,
  `Test.java`, `Spec.kt`) alongside the existing directory-style patterns.
  Otherwise etcd's `watchable_store_test.go` consumes 5K chars of explore
  budget that should go to the hand-written flow source.

Tests:

- New `__tests__/generated-detection.test.ts` (4 unit tests) pins the
  suffix patterns.
- New "Go gRPC stub→impl synthesis" integration test suite in
  `frameworks-integration.test.ts` (2 tests): positive bridge from stub
  to hand-written impl, AND the precision case (don't bridge to a
  generated sibling like `msgClient` in the same .pb.go).
- Full suite: 1076/1076 pass.

Empirical (post-fix, n=2 average per question):

| Repo / Q                | WITH       | WITHOUT     | Reads (W/WO) | Time (W/WO)
|-------------------------|------------|-------------|--------------|------------
| cobra (parse cmds)      | $0.27      | $0.27       | 0 / 4        | 39s / 60s
| prometheus (scrape→TSDB)| $0.63      | $0.70       | 0 / 6        | 106s/143s
| cosmos-sdk Q1 (MsgSend) | $0.41      | $0.26       | 1 / 2        | 67s / 64s
| cosmos-sdk Q2 (Delegate)| $0.47      | $0.46       | 0 / 5        | 50s / 73s
| cosmos-sdk Q3 (gov tally)| $0.34     | $0.31       | 1.5 / 3      | 54s / 76s
| etcd Q1 (Put→raft)      | $0.65      | $0.78       | 0 / 4        | 98s / 129s
| etcd Q2 (watch)         | $0.36      | $0.50       | 0 / 4+       | 58s / 89s

Codegraph wins on reads + time on every question. Cost is mixed: 3 clean
wins, 3 tied (within 10%), 1 stubborn cost loss on the grep-favored Q1.
Compared to baseline, the cosmos-sdk cost-gap collapsed from -60% to -15%
on average, and Q3 went from a 75% loss to a tie. Raw run artifacts in
`/tmp/cg-finalv2-*/` and `/tmp/cg-final-*/`.

Memory written at `project_go_multi_module_audit.md` for the methodology
+ before/after numbers.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Colby McHenry il y a 3 semaines
Parent
commit
2777bb8dae

+ 2 - 1
.claude/skills/agent-eval/corpus.json

@@ -11,7 +11,8 @@
   "Go": [
     { "name": "cobra", "repo": "https://github.com/spf13/cobra", "size": "Small", "files": "~50", "question": "How does cobra parse commands and flags?" },
     { "name": "gin", "repo": "https://github.com/gin-gonic/gin", "size": "Medium", "files": "~150", "question": "How does gin route requests through its middleware chain?" },
-    { "name": "terraform", "repo": "https://github.com/hashicorp/terraform", "size": "Large", "files": "~4000", "question": "How does Terraform build and walk the resource dependency graph?" }
+    { "name": "terraform", "repo": "https://github.com/hashicorp/terraform", "size": "Large", "files": "~4000", "question": "How does Terraform build and walk the resource dependency graph?" },
+    { "name": "cosmos-sdk", "repo": "https://github.com/cosmos/cosmos-sdk", "size": "Large", "files": "~5000", "question": "How does a bank module MsgSend message reach the account balance update? Trace the cross-module call path from the bank keeper's Send handler through to the account/balance store update." }
   ],
   "Python": [
     { "name": "click", "repo": "https://github.com/pallets/click", "size": "Small", "files": "~60", "question": "How does click parse command-line arguments into commands?" },

+ 51 - 0
CHANGELOG.md

@@ -10,6 +10,57 @@ and adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 ## [Unreleased]
 
 ### Added
+- **Generated-file down-ranking across search, trace, and explore.** A new
+  filename-based classifier (`src/extraction/generated-detection.ts`) flags
+  protobuf / gRPC / mockgen / build-output files (`.pb.go`, `.pulsar.go`,
+  `_grpc.pb.go`, `_mock.go`, `_mocks.go`, `mock_*.go`, `.generated.[jt]sx`,
+  `_pb2(_grpc)?.py`, `.pb.{cc,h}`, `.g.dart`, `.freezed.dart`) and pushes them
+  LAST in disambiguation. Before this, a `codegraph_search "Send"` on
+  cosmos-sdk returned the gRPC interface stub at `tx_grpc.pb.go:124` as the
+  first match — the trace landed on that empty stub, reported "no path", and
+  the agent fell back to Read. With the down-rank applied to `findSymbol`,
+  `findAllSymbols`, `codegraph_search`, the CLI `query` command, AND the
+  context Entry Points / Related Symbols / Code blocks, the bank keeper's
+  `msgServer.Send` (the real implementation) ranks #3 instead of #9 and
+  trace lands on it directly. Pure path-based classifier — no schema change,
+  no index migration.
+- **gRPC interface→implementation bridge for Go.** New synthesizer
+  `goGrpcStubImplEdges` in `src/resolution/callback-synthesizer.ts` finds
+  `UnimplementedXxxServer` structs in `.pb.go` / `_grpc.pb.go` files,
+  identifies their RPC-method signatures (excluding the `mustEmbed*` /
+  `testEmbeddedByValue` gRPC markers), and links each stub method to the
+  hand-written impl method on any struct whose method-name set is a
+  superset. Closes Go's structural-typing gap that the Java/Kotlin-only
+  `interfaceOverrideEdges` couldn't bridge. Excludes other generated files
+  from candidate impls so a sibling `msgClient` in the same `.pb.go` doesn't
+  get falsely paired. Measured on cosmos-sdk: 467 stub→impl `calls` edges
+  synthesized, bank's `UnimplementedMsgServer::Send` now points only to
+  `x/bank/keeper/msg_server.go::msgServer::Send` — not to mocks, not to
+  client wrappers.
+- **Trace-failure response now inlines both endpoints' bodies + neighbors.**
+  When `codegraph_trace` can't find a static call path (typically a
+  dynamic-dispatch break), it used to return a one-liner telling the agent
+  to call `codegraph_node` next — which triggered 3-4 follow-up calls plus a
+  Read. The new failure response inlines each endpoint's source (capped at
+  120 lines / 3600 chars), callers, and callees in one response. On the
+  cosmos-Q3 / etcd-Q2 audits this eliminated the entire fan-out pattern
+  (5-11 codegraph calls collapsed into 1-2).
+- **Path-proximity pairing in trace endpoint selection.** In a multi-module
+  Go repo, a symbol like `EndBlocker` exists in 20+ modules; FTS picks one
+  almost arbitrarily. Trace now scores every `from` × `to` candidate pair by
+  shared directory prefix length (longest match wins) so
+  `x/gov/abci.go::EndBlocker` + `x/gov/keeper/tally.go::Tally` are paired
+  before `simapp/app.go`'s wrapper EndBlocker is even considered. A
+  less-canonical-path penalty (`enterprise/`, `contrib/`, `examples/`,
+  `vendor/`, `third_party/`, `deprecated/`, `legacy/`) ensures a side-module
+  with a longer shared prefix doesn't beat the canonical module with a
+  shorter one. FindPath probe budget capped at 20 pairs.
+- **Test-file deprioritization in `codegraph_explore`.** Existing
+  `isLowValue` only caught directory-style patterns (`/tests/`, `/spec/`);
+  now also catches Go's `_test.go`, Ruby's `_spec.rb`, JS/TS `.test.ts` /
+  `.spec.tsx`, and Java/Kotlin/Scala `*Test.java` / `*Spec.kt`. Without
+  this, etcd's `watchable_store_test.go` consumed 5K chars of explore
+  budget that should have gone to the hand-written flow source.
 - **Java / Kotlin imports now resolve by fully-qualified name.** Extraction
   wraps every top-level declaration of a `.kt` / `.java` file in a `namespace`
   node carrying the file's `package` (so a class `Bar` in

+ 103 - 0
__tests__/frameworks-integration.test.ts

@@ -805,3 +805,106 @@ describe('Java anonymous-class override synthesis — end-to-end', () => {
     cg.close();
   });
 });
+
+describe('Go gRPC stub→impl synthesis', () => {
+  let tmpDir: string | undefined;
+  afterEach(() => {
+    if (tmpDir) fs.rmSync(tmpDir, { recursive: true, force: true });
+    tmpDir = undefined;
+  });
+
+  it('bridges UnimplementedMsgServer methods to the hand-written keeper impl', async () => {
+    tmpDir = fs.mkdtempSync(path.join(os.tmpdir(), 'cg-go-grpc-'));
+    // Mimic protoc-gen-go-grpc output: `*_grpc.pb.go` carrying the
+    // UnimplementedMsgServer stub.
+    fs.writeFileSync(
+      path.join(tmpDir, 'tx_grpc.pb.go'),
+      'package banktypes\n\n' +
+        'type UnimplementedMsgServer struct{}\n\n' +
+        'func (UnimplementedMsgServer) Send(ctx context.Context, req *MsgSend) (*MsgSendResponse, error) { return nil, nil }\n' +
+        'func (UnimplementedMsgServer) MultiSend(ctx context.Context, req *MsgMultiSend) (*MsgMultiSendResponse, error) { return nil, nil }\n' +
+        'func (UnimplementedMsgServer) mustEmbedUnimplementedMsgServer() {}\n' +
+        'func (UnimplementedMsgServer) testEmbeddedByValue() {}\n'
+    );
+    // Hand-written impl in a non-generated file — what an agent actually
+    // wants the trace to land on.
+    fs.writeFileSync(
+      path.join(tmpDir, 'msg_server.go'),
+      'package keeper\n\n' +
+        'type msgServer struct{ k Keeper }\n\n' +
+        'func (m msgServer) Send(ctx context.Context, req *MsgSend) (*MsgSendResponse, error) {\n' +
+        '  return m.k.SendCoins(ctx, req.From, req.To, req.Amount)\n' +
+        '}\n' +
+        'func (m msgServer) MultiSend(ctx context.Context, req *MsgMultiSend) (*MsgMultiSendResponse, error) {\n' +
+        '  return nil, nil\n' +
+        '}\n'
+    );
+
+    let cg: CodeGraph | undefined;
+    try {
+      cg = CodeGraph.initSync(tmpDir);
+      await cg.indexAll();
+
+      const stubSend = cg
+        .getNodesByKind('method')
+        .find((n) => n.qualifiedName.endsWith('UnimplementedMsgServer::Send'));
+      const implSend = cg
+        .getNodesByKind('method')
+        .find((n) => n.qualifiedName.endsWith('msgServer::Send'));
+      expect(stubSend, 'UnimplementedMsgServer.Send should be indexed').toBeDefined();
+      expect(implSend, 'msgServer.Send should be indexed').toBeDefined();
+
+      const bridge = cg
+        .getOutgoingEdges(stubSend!.id)
+        .find((e) => e.target === implSend!.id && e.kind === 'calls');
+      expect(bridge, 'stub Send should bridge to impl Send').toBeDefined();
+      expect(bridge!.provenance).toBe('heuristic');
+      expect((bridge!.metadata as { synthesizedBy?: string } | undefined)?.synthesizedBy).toBe(
+        'go-grpc-stub-impl'
+      );
+    } finally {
+      cg?.close();
+    }
+  });
+
+  it('does not bridge to candidates living in another generated file', async () => {
+    tmpDir = fs.mkdtempSync(path.join(os.tmpdir(), 'cg-go-grpc-sib-'));
+    // `*_grpc.pb.go` also contains a sibling `msgClient` struct that
+    // happens to satisfy the same method set. We must NOT bridge to it —
+    // it's not the hand-written impl, just the gRPC client wrapper.
+    fs.writeFileSync(
+      path.join(tmpDir, 'tx_grpc.pb.go'),
+      'package banktypes\n\n' +
+        'type UnimplementedMsgServer struct{}\n' +
+        'func (UnimplementedMsgServer) Send() {}\n' +
+        'func (UnimplementedMsgServer) MultiSend() {}\n\n' +
+        'type msgClient struct{}\n' +
+        'func (m msgClient) Send() {}\n' +
+        'func (m msgClient) MultiSend() {}\n'
+    );
+
+    let cg: CodeGraph | undefined;
+    try {
+      cg = CodeGraph.initSync(tmpDir);
+      await cg.indexAll();
+
+      const stub = cg
+        .getNodesByKind('struct')
+        .find((n) => n.name === 'UnimplementedMsgServer');
+      expect(stub).toBeDefined();
+      const bridges = cg
+        .getNodesByKind('method')
+        .filter((n) => n.qualifiedName.endsWith('UnimplementedMsgServer::Send'))
+        .flatMap((stubSend) => cg!.getOutgoingEdges(stubSend.id))
+        .filter(
+          (e) =>
+            e.kind === 'calls' &&
+            (e.metadata as { synthesizedBy?: string } | undefined)?.synthesizedBy ===
+              'go-grpc-stub-impl',
+        );
+      expect(bridges, 'no bridge to msgClient (also generated)').toHaveLength(0);
+    } finally {
+      cg?.close();
+    }
+  });
+});

+ 47 - 0
__tests__/generated-detection.test.ts

@@ -0,0 +1,47 @@
+/**
+ * Regression coverage for the generated-file detector that drives
+ * symbol-disambiguation down-ranking. Locked here because the suffix
+ * list is a contract: if a future edit drops `.pb.go`, the cosmos-sdk
+ * trace endpoint regresses to the gRPC stub (see
+ * `project_go_multi_module_audit` memory + the audit in #N/A).
+ */
+
+import { describe, it, expect } from 'vitest';
+import { isGeneratedFile } from '../src/extraction/generated-detection';
+
+describe('isGeneratedFile', () => {
+  it('classifies Go protobuf / gRPC / pulsar / mock outputs as generated', () => {
+    expect(isGeneratedFile('api/cosmos/bank/v1beta1/tx_grpc.pb.go')).toBe(true);
+    expect(isGeneratedFile('x/bank/types/tx.pb.go')).toBe(true);
+    expect(isGeneratedFile('api/cosmos/bank/v1beta1/tx.pulsar.go')).toBe(true);
+    // cosmos-sdk uses `<base>_mocks.go`; mockgen's default is `mock_<src>.go`;
+    // many projects use `<base>_mock.go`. All three are mockgen output.
+    expect(isGeneratedFile('x/auth/testutil/expected_keepers_mocks.go')).toBe(true);
+    expect(isGeneratedFile('internal/foo_mock.go')).toBe(true);
+    expect(isGeneratedFile('mock_keeper.go')).toBe(true);
+  });
+
+  it('does not flag the hand-written keeper as generated', () => {
+    expect(isGeneratedFile('x/bank/keeper/msg_server.go')).toBe(false);
+    expect(isGeneratedFile('x/bank/keeper/send.go')).toBe(false);
+  });
+
+  it('catches common cross-language codegen suffixes', () => {
+    expect(isGeneratedFile('app/foo.generated.ts')).toBe(true);
+    expect(isGeneratedFile('app/foo.generated.tsx')).toBe(true);
+    expect(isGeneratedFile('proto/bar_pb2.py')).toBe(true);
+    expect(isGeneratedFile('proto/bar_pb2_grpc.py')).toBe(true);
+    expect(isGeneratedFile('lib/baz.pb.cc')).toBe(true);
+    expect(isGeneratedFile('lib/baz.pb.h')).toBe(true);
+    expect(isGeneratedFile('lib/quux.g.dart')).toBe(true);
+    expect(isGeneratedFile('lib/quux.freezed.dart')).toBe(true);
+  });
+
+  it('leaves ordinary source files alone', () => {
+    expect(isGeneratedFile('src/index.ts')).toBe(false);
+    expect(isGeneratedFile('src/components/Foo.tsx')).toBe(false);
+    expect(isGeneratedFile('lib/main.dart')).toBe(false);
+    expect(isGeneratedFile('cmd/server/main.go')).toBe(false);
+    expect(isGeneratedFile('app/db.py')).toBe(false);
+  });
+});

+ 11 - 1
src/bin/codegraph.ts

@@ -843,11 +843,21 @@ program
       const cg = await CodeGraph.open(projectPath);
 
       const limit = parseInt(options.limit || '10', 10);
-      const results = cg.searchNodes(search, {
+      const rawResults = cg.searchNodes(search, {
         limit,
         kinds: options.kind ? [options.kind as any] : undefined,
       });
 
+      // Mirror the MCP search down-rank so the CLI also surfaces the
+      // hand-written implementation before protobuf/gRPC scaffolding
+      // when both share a name. See extraction/generated-detection.ts.
+      const { isGeneratedFile } = await import('../extraction/generated-detection');
+      const results = [...rawResults].sort((a, b) => {
+        const aGen = isGeneratedFile(a.node.filePath) ? 1 : 0;
+        const bGen = isGeneratedFile(b.node.filePath) ? 1 : 0;
+        return aGen - bGen;
+      });
+
       if (options.json) {
         console.log(JSON.stringify(results, null, 2));
       } else {

+ 25 - 6
src/context/formatter.ts

@@ -5,6 +5,7 @@
  */
 
 import { Node, Edge, TaskContext, Subgraph } from '../types';
+import { isGeneratedFile } from '../extraction/generated-detection';
 
 /**
  * Format context as markdown
@@ -21,10 +22,17 @@ export function formatContextAsMarkdown(context: TaskContext): string {
   lines.push('## Code Context\n');
   lines.push(`**Query:** ${context.query}\n`);
 
-  // Entry points - compact format
-  if (context.entryPoints.length > 0) {
+  // Entry points - compact format. Re-sort so generated files (.pb.go,
+  // .pulsar.go, mocks, …) rank LAST — a flow query should lead with the
+  // hand-written implementation, not protobuf scaffolding.
+  const orderedEntries = [...context.entryPoints].sort((a, b) => {
+    const aGen = isGeneratedFile(a.filePath) ? 1 : 0;
+    const bGen = isGeneratedFile(b.filePath) ? 1 : 0;
+    return aGen - bGen;
+  });
+  if (orderedEntries.length > 0) {
     lines.push('### Entry Points\n');
-    for (const node of context.entryPoints) {
+    for (const node of orderedEntries) {
       const location = node.startLine ? `:${node.startLine}` : '';
       lines.push(`- **${node.name}** (${node.kind}) - ${node.filePath}${location}`);
       if (node.signature) {
@@ -34,9 +42,14 @@ export function formatContextAsMarkdown(context: TaskContext): string {
     lines.push('');
   }
 
-  // Related symbols - compact list (skip verbose structure tree)
+  // Related symbols - compact list (skip verbose structure tree). Drop nodes
+  // in generated source files (`.pb.go` / `.pulsar.go` / mocks / …) — agents
+  // chasing a flow never want to land on protobuf scaffolding (cosmos-Q3 used
+  // to list `gov.pulsar.go::GetExpeditedThreshold` and `1.pulsar.go::Get` in
+  // Related Symbols, pure noise that displaced real-flow entries).
   const otherSymbols = Array.from(context.subgraph.nodes.values())
     .filter(n => !context.entryPoints.some(e => e.id === n.id))
+    .filter(n => !isGeneratedFile(n.filePath))
     .slice(0, 10); // Limit to 10 related symbols
 
   if (otherSymbols.length > 0) {
@@ -55,10 +68,16 @@ export function formatContextAsMarkdown(context: TaskContext): string {
     lines.push('');
   }
 
-  // Code blocks - only for key entry points
+  // Code blocks - only for key entry points. Re-sort so non-generated blocks
+  // show first (consistent with Entry Points reordering above).
   if (context.codeBlocks.length > 0) {
+    const orderedBlocks = [...context.codeBlocks].sort((a, b) => {
+      const aGen = isGeneratedFile(a.filePath) ? 1 : 0;
+      const bGen = isGeneratedFile(b.filePath) ? 1 : 0;
+      return aGen - bGen;
+    });
     lines.push('### Code\n');
-    for (const block of context.codeBlocks) {
+    for (const block of orderedBlocks) {
       const nodeName = block.node?.name ?? 'Unknown';
       lines.push(`#### ${nodeName} (${block.filePath}:${block.startLine})\n`);
       lines.push('```' + block.language);

+ 55 - 0
src/extraction/generated-detection.ts

@@ -0,0 +1,55 @@
+/**
+ * Generated-file detection for symbol-disambiguation down-ranking.
+ *
+ * When a query like "Send" matches 17 symbols across protobuf scaffolding,
+ * test mocks, and the hand-written implementation, the FTS ranker often
+ * surfaces the generated stubs first because their names are identical
+ * to the implementation's name (validated empirically on cosmos-sdk —
+ * see project_go_multi_module_audit memory). Generated stubs frequently
+ * have no body to trace from, so the agent ends up reading source anyway.
+ *
+ * This helper is a pure path-based classifier consulted at disambiguation
+ * time (findSymbol / findAllSymbols / codegraph_search formatting), NOT
+ * a hard filter — generated nodes are still in the graph and remain
+ * reachable; they just rank LAST when there's a real implementation
+ * with the same name.
+ *
+ * Scope: suffix patterns only. Most generated files follow the
+ * `<basename>.<tool>.<ext>` convention (`.pb.go`, `_grpc.pb.go`,
+ * `.g.dart`, `_pb2.py`), and that covers ~all of what we saw in the
+ * Go audit. A future addition would be scanning for the canonical
+ * `// Code generated by` header during extraction, for the rare files
+ * that defy the suffix convention.
+ */
+
+const GENERATED_PATTERNS: ReadonlyArray<RegExp> = [
+  // Go — protobuf / gRPC / pulsar
+  /\.pb\.go$/,
+  /\.pulsar\.go$/,
+  /_grpc\.pb\.go$/,
+  // Go — mockgen output. Default emits `mock_<src>.go`; many projects
+  // (cosmos-sdk uses `expected_*_mocks.go`) rename to `*_mock.go` /
+  // `*_mocks.go`. Matching either suffix catches both conventions
+  // without false-positive risk on hand-written sources.
+  /_mock\.go$/,
+  /_mocks\.go$/,
+  /^mock_[^/]+\.go$/,
+  // TypeScript / JavaScript — common codegen suffix
+  /\.generated\.[jt]sx?$/,
+  // Python — protobuf
+  /_pb2(_grpc)?\.py$/,
+  // C++ — protobuf
+  /\.pb\.(cc|h)$/,
+  // Dart — build_runner / freezed
+  /\.g\.dart$/,
+  /\.freezed\.dart$/,
+];
+
+/**
+ * Whether `filePath` looks like a tool-generated source file based on
+ * its filename. Path-only — does not read content. The result is a
+ * relevance hint for disambiguation, not a hard claim.
+ */
+export function isGeneratedFile(filePath: string): boolean {
+  return GENERATED_PATTERNS.some((p) => p.test(filePath));
+}

+ 175 - 33
src/mcp/tools.ts

@@ -24,6 +24,7 @@ import {
   writeSync,
 } from 'fs';
 import { clamp, validatePathWithinRoot, validateProjectPath } from '../utils';
+import { isGeneratedFile } from '../extraction/generated-detection';
 import { tmpdir } from 'os';
 import { join, resolve as resolvePath } from 'path';
 
@@ -1014,7 +1015,16 @@ export class ToolHandler {
       return this.textResult(`No results found for "${query}"`);
     }
 
-    const formatted = this.formatSearchResults(results);
+    // Down-rank generated files within the FTS-returned set so a search
+    // for "Send" surfaces the hand-written keeper before .pb.go stubs
+    // that share the name. Stable: only reorders generated vs. not.
+    const ranked = [...results].sort((a, b) => {
+      const aGen = isGeneratedFile(a.node.filePath) ? 1 : 0;
+      const bGen = isGeneratedFile(b.node.filePath) ? 1 : 0;
+      return aGen - bGen;
+    });
+
+    const formatted = this.formatSearchResults(ranked);
     return this.textResult(this.truncateOutput(formatted));
   }
 
@@ -1232,41 +1242,137 @@ export class ToolHandler {
     // (which, on real code, means the flow breaks at dynamic dispatch).
     const edgeKinds: Edge['kind'][] = ['calls'];
     const MAX_HOPS = 7;
-    const fromTry = fromMatches.nodes.slice(0, 3);
-    const toTry = toMatches.nodes.slice(0, 3);
+    // Path-proximity pairing: in a multi-module repo a symbol name like
+    // `EndBlocker` exists in 20+ modules. FTS picks one almost arbitrarily;
+    // the WRONG pair (e.g. simapp's wrapper EndBlocker paired with gov's Tally)
+    // has no static path, falls through to the dynamic-dispatch failure branch,
+    // and surfaces unrelated bodies — exactly the cosmos-Q3 trace failure mode.
+    // Score every from×to combo by shared file-path prefix length; try the
+    // most-co-located pair first (e.g. `x/gov/abci.go::EndBlocker` ×
+    // `x/gov/keeper/tally.go::Tally` share `x/gov/`).
+    //
+    // Consider the FULL candidate set, not just the FTS top-5: the right
+    // EndBlocker for a gov-module flow may rank 8th in FTS but share the
+    // entire `x/gov/` prefix with the destination. Path-proximity supersedes
+    // FTS for this disambiguation. Findpath trials are still capped by
+    // FINDPATH_PAIR_BUDGET below to bound graph traversal cost.
+    const sharedDirPrefixLen = (a: string, b: string): number => {
+      const aDir = a.replace(/[^/]+$/, '');
+      const bDir = b.replace(/[^/]+$/, '');
+      let i = 0;
+      while (i < aDir.length && i < bDir.length && aDir[i] === bDir[i]) i++;
+      return i;
+    };
+    // Cosmos-Q3 surfaced a second-order failure: `enterprise/group/x/group/`
+    // SHARES MORE of its path with `enterprise/group/x/group/keeper/tally.go`
+    // (24 chars) than `x/gov/abci.go` shares with `x/gov/keeper/tally.go`
+    // (6 chars), so pure shared-prefix prefers the side-experiment module
+    // over the canonical one — even though the user's question is clearly
+    // about the main gov module. Penalize candidates living under prefixes
+    // that conventionally hold extensions / experiments / vendored code, so
+    // the canonical-path pair wins even when its shared prefix is short.
+    const isLessCanonicalPath = (p: string): boolean =>
+      /^(enterprise|contrib|examples?|sample|playground|vendor|third[_-]?party|deprecated|legacy)\//i.test(p);
+    const LESS_CANONICAL_PENALTY = 100; // any canonical candidate beats any less-canonical one
+    const scorePair = (a: string, b: string): number =>
+      sharedDirPrefixLen(a, b)
+      - (isLessCanonicalPath(a) ? LESS_CANONICAL_PENALTY : 0)
+      - (isLessCanonicalPath(b) ? LESS_CANONICAL_PENALTY : 0);
+    const fromCands = fromMatches.nodes;
+    const toCands = toMatches.nodes;
+    const pairs: Array<{ f: Node; t: Node; score: number }> = [];
+    for (const f of fromCands) {
+      for (const t of toCands) {
+        pairs.push({ f, t, score: scorePair(f.filePath, t.filePath) });
+      }
+    }
+    // Sort by shared prefix desc, then by FTS order (already encoded in the
+    // pairs' insertion order — both for f and t). The tiebreaker preserves
+    // findAllSymbols' generated-file-last ranking.
+    pairs.sort((a, b) => b.score - a.score);
+    // Cap how many graph-path probes we attempt so a 50×50 cross-product
+    // doesn't blow up on a god-named symbol like `Get` (well-named flows have
+    // their good pair near the top of the sort anyway).
+    const FINDPATH_PAIR_BUDGET = 20;
+    const fromTry = fromCands;
+    const toTry = toCands;
     let path: Array<{ node: Node; edge: Edge | null }> | null = null;
     let overCap: Array<{ node: Node; edge: Edge | null }> | null = null;
-    for (const f of fromTry) {
-      for (const t of toTry) {
-        const p = cg.findPath(f.id, t.id, edgeKinds);
-        if (!p || p.length <= 1) continue;
-        if (p.length <= MAX_HOPS) { path = p; break; }
-        if (!overCap || p.length < overCap.length) overCap = p;
-      }
+    let bestPair: { f: Node; t: Node } | null = null;
+    let triedPairs = 0;
+    for (const { f, t } of pairs) {
       if (path) break;
+      if (triedPairs >= FINDPATH_PAIR_BUDGET) break;
+      triedPairs++;
+      const p = cg.findPath(f.id, t.id, edgeKinds);
+      if (p && p.length > 1) {
+        if (p.length <= MAX_HOPS) { path = p; bestPair = { f, t }; break; }
+        if (!overCap || p.length < overCap.length) { overCap = p; bestPair = { f, t }; }
+      } else if (!bestPair) {
+        // No path yet — remember the top-scored pair so the failure branch
+        // surfaces the most-co-located candidates' bodies, not whatever FTS
+        // happened to put first.
+        bestPair = { f, t };
+      }
     }
 
     if (!path) {
-      // No static path — almost always a dynamic-dispatch break. Surface the
-      // start symbol's outgoing calls so the agent can bridge the gap.
-      const start = fromTry[0]!;
-      const callees = cg.getCallees(start.id).slice(0, 10)
-        .map(c => `${c.node.name} (${c.node.filePath}:${c.node.startLine})`);
+      // No static path — almost always a dynamic-dispatch break. INSTEAD of
+      // telling the agent to chase the gap with codegraph_node/callers/callees
+      // (which fans out into 3-4 follow-up tool calls + a Read), inline the
+      // material those would have returned right here. Measured on cosmos-Q3:
+      // the failed-trace + subsequent fan-out used to cost ~2× a single
+      // sufficient trace call; this branch closes that gap.
+      // Prefer the path-proximity-best pair we identified above (e.g. gov's
+      // EndBlocker × gov's Tally) over the FTS top-pick (simapp's wrapper).
+      const start = bestPair?.f ?? fromTry[0]!;
+      const end = bestPair?.t ?? toTry[0]!;
+      const fileCache = new Map<string, string[]>();
       const lines = [
-        `No direct call path from "${from}" to "${to}".`,
+        `No direct static call path from "${from}" to "${to}" — the chain almost certainly breaks at dynamic dispatch (a callback / interface dispatch / framework hook / metaclass). Both endpoint bodies + their immediate neighbors are inlined below; answer from them — a follow-up codegraph_node/callers/callees on these would just return what is already here.`,
         '',
-        (overCap
-          ? `(Only a ${overCap.length}-hop indirect chain connects them — almost certainly a BFS wander through unrelated code, not the real flow.) `
-          : '') +
-        'The direct chain most likely breaks at **dynamic dispatch** (a callback, descriptor, ' +
-        'metaclass, or attribute-as-callable) that static parsing cannot resolve into an edge. ' +
-        `Inspect \`${start.name}\` (${start.filePath}:${start.startLine}) with codegraph_node ` +
-        '(includeCode=true) — its body usually shows the dynamic call to follow next.',
       ];
-      if (callees.length > 0) {
-        lines.push('', `**${start.name} statically calls:** ${callees.join(', ')}`);
+      if (overCap) {
+        lines.push(
+          `> Indirect chain of ${overCap.length} hops exists but is over the ${MAX_HOPS}-hop cap (usually a BFS wander through unrelated code, not the real execution flow).`,
+          '',
+        );
       }
-      return this.textResult(lines.join('\n') + fromMatches.note + toMatches.note);
+
+      const inlineEndpoint = (
+        label: 'FROM' | 'TO',
+        node: Node,
+        // calls/callers caps are tight on purpose — the full bodies are what
+        // displaces the Read; the lists are just enough hint to follow if needed.
+      ) => {
+        lines.push(`### ${label}: \`${node.name}\` (${node.filePath}:${node.startLine}-${node.endLine})`);
+        // Modest endpoint-source cap (120 lines / 3600 chars). Earlier bumped to
+        // 200/6000 to fit cosmos-gov's 261-line EndBlocker without truncation,
+        // but the n=2 audit showed the agent re-Reads regardless — so the extra
+        // characters were pure cost without payoff. 120/3600 captures most
+        // real-world endpoint bodies (the gRPC stubs / module Begin/EndBlocker
+        // wrappers we typically land on are short) at half the token weight.
+        const body = this.sourceRangeAt(cg, node.filePath, node.startLine, node.endLine, fileCache, 120, 3600);
+        if (body) lines.push(body);
+        const callers = cg.getCallers(node.id).slice(0, 6);
+        if (callers.length > 0) {
+          lines.push(`**Callers of \`${node.name}\`:** ` +
+            callers.map(c => `${c.node.name} (${c.node.filePath}:${c.node.startLine})`).join(', '));
+        }
+        const callees = cg.getCallees(node.id).slice(0, 8);
+        if (callees.length > 0) {
+          lines.push(`**\`${node.name}\` calls:** ` +
+            callees.map(c => `${c.node.name} (${c.node.filePath}:${c.node.startLine})`).join(', '));
+        }
+        lines.push('');
+      };
+      inlineEndpoint('FROM', start);
+      if (end.id !== start.id) inlineEndpoint('TO', end);
+
+      lines.push(
+        '> Both endpoint bodies, callers, and callees are inlined above. The dynamic-dispatch hop typically appears in one of them as: a callback registration, an interface method invoked on a field, a framework hook, or a generated stub. Identify the gap from the bodies — no further codegraph_node/Read is needed for these symbols.',
+      );
+      return this.textResult(this.truncateOutput(lines.join('\n') + fromMatches.note + toMatches.note));
     }
 
     const lines: string[] = [
@@ -1670,15 +1776,33 @@ export class ToolHandler {
       const bRelevant = hasQueryRelevance(bPath, b[1].nodes);
       if (aRelevant !== bRelevant) return aRelevant ? -1 : 1;
 
-      // Deprioritize test files, icon files, and i18n files
+      // Deprioritize test files, icon files, and i18n files. Covers both
+      // directory-style (`/tests/`, `/spec/`) AND suffix-style conventions
+      // (`*_test.go`, `*_spec.rb`, `*.test.ts`, `*.spec.tsx`, `*Test.java`,
+      // `*Spec.kt`) — without the suffix check, etcd's `watchable_store_test.go`
+      // displaced 5K chars of real-flow source in codegraph_explore for Q2.
       const isLowValue = (p: string) =>
         /\/(tests?|__tests?__|spec)\//i.test(p) ||
+        /_test\.(go|py|rb)$/i.test(p) ||
+        /_spec\.rb$/i.test(p) ||
+        /\.(test|spec)\.[jt]sx?$/i.test(p) ||
+        /(Test|Spec|Tests)\.(java|kt|scala)$/.test(p) ||
         /\bicons?\b/i.test(p) ||
         /\bi18n\b/i.test(p);
       const aLow = isLowValue(aPath);
       const bLow = isLowValue(bPath);
       if (aLow !== bLow) return aLow ? 1 : -1;
 
+      // Deprioritize generated source (.pb.go / .pulsar.go / _mocks.go / …) —
+      // the agent rarely needs to see the protobuf scaffold or gomock output
+      // when asking about the actual flow, and dumping their bodies inflates
+      // the response (the cosmos Q3 explore otherwise leads with
+      // `expected_keepers_mocks.go`, displacing the real `tally.go` content
+      // and forcing the agent to Read tally.go anyway).
+      const aGen = isGeneratedFile(a[0]);
+      const bGen = isGeneratedFile(b[0]);
+      if (aGen !== bGen) return aGen ? 1 : -1;
+
       if (a[1].score !== b[1].score) return b[1].score - a[1].score;
       return b[1].nodes.length - a[1].nodes.length;
     });
@@ -2519,12 +2643,21 @@ export class ToolHandler {
     }
 
     if (exactMatches.length > 1) {
+      // Down-rank generated files (.pb.go, .pulsar.go, _grpc.pb.go, …)
+      // so a query like "Send" prefers the keeper implementation over
+      // the protobuf-generated interface stub. Stable sort preserves
+      // FTS order within each group. See generated-detection.ts.
+      const ranked = [...exactMatches].sort((a, b) => {
+        const aGen = isGeneratedFile(a.node.filePath) ? 1 : 0;
+        const bGen = isGeneratedFile(b.node.filePath) ? 1 : 0;
+        return aGen - bGen;
+      });
       // Multiple exact matches - pick first, note the others
-      const picked = exactMatches[0]!.node;
-      const others = exactMatches.slice(1).map(r =>
+      const picked = ranked[0]!.node;
+      const others = ranked.slice(1).map(r =>
         `${r.node.name} (${r.node.kind}) at ${r.node.filePath}:${r.node.startLine}`
       );
-      const note = `\n\n> **Note:** ${exactMatches.length} symbols named "${symbol}". Showing results for \`${picked.filePath}:${picked.startLine}\`. Others: ${others.join(', ')}`;
+      const note = `\n\n> **Note:** ${ranked.length} symbols named "${symbol}". Showing results for \`${picked.filePath}:${picked.startLine}\`. Others: ${others.join(', ')}`;
       return { node: picked, note };
     }
 
@@ -2562,11 +2695,20 @@ export class ToolHandler {
       return { nodes: [node], note: '' };
     }
 
-    const locations = exactMatches.map(r =>
+    // Same generated-file down-rank as findSymbol — keeps callers/callees
+    // /impact aggregation aligned (a query against "Send" returns the
+    // hand-written implementations before the protobuf scaffold).
+    const ranked = [...exactMatches].sort((a, b) => {
+      const aGen = isGeneratedFile(a.node.filePath) ? 1 : 0;
+      const bGen = isGeneratedFile(b.node.filePath) ? 1 : 0;
+      return aGen - bGen;
+    });
+
+    const locations = ranked.map(r =>
       `${r.node.kind} at ${r.node.filePath}:${r.node.startLine}`
     );
-    const note = `\n\n> **Note:** Aggregated results across ${exactMatches.length} symbols named "${symbol}": ${locations.join(', ')}`;
-    return { nodes: exactMatches.map(r => r.node), note };
+    const note = `\n\n> **Note:** Aggregated results across ${ranked.length} symbols named "${symbol}": ${locations.join(', ')}`;
+    return { nodes: ranked.map(r => r.node), note };
   }
 
   /**

+ 112 - 0
src/resolution/callback-synthesizer.ts

@@ -24,6 +24,7 @@
 import type { Edge, Node } from '../types';
 import type { QueryBuilder } from '../db/queries';
 import type { ResolutionContext } from './types';
+import { isGeneratedFile } from '../extraction/generated-detection';
 
 const REGISTRAR_NAME = /^(on[A-Z]\w*|subscribe|addListener|addEventListener|register|watch|listen|addCallback)$/;
 const DISPATCHER_NAME = /(emit|trigger|notify|dispatch|fire|publish|flush)/i;
@@ -386,6 +387,115 @@ function interfaceOverrideEdges(queries: QueryBuilder): Edge[] {
   return edges;
 }
 
+/**
+ * Go gRPC stub → impl bridge. The protoc-gen-go-grpc codegen emits an
+ * `UnimplementedXxxServer` struct in `*_grpc.pb.go` carrying one method
+ * per service RPC; the real handler is a hand-written struct in another
+ * file (`x/bank/keeper/msg_server.go::msgServer.Send` in cosmos-sdk).
+ * Go's structural typing means no `implements` edge exists for our
+ * resolver to follow, so `trace("Send","SendCoins")` lands on the
+ * empty stub and reports "no path" (validated empirically — the cosmos
+ * Q1 r1 trace failure that drove this work).
+ *
+ * Bridge: for each `UnimplementedXxxServer` whose RPC-method names are
+ * a SUBSET of some other Go struct's method names, emit `calls` edges
+ * `stub.method → impl.method` (paired by name). Excludes the gRPC
+ * internal markers `mustEmbedUnimplementedXxxServer` and
+ * `testEmbeddedByValue`, and skips candidate impls that themselves
+ * live in a generated file (their `xxxClient` / sibling stubs would
+ * otherwise look like impls).
+ *
+ * Multiple candidates is allowed and capped at MAX_CALLBACKS_PER_CHANNEL —
+ * a service often has both a production impl and one or more test
+ * mocks; linking to all preserves trace utility without false-favoring.
+ *
+ * Provenance: `heuristic`, `synthesizedBy: 'go-grpc-stub-impl'`. The
+ * stub's source line is the wiring site shown in the trace trail.
+ */
+function goGrpcStubImplEdges(queries: QueryBuilder): Edge[] {
+  const edges: Edge[] = [];
+  const seen = new Set<string>();
+
+  const STUB_RE = /^Unimplemented.*Server$/;
+  // gRPC internal-helper methods that appear on every Unimplemented*Server;
+  // not part of the service contract, so exclude when computing the RPC-method
+  // signature used to match impls.
+  const isInternalMarker = (n: string) => n.startsWith('mustEmbed') || n === 'testEmbeddedByValue';
+
+  // Methods directly contained by each Go struct, name-only. Built once.
+  const methodNamesByStruct = new Map<string, Set<string>>();
+  const methodNodesByStruct = new Map<string, Node[]>();
+  const goStructs: Node[] = [];
+  for (const s of queries.getNodesByKind('struct')) {
+    if (s.language !== 'go') continue;
+    goStructs.push(s);
+    const ms = queries
+      .getOutgoingEdges(s.id, ['contains'])
+      .map((e) => queries.getNodeById(e.target))
+      .filter((n): n is Node => !!n && n.kind === 'method');
+    methodNodesByStruct.set(s.id, ms);
+    methodNamesByStruct.set(s.id, new Set(ms.map((m) => m.name)));
+  }
+
+  for (const stub of goStructs) {
+    if (!STUB_RE.test(stub.name)) continue;
+    // The stub MUST live in a generated file — that's what tells us this is
+    // a protoc-emitted scaffold rather than someone naming a struct
+    // `UnimplementedXxxServer` by hand. Without this gate we'd also bridge
+    // such hand-written structs and create misleading edges.
+    if (!isGeneratedFile(stub.filePath)) continue;
+
+    const stubMethods = (methodNodesByStruct.get(stub.id) ?? []).filter(
+      (m) => !isInternalMarker(m.name),
+    );
+    if (stubMethods.length === 0) continue;
+    const stubMethodNames = stubMethods.map((m) => m.name);
+
+    for (const cand of goStructs) {
+      if (cand.id === stub.id) continue;
+      // Skip generated-file candidates — they're siblings (msgClient,
+      // UnsafeMsgServer, …) whose method sets coincidentally match.
+      if (isGeneratedFile(cand.filePath)) continue;
+
+      const candNames = methodNamesByStruct.get(cand.id);
+      if (!candNames) continue;
+      // Subset: every RPC method must exist on the candidate by name.
+      // Signature-level match would tighten this further, but name-match
+      // alone already gives one-to-one pairing in real codebases because
+      // gRPC method-name sets are highly distinctive (Send + MultiSend +
+      // UpdateParams + SetSendEnabled is unique to bank's MsgServer).
+      if (!stubMethodNames.every((n) => candNames.has(n))) continue;
+
+      const candMethods = methodNodesByStruct.get(cand.id) ?? [];
+      let added = 0;
+      for (const sm of stubMethods) {
+        if (added >= MAX_CALLBACKS_PER_CHANNEL) break;
+        for (const cm of candMethods) {
+          if (added >= MAX_CALLBACKS_PER_CHANNEL) break;
+          if (cm.name !== sm.name) continue;
+          const key = `${sm.id}>${cm.id}`;
+          if (seen.has(key)) continue;
+          seen.add(key);
+          edges.push({
+            source: sm.id,
+            target: cm.id,
+            kind: 'calls',
+            line: sm.startLine,
+            provenance: 'heuristic',
+            metadata: {
+              synthesizedBy: 'go-grpc-stub-impl',
+              via: cm.name,
+              registeredAt: `${cm.filePath}:${cm.startLine}`,
+            },
+          });
+          added++;
+        }
+      }
+    }
+  }
+  return edges;
+}
+
 /**
  * Phase 5: React JSX child rendering. A component that returns `<Child .../>`
  * mounts Child — React calls it — but JSX instantiation isn't a static call edge,
@@ -856,6 +966,7 @@ export function synthesizeCallbackEdges(queries: QueryBuilder, ctx: ResolutionCo
   const flutterEdges = flutterBuildEdges(queries, ctx);
   const cppEdges = cppOverrideEdges(queries);
   const ifaceEdges = interfaceOverrideEdges(queries);
+  const goGrpcEdges = goGrpcStubImplEdges(queries);
   const rnEventEdgesList = rnEventEdges(ctx);
   const fabricNativeEdges = fabricNativeImplEdges(ctx);
   const mybatisEdges = mybatisJavaXmlEdges(queries);
@@ -871,6 +982,7 @@ export function synthesizeCallbackEdges(queries: QueryBuilder, ctx: ResolutionCo
     ...flutterEdges,
     ...cppEdges,
     ...ifaceEdges,
+    ...goGrpcEdges,
     ...rnEventEdgesList,
     ...fabricNativeEdges,
     ...mybatisEdges,