Преглед изворни кода

feat(mcp): method-atomic explore render + codegraph_node file/line overload selector

Closes the last residual reads on the README's hardest repos (Alamofire, Tokio)
— both now hit 0 read / 0 grep in agent A/B.

- Method-atomic explore: codegraph_explore never returns half a method. The four
  raw .slice() points that could cut THROUGH a body (per-file cap, whole-file
  total-trim, clustering total-trim, final hard-ceiling) are gone — when the
  budget is hit it drops whole methods / whole files (and lists what it dropped
  for a follow-up call), and the hard ceiling cuts at a #### file-section
  boundary. A truncated method was the one case that still forced a Read.

- codegraph_node file/line selector: a heavily-overloaded name (poll = 50+,
  validate = 21) can now be pinned with `file` and/or `line` — the file:line a
  trail already showed — to fetch that exact definition. Requires enumerating
  ALL overloads, so findSymbolMatches now uses a direct name index
  (CodeGraph.getNodesByName → QueryBuilder, non-FTS) for bare names instead of
  the FTS top-N, which dropped low-ranked overloads like Harness::poll below the
  cut. The multi-overload output hints the selector and caps the "Other
  definitions" list.

Validated (agent A/B, why-Read, n=2 each, fresh daemon): OkHttp 0/0 both runs
(node all-overloads recovered explore-trimmed `proceed`); Tokio 0/0 both
(selector: `poll file:harness.rs line:153`); Alamofire 0/0 both — first time
both clean (atomic render + `task file:DataRequest.swift line:119`). The agent
feeds codegraph's own trail `file:line` back into the selector. Full suite green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Colby McHenry пре 3 недеља
родитељ
комит
5bf6ad8
4 измењених фајлова са 113 додато и 24 уклоњено
  1. 2 1
      CHANGELOG.md
  2. 13 0
      __tests__/symbol-lookup.test.ts
  3. 9 0
      src/index.ts
  4. 89 23
      src/mcp/tools.ts

+ 2 - 1
CHANGELOG.md

@@ -19,7 +19,8 @@ and adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 ### Fixes
 
 - Search ranking no longer lets a common word in your request hijack the results: asking about, say, a "flat object" screen used to surface an unrelated constant that merely happened to be named the same, because the exact-name match outweighed everything else. Ranking now weighs how well each result is corroborated by the rest of your request, so the symbols you actually meant come first (this improves `codegraph_explore`'s results).
-- `codegraph_node` now returns *every* definition when a name is ambiguous — an overloaded method, or the same method name on different types — instead of returning one (sometimes the wrong one) with a note listing the rest. Asking for such a symbol now hands back all of the matching definitions with their source in a single call, so the agent stops having to read the file by hand to find the specific overload it wanted (common in Swift, Go, Java, and C#). Large overload sets show the most relevant ones in full and list the remainder by location.
+- `codegraph_node` now returns *every* definition when a name is ambiguous — an overloaded method, or the same method name on different types — instead of returning one (sometimes the wrong one) with a note listing the rest. Asking for such a symbol now hands back all of the matching definitions with their source in a single call, so the agent stops having to read the file by hand to find the specific overload it wanted (common in Swift, Go, Java, and C#). For a heavily-overloaded name (a `poll`/`validate` with dozens of definitions), pass `file` (and/or `line`) — e.g. the `file:line` shown in a trail — to get that exact definition's body. Large overload sets show the most relevant ones in full and list the remainder by location.
+- `codegraph_explore` never returns half a method anymore: when output runs up against its size budget it drops whole methods or whole files (and lists what it dropped, so you can ask for them in another call) instead of cutting off a method body partway. A truncated method was the one case that still sent the agent to read the file for the rest — so the source explore returns is now always complete and usable as-is.
 
 ## [0.9.8] - 2026-06-01
 

+ 13 - 0
__tests__/symbol-lookup.test.ts

@@ -154,6 +154,19 @@ describe.skipIf(!HAS_SQLITE)('matchesSymbol — module-qualified lookups (#173)'
     const matches = findSymbolMatches(cg, 'stage_apply::nonexistent_fn');
     expect(matches.length).toBe(0);
   });
+
+  it('codegraph_node with a `file` hint pins an overloaded name to that file', async () => {
+    // `run` is defined in BOTH stage_apply.rs and stage_detect.rs. A bare lookup
+    // returns both; the `file` hint narrows to the one the caller saw in a trail.
+    const res = await handler.execute('codegraph_node', {
+      symbol: 'run',
+      includeCode: true,
+      file: 'stage_detect.rs',
+    });
+    const text = res.content?.[0]?.text ?? '';
+    expect(text).toMatch(/stage_detect\.rs/);
+    expect(text).not.toMatch(/stage_apply\.rs/);
+  });
 });
 
 describe.skipIf(!HAS_SQLITE)('matchesSymbol — dotted lookups (regression for #173 fix)', () => {

+ 9 - 0
src/index.ts

@@ -681,6 +681,15 @@ export class CodeGraph {
     return this.queries.getNodesByKind(kind);
   }
 
+  /**
+   * Get ALL nodes with an exact name (direct index lookup, not FTS-ranked/capped).
+   * Used to enumerate every overload of a heavily-overloaded name so the specific
+   * definition the caller wants is never dropped below a search cut.
+   */
+  getNodesByName(name: string): Node[] {
+    return this.queries.getNodesByName(name);
+  }
+
   /**
    * Search nodes by text
    */

+ 89 - 23
src/mcp/tools.ts

@@ -457,7 +457,7 @@ export const tools: ToolDefinition[] = [
   },
   {
     name: 'codegraph_node',
-    description: 'SECONDARY (after codegraph_explore): get ONE symbol in full — its location, signature, callers/callees trail, and verbatim body (includeCode=true). When the name is AMBIGUOUS (an overloaded method, or the same method name on different types), it returns EVERY matching definition\'s full body in a single call — so you never need to Read a file to find the specific overload you want. Reach for this when explore trimmed a body you need. Use codegraph_explore for several related symbols or the full flow.',
+    description: 'SECONDARY (after codegraph_explore): get ONE symbol in full — its location, signature, callers/callees trail, and verbatim body (includeCode=true). When the name is AMBIGUOUS (an overloaded method, or the same method name on different types), it returns EVERY matching definition\'s full body in a single call — so you never need to Read a file to find the specific overload you want. For a heavily-overloaded name, pass `file` (and/or `line`) to pin the exact definition — e.g. the `file:line` a trail or another tool already showed you. Reach for this when explore trimmed a body you need. Use codegraph_explore for several related symbols or the full flow.',
     inputSchema: {
       type: 'object',
       properties: {
@@ -470,6 +470,14 @@ export const tools: ToolDefinition[] = [
           description: 'Include full source code (default: false to minimize context)',
           default: false,
         },
+        file: {
+          type: 'string',
+          description: 'Optional: disambiguate an overloaded name to the definition in this file (path or basename, e.g. "harness.rs").',
+        },
+        line: {
+          type: 'number',
+          description: 'Optional: disambiguate to the definition at/around this line (use with the file:line a trail showed you).',
+        },
         projectPath: projectPathProperty,
       },
       required: ['symbol'],
@@ -2171,11 +2179,12 @@ export class ToolHandler {
         const omitted = uniqSymbols.length - headerNames.length;
         const wholeHeader = `#### ${filePath} — ${omitted > 0 ? `${headerNames.join(', ')}, +${omitted} more` : headerNames.join(', ')}`;
 
-        if (totalChars + wholeSection.length + 200 > budget.maxOutputChars) {
-          const remaining = budget.maxOutputChars - totalChars - 200;
-          if (remaining < 500) break;
-          wholeSection = wholeSection.slice(0, remaining) + '\n... (trimmed) ...';
+        if (!fileNecessary && totalChars + wholeSection.length + 200 > budget.maxOutputChars) {
+          // Don't slice a whole file mid-method: an incidental file that doesn't
+          // fit is skipped; a necessary one (below) renders in full. Half a file
+          // forces the Read this is meant to prevent.
           anyFileTrimmed = true;
+          continue;
         }
         lines.push(wholeHeader, '', '```' + lang, wholeSection, '```', '');
         totalChars += wholeSection.length + 200;
@@ -2350,7 +2359,6 @@ export class ToolHandler {
       // Emit chosen clusters in source order so the file reads top-to-bottom.
       let fileSection = '';
       const allSymbols: string[] = [];
-      let fileTrimmed = false;
       for (let i = 0; i < clusters.length; i++) {
         if (!chosenIndices.has(i)) continue;
         const cluster = clusters[i]!;
@@ -2360,13 +2368,12 @@ export class ToolHandler {
         allSymbols.push(...cluster.symbols);
       }
 
-      // If a single chosen cluster is still oversize (long monolithic
-      // function), tail-trim it. Better one trimmed view than nothing.
-      if (fileSection.length > budget.maxCharsPerFile) {
-        fileSection = fileSection.slice(0, budget.maxCharsPerFile) + '\n... (trimmed) ...';
-        fileTrimmed = true;
-      }
-      if (chosenIndices.size < clusters.length || fileTrimmed) {
+      // A chosen cluster is a COMPLETE method-range — we never cut through a body.
+      // An oversize single cluster (a long monolithic function) renders in FULL:
+      // half a method is useless (the agent just Reads the rest for the other half),
+      // which is the very fallback explore exists to prevent. A pathological file is
+      // bounded by the per-file cluster SELECTION above + the total hard ceiling.
+      if (chosenIndices.size < clusters.length) {
         anyFileTrimmed = true;
       }
 
@@ -2400,10 +2407,11 @@ export class ToolHandler {
       // (DataRequest/Validation) all render, instead of the cap dropping whichever
       // phase the file order happened to put last.
       if (!fileNecessary && totalChars + fileSection.length + 200 > budget.maxOutputChars) {
-        const remaining = budget.maxOutputChars - totalChars - 200;
-        if (remaining < 500) continue; // incidental file, no room — skip it, keep scanning for necessary ones
-        fileSection = fileSection.slice(0, remaining) + '\n... (trimmed) ...';
+        // Incidental file that doesn't fit: SKIP it whole — never slice mid-method.
+        // Keep scanning for necessary files (which bypass this cap and render in
+        // full, bounded by the hard ceiling).
         anyFileTrimmed = true;
+        continue;
       }
 
       lines.push(fileHeader);
@@ -2481,9 +2489,15 @@ export class ToolHandler {
     const output = flow.text + lines.join('\n');
     const hardCeiling = Math.round(budget.maxOutputChars * 1.5);
     if (output.length > hardCeiling) {
+      // Cut at a FILE-SECTION boundary (the last `#### ` header before the
+      // ceiling) so we drop whole trailing file-sections rather than slicing
+      // through a method body — a half-rendered method just forces the Read this
+      // tool exists to prevent. Fall back to a line boundary only if no section
+      // header sits in the back half (degenerate single-giant-section case).
       const cut = output.slice(0, hardCeiling);
-      const lastNewline = cut.lastIndexOf('\n');
-      const safe = lastNewline > hardCeiling * 0.8 ? cut.slice(0, lastNewline) : cut;
+      const lastSection = cut.lastIndexOf('\n#### ');
+      const boundary = lastSection > hardCeiling * 0.5 ? lastSection : cut.lastIndexOf('\n');
+      const safe = boundary > 0 ? cut.slice(0, boundary) : cut;
       return this.textResult(safe + '\n\n... (output truncated to budget; the source above is complete and verbatim — treat it as already Read. For any area not covered, run another codegraph_explore with the specific names — do NOT Read these files.)');
     }
     return this.textResult(output);
@@ -2499,12 +2513,37 @@ export class ToolHandler {
     const cg = this.getCodeGraph(args.projectPath as string | undefined);
     // Default to false to minimize context usage
     const includeCode = args.includeCode === true;
+    const fileHint = typeof args.file === 'string' && args.file.trim() ? args.file.trim() : undefined;
+    const lineHint = typeof args.line === 'number' && args.line > 0 ? args.line : undefined;
 
-    const matches = this.findSymbolMatches(cg, symbol);
+    let matches = this.findSymbolMatches(cg, symbol);
     if (matches.length === 0) {
       return this.textResult(`Symbol "${symbol}" not found in the codebase`);
     }
 
+    // Disambiguate a heavily-overloaded name to a specific definition the caller
+    // pinned by file/line (the `file:line` a trail or another tool showed it) —
+    // so it can fetch e.g. `Harness::poll` at harness.rs:153 out of 50+ `poll`s
+    // instead of Reading. file matches by path suffix/substring; line prefers the
+    // def whose body contains it, else the nearest start. Only narrows (never
+    // empties — if a hint matches nothing it's ignored).
+    if (matches.length > 1 && (fileHint || lineHint !== undefined)) {
+      const norm = (p: string) => p.replace(/\\/g, '/').toLowerCase();
+      let narrowed = matches;
+      if (fileHint) {
+        const fh = norm(fileHint);
+        const byFile = narrowed.filter((n) => norm(n.filePath).endsWith(fh) || norm(n.filePath).includes(fh));
+        if (byFile.length > 0) narrowed = byFile;
+      }
+      if (lineHint !== undefined && narrowed.length > 1) {
+        const containing = narrowed.filter((n) => n.startLine <= lineHint && (n.endLine ?? n.startLine) >= lineHint);
+        narrowed = containing.length > 0
+          ? containing
+          : [...narrowed].sort((a, b) => Math.abs(a.startLine - lineHint) - Math.abs(b.startLine - lineHint)).slice(0, 1);
+      }
+      if (narrowed.length > 0) matches = narrowed;
+    }
+
     // Single definition — the common case.
     if (matches.length === 1) {
       return this.textResult(this.truncateOutput(await this.renderNodeSection(cg, matches[0]!, includeCode)));
@@ -2554,7 +2593,18 @@ export class ToolHandler {
       rendered.join('\n\n---\n\n'),
     ];
     if (listed.length) {
-      out.push('', '### Other definitions', ...listed.map((n) => `- \`${n.name}\` (${n.kind}) — ${n.filePath}:${n.startLine}`));
+      const LIST_CAP = 20;
+      const shownList = listed.slice(0, LIST_CAP);
+      out.push(
+        '',
+        '### Other definitions',
+        ...shownList.map((n) => `- \`${n.name}\` (${n.kind}) — ${n.filePath}:${n.startLine}`),
+      );
+      if (listed.length > LIST_CAP) out.push(`- … +${listed.length - LIST_CAP} more`);
+      out.push(
+        '',
+        `> Need one of these in full? Call codegraph_node again with \`file\` (e.g. \`"${listed[0]!.filePath.split('/').pop()}"\`) or \`line\` — do NOT Read it.`,
+      );
     }
     return this.textResult(this.truncateOutput(out.join('\n')));
   }
@@ -2978,10 +3028,26 @@ export class ToolHandler {
    * bare name with no exact match falls back to the single top fuzzy result.
    */
   private findSymbolMatches(cg: CodeGraph, symbol: string): Node[] {
-    // Higher limit for qualified lookups (e.g., "Session.request") — the target
-    // may rank lower in FTS amid many partial matches across qualifier parts.
     const isQualified = /[.\/]|::/.test(symbol);
-    const limit = isQualified ? 50 : 10;
+
+    // For a bare name, enumerate EVERY exact-name definition via the direct index
+    // (not FTS, which caps + ranks): tokio's `poll` has 50+ defs and the one the
+    // caller wants (`Harness::poll` at harness.rs:153) ranks below any search cut,
+    // so it could be neither rendered nor pinned by the file/line disambiguator —
+    // and the agent Read it. With the full set, the multi-overload render + the
+    // file/line filter can both reach it.
+    if (!isQualified) {
+      const exact = cg.getNodesByName(symbol);
+      if (exact.length > 0) {
+        return [...exact].sort((a, b) => (isGeneratedFile(a.filePath) ? 1 : 0) - (isGeneratedFile(b.filePath) ? 1 : 0));
+      }
+      // No exact match — use the single top fuzzy result (e.g. a file basename).
+      const fuzzy = cg.searchNodes(symbol, { limit: 10 });
+      return fuzzy[0] ? [fuzzy[0].node] : [];
+    }
+
+    // Qualified lookup (`Session.request`, `stage_apply::run`): FTS + matchesSymbol.
+    const limit = 50;
     let results = cg.searchNodes(symbol, { limit });
 
     // FTS strips colons, so `stage_apply::run` searches the literal