ソースを参照

feat(mcp): adaptive codegraph_explore sizing — skeletonize redundant polymorphic siblings (#564)

codegraph_explore now skeletonizes off-spine, redundant members of a polymorphic
family (OkHttp's interceptor chain, Django's SQLCompiler family) to signatures
instead of shipping every full body, while keeping the dispatch mechanism, the
orchestrator/base, and any method the agent named in full. Sizes the response to
the answer rather than the budget cap, so interface-heavy flows stop costing more
than plain grep/read. Default on; CODEGRAPH_ADAPTIVE_EXPLORE=0 disables.

Gate: off-spine + >=3-impl sibling + not-spared, where spared = the agent named a
callable in the file UNLESS the file defines the family's supertype (a huge
base+subclasses file is Read-anyway, so skeletonizing frees explore budget).

Validated headless A/B (Opus 4.8): both former README cost outliers flipped —
OkHttp and Django went from costlier-than-native to cheaper; full 7-repo average
22%% cheaper / 47%% fewer tokens / 20%% faster / 50%% fewer tool calls, every repo
cost-positive, inert repos unchanged. 7-case regression test.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Colby Mchenry 3 週間 前
コミット
f1b14f021b
5 ファイル変更819 行追加59 行削除
  1. 1 0
      CHANGELOG.md
  2. 52 52
      README.md
  3. 373 0
      __tests__/adaptive-explore-sizing.test.ts
  4. 258 0
      docs/design/adaptive-explore-sizing.md
  5. 135 7
      src/mcp/tools.ts

+ 1 - 0
CHANGELOG.md

@@ -13,6 +13,7 @@ and adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 
 
 - `codegraph init` now builds the initial index by default — you no longer need the `-i`/`--index` flag (it's still accepted, so existing commands and scripts keep working). (#483)
 - `codegraph init` now builds the initial index by default — you no longer need the `-i`/`--index` flag (it's still accepted, so existing commands and scripts keep working). (#483)
 - Go: Gin middleware chains now connect end-to-end in `codegraph_trace` and `codegraph_explore` — following a request reaches the middleware and route handlers registered via `.Use()` / `.GET()` instead of dead-ending where the framework dispatches the chain dynamically.
 - Go: Gin middleware chains now connect end-to-end in `codegraph_trace` and `codegraph_explore` — following a request reaches the middleware and route handlers registered via `.Use()` / `.GET()` instead of dead-ending where the framework dispatches the chain dynamically.
+- `codegraph_explore` is now leaner on interface-heavy flows: when a query spans many interchangeable implementations of one interface (an HTTP interceptor chain, say), it shows the rest as signatures instead of every full body, while keeping the dispatch mechanism and any specific method you asked about in full. Fewer tokens for the same answer, so questions like these stop costing more than plain grep/read — in testing, the two slowest-to-pay-off repos (a Java and a Python framework) went from slightly costlier than native search to clearly cheaper. Distinct, non-interchangeable code is shown in full as before. Disable with `CODEGRAPH_ADAPTIVE_EXPLORE=0`.
 
 
 ### Fixes
 ### Fixes
 
 

+ 52 - 52
README.md

@@ -4,7 +4,7 @@
 
 
 ### Supercharge Claude Code, Cursor, Codex, OpenCode, Hermes Agent, Gemini, Antigravity, and Kiro with Semantic Code Intelligence
 ### Supercharge Claude Code, Cursor, Codex, OpenCode, Hermes Agent, Gemini, Antigravity, and Kiro with Semantic Code Intelligence
 
 
-**~18% cheaper · ~57% fewer tool calls · 100% local**
+**~22% cheaper · ~50% fewer tool calls · 100% local**
 
 
 ### [Documentation & Website →](https://colbymchenry.github.io/codegraph/)
 ### [Documentation & Website →](https://colbymchenry.github.io/codegraph/)
 
 
@@ -83,21 +83,21 @@ When Claude Code explores a codebase, it spawns **Explore agents** that scan fil
 
 
 ### Benchmark Results
 ### Benchmark Results
 
 
-Tested across **7 real-world open-source codebases** spanning 7 languages, comparing an agent (Claude Code, headless) answering one architecture question **with** and **without** CodeGraph. Each cell is the savings at the **median of 4 runs per arm**. _Re-validated on **v0.9.7** + Opus 4.8 (2026-05-28)._
+Tested across **7 real-world open-source codebases** spanning 7 languages, comparing an agent (Claude Code, headless) answering one architecture question **with** and **without** CodeGraph. Each cell is the savings at the **median of 4 runs per arm**. _Re-validated on Opus 4.8 (2026-05-29), on the build with adaptive `codegraph_explore` sizing._
 
 
-> **Average: 18% cheaper · 51% fewer tokens · 16% faster · 57% fewer tool calls**
+> **Average: 22% cheaper · 47% fewer tokens · 20% faster · 50% fewer tool calls**
 
 
 | Codebase | Language | Cost | Tokens | Time | Tool calls |
 | Codebase | Language | Cost | Tokens | Time | Tool calls |
 |----------|----------|------|--------|------|------------|
 |----------|----------|------|--------|------|------------|
-| **VS Code** | TypeScript · ~10k files | 26% cheaper | 63% fewer | 20% faster | 69% fewer |
-| **Excalidraw** | TypeScript · ~640 | 40% cheaper | 71% fewer | 41% faster | 82% fewer |
-| **Django** | Python · ~3k | 10% costlier | 45% fewer | 3% slower | 64% fewer |
-| **Tokio** | Rust · ~790 | 30% cheaper | 69% fewer | 22% faster | 71% fewer |
-| **OkHttp** | Java · ~645 | 3% costlier | 32% fewer | 15% faster | 60% fewer |
-| **Gin** | Go · ~110 | 7% cheaper | 35% fewer | 8% faster | 38% fewer |
-| **Alamofire** | Swift · ~110 | 38% cheaper | 45% fewer | 6% faster | 8% fewer |
+| **VS Code** | TypeScript · ~10k files | 13% cheaper | 63% fewer | 11% faster | 82% fewer |
+| **Excalidraw** | TypeScript · ~640 | 40% cheaper | 71% fewer | 51% faster | 82% fewer |
+| **Django** | Python · ~3k | 9% cheaper | 35% fewer | 7% faster | 38% fewer |
+| **Tokio** | Rust · ~790 | 31% cheaper | 59% fewer | 29% faster | 61% fewer |
+| **OkHttp** | Java · ~645 | 4% cheaper | 16% fewer | 11% faster | 40% fewer |
+| **Gin** | Go · ~110 | 28% cheaper | 40% fewer | 25% faster | 35% fewer |
+| **Alamofire** | Swift · ~110 | 32% cheaper | 43% fewer | 6% faster | 13% fewer |
 
 
-CodeGraph cuts **tool calls and total tokens on every repo** and answers large repos with **zero file reads**, while the no-CodeGraph agent spends its budget on grep/find/Read discovery. The **cost** margin is narrower — and occasionally negative on smaller repos (Django, OkHttp) — because a modern model's native search is already cheap and CodeGraph's richer responses cost real input tokens; the consistent wins are fewer tool calls and faster answers.
+CodeGraph cuts **tool calls and total tokens on every repo** and answers large repos with **zero file reads**, while the no-CodeGraph agent spends its budget on grep/find/Read discovery. **Every repo is now cheaper, not just faster** — the two former cost outliers (Django and OkHttp, where the answer spans many interchangeable implementations of one interface) flipped from *costlier* than native search to cheaper once adaptive `codegraph_explore` sizing stopped shipping every sibling's full body. The margin is still narrowest on the smallest repos, where a modern model's native search is already cheap, but it stays positive across the board; the largest wins remain fewer tool calls and faster answers.
 
 
 <details>
 <details>
 <summary><strong>Per-repo breakdown — WITH vs WITHOUT (median of 4)</strong></summary>
 <summary><strong>Per-repo breakdown — WITH vs WITHOUT (median of 4)</strong></summary>
@@ -105,79 +105,79 @@ CodeGraph cuts **tool calls and total tokens on every repo** and answers large r
 **VS Code** · ~10k files
 **VS Code** · ~10k files
 | Metric | WITH cg | WITHOUT cg | Δ |
 | Metric | WITH cg | WITHOUT cg | Δ |
 |---|---|---|---|
 |---|---|---|---|
-| Time | 1m 49s | 2m 16s | 20% faster |
-| File Reads | 0 | 7 | −7 |
+| Time | 1m 58s | 2m 13s | 11% faster |
+| File Reads | 0 | 8 | −8 |
 | Grep/Bash | 0 | 9 | −9 |
 | Grep/Bash | 0 | 9 | −9 |
-| Tool calls | 5 | 16 | 71% fewer |
-| Total tokens | 672k | 1.81M | 63% fewer |
-| Cost | $0.66 | $0.89 | 26% cheaper |
+| Tool calls | 3 | 17 | 82% fewer |
+| Total tokens | 607k | 1.65M | 63% fewer |
+| Cost | $0.66 | $0.76 | 13% cheaper |
 
 
 **Excalidraw** · ~640 files
 **Excalidraw** · ~640 files
 | Metric | WITH cg | WITHOUT cg | Δ |
 | Metric | WITH cg | WITHOUT cg | Δ |
 |---|---|---|---|
 |---|---|---|---|
-| Time | 1m 41s | 2m 51s | 41% faster |
+| Time | 1m 23s | 2m 48s | 51% faster |
 | File Reads | 0 | 11 | −11 |
 | File Reads | 0 | 11 | −11 |
-| Grep/Bash | 0 | 11 | −11 |
-| Tool calls | 4 | 22 | 82% fewer |
-| Total tokens | 692k | 2.39M | 71% fewer |
-| Cost | $0.63 | $1.04 | 40% cheaper |
+| Grep/Bash | 0 | 9 | −9 |
+| Tool calls | 4 | 20 | 82% fewer |
+| Total tokens | 596k | 2.06M | 71% fewer |
+| Cost | $0.53 | $0.89 | 40% cheaper |
 
 
 **Django** · ~3k files
 **Django** · ~3k files
 | Metric | WITH cg | WITHOUT cg | Δ |
 | Metric | WITH cg | WITHOUT cg | Δ |
 |---|---|---|---|
 |---|---|---|---|
-| Time | 2m 2s | 1m 58s | 4% slower |
-| File Reads | 2 | 10 | −8 |
-| Grep/Bash | 0 | 5 | −5 |
-| Tool calls | 5 | 14 | 64% fewer |
-| Total tokens | 720k | 1.30M | 45% fewer |
-| Cost | $0.70 | $0.64 | 10% costlier |
+| Time | 1m 43s | 1m 51s | 7% faster |
+| File Reads | 5 | 10 | −5 |
+| Grep/Bash | 0 | 4 | −4 |
+| Tool calls | 8 | 13 | 38% fewer |
+| Total tokens | 752k | 1.16M | 35% fewer |
+| Cost | $0.56 | $0.62 | 9% cheaper |
 
 
 **Tokio** · ~790 files
 **Tokio** · ~790 files
 | Metric | WITH cg | WITHOUT cg | Δ |
 | Metric | WITH cg | WITHOUT cg | Δ |
 |---|---|---|---|
 |---|---|---|---|
-| Time | 1m 54s | 2m 26s | 22% faster |
-| File Reads | 0 | 11 | −11 |
-| Grep/Bash | 0 | 6 | −6 |
-| Tool calls | 5 | 17 | 73% fewer |
-| Total tokens | 657k | 2.10M | 69% fewer |
-| Cost | $0.61 | $0.86 | 30% cheaper |
+| Time | 2m 3s | 2m 53s | 29% faster |
+| File Reads | 3 | 9 | −6 |
+| Grep/Bash | 0 | 7 | −7 |
+| Tool calls | 7 | 17 | 61% fewer |
+| Total tokens | 869k | 2.14M | 59% fewer |
+| Cost | $0.63 | $0.92 | 31% cheaper |
 
 
 **OkHttp** · ~645 files
 **OkHttp** · ~645 files
 | Metric | WITH cg | WITHOUT cg | Δ |
 | Metric | WITH cg | WITHOUT cg | Δ |
 |---|---|---|---|
 |---|---|---|---|
-| Time | 1m 18s | 1m 32s | 15% faster |
-| File Reads | 1 | 5 | −4 |
-| Grep/Bash | 0 | 6 | −6 |
-| Tool calls | 4 | 10 | 58% fewer |
-| Total tokens | 713k | 1.05M | 32% fewer |
-| Cost | $0.59 | $0.57 | 3% costlier |
+| Time | 1m 18s | 1m 27s | 11% faster |
+| File Reads | 2 | 4 | −2 |
+| Grep/Bash | 0 | 4 | −4 |
+| Tool calls | 5 | 8 | 40% fewer |
+| Total tokens | 739k | 883k | 16% fewer |
+| Cost | $0.54 | $0.56 | 4% cheaper |
 
 
 **Gin** · ~110 files
 **Gin** · ~110 files
 | Metric | WITH cg | WITHOUT cg | Δ |
 | Metric | WITH cg | WITHOUT cg | Δ |
 |---|---|---|---|
 |---|---|---|---|
-| Time | 1m 12s | 1m 18s | 9% faster |
-| File Reads | 0 | 4 | −4 |
-| Grep/Bash | 0 | 4 | −4 |
-| Tool calls | 5 | 8 | 40% fewer |
-| Total tokens | 533k | 815k | 35% fewer |
-| Cost | $0.44 | $0.47 | 7% cheaper |
+| Time | 1m 8s | 1m 30s | 25% faster |
+| File Reads | 0 | 3 | −3 |
+| Grep/Bash | 0 | 5 | −5 |
+| Tool calls | 6 | 9 | 35% fewer |
+| Total tokens | 532k | 887k | 40% fewer |
+| Cost | $0.36 | $0.50 | 28% cheaper |
 
 
 **Alamofire** · ~110 files
 **Alamofire** · ~110 files
 | Metric | WITH cg | WITHOUT cg | Δ |
 | Metric | WITH cg | WITHOUT cg | Δ |
 |---|---|---|---|
 |---|---|---|---|
-| Time | 2m 0s | 2m 7s | 6% faster |
-| File Reads | 6 | 8 | −2 |
-| Grep/Bash | 2 | 4 | −2 |
-| Tool calls | 11 | 12 | 9% fewer |
-| Total tokens | 1.09M | 1.98M | 45% fewer |
-| Cost | $0.63 | $1.01 | 38% cheaper |
+| Time | 2m 19s | 2m 28s | 6% faster |
+| File Reads | 5 | 9 | −4 |
+| Grep/Bash | 1 | 4 | −3 |
+| Tool calls | 11 | 12 | 13% fewer |
+| Total tokens | 1.22M | 2.14M | 43% fewer |
+| Cost | $0.71 | $1.04 | 32% cheaper |
 
 
 </details>
 </details>
 
 
 <details>
 <details>
 <summary><strong>Full benchmark details</strong></summary>
 <summary><strong>Full benchmark details</strong></summary>
 
 
-**Methodology.** Each arm is `claude -p` (Claude Opus 4.8) run headlessly against the repo with `--strict-mcp-config`: **WITH** = CodeGraph's MCP server enabled, **WITHOUT** = an empty MCP config. Built-in Read/Grep/Bash stay available to both. Same question per repo, **4 runs per arm, median reported**. Cost = the run's `total_cost_usd`; Tokens = total tokens processed (input incl. cached + output); Time = wall-clock; Tool calls = every tool invocation, including those inside any sub-agents the model spawns. Repos cloned at `--depth 1` and indexed by the same CodeGraph build that served them. Re-validated on codegraph **v0.9.7** (2026-05-28). These numbers are lower than the prior Opus 4.7 validation — not a CodeGraph regression but a stronger native baseline: Opus 4.8 greps/reads efficiently on the main thread instead of fanning out into large Explore-subagent sweeps, so the no-CodeGraph arm is leaner than it used to be. Per-repo numbers move run-to-run with how hard the without-arm thrashes (the median-of-4 smooths it, but tails remain — e.g. Django's without-arm hit $2.71/14m one batch).
+**Methodology.** Each arm is `claude -p` (Claude Opus 4.8) run headlessly against the repo with `--strict-mcp-config`: **WITH** = CodeGraph's MCP server enabled, **WITHOUT** = an empty MCP config. Built-in Read/Grep/Bash stay available to both. Same question per repo, **4 runs per arm, median reported**. Cost = the run's `total_cost_usd`; Tokens = total tokens processed (input incl. cached + output); Time = wall-clock; Tool calls = every tool invocation, including those inside any sub-agents the model spawns. Repos cloned at `--depth 1` and indexed by the same CodeGraph build that served them. Re-validated 2026-05-29 on the build with adaptive `codegraph_explore` sizing. These numbers are lower than the prior Opus 4.7 validation — not a CodeGraph regression but a stronger native baseline: Opus 4.8 greps/reads efficiently on the main thread instead of fanning out into large Explore-subagent sweeps, so the no-CodeGraph arm is leaner than it used to be. Per-repo numbers move run-to-run with how hard the without-arm thrashes (the median-of-4 smooths it, but tails remain — e.g. Django's without-arm hit $2.71/14m one batch).
 
 
 **Queries:**
 **Queries:**
 | Codebase | Query |
 | Codebase | Query |

+ 373 - 0
__tests__/adaptive-explore-sizing.test.ts

@@ -0,0 +1,373 @@
+/**
+ * Regression test for adaptive `codegraph_explore` sizing — sibling
+ * skeletonization (branch `feat/adaptive-explore-sizing`, commit d6d059f).
+ *
+ * Feature: when a file is BOTH (1) off the synthesized flow spine AND (2) a
+ * polymorphic sibling — its class implements/extends a supertype shared by
+ * >= MIN_SIBLINGS (3) implementers — `codegraph_explore` renders it as a
+ * class + member *signature* skeleton (bodies elided) instead of full source,
+ * keeping the on-spine exemplar and the mechanism full. This sizes the
+ * response to the answer rather than the budget cap on sibling-heavy flows
+ * (OkHttp's interceptor chain) without starving diffuse ones (distinct
+ * pipeline steps stay full). Default ON; CODEGRAPH_ADAPTIVE_EXPLORE=0 disables.
+ *
+ * The fixture is OkHttp's interceptor chain in miniature:
+ *   - `Interceptor` interface with FOUR implementers (>= 3 => a sibling family)
+ *   - a 3-hop call spine `dispatch -> proceed -> handleLogging` that passes
+ *     THROUGH LoggingInterceptor — so that file is the on-spine exemplar
+ *   - Bridge/Cache/RetryInterceptor: off-spine members of the sibling family
+ *     => skeletonize
+ *   - ResponseFormatter implements `Formatter`, which has only ONE impl (< 3)
+ *     => a distinct step: off-spine but NOT a sibling => stays full
+ *
+ * Guards the two ways the feature can silently regress: skeletonizing too much
+ * (a distinct step or the on-spine exemplar) or too little (the off-spine
+ * siblings), plus the escape hatch.
+ */
+import { describe, it, expect, beforeAll, afterAll, beforeEach } from 'vitest';
+import * as fs from 'fs';
+import * as path from 'path';
+import * as os from 'os';
+import { ToolHandler } from '../src/mcp/tools';
+import CodeGraph from '../src/index';
+
+const SKELETON_MARK = '· skeleton (signatures only; Read for a full body)';
+
+/** Return the `#### <path> ...` section for a file basename, header through the
+ *  line before the next `###`/`####` header (or end of output). */
+function sectionFor(text: string, basename: string): string {
+  const lines = text.split('\n');
+  const start = lines.findIndex((l) => l.startsWith('#### ') && l.includes(basename));
+  if (start < 0) return '';
+  let end = lines.length;
+  for (let i = start + 1; i < lines.length; i++) {
+    if (lines[i].startsWith('### ') || lines[i].startsWith('#### ')) {
+      end = i;
+      break;
+    }
+  }
+  return lines.slice(start, end).join('\n');
+}
+
+describe('adaptive codegraph_explore sizing — sibling skeletonization', () => {
+  let testDir: string;
+  let cg: CodeGraph;
+  let handler: ToolHandler;
+
+  // Names the spine (dispatch/proceed/handleLogging), the on-spine exemplar,
+  // the three off-spine siblings, and the distinct step — so every file we
+  // assert on is gathered as relevant. maxFiles overrides the very-tiny tier's
+  // 4-file default so all of them land in one call.
+  const QUERY =
+    'dispatch proceed handleLogging LoggingInterceptor BridgeInterceptor CacheInterceptor RetryInterceptor ResponseFormatter';
+
+  beforeAll(async () => {
+    testDir = fs.mkdtempSync(path.join(os.tmpdir(), 'codegraph-adaptive-explore-'));
+    const srcDir = path.join(testDir, 'src');
+    fs.mkdirSync(srcDir);
+
+    const write = (name: string, body: string) =>
+      fs.writeFileSync(path.join(srcDir, name), body.trimStart());
+
+    // The interchangeable contract — 4 implementers below => sibling family.
+    write(
+      'interceptor.ts',
+      `
+export interface Interceptor {
+  intercept(request: string): string;
+}
+`
+    );
+
+    // The mechanism + the spine: dispatch -> proceed -> (LoggingInterceptor) handleLogging.
+    // Unique method names so the call edges resolve unambiguously.
+    write(
+      'dispatcher.ts',
+      `
+import { LoggingInterceptor } from './logging-interceptor';
+
+export class RequestDispatcher {
+  dispatch(): string {
+    const chain = new InterceptorChain();
+    return chain.proceed();
+  }
+}
+
+export class InterceptorChain {
+  proceed(): string {
+    const exemplar = new LoggingInterceptor();
+    return exemplar.handleLogging();
+  }
+}
+`
+    );
+
+    // On-spine exemplar: handleLogging is the spine's tail, so this whole file
+    // is on-spine and must stay FULL even though it's a sibling (implements Interceptor).
+    write(
+      'logging-interceptor.ts',
+      `
+import { Interceptor } from './interceptor';
+
+export class LoggingInterceptor implements Interceptor {
+  handleLogging(): string {
+    const tag = 'LOGGING_BODY_MARKER';
+    return this.intercept(tag);
+  }
+  intercept(request: string): string {
+    return 'logged:' + request;
+  }
+}
+`
+    );
+
+    // Off-spine siblings — interchangeable impls of Interceptor => SKELETONIZE.
+    // Each body carries a unique marker that must NOT survive skeletonization.
+    write(
+      'bridge-interceptor.ts',
+      `
+import { Interceptor } from './interceptor';
+
+export class BridgeInterceptor implements Interceptor {
+  intercept(request: string): string {
+    const detail = 'BRIDGE_BODY_MARKER';
+    return 'bridged:' + request + detail;
+  }
+}
+`
+    );
+    write(
+      'cache-interceptor.ts',
+      `
+import { Interceptor } from './interceptor';
+
+export class CacheInterceptor implements Interceptor {
+  intercept(request: string): string {
+    const detail = 'CACHE_BODY_MARKER';
+    return 'cached:' + request + detail;
+  }
+}
+`
+    );
+    write(
+      'retry-interceptor.ts',
+      `
+import { Interceptor } from './interceptor';
+
+export class RetryInterceptor implements Interceptor {
+  intercept(request: string): string {
+    const detail = 'RETRY_BODY_MARKER';
+    return 'retried:' + request + detail;
+  }
+}
+`
+    );
+
+    // A 1:1 interface->impl pair: off-spine, implements something, but the
+    // supertype has only ONE impl (< MIN_SIBLINGS) => a DISTINCT step => FULL.
+    write(
+      'formatter.ts',
+      `
+export interface Formatter {
+  format(input: string): string;
+}
+`
+    );
+    write(
+      'response-formatter.ts',
+      `
+import { Formatter } from './formatter';
+import { JsonCodec } from './codec';
+
+export class ResponseFormatter implements Formatter {
+  format(input: string): string {
+    const detail = 'FORMATTER_BODY_MARKER';
+    // Calls into the Codec family from OFF the dispatch spine, so codec.ts is
+    // gathered as relevant but stays off-spine (mirrors Django: compiler.py is
+    // referenced by the flow yet off the QuerySet-iteration spine).
+    return new JsonCodec().encode(input) + detail;
+  }
+}
+`
+    );
+
+    // An off-spine sibling (implements Interceptor) the agent would otherwise
+    // skeletonize — BUT it owns a uniquely-named method `authenticate` the agent
+    // names in the query. Mirrors OkHttp's RealCall (named getResponseWith-
+    // InterceptorChain): a named callable means "show me this", so it stays full.
+    write(
+      'auth-interceptor.ts',
+      `
+import { Interceptor } from './interceptor';
+
+export class AuthInterceptor implements Interceptor {
+  authenticate(token: string): string {
+    const detail = 'AUTH_BODY_MARKER';
+    return 'auth:' + token + detail;
+  }
+  intercept(request: string): string {
+    return this.authenticate(request);
+  }
+}
+`
+    );
+
+    // A base class that DEFINES a >=3-impl supertype AND co-locates its
+    // subclasses in the same file — mirrors Django's compiler.py (SQLCompiler +
+    // SQLInsertCompiler/SQLUpdateCompiler/...). The subclasses' `extends` edges
+    // make the file look like a sibling, but it's the family's base/mechanism,
+    // so it must stay full.
+    write(
+      'codec.ts',
+      `
+export class Codec {
+  encode(input: string): string {
+    const detail = 'CODEC_BASE_MARKER';
+    return input + detail;
+  }
+}
+export class JsonCodec extends Codec {
+  encode(input: string): string { return '{' + input + '}'; }
+}
+export class XmlCodec extends Codec {
+  encode(input: string): string { return '<' + input + '>'; }
+}
+export class YamlCodec extends Codec {
+  encode(input: string): string { return '- ' + input; }
+}
+`
+    );
+
+    cg = CodeGraph.initSync(testDir, { config: { include: ['**/*.ts'], exclude: [] } });
+    await cg.indexAll();
+    handler = new ToolHandler(cg);
+  });
+
+  afterAll(() => {
+    if (cg) cg.destroy();
+    if (testDir && fs.existsSync(testDir)) {
+      fs.rmSync(testDir, { recursive: true, force: true });
+    }
+  });
+
+  beforeEach(() => {
+    // Each test asserts against the default (ON) behaviour unless it opts out.
+    delete process.env.CODEGRAPH_ADAPTIVE_EXPLORE;
+  });
+
+  it('fixture sanity: Interceptor has >=3 implementers, Formatter has <3', () => {
+    const find = (name: string, kind: string) =>
+      cg.searchNodes(name).map((r) => r.node).find((n) => n.name === name && n.kind === kind);
+
+    const interceptor = find('Interceptor', 'interface');
+    const formatter = find('Formatter', 'interface');
+    expect(interceptor).toBeTruthy();
+    expect(formatter).toBeTruthy();
+
+    const implementers = (id: string) =>
+      cg.getIncomingEdges(id).filter((e) => e.kind === 'implements' || e.kind === 'extends').length;
+
+    // The whole gate hinges on this signal — assert the fixture actually
+    // produces the >=3 / <3 split, so a TS-extraction change fails here loudly
+    // rather than silently flipping the skeletonization downstream.
+    expect(implementers(interceptor!.id)).toBeGreaterThanOrEqual(3);
+    expect(implementers(formatter!.id)).toBeLessThan(3);
+  });
+
+  it('skeletonizes off-spine polymorphic siblings (bodies elided, signatures kept)', async () => {
+    const result = await handler.execute('codegraph_explore', { query: QUERY, maxFiles: 12 });
+    const text = result.content?.[0]?.text ?? '';
+
+    // Precondition: the spine must have formed, or nothing skeletonizes.
+    expect(text).toContain('## Flow (call path among the symbols you queried)');
+
+    for (const [file, marker] of [
+      ['bridge-interceptor.ts', 'BRIDGE_BODY_MARKER'],
+      ['cache-interceptor.ts', 'CACHE_BODY_MARKER'],
+      ['retry-interceptor.ts', 'RETRY_BODY_MARKER'],
+    ] as const) {
+      const section = sectionFor(text, file);
+      expect(section, `${file} should be present in the explore output`).not.toBe('');
+      expect(section, `${file} should be skeletonized`).toContain(SKELETON_MARK);
+      // The signature line survives; the body (with its marker) is elided.
+      expect(section).toContain('intercept(request');
+      expect(section, `${file} body marker must NOT survive skeletonization`).not.toContain(marker);
+    }
+  });
+
+  it('keeps the on-spine exemplar full even though it is a sibling', async () => {
+    const result = await handler.execute('codegraph_explore', { query: QUERY, maxFiles: 12 });
+    const text = result.content?.[0]?.text ?? '';
+
+    const section = sectionFor(text, 'logging-interceptor.ts');
+    expect(section, 'logging-interceptor.ts should be present').not.toBe('');
+    expect(section, 'on-spine exemplar must NOT be skeletonized').not.toContain(SKELETON_MARK);
+    // Full source => the body marker is present.
+    expect(section).toContain('LOGGING_BODY_MARKER');
+  });
+
+  it('keeps a distinct step full (off-spine but supertype has < 3 implementers)', async () => {
+    const result = await handler.execute('codegraph_explore', { query: QUERY, maxFiles: 12 });
+    const text = result.content?.[0]?.text ?? '';
+
+    const section = sectionFor(text, 'response-formatter.ts');
+    expect(section, 'response-formatter.ts should be present').not.toBe('');
+    expect(section, 'a 1:1 interface impl is not a sibling and must stay full').not.toContain(SKELETON_MARK);
+    expect(section).toContain('FORMATTER_BODY_MARKER');
+  });
+
+  it('CODEGRAPH_ADAPTIVE_EXPLORE=0 disables skeletonization (siblings render full)', async () => {
+    process.env.CODEGRAPH_ADAPTIVE_EXPLORE = '0';
+    try {
+      const result = await handler.execute('codegraph_explore', { query: QUERY, maxFiles: 12 });
+      const text = result.content?.[0]?.text ?? '';
+
+      expect(text, 'no file should be skeletonized with the flag off').not.toContain(SKELETON_MARK);
+      // The previously-skeletonized siblings now render their full bodies.
+      const section = sectionFor(text, 'bridge-interceptor.ts');
+      expect(section).not.toBe('');
+      expect(section).toContain('BRIDGE_BODY_MARKER');
+    } finally {
+      delete process.env.CODEGRAPH_ADAPTIVE_EXPLORE;
+    }
+  });
+
+  // Names AuthInterceptor's `authenticate` and Codec's `encode` (both methods),
+  // plus the spine tokens so a spine still forms. Same Interceptor family as the
+  // skeleton test, plus the Codec base+subclasses family.
+  const SPARE_QUERY = `${QUERY} authenticate encode AuthInterceptor Codec JsonCodec`;
+
+  it('spares an off-spine sibling when the agent NAMED a callable in it (RealCall fix)', async () => {
+    const result = await handler.execute('codegraph_explore', { query: SPARE_QUERY, maxFiles: 15 });
+    const text = result.content?.[0]?.text ?? '';
+    expect(text).toContain('## Flow (call path among the symbols you queried)');
+
+    // auth-interceptor.ts is an off-spine Interceptor sibling — would skeletonize —
+    // but the agent named its method `authenticate`, so it stays FULL.
+    const auth = sectionFor(text, 'auth-interceptor.ts');
+    expect(auth, 'auth-interceptor.ts should be present').not.toBe('');
+    expect(auth, 'a file holding an agent-named callable must NOT be skeletonized').not.toContain(SKELETON_MARK);
+    expect(auth).toContain('AUTH_BODY_MARKER');
+
+    // Contrast: bridge-interceptor.ts — same family, named only by TYPE — still skeletonizes.
+    const bridge = sectionFor(text, 'bridge-interceptor.ts');
+    expect(bridge, 'a sibling named only by type still skeletonizes').toContain(SKELETON_MARK);
+    expect(bridge).not.toContain('BRIDGE_BODY_MARKER');
+  });
+
+  it('skeletonizes a base+subclasses family file even when named (compiler.py: family override beats the named spare)', async () => {
+    const result = await handler.execute('codegraph_explore', { query: SPARE_QUERY, maxFiles: 15 });
+    const text = result.content?.[0]?.text ?? '';
+
+    // codec.ts defines the base Codec (>=3 subclasses extend it) and co-locates the
+    // subclasses — a redundant, Read-anyway "family" file (Django's compiler.py). Even
+    // though the agent named `encode`, it STILL skeletonizes: a full one would eat the
+    // explore budget and starve the sibling files. Contrast auth-interceptor.ts above,
+    // which is named AND not a family file → spared. This is the override that keeps
+    // Django from regressing (sparing the family file cost more and Read more).
+    const codec = sectionFor(text, 'codec.ts');
+    expect(codec, 'codec.ts should be present').not.toBe('');
+    expect(codec, 'a named base+subclasses family file still skeletonizes (budget)').toContain(SKELETON_MARK);
+    expect(codec, 'the elided base body marker must NOT survive').not.toContain('CODEC_BASE_MARKER');
+  });
+});

+ 258 - 0
docs/design/adaptive-explore-sizing.md

@@ -0,0 +1,258 @@
+# Design + status: adaptive `codegraph_explore` sizing (sibling skeletonization)
+
+**Status:** Implemented & validated, **default-on**, on branch
+`feat/adaptive-explore-sizing` (initial commit `d6d059f`; **refined 2026-05-29**
+after a real-agent A/B exposed a read-back regression — see
+"Refinement" below). Escape hatch: `CODEGRAPH_ADAPTIVE_EXPLORE=0`.
+**Motivation:** make `codegraph_explore` size its output to the *answer* rather
+than always filling the budget cap — so a "sibling-heavy" flow (many
+interchangeable implementations of one interface) stops costing *more* than
+plain grep/read, without starving "diffuse" flows that genuinely need broad
+source.
+
+> **Refinement (2026-05-29) — the read-back regression.** The first cut gated
+> only on *off-spine + polymorphic-sibling*. A real-agent A/B (not the
+> deterministic probe) showed that this skeletonized two files the agent then
+> **Read back**, defeating the point: OkHttp's `RealCall` (it implements the
+> 9-impl `Lockable` *mixin*, so it tripped the sibling signal even though it's
+> the orchestrator) and Django's `compiler.py` (it *defines* `SQLCompiler` and
+> co-locates its subclasses). Two conditions fixed it — a file skeletonizes only
+> if it is **not spared**, where **spared = the agent NAMED a callable in it**
+> (`getResponseWithInterceptorChain`, `SQLCompiler.execute_sql` → keep it full)
+> **UNLESS the file DEFINES a ≥3-impl supertype** (a base+subclasses "family"
+> file is huge and Read-anyway, so skeletonizing it *frees explore budget* for
+> the sibling files the agent would otherwise Read). Result: OkHttp **3%
+> costlier → ~10% cheaper** (RealCall full, 0 read-backs); Django **10% costlier
+> → ~10% cheaper** (compiler.py skeleton frees ~6.5 KB of the 28 KB budget; half
+> the runs answer with 0 reads). The supertype signal was initially used as a
+> *spare* — that was backwards and regressed Django to 9% costlier by starving
+> its budget; it is now an *override* of the named-callable spare. The
+> single-condition history below is kept for context.
+
+---
+
+## TL;DR
+
+`codegraph_explore` returned full source for **every** relevant file up to its
+char budget. On a question whose answer spans many *same-shaped* classes — e.g.
+"how does OkHttp process a request through its interceptor chain?", which touches
+~14 `class … : Interceptor` implementations — that meant ~28 KB of mostly
+**redundant full bodies**. Because those bodies ride in the context window for
+the rest of the session, the WITH-CodeGraph arm cost *more* than the WITHOUT arm
+(which answers the well-named interceptor question in ~10 cheap greps). OkHttp
+was the benchmark's cost outlier (−3% — i.e. *costlier* than native search).
+
+Fix: when a file is **both (a) off the synthesized flow spine and (b) a
+polymorphic sibling**, render it as a **skeleton** (class + member *signatures*,
+bodies elided) instead of full source — keeping the on-spine exemplar and the
+mechanism in full.
+
+- **OkHttp:** the interceptor-chain flow skeletonizes the 5 redundant
+  `: Interceptor` impls while keeping `RealInterceptorChain` (the dispatch
+  mechanism) and `RealCall` (the orchestrator the agent named) full → **~10%
+  cheaper than native, 0 RealCall read-backs** (see Refinement for the corrected
+  numbers; the original `28.5k → 16.6k` / "reads 1 vs 3" figures came from a
+  deterministic probe query, not the agent's real query).
+- **Django:** the QuerySet→SQL flow skeletonizes `compiler.py` (a
+  base+subclasses family file), freeing budget → **~10% cheaper**. (The earlier
+  claim that Django was "byte-identical / 0 skeletons" was an artifact of the
+  *probe* query; the agent's real query DOES surface the SQLCompiler family.)
+- **Excalidraw / Tokio / VS Code / Gin:** explore output is **byte-identical**
+  with the flag on/off (0 skeletons) — their flows have no off-spine
+  ≥3-implementer sibling group. The corrected gate only *adds* a spare
+  condition, so it skeletonizes a **strict subset** of the original gate → these
+  repos provably stay at 0 skeletons (verified by probe).
+
+---
+
+## The problem in one picture
+
+`handleExplore` gathers relevant files, sorts by relevance, and fills up to
+`maxOutputChars` (the "whole-small-file rule" dumps any relevant file ≤220 lines
+in full). The budget is a **target**, not a ceiling:
+
+```
+OkHttp explore (shipped):  RealCall (full) + RealInterceptorChain (full)
+                         + CallServerInterceptor (full, 8.7k)
+                         + Bridge/Connect/Cache/… (full, ~4-5k each)   ← all ~same shape
+                         = ~28k, most of it redundant interceptor bodies
+```
+
+The agent only needs the **mechanism** (`RealInterceptorChain.proceed` iterating
+the chain) + the **contract** every interceptor implements + maybe one concrete
+example. The other five full bodies are padding — but only *because they're
+interchangeable*. On a diffuse question (Excalidraw's render pipeline:
+`mutateElement → … → renderStaticScene`), the off-spine files are **distinct
+steps**, and their bodies do real work — eliding them just makes the agent
+reconstruct them from signatures (more reasoning, net costlier; see "Dead ends").
+
+So the whole game is: **tell "interchangeable sibling" apart from "distinct
+step," cheaply.**
+
+## The gate (refined)
+
+A file is skeletonized iff **all** hold (and `CODEGRAPH_ADAPTIVE_EXPLORE != 0`):
+
+1. **A spine exists.** `buildFlowFromNamedSymbols` returns its path node set
+   (`pathNodeIds`) and the full set of agent-named callables (`namedNodeIds`). If
+   no spine forms, nothing skeletonizes.
+
+2. **Off the flow spine.** No symbol in the file is on the traced chain — that
+   chain is the mechanism the agent is walking, always kept full.
+
+3. **A polymorphic sibling.** The file's class `implements`/`extends` a supertype
+   with **≥ 3 implementers** (`MIN_SIBLINGS`) — the signal that it's one of many
+   *interchangeable* impls. From real `implements`/`extends` edges, cached.
+
+4. **Not spared.** A file is **spared** (kept full) iff the agent **named a
+   callable in it** — a named method/function is something the agent asked to
+   *see* (`getResponseWithInterceptorChain`, `SQLCompiler.execute_sql`), not an
+   interchangeable leaf — **UNLESS the file itself DEFINES a ≥3-impl supertype**.
+   That last clause is the override: a base+subclasses "family" file (Django's
+   `compiler.py`) is huge and Read-anyway, so a full copy just eats explore
+   budget; skeletonizing it *frees* that budget for the sibling files the agent
+   would otherwise Read. So: *named ⇒ spare, unless it's a family file ⇒
+   skeletonize anyway.*
+
+Worked through the two repos:
+
+- **`RealInterceptorChain`** — `proceed` is on the spine → kept full (cond. 2).
+- **`RealCall`** — off-spine, and it trips the sibling signal via the **9-impl
+  `Lockable` mixin** (not because it's an interchangeable interceptor). But the
+  agent named `getResponseWithInterceptorChain`/`execute`/`enqueue` in it, and it
+  defines no ≥3-impl supertype → **spared, kept full** (cond. 4). This is the fix
+  for the read-back: before cond. 4 it skeletonized and the agent Read it back.
+- **`BridgeInterceptor` & the other 4** — off-spine, ≥3-impl siblings, named only
+  by *type*, define no supertype → **skeletonized**. The win.
+- **Django `compiler.py`** — off-spine, a sibling (its subclasses extend
+  `SQLCompiler`), the agent named `execute_sql` in it — *but it defines the
+  `SQLCompiler` supertype*, so the override fires → **skeletonized** (frees
+  budget). Sparing it instead (the wrong first attempt) cost MORE and Read MORE.
+
+## Why "shared supertype with ≥3 implementers" is the signal
+
+The thing that makes OkHttp's interceptors interchangeable is precisely that
+they're **N implementations of one interface**, invoked polymorphically. That is
+a *structural* property the graph records as `implements`/`extends` edges:
+
+```
+14 classes ──implements──▶ Interceptor      (BridgeInterceptor, CacheInterceptor,
+                                              CallServerInterceptor, … )
+```
+
+Excalidraw's `renderStaticScene`, `Scene`, `Collab` share **no** common
+supertype — the ≥3-implementer query returns nothing for them. So the signal
+cleanly separates the two repos, and (validated below) leaves every non-sibling
+flow untouched.
+
+The `≥ 3` threshold matters: 1:1 "service interface → single impl" pairs (the
+common Spring/Java shape) are **not** siblings and stay full. Only genuine
+many-impl families (interceptor chains, strategy/visitor families, codec
+registries) trip the gate.
+
+## Skeleton rendering
+
+For a skeletonized file we emit the class + member **signature lines** (not
+bodies). Because a symbol node's `startLine` can point at a decorator/annotation
+(`@Throws`, `@Override`, `@objc`), we scan forward up to 4 lines for the line
+that actually *names* the symbol, so the skeleton shows the real signature:
+
+```
+#### …/CallServerInterceptor.kt — CallServerInterceptor, intercept, … · skeleton (signatures only; Read for a full body)
+```kotlin
+30  object CallServerInterceptor : Interceptor {
+32  override fun intercept(chain: Interceptor.Chain): Response {
+194 private fun shouldIgnoreAndWaitForRealResponse(code: Int): Boolean =
+```
+```
+
+The header still lists the file's symbols and says `Read for a full body`, so the
+agent can pull one specific implementation if it truly needs it.
+
+## Validation (refined gate)
+
+Headless `claude -p`, Opus 4.8, **WITH vs WITHOUT** CodeGraph (the real benchmark
+arm, not the on/off probe the first cut used). Cost = median `total_cost_usd`.
+
+| Repo | WITH→WITHOUT cost | WITH reads | WITHOUT reads | RealCall/compiler read-back |
+|---|---|---|---|---|
+| **OkHttp** (n=4) | **$0.45 → $0.50** (~10% cheaper) | 2 | 3.5 | **0 / —** (RealCall full) |
+| **Django** (n=6) | **$0.56 → $0.63** (~10% cheaper) | 2 | 8.5 | half the runs read 0 |
+
+Both were the README's **cost outliers** (OkHttp 3% costlier, Django 10%
+costlier) and both flipped to clear wins. OkHttp WITH was cheaper in all 4 runs;
+Django in 5 of 6 (n=6 to see through its high variance). WITHOUT baselines match
+the README ($0.50/$0.63 vs $0.57/$0.64), so the gain is the WITH-arm improving.
+
+The **decisive check now passes for the right reason**: with the named-callable
+spare, OkHttp's `RealCall` stays full and is **never** Read back (it was Read
+back in 3/4 runs before the fix). The inert repos (Excalidraw / Tokio / VS Code /
+Gin) stay at **0 skeletons** — verified by probe — because the refined gate
+skeletonizes a strict subset of the original. (The first cut's "on vs off, reads
+flat 1 vs 3" claim came from a deterministic probe query and did **not** hold for
+the agent's real query — that mismatch is what this refinement corrects.)
+
+## Dead ends (don't re-attempt these)
+
+1. **Demote/rank low-value files** (e.g. broaden `isLowValuePath` to drop
+   `*-testing-support/` fixtures). Improves *content quality* but **not size** —
+   explore refills the freed budget with other full bodies (28,478 → 28,424).
+   Ranking ≠ shrinking; you must *skeletonize* to shrink.
+2. **Gate on entry-node membership.** A precise symbol-bag explore query *names*
+   every chain participant, so they're all "entry nodes" — no separation, nothing
+   skeletonizes.
+3. **Rely on interface-impl synthesizer edges** (`synthesizedBy:'interface-impl'`)
+   for the sibling signal. They were **not** created for OkHttp's `Interceptor`
+   (a Kotlin `fun interface`), so the signal must come from the real
+   `implements`/`extends` edges, not synth edges.
+4. **A plain "core-floor" gate** (keep first N full, skeletonize the rest) —
+   skeletonized Excalidraw's *distinct* steps → **+17% cost regression**. The
+   sibling condition is what makes it safe.
+5. **Sparing a file because it DEFINES the supertype** (the first refinement
+   attempt). Backwards: a base+subclasses *family* file (Django's `compiler.py`,
+   2,266 lines) is huge and Read-anyway, so keeping it full just **eats the 28 KB
+   explore budget and starves the sibling files** the agent then Reads — it
+   regressed Django to **9% costlier** ($0.71). Defining a supertype is instead
+   an **override** that lets a named family file skeletonize anyway.
+6. **Validating skeletonization with the deterministic probe query only.** The
+   probe (`probe-explore.mjs "<symbol bag>"`) and the *agent's* real explore
+   query name symbols differently, so they form different spines and skeletonize
+   different files. The probe said "Django: 0 skeletons / reads flat"; the real
+   agent query skeletonized `compiler.py` and Read it back. **Always confirm with
+   a real-agent A/B (`run-all.sh`), not just the probe.**
+
+## Code
+
+- `src/mcp/tools.ts`
+  - `adaptiveExploreEnabled()` — the flag (default on).
+  - `buildFlowFromNamedSymbols()` — returns `{ text, pathNodeIds, namedNodeIds }`.
+    `namedNodeIds` is every callable the agent named (a superset of the spine) —
+    the named-callable spare reads it.
+  - `handleExplore()` — two cached helpers: `isPolymorphicSibling()` (a node has
+    an outgoing `implements`/`extends` to a ≥3-impl supertype) and
+    `definesPolymorphicSupertype()` (a node HAS ≥3 incoming `implements`/`extends`
+    — i.e. the file is the family base). The skeleton branch:
+    `off-spine && isPolymorphicSibling && !(namedInFile && !definesSupertype)`.
+- `__tests__/adaptive-explore-sizing.test.ts` — 7 cases incl. the named-callable
+  spare (RealCall) and the supertype-family override (compiler.py).
+
+## Frontier / future work
+
+- **Per-symbol skeletonization within a family file.** `compiler.py` is
+  skeletonized whole, so `SQLCompiler.execute_sql` (the base mechanism) becomes a
+  signature too and *is* Read back in ~half the Django runs. The ideal is to keep
+  the base class's methods full and elide only the redundant subclass bodies —
+  shrinking the payload without eliding the answer. Whole-file skeletonization
+  can't express that yet.
+- **Big non-sibling files dominate Django's residual reads.** `query.py` (3,040
+  lines) and `sql/query.py` are not polymorphic families, so skeletonization
+  can't touch them; the agent Reads them when the 28 KB clustered view is
+  insufficient. That's the explore-budget / big-file-clustering frontier, not
+  skeletonization.
+- **Non-interface sibling families** (Go `HandlerFunc` slices, function-pointer
+  registries) aren't caught — they have no `implements`/`extends` edge. Gin's
+  middleware chain, for instance, doesn't trip the gate (its handlers are funcs,
+  not interface impls).
+- **Exemplar selection** when *no* interceptor is on the spine: today all siblings
+  skeletonize and the agent leans on the interface contract; showing one as a
+  forced exemplar might read slightly better (untested).

+ 135 - 7
src/mcp/tools.ts

@@ -239,6 +239,22 @@ function exploreLineNumbersEnabled(): boolean {
   return process.env.CODEGRAPH_EXPLORE_LINENUMS !== '0';
   return process.env.CODEGRAPH_EXPLORE_LINENUMS !== '0';
 }
 }
 
 
+/**
+ * Adaptive explore sizing (default ON). `codegraph_explore` skeletonizes OFF-SPINE
+ * polymorphic-sibling files — a file whose class is one of ≥3 interchangeable
+ * implementations of a shared interface (e.g. OkHttp's `: Interceptor` classes) —
+ * to class + member signatures (bodies elided), keeping the on-spine exemplar full.
+ * This sizes the response to the answer instead of the budget cap on sibling-heavy
+ * flows (OkHttp interceptor-chain explore 28.5k→16.6k, ~28% cheaper than native
+ * search, reads flat). It is PROVABLY INERT elsewhere: distinct pipeline steps (no
+ * ≥3-implementer supertype, e.g. Excalidraw's `renderStaticScene`) and on-spine
+ * files keep full source — output is byte-identical to shipped on excalidraw /
+ * tokio / django / vscode / gin. Set `CODEGRAPH_ADAPTIVE_EXPLORE=0` to disable.
+ */
+function adaptiveExploreEnabled(): boolean {
+  return process.env.CODEGRAPH_ADAPTIVE_EXPLORE !== '0' && process.env.CODEGRAPH_ADAPTIVE_EXPLORE !== 'false';
+}
+
 /**
 /**
  * Prefix each line of a source slice with its 1-based line number, matching
  * Prefix each line of a source slice with its 1-based line number, matching
  * the Read tool's `cat -n` convention (number + tab) so the agent treats it
  * the Read tool's `cat -n` convention (number + tab) so the agent treats it
@@ -1908,7 +1924,8 @@ export class ToolHandler {
    * whose qualifiedName contains another named token (`PmsProductServiceImpl::list`),
    * whose qualifiedName contains another named token (`PmsProductServiceImpl::list`),
    * dropping unrelated `OmsOrderService::list`.
    * dropping unrelated `OmsOrderService::list`.
    */
    */
-  private buildFlowFromNamedSymbols(cg: CodeGraph, query: string): string {
+  private buildFlowFromNamedSymbols(cg: CodeGraph, query: string): { text: string; pathNodeIds: Set<string>; namedNodeIds: Set<string> } {
+    const EMPTY = { text: '', pathNodeIds: new Set<string>(), namedNodeIds: new Set<string>() };
     try {
     try {
       const CALLABLE = new Set(['method', 'function', 'component', 'constructor']);
       const CALLABLE = new Set(['method', 'function', 'component', 'constructor']);
       // Strip only a REAL file extension (Create.cs → Create); KEEP qualified
       // Strip only a REAL file extension (Create.cs → Create); KEEP qualified
@@ -1921,7 +1938,7 @@ export class ToolHandler {
           .map((t) => t.replace(FILE_EXT, '').trim())
           .map((t) => t.replace(FILE_EXT, '').trim())
           .filter((t) => t.length >= 3 && /^[A-Za-z_$][\w$]*(?:(?:::|\.)[\w$]+)*$/.test(t))
           .filter((t) => t.length >= 3 && /^[A-Za-z_$][\w$]*(?:(?:::|\.)[\w$]+)*$/.test(t))
       )].slice(0, 16);
       )].slice(0, 16);
-      if (tokens.length < 2) return '';
+      if (tokens.length < 2) return EMPTY;
       // Pool of name SEGMENTS (Class + method from every token) used to
       // Pool of name SEGMENTS (Class + method from every token) used to
       // disambiguate an ambiguous SIMPLE name: keep a candidate only if its
       // disambiguate an ambiguous SIMPLE name: keep a candidate only if its
       // CONTAINER class is itself named in the query.
       // CONTAINER class is itself named in the query.
@@ -1942,7 +1959,7 @@ export class ToolHandler {
         for (const n of pick.slice(0, 6)) named.set(n.id, n);
         for (const n of pick.slice(0, 6)) named.set(n.id, n);
         if (named.size > 40) break;
         if (named.size > 40) break;
       }
       }
-      if (named.size < 2) return '';
+      if (named.size < 2) return EMPTY;
       const MAX_HOPS = 7;
       const MAX_HOPS = 7;
       let best: Array<{ node: Node; edge: Edge | null }> | null = null;
       let best: Array<{ node: Node; edge: Edge | null }> | null = null;
       // BFS the full call graph (incl. synth edges) from each named seed, but
       // BFS the full call graph (incl. synth edges) from each named seed, but
@@ -1974,7 +1991,7 @@ export class ToolHandler {
         chain.reverse();
         chain.reverse();
         if (!best || chain.length > best.length) best = chain;
         if (!best || chain.length > best.length) best = chain;
       }
       }
-      if (!best || best.length < 3) return '';
+      if (!best || best.length < 3) return EMPTY;
       const out = ['## Flow (call path among the symbols you queried)', ''];
       const out = ['## Flow (call path among the symbols you queried)', ''];
       for (let i = 0; i < best.length; i++) {
       for (let i = 0; i < best.length; i++) {
         const step = best[i]!;
         const step = best[i]!;
@@ -1982,9 +1999,14 @@ export class ToolHandler {
         out.push(`${i + 1}. ${step.node.name} (${step.node.filePath}:${step.node.startLine})`);
         out.push(`${i + 1}. ${step.node.name} (${step.node.filePath}:${step.node.startLine})`);
       }
       }
       out.push('', '> Full source for these symbols is below; codegraph_trace(from,to) for the exact path between two endpoints.', '');
       out.push('', '> Full source for these symbols is below; codegraph_trace(from,to) for the exact path between two endpoints.', '');
-      return out.join('\n');
+      // namedNodeIds = every callable the agent explicitly named (a superset of
+      // the spine). A file holding one is something the agent asked to SEE, so it
+      // must keep full source even if it's an off-spine polymorphic sibling — the
+      // agent named `getResponseWithInterceptorChain` / `SQLCompiler.execute_sql`
+      // as the mechanism, not as an interchangeable leaf. See the skeleton gate.
+      return { text: out.join('\n'), pathNodeIds: new Set(best.map((s) => s.node.id)), namedNodeIds: new Set(named.keys()) };
     } catch {
     } catch {
-      return '';
+      return EMPTY;
     }
     }
   }
   }
 
 
@@ -2217,6 +2239,63 @@ export class ToolHandler {
     }
     }
 
 
     // Step 4: Read contiguous file sections
     // Step 4: Read contiguous file sections
+    // Compute the flow spine once — used both to prepend the Flow section (below)
+    // and to gate adaptive source sizing: files on the spine get full source,
+    // off-spine peers skeletonize.
+    const flow = this.buildFlowFromNamedSymbols(cg, query);
+
+    // Polymorphic-sibling detector for adaptive sizing. A class that implements/
+    // extends a supertype shared by >= MIN_SIBLINGS classes is one of many
+    // INTERCHANGEABLE implementations (OkHttp's 14 `: Interceptor` classes —
+    // showing one + the rest as signatures is enough), as opposed to a DISTINCT
+    // pipeline step (Excalidraw's `renderStaticScene`, which shares no supertype and
+    // must stay full or the agent loses real content). Only off-spine sibling files
+    // skeletonize; distinct steps and on-spine files keep full source. Cache
+    // supertype→(has ≥N implementers) so this stays a handful of edge queries.
+    const MIN_SIBLINGS = 3;
+    const siblingSuper = new Map<string, boolean>();
+    const isPolymorphicSibling = (nodes: Node[]): boolean => {
+      for (const n of nodes) {
+        for (const e of cg.getOutgoingEdges(n.id)) {
+          if (e.kind !== 'implements' && e.kind !== 'extends') continue;
+          let many = siblingSuper.get(e.target);
+          if (many === undefined) {
+            many = cg.getIncomingEdges(e.target)
+              .filter((x) => x.kind === 'implements' || x.kind === 'extends').length >= MIN_SIBLINGS;
+            siblingSuper.set(e.target, many);
+          }
+          if (many) return true;
+        }
+      }
+      return false;
+    };
+
+    // A file that DEFINES a polymorphic supertype (a class/interface with ≥
+    // MIN_SIBLINGS implementers) AND co-locates its subclasses is a redundant
+    // "family" file — Django's compiler.py holds `SQLCompiler` + its 4 subclasses
+    // (SQLInsert/Update/Delete/AggregateCompiler) in 2,266 lines. Such files are
+    // huge and read-anyway, so they should STILL skeletonize even when the agent
+    // named a method in them: a full one eats ~6.5K of the explore budget (Django
+    // is pinned at the 28K cap, truncating), starving the sibling files the agent
+    // then Reads. This flag OVERRIDES the named-callable spare below — it does NOT
+    // by itself spare a file. (OkHttp's RealCall implements the `Lockable` mixin
+    // but defines no ≥3-impl supertype, so the named spare keeps it full.)
+    const superMany = new Map<string, boolean>();
+    const definesPolymorphicSupertype = (nodes: Node[]): boolean => {
+      for (const n of nodes) {
+        if (n.kind !== 'class' && n.kind !== 'interface' && n.kind !== 'struct'
+            && n.kind !== 'trait' && n.kind !== 'protocol' && n.kind !== 'type_alias') continue;
+        let many = superMany.get(n.id);
+        if (many === undefined) {
+          many = cg.getIncomingEdges(n.id)
+            .filter((x) => x.kind === 'implements' || x.kind === 'extends').length >= MIN_SIBLINGS;
+          superMany.set(n.id, many);
+        }
+        if (many) return true;
+      }
+      return false;
+    };
+
     lines.push('### Source Code');
     lines.push('### Source Code');
     lines.push('');
     lines.push('');
     lines.push('> The code below is the **verbatim, current on-disk source** of these files — re-read from disk on this call and line-numbered, byte-for-byte identical to what the Read tool returns. It is NOT a summary, outline, or stale cache. Treat each block as a Read you have already performed: do not Read a file shown here.');
     lines.push('> The code below is the **verbatim, current on-disk source** of these files — re-read from disk on this call and line-numbered, byte-for-byte identical to what the Read tool returns. It is NOT a summary, outline, or stale cache. Treat each block as a Read you have already performed: do not Read a file shown here.');
@@ -2243,6 +2322,55 @@ export class ToolHandler {
       const fileLines = fileContent.split('\n');
       const fileLines = fileContent.split('\n');
       const lang = group.nodes[0]?.language || '';
       const lang = group.nodes[0]?.language || '';
 
 
+      // Adaptive sizing (CODEGRAPH_ADAPTIVE_EXPLORE, default on): skeletonize a file
+      // (member signatures, bodies elided) when it is a redundant member of a
+      // polymorphic family. Skeletonize iff ALL hold:
+      //   1. a flow spine exists,
+      //   2. no symbol in the file is on that spine (it's not the mechanism path),
+      //   3. it IS a polymorphic sibling (≥ MIN_SIBLINGS impls of a shared supertype),
+      //   4. it is NOT SPARED, where a file is spared iff the agent NAMED a callable
+      //      in it (`getResponseWithInterceptorChain` → keep RealCall.kt full so the
+      //      agent doesn't Read it back) UNLESS the file also DEFINES the family's
+      //      supertype — a base+subclasses "family" file (Django's compiler.py) is
+      //      huge and Read-anyway, so skeletonizing it FREES budget for the sibling
+      //      files the agent would otherwise Read (it's the cheaper option, proven by
+      //      A/B: sparing compiler.py cost MORE and Read MORE).
+      // Before condition 4, off-spine + sibling alone skeletonized RealCall.kt (it
+      // implements the 9-impl `Lockable` mixin), which the agent then Read back.
+      const namedInFile = group.nodes.some(n => flow.namedNodeIds.has(n.id));
+      const spared = namedInFile && !definesPolymorphicSupertype(group.nodes);
+      if (adaptiveExploreEnabled() && flow.pathNodeIds.size > 0
+          && !group.nodes.some(n => flow.pathNodeIds.has(n.id))
+          && isPolymorphicSibling(group.nodes)
+          && !spared) {
+        const syms = group.nodes
+          .filter(n => n.kind !== 'import' && n.kind !== 'export' && n.startLine > 0)
+          .sort((a, b) => a.startLine - b.startLine);
+        const seenLn = new Set<number>();
+        const skel: string[] = [];
+        for (const n of syms) {
+          // node.startLine can point at a decorator/annotation (@Throws, @Override,
+          // @objc), so scan forward a few lines for the line that actually NAMES the
+          // symbol — that's the signature the agent needs from a skeleton.
+          let lineNo = n.startLine;
+          for (let k = 0; k < 4; k++) {
+            if ((fileLines[n.startLine - 1 + k] || '').includes(n.name)) { lineNo = n.startLine + k; break; }
+          }
+          if (seenLn.has(lineNo)) continue;
+          seenLn.add(lineNo);
+          const sig = (fileLines[lineNo - 1] || '').trim();
+          if (sig) skel.push(exploreLineNumbersEnabled() ? `${lineNo}\t${sig}` : sig);
+        }
+        if (skel.length > 0) {
+          const names = [...new Set(group.nodes.filter(n => n.kind !== 'import' && n.kind !== 'export').map(n => n.name))]
+            .slice(0, budget.maxSymbolsInFileHeader).join(', ');
+          lines.push(`#### ${filePath} — ${names} · skeleton (signatures only; Read for a full body)`, '', '```' + lang, skel.join('\n'), '```', '');
+          totalChars += skel.join('\n').length + 120;
+          filesIncluded++;
+          continue;
+        }
+      }
+
       // Whole-small-file rule: if a relevant file is small enough to afford,
       // Whole-small-file rule: if a relevant file is small enough to afford,
       // return it ENTIRELY instead of clustering. Clustering exists to tame
       // return it ENTIRELY instead of clustering. Clustering exists to tame
       // god-files (App.tsx ~13k lines); on a ~134-line component a cluster is a
       // god-files (App.tsx ~13k lines); on a ~134-line component a cluster is a
@@ -2542,7 +2670,7 @@ export class ToolHandler {
     // maxOutputChars (observed 30k against a 28k tier cap). A fat explore
     // maxOutputChars (observed 30k against a 28k tier cap). A fat explore
     // payload persists in the agent's context and is re-read as cache-input
     // payload persists in the agent's context and is re-read as cache-input
     // on every subsequent turn, so the overrun is paid many times over.
     // on every subsequent turn, so the overrun is paid many times over.
-    const output = this.buildFlowFromNamedSymbols(cg, query) + lines.join('\n');
+    const output = flow.text + lines.join('\n');
     if (output.length > budget.maxOutputChars) {
     if (output.length > budget.maxOutputChars) {
       const cut = output.slice(0, budget.maxOutputChars);
       const cut = output.slice(0, budget.maxOutputChars);
       const lastNewline = cut.lastIndexOf('\n');
       const lastNewline = cut.lastIndexOf('\n');