Bladeren bron

docs: finalize 0.9.4 — consolidate CHANGELOG + re-validate README benchmark

Folds the framework sweep + retrieval work into [0.9.4] (2026-05-24). README benchmark table refreshed with current-build medians (avg 35% cost / 57% tokens / 46% time / 71% tool calls) + a v0.9.4 re-validation note.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Colby McHenry 1 maand geleden
bovenliggende
commit
3310285b94
2 gewijzigde bestanden met toevoegingen van 45 en 35 verwijderingen
  1. 28 18
      CHANGELOG.md
  2. 17 17
      README.md

+ 28 - 18
CHANGELOG.md

@@ -7,23 +7,36 @@ a [GitHub Release](https://github.com/colbymchenry/codegraph/releases) tagged
 This project follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/)
 and adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 
-## [Unreleased]
+## [0.9.4] - 2026-05-24
 
 ### Added
+- **Framework-aware route resolution — `request → route → handler → service`
+  flows now resolve end-to-end across the supported stacks.** Added or fixed
+  routing for Express (inline arrow handlers → services), Rails, Spring (Java +
+  Kotlin; bare and class-prefixed mappings), Django/DRF (`router.register` →
+  ViewSet), Laravel (`Controller@method`), Flask/FastAPI (decorator stacks,
+  empty-path routers, Flask-RESTful `add_resource`), Gin/chi (group-var routing),
+  ASP.NET (feature-folder + bare attribute routes), Drupal, Rust (Axum chained
+  methods, actix builder API), Vapor (Swift grouped routes), Play (`conf/routes`),
+  Vue/Nuxt SFC templates, Svelte/SvelteKit, and React Router (`<Route>` JSX +
+  object data-router).
+- **Dynamic-dispatch flow synthesis — `codegraph_trace`, `codegraph_callees`, and
+  `codegraph_explore` now follow flows that have no static call edge.** Bridged
+  channels: callback/observer registration, EventEmitter (`on`/`emit`), React
+  re-render (`setState` → `render`) and JSX children, Flutter `setState` → `build`,
+  C++ virtual overrides, and Java/Kotlin interface → implementation dispatch
+  (e.g. Spring `@Autowired svc.list()` → the impl). Each synthesized hop is
+  labeled inline in `trace` with where it was wired up.
 - **`CODEGRAPH_MCP_TOOLS` — trim the exposed MCP tool surface.** Set it to a
   comma-separated list of tool names (e.g. `trace,search,node,context`) to expose
   only those codegraph tools over MCP; unset exposes all of them. Names match on
   the short form, so `trace` and `codegraph_trace` are equivalent. Lets you
   constrain an agent to a minimal surface (or A/B-test tool selection) without
   editing the client's MCP config. Inert by default.
-- **Java/Kotlin interface & abstract dispatch is now traceable.** A call through
-  an injected interface (Spring `@Autowired FooService svc; svc.list()`) or an
-  abstract base previously dead-ended at the interface method — there's no static
-  edge to the implementation — so request→service→impl flows broke at the DI
-  boundary. CodeGraph now synthesizes interface/base-method → implementing-override
-  edges, so `codegraph_trace` and `codegraph_callees` follow the flow into the
-  implementation (e.g. controller → service interface → service impl). JVM-gated,
-  capped per class, overload-aware.
+- **Release archives now ship with a `SHA256SUMS` file**, and the npm launcher
+  verifies the bundle it downloads against it — a mismatch aborts before anything
+  runs. Releases published before this change have no checksum file, so the
+  verification is skipped (not failed) when none is available.
 
 ### Changed
 - **`codegraph_trace` now returns a self-contained flow dossier.** Each hop on
@@ -47,9 +60,13 @@ and adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
   Scoped to the named symbols (no wrong-feature wandering) and bridge-capped (no
   god-function fan-out); absent when the query is fuzzy or has no connected chain.
 
-## [0.9.4] - 2026-05-22
-
 ### Fixed
+- **Static-extraction & resolution correctness fixes** underpinning the framework
+  work above: C++ inheritance (`base_class_clause` was unhandled, so C++ `extends`
+  edges were missing), Dart method body ranges (methods were extracted
+  signature-only), a Python builtin-name handler guard (handlers named
+  `index`/`get`/`update` were silently dropped), and an explore output-budget
+  regression that under-returned source on god-file repos.
 - **Orphaned `codegraph serve --mcp` processes after a parent SIGKILL.** When
   the MCP host (Claude Code, opencode, …) was force-killed — OOM killer, a
   `kill -9`, a container teardown — the child kept running indefinitely on
@@ -61,13 +78,6 @@ and adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
   `5000`, `0` disables). Resolves
   [#277](https://github.com/colbymchenry/codegraph/issues/277).
 
-### Added
-- **Release archives now ship with a `SHA256SUMS` file**, and the npm launcher
-  verifies the bundle it downloads against it — a mismatch aborts before
-  anything runs. Releases published before this change have no checksum file, so
-  the verification is skipped (not failed) when none is available.
-
-### Fixed
 - **`codegraph: no prebuilt bundle for <platform>` after installing through a
   registry mirror.** Installing `@colbymchenry/codegraph` from a registry that
   hadn't mirrored the matching per-platform package — most often the

+ 17 - 17
README.md

@@ -76,26 +76,26 @@ When Claude Code explores a codebase, it spawns **Explore agents** that scan fil
 
 ### Benchmark Results
 
-Tested across **7 real-world open-source codebases** spanning 7 languages, comparing an agent (Claude Code, headless) answering one architecture question **with** and **without** CodeGraph. Each cell is the savings at the **median of 4 runs per arm**.
+Tested across **7 real-world open-source codebases** spanning 7 languages, comparing an agent (Claude Code, headless) answering one architecture question **with** and **without** CodeGraph. Each cell is the savings at the **median of 4 runs per arm**. _Re-validated on **v0.9.4** (2026-05-24)._
 
-> **Average: 35% cheaper · 59% fewer tokens · 49% faster · 70% fewer tool calls**
+> **Average: 35% cheaper · 57% fewer tokens · 46% faster · 71% fewer tool calls**
 
 | Codebase | Language | Cost | Tokens | Time | Tool calls |
 |----------|----------|------|--------|------|------------|
-| **VS Code** | TypeScript · ~10k files | 35% cheaper | 73% fewer | 41% faster | 72% fewer |
-| **Excalidraw** | TypeScript · ~600 | 47% cheaper | 73% fewer | 60% faster | 86% fewer |
-| **Django** | Python · ~2.7k | 34% cheaper | 64% fewer | 59% faster | 81% fewer |
-| **Tokio** | Rust · ~700 | 52% cheaper | 81% fewer | 63% faster | 89% fewer |
-| **OkHttp** | Java · ~640 | 17% cheaper | 41% fewer | 36% faster | 64% fewer |
-| **Gin** | Go · ~150 | 22% cheaper | 23% fewer | 34% faster | 19% fewer |
-| **Alamofire** | Swift · ~100 | 38% cheaper | 59% fewer | 51% faster | 77% fewer |
+| **VS Code** | TypeScript · ~10k files | 26% cheaper | 78% fewer | 52% faster | 85% fewer |
+| **Excalidraw** | TypeScript · ~640 | 52% cheaper | 90% fewer | 73% faster | 96% fewer |
+| **Django** | Python · ~3k | 12% cheaper | 36% fewer | 19% faster | 53% fewer |
+| **Tokio** | Rust · ~790 | 82% cheaper | 86% fewer | 71% faster | 92% fewer |
+| **OkHttp** | Java · ~645 | 2% cheaper | 13% fewer | 31% faster | 45% fewer |
+| **Gin** | Go · ~110 | 21% cheaper | 34% fewer | 27% faster | 40% fewer |
+| **Alamofire** | Swift · ~110 | 47% cheaper | 64% fewer | 48% faster | 83% fewer |
 
 The gains scale with codebase size: on large repos the agent answers from the index in a handful of calls with **zero file reads**, while the no-CodeGraph agent fans out across grep/find/Read (and the sub-agents it spawns). On a small repo like Gin (~150 files) native search is already cheap, so the margin narrows.
 
 <details>
 <summary><strong>Full benchmark details</strong></summary>
 
-**Methodology.** Each arm is `claude -p` (Claude Opus 4.7, Claude Code v2.1.145) run headlessly against the repo with `--strict-mcp-config`: **WITH** = CodeGraph's MCP server enabled, **WITHOUT** = an empty MCP config. Built-in Read/Grep/Bash stay available to both. Same question per repo, **4 runs per arm, median reported**. Cost = the run's `total_cost_usd`; Tokens = total tokens processed (input incl. cached + output); Time = wall-clock; Tool calls = every tool invocation, including those inside any sub-agents the model spawns. Repos cloned at `--depth 1` and indexed by the same CodeGraph build that served them.
+**Methodology.** Each arm is `claude -p` (Claude Opus 4.7) run headlessly against the repo with `--strict-mcp-config`: **WITH** = CodeGraph's MCP server enabled, **WITHOUT** = an empty MCP config. Built-in Read/Grep/Bash stay available to both. Same question per repo, **4 runs per arm, median reported**. Cost = the run's `total_cost_usd`; Tokens = total tokens processed (input incl. cached + output); Time = wall-clock; Tool calls = every tool invocation, including those inside any sub-agents the model spawns. Repos cloned at `--depth 1` and indexed by the same CodeGraph build that served them. Re-validated on codegraph **v0.9.4** (2026-05-24); per-repo numbers move run-to-run with how hard the without-arm thrashes (the median-of-4 smooths it, but tails remain — e.g. Tokio's without-arm hit $2.41/3m one batch).
 
 **Queries:**
 | Codebase | Query |
@@ -111,13 +111,13 @@ The gains scale with codebase size: on large repos the agent answers from the in
 **Raw medians — WITH → WITHOUT:**
 | Codebase | Cost | Tokens | Time | Tool calls |
 |----------|------|--------|------|------------|
-| VS Code | $0.42 → $0.64 | 393k → 1.4M | 1m 0s → 1m 43s | 7 → 23 |
-| Excalidraw | $0.54 → $1.02 | 851k → 3.2M | 1m 17s → 3m 14s | 12 → 83 |
-| Django | $0.41 → $0.62 | 499k → 1.4M | 1m 0s → 2m 25s | 9 → 48 |
-| Tokio | $0.50 → $1.04 | 657k → 3.4M | 1m 5s → 2m 56s | 9 → 75 |
-| OkHttp | $0.36 → $0.44 | 352k → 596k | 45s → 1m 11s | 5 → 14 |
-| Gin | $0.36 → $0.46 | 431k → 562k | 47s → 1m 11s | 7 → 8 |
-| Alamofire | $0.61 → $0.99 | 1.1M → 2.6M | 1m 19s → 2m 41s | 15 → 64 |
+| VS Code | $0.60 → $0.80 | 601k → 2.8M | 1m 10s → 2m 26s | 8 → 55 |
+| Excalidraw | $0.43 → $0.90 | 344k → 3.5M | 48s → 2m 58s | 3 → 79 |
+| Django | $0.59 → $0.67 | 739k → 1.2M | 1m 19s → 1m 38s | 9 → 19 |
+| Tokio | $0.42 → $2.41 | 379k → 2.6M | 53s → 3m 2s | 4 → 53 |
+| OkHttp | $0.47 → $0.47 | 636k → 730k | 42s → 1m 1s | 6 → 11 |
+| Gin | $0.37 → $0.47 | 444k → 675k | 44s → 1m 0s | 6 → 10 |
+| Alamofire | $0.61 → $1.14 | 1.0M → 2.8M | 1m 17s → 2m 27s | 12 → 69 |
 
 **Why CodeGraph wins:** with the index available, the agent answers directly — `codegraph_context` to map the area, then one `codegraph_explore` for the relevant source — and stops, usually with zero file reads. Without it, the agent (and the Explore sub-agents it spawns) spends most of its budget on discovery (find/ls/grep) before reading the right code. CodeGraph only helps when queried *directly*, so its instructions steer agents to answer directly rather than delegate exploration to file-reading sub-agents — otherwise a sub-agent reads files regardless and CodeGraph becomes overhead.