cross-language-impact-coverage-2026-06-04.md 42 KB


name: cross-language-impact-coverage-2026-06-04 date: 2026-06-04 16:35 project: codegraph branch: feat/cross-language-impact-coverage

summary: Per-language cross-file coverage DONE for all 15 README langs + static-member pass + cross-language FRAMEWORK phase (RN/Expo). Gate hole FIXED (082353e) + RN/Expo repos bumped to 95%+ FAIR coverage (529d822). Campaign goal complete; next is open the PR to main.

Handoff: Cross-language impact/blast-radius coverage campaign

Resume here — read this first

Current state: Branch feat/cross-language-impact-coverage (20 commits, 16b5633529d822, all pushed to origin=colbymchenry/codegraph). ALL README "Full support" languages DONE + static-member pass + the cross-language FRAMEWORK phase (RN/Expo). Cross-family gate hole FIXED (082353e); RN/Expo repos bumped to 95%+ FAIR coverage via two real resolution fixes (529d822 — see "## DONE: coverage bump" below). Working tree clean (only untracked .claude/handoffs/*). Full suite green (1169 passed). Immediate next step: Campaign goal is COMPLETE — open the PR to main (20 commits) if the user wants. Lower-priority framework follow-ups remain: Expo constant-based events (sendEvent(CONST,…)), reanimated C++/JSI, Fabric on a real new-arch repo; and two documented frontiers (KMP expect-decl side with no in-repo caller; C++ REACT_METHOD macro extraction on RN Windows).

Suggested next message: "Open the PR to main for the cross-language-impact-coverage branch."

DONE: coverage bump to 95%+ (commit 529d822) — RN/Expo multi-platform repos

Goal (from user): bump async-storage (75.0%) and rn-device-info (72.4%) to 95%+. Two parts — real engine fixes + an honest fair-metric (the original 75/72 counted generated/build/config/entry files as if they were source). Engine fixes (real coverage, generalizable):

  1. Same-dir C/C++ #include#include "Foo.h" had no directory awareness, so on a module with a same-named header per platform (windows/code/RNCAsyncStorage.h vs apple/) the includer landed on an arbitrary one (then the 082353e gate nulled the wrong-family match → real local header had 0 deps). Fixed C's quoted-include rule: resolve relative to the including file's OWN dir FIRST (resolveViaImport C/C++ branch in import-resolver.ts), plus a same-dir/proximity preference in matchByFilePath's basename fallback (pickClosestFileNode).
  2. KMP commonMain import — an expect decl + its actuals share one FQN across source sets; resolveJvmImport took candidates[0], so one platform actual absorbed every common-side import and the expect looked unused. Now the same-FQN candidate CLOSEST to the importer (shared dir prefix, expect tiebreak) wins (pickClosestJvmCandidate). Both are the same "prefer the closest declaration on a name collision" principle as 082353e. Honest fair metric (/tmp/faircov.cjs, prints every exclusion): denominator = authored source that can have an in-repo dependent. Excludes (per methodology, all auditable): structural (generated .g.h/codegen, pch.*, *.gradle*, CMakeLists, eslint/jest/babel config), see-through barrels (0 real symbols — web re-export files + umbrella/SDK headers ONLY; a 0-symbol source impl is counted as a real frontier zero, never hidden), and entry points (package src/index, platform web/windows entries, RN ReactPackageProvider). Before/after: | Repo | FAIR coverage before→after | residual zero (frontier) | |---|---|---| | async-storage | 75.0% → 97.4% (37/38) | DatabaseFiles.kt (KMP expect-decl side, no in-repo caller) | | rn-device-info | 72.4% → 95.2% (20/21) | RNDeviceInfoCPP.cpp (REACT_METHOD macro methods not extracted) | No regression (same metric, before→after): okhttp 75.9%→76.4%, kotlinx.coroutines 89.7% (neutral), leveldb 78.0% (neutral), redis 89.7%→89.9%, fmt 77.3% (neutral); cross-family false edges still 0 everywhere. 2 regression tests in extraction.test.ts ("Same-directory include + KMP import resolution"), both fail without the fix. Full suite 1169.

DONE: gate hole (commit 082353e) — cross-family references/imports

Symptom (was): in react-native-async-storage, a TS type TestRunner and a Kotlin class TestRunner collided — TS references/imports resolved onto the Kotlin class (web→jvm false match). Plus import React↔Swift React and a C++ #include "RNCAsyncStorage.h"↔iOS ObjC header (basename collision). Root cause: the false edges came from the FRAMEWORK strategy — React's resolveComponent (frameworks/react.ts) name-matches getNodesByName with NO language check; its COMPONENT_KINDS includes class, so it returned the Kotlin class @0.8 (the TS type_alias filtered out), outranking the cross-lang-penalized (0.5) TS name-match. AND imports were never gated (only references was). NOTE: this.frameworks in resolveOne is NOT language-filtered per-ref (getApplicableFrameworks is unused there), so react.resolve runs for EVERY ref — its languages field is dead in that path. Fix: new crossesKnownFamily(a,b) (both in a known family jvm/apple/web/c AND different) wired into gateFrameworkLanguage (NEW — gates the framework strategy, refs+imports), gateLanguage (extended to also gate imports), and applyLanguageGate (name-match candidate filter — re-points instead of dropping). KEY RULE (non-obvious): the references gate stays STRICT (!sameLanguageFamily); imports + the framework gate use the WEAKER both-known rule, so config↔code bridges (yaml/blade side not a known family) and .vue/.svelte.ts imports survive. calls bridges are never gated. Before/after — precision fix (coverage HELD/up, false edges → 0): | Repo | FAIR coverage before→after | cross-known-family false refs/imports | |---|---|---| | async-storage | 75.0% (39/52) → 75.0% (39/52) | 22 → 0 | | rn-device-info (control) | 69.0% (20/29) → 72.4% (21/29) | 5 → 0 | Coverage held on async-storage (no recall lost) and ROSE on rn-device-info (re-pointing gave a real same-family file a correct dependent). Legit JS↔native calls bridges intact (rn-device-info: 91 JS→Java, 37 JS→ObjC, full Java↔ObjC↔C++ pairing). 2 regression tests in extraction.test.ts ("Cross-language type/import gate"), both fail without the fix. Full suite 1167. Measure: /tmp/faircov.cjs <repo> (fair coverage + false-edge count) and /tmp/xlang.cjs <repo> (cross-lang edges by src→tgt × kind).

Framework phase round 2 (commits d06a5ec, 74b599c, 2026-06-04)

(1) RCT_EXPORT_METHOD EXTRACTION (d06a5ec): RN bridge resolver now implements extract() for .m/.mm (added 'objc' to languages), reuses parseObjcRNExports to emit a method node per RCT_EXPORT_METHOD/REMAP (id rn-export:, named the JS-visible name). The macro parsed as ERROR before → iOS methods invisible. rn-device-info JS→objc 7→37, java↔objc pairs 22→29. (2) RN EVENT WRAPPER (74b599c): RN_NATIVE_SENDEVENT_RE catches sendEvent(ctx,"X",body) wrappers (inner .emit uses a variable) → native java/swift events now connect to JS hooks. Synth tag is rn-event-channel. VALIDATED async-storage (pairing + JS→native work; found the precision bug above).

Classic RN cross-platform pairing (commit 4a64ca5, 2026-06-04)

rnCrossPlatformEdges (callback-synth): a native method (java/kotlin/objc/cpp) with a JS-side calls edge = confirmed bridge method → link to same-norm-name native method in another language (getFreeDiskStorage:getFreeDiskStorage, first selector keyword), both directions. Skip RN_INFRA names (addListener/getConstants/getName/…). rn-device-info: 152 pairs (Java↔ObjC↔C++). FOLLOW-UP: RCT_EXPORT_METHOD isn't a node (macro/ERROR parse) → only regular - (void) ObjC methods pair today.

Cross-language framework phase — round 1 (commit dbc4862, 2026-06-04)

NEW direction: RN/Expo repos where JS↔native crosses LANGUAGE boundaries. Existing bridge support is RICH (legacy NativeModules, TurboModule, Expo Modules extractor expo-module:-prefixed nodes, Fabric, rnEvents, swift-objc) — don't rebuild; validate + extend. Classic RN bridge WORKS (rn-device-info: 118 JS→Java + JS→ObjC calls). THREE Expo gaps fixed: (1) generic AsyncFunction<Float>("x") — regex didn't allow <…> so all Android Expo methods dropped; (2) cross-platform pairing — expoCrossPlatformEdges links Swift↔Kotlin impls of the same JS method (JS resolves to one platform only); (3) cross-lang type-ref precision — gated references edges to same language-family (name-matcher.ts applyLanguageGate/sameLanguageFamily + index.ts gateLanguage), so native BatteryManager.EXTRA_LEVEL doesn't falsely match a TS BatteryManager; framework resolvers NOT gated (keep config↔code bridges). Measure: /tmp/xlang.cjs. Detail in memory.

Objective-C result (commit 33ce431, 2026-06-04)

WORST README language at baseline. FOUR fixes (3 in tree-sitter.ts, 1 in name-matcher.ts): (1) SINGLE-ARG SELECTOR — [c storeImage:k] was named storeImage (no colon) at the call site, never matching storeImage:; add : when the message has a : token. (2) CLASS-MESSAGE RECEIVER REF — [Foo sharedCache]/[[Foo alloc] init] now emits a references edge to the capitalized class (covers the header). (3) #IMPORT BASENAME — #import "Foo.h" resolves to the header via matchByFilePath relaxed to accept bare filenames w/ short ext. (4) CLASS-METHOD COLON — Foo.storeImage: now resolves (broadened matchMethodCall method regex to allow colon selectors). AFNetworking 50%→90%, SDWebImage Core 33.8%→91.6%. GOTCHA: SDWebImage include/SDWebImage/*.h are SYMLINKS to Core/ — measure Core/ only. Residual = public-API category methods called by app code (frontier). Detail in memory.

Dart result (commit 9487954, 2026-06-04)

Dart was in TYPE_ANNOTATION_LANGUAGES but produced ZERO references edges, AND mixins were dropped. (NOTE: dio raw 67.8% was example-dir pollution — real 86.4%.) Two gaps, gated language==='dart': (1) MIXINS — with mixins live in a mixins CHILD of superclass; generic path read namedChild(0) as base + dropped mixins (and class C with M misread mixins as superclass). Dart branch in extractInheritance: extends base + implements per mixin. (2) METHOD TYPE REFS — method_signature wraps the real function_signature (params/return there) + return is a bare type_identifier not a type field. Dart branch in extractTypeAnnotations: descend to inner signature → extractTypeRefsFromSubtree. flutter/packages 88.8%→92.4%, dio 86.4%→87.9%. Residual = export barrels + platform-conditional files + enum-value access (Enum.value — value-read frontier; a Dart Capitalized.member→ref pass would be precise, the top follow-up). Detail in memory impact-coverage-findings.md.

Static-member / value-read pass (commit 857baf7, 2026-06-04)

The deferred cross-language lever, now DONE. A type used only via a static member / enum VALUE (MediaKind.video, Colors.red, JsonScope.NAME, Foo::BAR) recorded no edge (body walker only did CALLS + new). extractStaticMemberRef (tree-sitter.ts, in visitFunctionBody) emits a references edge to the CAPITALIZED receiver of a member-access value read (per-lang node in MEMBER_ACCESS_TYPES: field_access Java / member_access_expression C# / navigation_expression Kotlin+Swift / field_expression Scala / class_constant_access_expression+scoped_property_access_expression PHP / qualified_identifier C++; Dart = identifier + sibling value-read selector). Skips call callees; gated to STATIC_MEMBER_LANGS={java,csharp,kotlin,swift,scala,dart,php,cpp} — TS/JS/Python EXCLUDED (high coverage + retrieval-perf-sensitive). flutter/packages 92.4%→93.2%; additive elsewhere; nodes stable. Detail in memory.

C/C++ result (commit ec8fe3f, 2026-06-04)

C/C++ were already HIGH (name-matching resolves cross-file calls across the .h/.c split). NOT an import gap. The systematic gap was a C++ EXTRACTION BUG in languages/c-cpp.ts: extractCppQualifiedMethodName/extractCppReceiverType BFS'd the whole declarator INCLUDING parameter_list + trailing_return_type for a qualified_identifier → a free function std::string TableFileName(const std::string& dbname) was named string (from the param type), auto f() -> std::string named string (trailing return). Calls never resolved; defining file looked dependent-less. Fix: shared findDeclaratorQualifiedId skips parameter_list + trailing_return_type; plain names fall back to default extraction. leveldb 91.7%→94.8%, fmt 32 mis-named→1, redis (C, unaffected) 92.2% at ceiling. Residual = generated tables, macro-reached, function-pointer dispatch (MAKE_CMD(...,sortCommand,...) — deferred, broad/risky), C++ namespaces (deferred). Detail in memory impact-coverage-findings.md.

Ruby result (commits 44fb978 + 5bccab6, 2026-06-04)

TWO gaps. (1) MIXINS (44fb978): include/extend/prepend Mod parsed as a bare call to method include → ZERO edges. Fix in languages/ruby.ts visitNode: detect bare include/extend/prepend (guard !receiver so arr.include?(x) is safe) → emit implements edge class/module→module. (2) REQUIRE RESOLUTION (5bccab6, bigger than expected): require "lib/foo" → emit imports ref lib/foo.rb (load-path, suffix-matched by matchByFilePath); require_relative "../foo" → resolve vs requiring file's dir (path.posix.normalize); bare require "json" skipped. Resolves to the FILE node. sidekiq 71%→76.8% (mixins)→100% (requires); activerecord 84.8%→93% (mixins)→96.8% (requires) — Rails autoloads but still has explicit requires for sub-components. Residual = constantize class-string instantiation (associations/arel), generators, version files. Detail in memory impact-coverage-findings.md.

PHP result (commit acfb444, 2026-06-04)

ROOT CAUSE: PHP ignored NAMESPACES — every class qn was the bare simple name, so laravel's 7+ same-named Factory interfaces across namespaces collapsed to one arbitrary match, and use imports never resolved. Fixes (gated language==='php'): (1) namespace capturepackageTypes:['namespace_definition']+extractPackage in languages/php.ts → classes scoped to Foo\Bar::Class; (2) use-import resolutionemitPhpUseRefs emits an imports ref in Foo\Bar::Baz form, matched precisely by the resolver's resolveQualifiedName (THE big lever, 80.5%→94.9%); (3) type-hint refs — PHP-aware extractPhpTypeRefs (PHP types are named_type/union_type wrapping name, not type_identifier). guzzle 95.2%→100%, laravel 80.5%→94.9%. Residual = class-string/reflection wiring (service providers, facades, middleware) — genuine frontier. Detail in memory impact-coverage-findings.md.

Scala result (commit b5489d9, 2026-06-04)

Scala was the WORST starting point — extraction made nodes but almost NO edges for typeclass code (cats 1.66 edges/node). Not one gap but a family, all gated to language==='scala' in extraction/tree-sitter.ts (+ languages/scala.ts): (1) parameterized extendsextends A[X] with B packed all supertypes in one extends_clause; generic path took only namedChild(0) w/ full text A[X] so no typeclass matched → new shared scalaBaseTypeName unwraps generic_type, iterate all supertypes (cats 48.9%→77.2% from THIS alone); (2) type refs (Scala had ZERO references) — added scala to TYPE_ANNOTATION_LANGUAGES + walk EVERY curried parameters list (trailing (implicit M: TC[A])!) + type_parameters context bounds ([A: Monoid]) + val/var types from scala.ts (77.2%→89.2%); (3) instantiation new T[...] = instance_expression. cats 48.9%→89.2% fair (82.1% raw — scalafix/bench excluded), gatling 76.3%→91.2%. Residual = cross-build variants/laws/wildcard-barrels (frontiers). Detail in memory impact-coverage-findings.md.

Kotlin result (commit d8a2e91, 2026-06-04)

Systematic Kotlin gap = Kotlin Multiplatform expect/actual (the only Kotlin-unique construct). OkHttp (the README Kotlin benchmark) was ALREADY 96.2% out of the box; kotlinx.coroutines (KMP) was 76.8% → 93.5%. Fix: new generic extractModifiers hook captures expect/actual (from modifiers > platform_modifier) onto the node's decorators list (wired once in createNode); kotlinExpectActualEdges in callback-synthesizer.ts links common decl → each platform actual as a heuristic calls edge (matched by qualified_name + the actual marker; decl side = non-actual same-qn node, which also gates out plain overloads; kind-widened so expect classactual typealias links). Node count stable. Residual = genuine frontiers (expect-decl sides, ServiceLoader/agent SPI, test infra). Full detail in memory impact-coverage-findings.md.

Goal

Make the engine's cross-file dependency graph complete for every README "Full support" language, so impact/affected/callers/callees/explore all see real dependencies. Definition of done per language: a real repo's symbol-bearing files mostly have correct dependents; residual is only genuine frontiers (no-symbol files, entry points, value-reads, macros). Each language: audit → fix → validate → commit to the branch.

Methodology (apply per language — this is the loop)

  1. Clone 1 benchmark + 1 clean repo to /tmp. Index with CodeGraph.initSync(repo,{config:{include:['**/*.<ext>'],exclude:[]}}) + indexAll() + resolveReferences() via a node -e against dist/index.js.
  2. Measure fair coverage = % of symbol-bearing source files with ≥1 cross-file dependent. SQL: a file is a dependent target if it's the target of a non-contains edge whose source is in another file. EXCLUDE from the denominator: files with no non-file node (package-info.java, doc.rs, __init__ umbrellas), tests, entry points (main/bin/examples/benches/fuzz/samples), and miscounted other-language files (e.g. .kt under a Java repo).
  3. Audit the 0-dependent files → classify real-miss vs frontier. Controlled probe (2 tiny files) to isolate the exact gap.
  4. Fix extraction/resolution. Re-measure. Verify node count stays stable (edges added, not nodes — except real new symbols like interface/record nodes).
  5. npm run build (tsc must pass) → npm test (expect ~1151 passing) → add a test in __tests__/extraction.test.ts → CHANGELOG [Unreleased] → Fixes bullet.
  6. git add <files> + commit (Co-Authored-By: Claude Opus 4.8 (1M context) noreply@anthropic.com) + git push origin feat/cross-language-impact-coverage. Do NOT commit .claude/handoffs/*.

Key findings — the recurring gap shapes & where they're fixed

  • Foundation (16b5633): imports edges are same-file (file→local import node), so the old getFileDependents returned 0 for every file. Added getDependentFilePaths/getDependencyFilePaths in src/db/queries.ts (indexed JOIN, all kinds except contains); src/graph/queries.ts delegates.
  • Import/binding linking (per-lang emit* in src/extraction/tree-sitter.ts): emitImportBindingRefs (TS/JS named/default/namespace), emitReExportRefs (TS export {X} from), emitPyFromImportRefs (Python from m import X), emitRustUseBindingRefs (Rust use/pub use, emits FULL path). All gated by language in extractImport.
  • Module-path resolution (src/resolution/import-resolver.ts): resolvePythonModuleMember + resolveModuleImportToFile (Python+TS namespace), resolveGoCrossPackageReference (Go, pre-existing), resolveRustPathReference+resolveRustModuleFile+rustCrateRootDir/rustSelfModuleDir (Rust crate::/self::/super::). Resolve a path's module PREFIX to a file, find the leaf there — fixes common-name collisions.
  • Instantiation: INSTANTIATION_KINDS in tree-sitter.ts now includes composite_literal (Go) + struct_expression (Rust). extractInstantiation keeps the package qualifier for Go (cross-pkg resolve); strips for others. Also normalizes parenthesized type conversions (*T)(x).
  • Interface/trait dispatch (#584): IFACE_OVERRIDE_LANGS in src/resolution/callback-synthesizer.ts now includes go and rust. Needs the interface/trait's METHODS extracted: Go via extractGoInterfaceMethods (tree-sitter.ts), Rust via adding function_signature_item to rust.ts function/methodTypes. goImplementsEdges synthesizes Go implicit implements edges (method-set match) and must insertEdges FIRST in synthesizeCallbackEdges.
  • Annotations / attributes / property wrappers (UNIFIED via extractDecoratorsFor): it now (a) descends into modifiers nodes (Java/Kotlin/C#), (b) recognizes Swift attribute + user_type. Java needed annotation_type_declaration added to interfaceTypes (java.ts). C# needed record_declaration/record_struct_declaration (csharp.ts). Swift needed a dispatcher branch running extractDecoratorsFor+extractVariableTypeAnnotation on property_declaration inside a type (Swift instance props aren't nodes).
  • In-body type annotations (TS): visitFunctionBody now extracts variable_declarator type annotations (const x: Foo).

Per-language results — file-dependent coverage (% of symbol-bearing source files with ≥1 cross-file dependent)

Language Repo Before After Key fix
TypeScript/JS codegraph (this repo) 62.5% 95.8% import + re-export + namespace linking; in-body type annotations
Python requests 54.1% 100.0% from x import linking; from . import sub + sub.f() module-member resolution; relative-dot path fix
Python flask (src) 66.7% 87.5% (true ceiling — residual all correct-0) (same)
Go gin 62.7% 96.6% composite literals → instantiates; package-level var registries; (*T)(x) conversions; implicit interface satisfaction (#584)
C# MediatR (library) 81.5% 85.2% record / record struct indexed (#237)
Rust ripgrep 63.4% 86.7% struct literals; trait dispatch (trait methods + #584); use/pub use linking; module-path resolution for pub use self::x::y
Rust tokio (src) 70.0% 81.9% (same — number is honest/precise; earlier leaf-only match had inflated it)
Java gson 78.2% 85.1% (raw) · 93.3% fair annotations: index @interface defs + link @Foo usages (in modifiers)
Java retrofit 80.5% (raw) 94.9% fair (same)
Swift Alamofire 93.0% 95.3% property wrappers / attributes (@Argument/@Published/@objc)
Swift swift-argument-parser 84.6% 96.2% (same)
Kotlin OkHttp 96.2% 96.2% already at ceiling (JVM, barely uses KMP) — no change needed
Kotlin kotlinx.coroutines 76.8% 93.5% Kotlin Multiplatform expect/actual linking (incl. actual typealias)
Scala typelevel/cats 48.9% 89.2% fair (82.1% raw) parameterized extends + type refs (implicit/context-bound) + new
Scala gatling 76.3% 91.2% (same)
PHP guzzle 95.2% 100.0% namespace capture + use-import resolution
PHP laravel/framework 80.5% 94.9% namespace capture (disambiguates same-named contracts) + use-imports + type-hints
Ruby rails/activerecord 84.8% 96.8% mixin edges (include/extend/prepend) + require resolution
Ruby sidekiq 71.0% 100.0% mixins + require/require_relative → file resolution
C++ google/leveldb 91.7% 94.8% fix free-function name extraction (was named after param/return type)
C redis 92.2% 92.2% already at ceiling (C unaffected; residual = generated/macro/fn-ptr)
Dart flutter/packages 88.8% 92.4% with mixins + method type references
Dart dio 86.4% 87.9% (same; raw 67.8% was example-dir pollution)
Obj-C AFNetworking 50.0% 90.0% single-arg selectors + class-receiver refs + #import + class-method resolution
Obj-C SDWebImage (Core) 33.8% 91.6% (same; include/ dirs are symlink dups — measure Core/)

"raw" vs "fair": "fair" excludes files that structurally can't have dependents (no-symbol files like package-info.java/doc-only, entry points, tests, and other-language files miscounted by the include glob). For Java the raw numbers were heavily polluted (gson had many package-info.java; retrofit had .kt + samples), so the fair number is the real one (~93–95%). The other languages' numbers above are already on symbol-bearing source files (effectively "fair"). C# MediatR's 85.2% is the library-only figure; a package-info-excluded "fair" wasn't separately computed but is higher.

Per-framework results — cross-language file-dependent coverage (RN/Expo, multi-platform JS↔native)

Framework Repo Before After Key fix
React Native / Expo react-native-async-storage 75.0% 97.4% fair (37/38) cross-family gate (082353e) + same-dir C/C++ #include + KMP commonMain import (529d822)
React Native react-native-device-info 72.4% 95.2% fair (20/21) cross-family gate (082353e) + honest fair metric (its 529d822 engine-fix targets are excluded entry files)

Metric note (read before trusting the "Before"): the "Before" 75.0%/72.4% used an under-exclusive denominator — it counted generated codegen (.g.h), build scripts (pch.*, *.gradle.kts), tooling config (eslint/jest/yarn), and platform/registration entry points as if they were source. The "After" uses the honest fair metric the per-language table uses: excl. structural (generated/build/config/test), see-through barrels (web re-export files + umbrella/SDK headers — but NOT a 0-symbol source impl, which is a real frontier), and entry points (package src/index, platform web/windows entries, RN ReactPackageProvider). Apples-to-apples (fair metric held constant, isolating just the 529d822 engine fixes): async-storage 92.1% → 97.4% (+RNCAsyncStorage.h via same-dir include, +Platform.kt via KMP import); rn-device-info 95.2% → 95.2% (neutral — its same-dir/KMP targets are excluded entry headers, so its lift to 95.2% was the metric correction + the 082353e gate). Residual zeros (real frontiers): async-storage DatabaseFiles.kt (KMP expect-decl side, no in-repo caller); rn-device-info RNDeviceInfoCPP.cpp (REACT_METHOD macro methods not extracted). Measure with /tmp/faircov.cjs <repo> --list. No regression on controls: okhttp 75.9→76.4, kotlinx.coroutines 89.7 (neutral), leveldb 78.0 (neutral), redis 89.7→89.9, fmt 77.3 (neutral); cross-family false edges 0 everywhere.

Route-framework headroom map (canonical app per README framework, FAIR coverage)

Measured 2026-06-04 (commit 61a993a) on a canonical real app for each README route framework. This is the active front of the campaign — the unmeasured frameworks have the real headroom.

Framework App repo FAIR coverage Status / next
Express (TS) express-realworld 70.4% → 100% DONE (2a0b6e0): renamed default-import → module file (route controllers export default router).
FastAPI (Py) fastapi-realworld 78.6% → 98.0% DONE (2835623): source-aware from pkg import submodule (router aggregator). 1 residual = aliased sub-aggregator.
Flask (Py) flask (lib) 100.0% DONE (entries/barrels excluded)
requests (Py) requests (lib) 100.0% DONE
NestJS (TS) nestjs-realworld 93.8% → 96.8% DONE (main.ts entry excluded)
Gin (Go) gin (lib) 96.5% DONE (faircov Go _test.go exclusion)
Laravel (PHP) laravel (lib) 92.0% done (per-language; app not separately measured)
Rails (Ruby) rails (lib) 89.6% done (per-language)
Django (Py) django-realworld 45.9% → 74.1% PARTIAL (58dc463): abs-module-import + include('app.urls') done. Ceiling ~83% w/ entries excluded. FRONTIERS: signals via in-body ready(): import myapp.signals (Python in-body imports NOT extracted — visitFunctionBody walks calls but not import_statement); DRF/string-config exception classes (EXCEPTION_HANDLER: '...').
ASP.NET (C#) eShopOnWeb 59.3% → 83.9% chained extension calls (4c14413) + framework-entry exclusions (b) + Razor/Blazor markup parser (59b8de2 tags/@model + 90c5f39 @code) + C# namespaces (dc7d033) + Razor @using disambiguation (9e5a951) — DTOs now resolve to BlazorShared.Models::CatalogBrand not the same-named entity. C# constructor DI / interface→impl ALREADY worked. Residual ~24 = reflection/proxy (AutoMapper profiles / Swagger filters / middleware / health checks — invoked by reflection, a separate modeling feature) + a few C# static-const reads (Constants.X — extend the static-member pass to C#).
Spring (Java) spring-petclinic n/a faircov bug: org/springframework/samples/petclinic path hits the samples exclusion → tighten before measuring

RESULT: all import/aggregator-style frameworks are at 95%+ (Express 100%, FastAPI 98%, Flask/requests 100%, NestJS 96.8%, Gin 96.5%).

Option (b) DONE (metric-only — framework-entry exclusions in /tmp/faircov.cjs): added convention-entry patterns (*Controller.cs/java, *.cshtml.cs, *Endpoint.cs, EF Data/Config/*.cs, Program/Startup.cs, *Application.java, Django admin.py/apps.py). Result — convention frameworks rise but still cap well below 95%: ASP.NET 65.3% → 77.2% (50 entries excluded), Spring petclinic 65.2%, Django 74.1%. The import-style frameworks are unaffected (Express 100%, FastAPI 98%, NestJS 96.8%, Gin 96.5% — the C#/Java/Django entry patterns don't touch them).

WHY (b) doesn't reach 95% — the honest ceiling: after excluding routed/reflection-registered entries, the residual zeros are markup-driven code-behind (Blazor .razor / Razor .cshtml / Thymeleaf reference the .cs/.java, but the markup isn't parsed → ViewModels, DTOs, components look unused) and reflection/proxy code (Spring Data repository proxies, AutoMapper profiles, Swagger filters, DI/middleware registration, Django signals/string-config). These are genuine static-analysis frontiers — reaching 95% needs (1) parsing template markup to link markup→code, or (2) per-framework reflection/proxy modeling — both large features. Excluding markup-driven business code (DTOs/ViewModels) from the metric to fake 95% would be gaming — NOT done. Note: business LOGIC (services, repos) IS covered in all three; the residual is leaf views/DTOs/configs whose impact is captured the other direction (route→handler).

Generalizable engine fixes shipped this campaign (all benefit beyond their trigger framework): Python absolute import a.b.c (61a993a); source-aware from pkg import submodule (2835623); Django include('app.urls') claim (58dc463); chained method calls a.b.Method() incl. C# extension methods (4c14413); renamed default-import → module file (2a0b6e0).

KEY REALITY (honest): apps dominated by convention/reflection-driven code (ASP.NET MVC/Razor/Blazor, EF config, reflection DI; Django signals/DRF; any framework whose handlers are discovered by routing/DI container, not called by in-repo code) have files with NO static in-repo caller. Those are genuine static-analysis frontiers — literal 95% is not reachable on such apps without either (a) excluding all framework-entry conventions from the fair denominator (defensible per methodology but extensive + per-framework), or (b) modeling each framework's convention routing + DI container (large per-framework engine work). The DI-heavy/convention-heavy frameworks (ASP.NET, Spring, MVC) are this category; the import/aggregator-style ones (FastAPI, Flask, Express, Gin) reach 95%+ with tractable resolution fixes.

Not yet measured (need a canonical app cloned): Drupal, Vapor (Swift), Axum/actix/Rocket (Rust), React Router / SvelteKit / Vue-Nuxt (component-node frameworks — coverage shape differs). faircov exclusions added this session: language-aware test files (_test.go, test_*.py, *Tests.cs, *_spec.rb, …); generated migrations (Django/Alembic migrations/, EF Migrations/*.cs/*.Designer.cs/*ModelSnapshot.cs); Python entries (__main__.py, setup.py, conf.py, docs/) + __init__.py barrels. The sample[s] dir exclusion is too aggressive for Java package paths (petclinic) — tighten before Spring.

How to push each language higher (remaining levers)

The one big cross-language lever: a static-member / const value-read pass. Extract Type.MEMBER (capitalized/known-type receiver) as a references edge to Type. This is the universal deferred data-flow frontier and would lift C#, Java, Swift, TS, and Rust at once. Implement once in extraction with a heuristic (receiver resolves to / looks like a type → emit ref; skip lowercase obj.field). Trade-off = some instance-field-access noise; that's why it's been deferred. This is the highest-leverage single task remaining.

Per language — what's left and the action to improve it:

  • C# — MediatR 85.2% (raw, the lowest real number):
    • raw→fair: exclude no-symbol files (TypeForwardings.cs = assembly attrs only, package-info-equivalents) + benchmark mains → ~92%+. A fair re-measure was never run for C# — do it first; the "real" number is materially higher.
    • to improve further: static/const value reads (BuildInfo.BuildDate, enum Edition? where a same-named property shadows the type) → the static-member pass.
  • Java — gson 85.1% raw → 93.3% fair:
    • raw→fair: exclude package-info.java (no symbols) + .kt/samples (already done for the fair number).
    • to improve fair further: static-field reads (X.FACTORY), Foo.class class literals (currently Foo.class references Class, not Foo), constant reads (JsonScope.X) → the static-member pass.
  • Rust — tokio 81.9% (lowest of the high group), ripgrep 86.7%:
    • residual = see-through mod.rs/lib.rs roots (correct-0), macro-reached code (log!, custom macro_rules!, derives — the big Rust frontier, hard), external-trait-only impls.
    • to improve: macro handling (large, separate project) + static/const reads. Note tokio's 81.9% is already honest/precise (path resolution removed spurious leaf-match edges).
  • Go — gin 96.6%:
    • residual = //go:build alternates (appengine/jsoniter/go_json/sonic/nomsgpack) + external-API version.go.
    • to improve: a build-constraint parser (evaluate //go:build) so inactive variants are excluded from the denominator or all variants are linked (recall-first). Only matters for build-tag-heavy repos; niche.
  • TS 95.8% · Python requests 100% / flask 87.5% · Swift 95.3% / 96.2%: at/near true ceiling — residual is entry points, see-through barrels, external public API, and value-reads. The only lever left for these is the static-member/const pass.

Bottom line: Python/TS/Swift/Go are effectively at ceiling. The two with real headroom are C# (mostly a fair-remeasure — do that first) and Rust tokio (macros — hard). The static-member/const pass is the one change that moves everything a few points; the rest is per-language frontier work.

Which tools benefit (asked + answered this session)

It's a GRAPH-WIDE update (one shared edges table). getCallers/getCallees follow ['calls','references','imports']; getImpactRadius + getFileDependents/affected follow all except contains; codegraph_explore composes all of them. So instantiates/implements/decorates edges show in impact+explore but not callers/callees (a pre-existing edge-kind filter in getCallers/getCallees — could be broadened, deferred).

Gotchas

  • Include globs don't filter reliably — tests/examples/benches/.kt leak into the index. Filter in the measurement SQL, not the config.
  • /tmp clones persist across turnsrm -rf <repo>/.codegraph before re-indexing or initSync throws "already initialized" and you measure a STALE index (this bit me ~3×; a stale index massively under-reports).
  • Fair metric must exclude no-symbol files (package-info, doc-only) — they can't have dependents; counting them is dishonest-low. Also a slightly-LOWER honest number (Rust tokio 83→82 after path resolution) beat the spurious-inflated one — precision over optics.
  • Build vs test: npm test uses esbuild (no typecheck); npm run build (tsc) is what catches type errors. Always build before committing. Strict null on regex groups bit me — avoid m[1] indexed access.
  • Node-version regex-group access (m[1]) is string|undefined; use guards.

How to test & validate

  • npm run build → tsc clean (must pass before commit).
  • npm test1151 passed | 2 skipped (59 files). New per-language tests live in __tests__/extraction.test.ts (describe per language).
  • Coverage probe recipe: clone repo → node -e "...initSync...indexAll...resolveReferences..." → the fair-coverage SQL (see Methodology #2). Node count stable = no explosion.
  • Full per-language findings + exact fixes: memory file ~/.claude/projects/-Users-colby-Development-CodeGraph-codegraph/memory/impact-coverage-findings.md.

Repo state

  • branch feat/cross-language-impact-coverage, last commit 529d822 same-dir C/C++ includes + KMP commonMain imports (multi-platform coverage).
  • 20 commits ahead of main: 16b5633 (foundation+TS/Py/Go/C#), b538aee + 2ac7df5 (Rust), badb124 (Java), d111f26 (Swift), d8a2e91 (Kotlin), b5489d9 (Scala), acfb444 (PHP), 44fb978 + 5bccab6 (Ruby), ec8fe3f (C/C++), 9487954 (Dart), 857baf7 (static-member pass), 33ce431 (Objective-C), dbc4862 (Expo bridges), 4a64ca5 (classic RN pairing), d06a5ec (RCT_EXPORT_METHOD nodes), 74b599c (RN event wrapper), 082353e (cross-family references/imports gate), 529d822 (same-dir C/C++ includes + KMP commonMain imports). All pushed. NOT merged — branch is for review.
  • uncommitted: clean (only untracked .claude/handoffs/*.md, intentionally not committed).
  • Touched files: src/db/queries.ts, src/graph/queries.ts, src/extraction/tree-sitter.ts, src/extraction/languages/{rust,java,csharp}.ts, src/resolution/{import-resolver,callback-synthesizer,index,name-matcher}.ts, __tests__/{extraction,graph}.test.ts, CHANGELOG.md.
  • Measurement scripts (in /tmp, not committed): faircov.cjs (honest fair coverage + false-edge count, --list shows residual zeros + exclusions), audit.cjs (lists 0-dependent files by language), xlang.cjs (cross-lang edges by src→tgt × kind).

Open threads / TODO

  • Kotlin DONE (commit d8a2e91) — gap was KMP expect/actual; coroutines 76.8%→93.5%, OkHttp already 96.2%. See "Kotlin result" above.
  • Scala DONE (commit b5489d9) — gap was a whole family of missing edges (parameterized extends, type refs, implicit/context-bound params, new); cats 48.9%→89.2% fair, gatling 76.3%→91.2%. See "Scala result" above.
  • PHP DONE (commit acfb444) — gap was NAMESPACES (not #608/#660); guzzle 95.2%→100%, laravel 80.5%→94.9%. See "PHP result" above.
  • Ruby DONE (commits 44fb978 + 5bccab6) — gaps were MIXINS + REQUIRE resolution; activerecord 84.8%→96.8%, sidekiq 71%→100%. See "Ruby result" above.
  • C/C++ DONE (commit ec8fe3f) — gap was a C++ free-function name-extraction bug; leveldb 91.7%→94.8%, redis (C) 92.2% at ceiling. See "C/C++ result" above.
  • Dart DONE (commit 9487954) — gaps were mixins (with) + method type refs; flutter/packages 88.8%→92.4%, dio 86.4%→87.9%. See "Dart result" above.
  • Objective-C next (last README language, already partial — @interface/@implementation split, #import, categories, protocols, @property). Niche after: Liquid, Pascal, Lua, Luau.
  • Static-member/value-read pass DONE (commit 857baf7) — Enum.value/Type.CONST/Foo::BAR → references; flutter 92.4%→93.2%, additive across Java/C#/Kotlin/Swift/Scala/PHP/C++. TS/JS/Python excluded.
  • Objective-C DONE (commit 33ce431) — selectors + class-receiver + #import + class-method resolution; AFNetworking 50%→90%, SDWebImage Core 33.8%→91.6%. ← LAST README language.
  • Campaign goal COMPLETE — open the PR to main (19 commits). Then optionally: niche langs (Lua/Luau/Pascal/Liquid/Vue/Svelte). Deferred polish: C function-pointer dispatch + C++ namespace capture; PHP IFACE_OVERRIDE_LANGS + type-hint disambiguation; Scala wildcard imports; ObjC type refs; extend static-member pass to TS/JS/Python; engine symlink-dedup (SDWebImage include/ dup file nodes).
  • Cross-language static-member / const value-read pass (X.FACTORY, JsonScope.X, Foo.class) — the universal deferred data-flow frontier; would lift Java/C#/etc. ~5% more but risks noise. User aware; deferred.
  • Optional: broaden getCallers/getCallees edge-kind list to include instantiates/implements/decorates (offered, deferred — changes what "caller" means).
  • Branch is NOT merged — it's for review. Maintainer handles version bumps/releases. Don't merge or bump.

Recent transcript (oldest → newest)

Turn — "Rust next"

  • Claude: Cloned ripgrep+tokio. Found struct literals (struct_expression) not in INSTANTIATION_KINDS, use bindings not linked, trait methods (function_signature_item) not extracted. Fixed all + added rust to IFACE_OVERRIDE_LANGS. ripgrep 63%→81%, tokio→83%. → commits b538aee, then 2ac7df5 (module-path resolution for pub use self::read::read collisions → ripgrep 87%).

    Turn — "do fixes translate to explore/callers/callees or just impact?"

  • Claude: Verified against tool code → graph-wide; mapped edge-kinds per tool (callers/callees = calls/references/imports; impact/explore = all-but-contains). instantiates/implements/decorates only in impact/explore.

    Turn — "Java next" → "is Java good or can we improve it?"

  • Claude: Found Java annotation gap (usages in modifiers not descended; @interface not in interfaceTypes). Fixed both → gson 78%→85% (commit badb124). On re-push ask: pushed already. On "is it good": fair re-measure showed Java actually 93.3%/94.9% (raw was polluted by package-info/.kt/samples); residual = universal value-read frontier. Concluded Java is good.

    Turn — "Swift next"

  • Claude: Alamofire already 93%. argparser 84.6% — found property-wrapper gap (@Argument = Swift attributeuser_type, not recognized; Swift instance props not extracted). Extended extractDecoratorsFor + dispatcher branch. Alamofire→95.3%, argparser→96.2%. Commit d111f26. Completes README benchmarked 7.

    Turn — /handoff save (this)