Jelajahi Sumber

fix(extraction): qualified Type::member refs skip the name gate — no-import references resolve (#812)

`KtHandlers::handle` registered from another file produced no edge: the
extraction gate required the scope to be a same-file type or an IMPORTED
name, but Java/Kotlin same-package references and Kotlin companion members
need no import at all, so the gate could never see them. (The "companion
members extract unqualified" limit recorded during Arc A was a probe
artifact: a SINGLE-LINE `class X { companion object { … } }` is an
upstream tree-sitter-kotlin misparse (ERROR node); real multi-line
companions extract transparently as qualified methods of the class.)

Qualified `Type::member` candidates now skip the name gate the same way
`this.<member>` ones do: the explicit-ref syntax is self-selecting, and
resolution stays scope-suffix-anchored + unique-or-drop, so a
`Decoy::handle` can never match a `KtHandlers::handle` ref (tested).

A/B vs main: rxjava +4 (same-package `Maybe::just` / `Single::just`
method refs), fmt +3 (gtest `&Test::DeleteSelf_` /
`&TestSuite::RunSetUpTestSuite` cross-file member pointers), okio 0-delta,
redis byte-identical — every new edge verified genuine, zero calls edges
touched, node counts identical.

Full suite 1392 passed. EXTRACTION_VERSION 22 → 23 (re-index to benefit).

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Colby Mchenry 1 Minggu lalu
induk
melakukan
dce61a5f4a

+ 1 - 0
CHANGELOG.md

@@ -21,6 +21,7 @@ and adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 - Callback-registration coverage deepened across four more shapes: a `this.<member>` registration whose method lives on a **base class** now resolves through the inheritance chain (`bus.on("submit", this.handleSubmit)` in a subclass links to the parent's `handleSubmit`); Java and Kotlin **method references to other classes** (`Handlers::onMessage`, `OtherClass::handle`) resolve across files, with `this::` and `super::` scoped to the defining class and references through a variable deliberately left out; and Swift bare callback names now match only the **enclosing type's** methods (implicit `self`), eliminating a class of wrong edges where a parameter like `request` linked to a same-named method on an unrelated type. (Java, Kotlin, Swift, TypeScript, JavaScript)
 - PHP **string and array callables** now register: a string passed to a callable-taking core function (`usort($items, 'cmp_items')`, `array_map('absint', …)`, `call_user_func`, `spl_autoload_register`, …) links to that function — including across files — and the array forms `[$this, 'method']` and `[Foo::class, 'method']` link to the named method (the `$this` form resolves through the class and its parents). Strings passed to arbitrary functions are deliberately ignored: only known callable positions are trusted. Validated on WordPress core (+556 edges, every sampled edge a genuine registration). (PHP)
 - Ruby **lifecycle-hook symbols** now register: `before_action :authenticate`, `after_save :reindex`, `around_create`, `validate :check`, `rescue_from(…, with: :handler)` and friends link the symbol to the method it names — on the class itself or **inherited from a parent** (`before_action :authenticate` in a controller resolves to `ApplicationController`'s method). `validates` (plural) is excluded since its symbols name attributes, not methods. Validated on rails/rails (+385 edges, every sampled edge genuine). (Ruby)
+- Method references to a type that needed **no import** now resolve: Java/Kotlin same-package references (`.concatMapMaybe(Maybe::just, …)`), **Kotlin companion-object members** (`KtHandlers::handle`), and cross-file C++ member pointers (`&TestSuite::RunSetUpTestSuite`). Resolution stays anchored to the named type, so a same-named member on a different class never matches. (Java, Kotlin, C++)
 - CodeGraph now sees where a function is **registered as a callback**, not just where it's called. A function name passed as an argument (`signal(SIGINT, handler)`, `qsort(…, compare)`, `addEventListener(…, onBlur)`), assigned to a function pointer or field (`ops->recv_cb = my_cb`, `OnClick := Handler`), or placed in a struct initializer or handler table (`{ .recv_cb = my_cb }`, `{ "get", getCommand }`) now produces a reference edge from the registration site to the function — so `codegraph_callers` and `codegraph_impact` surface callback wiring that previously looked like dead code. Works across all supported languages, including the language-specific forms: C/C++ `&fn`, Java `Class::method`, Kotlin `::fn`, Swift `#selector`, Objective-C `@selector`, Ruby `method(:fn)`, Scala eta-expansion, and Delphi/Pascal `@Handler` and `OnClick := Handler` event wiring. Callers output labels these "via callback registration". Resolution is deliberately conservative: an ambiguous name produces no edge rather than a wrong one. Re-index a project to benefit. Thanks @zmcrazy. (#756)
 - The `codegraph_node` MCP tool can now **read a whole source file like the built-in Read tool — only faster, served from the index**. Pass a file path with no symbol and it returns that file's current source with line numbers (the same `<n>⇥<line>` shape Read produces, so an assistant can edit straight from it), narrowable with `offset`/`limit` exactly like Read, plus a one-line note of which files depend on it (the file's blast radius). Use it anywhere you'd reach for Read on an indexed source file. Pass `symbolsOnly: true` for just the file's structure. Configuration/data files (`.yml` / `.properties`) are summarized by key only, never dumped, so secrets in them are never surfaced. The agent-facing guidance was also retuned so assistants reach for codegraph while *implementing* a change (not only when answering questions), since one codegraph call returns the same bytes plus the blast radius, faster than re-reading the file.
 - New `codegraph upgrade` command updates CodeGraph to the latest release in place — it detects how you installed (the standalone `install.sh` / `install.ps1` bundle, npm, or npx) and does the right thing for each, on macOS, Linux, and Windows. Use `codegraph upgrade --check` to see whether an update is available without installing, or `codegraph upgrade <version>` to move to a specific version. After upgrading it reminds you to re-index your projects so they pick up the newer engine's improvements. (#679)

+ 45 - 0
__tests__/function-ref.test.ts

@@ -544,6 +544,51 @@ describe('Function-as-value capture (#756)', () => {
     }
   });
 
+  it('KOTLIN: companion-object refs resolve cross-file without imports; decoy companion untouched', async () => {
+    tmpDir = fs.mkdtempSync(path.join(os.tmpdir(), 'cg-fnref-ktcomp-'));
+    // Same package, no imports — the Java/Kotlin reality the name gate can't
+    // see, which is why qualified `Type::member` candidates skip it.
+    fs.writeFileSync(
+      path.join(tmpDir, 'Handlers.kt'),
+      [
+        'class KtHandlers {',
+        '  companion object {',
+        '    fun handle(x: Int) {}',
+        '  }',
+        '}',
+        'class Decoy {',
+        '  companion object {',
+        '    fun handle(x: Int) {}',
+        '  }',
+        '}',
+      ].join('\n')
+    );
+    fs.writeFileSync(
+      path.join(tmpDir, 'Wirer.kt'),
+      [
+        'fun register(cb: Any) {}',
+        'class Wirer {',
+        '  fun wire() { register(KtHandlers::handle) }',
+        '}',
+      ].join('\n')
+    );
+
+    const cg = CodeGraph.initSync(tmpDir);
+    try {
+      await cg.indexAll();
+      const handles = cg.getNodesByName('handle');
+      const target = handles.find((n) => n.qualifiedName.includes('KtHandlers'))!;
+      const decoy = handles.find((n) => n.qualifiedName.includes('Decoy'))!;
+      const into = cg.getIncomingEdges(target.id).filter((e) => e.metadata?.fnRef === true);
+      expect(into).toHaveLength(1);
+      expect(cg.getNode(into[0]!.source)?.name).toBe('wire');
+      expect(cg.getIncomingEdges(decoy.id).filter((e) => e.metadata?.fnRef === true)).toHaveLength(0);
+    } finally {
+      cg.destroy();
+      tmpDir = undefined;
+    }
+  });
+
   it('SWIFT SCOPING: bare ids hit only the enclosing type’s methods; top-level bare hits functions only', async () => {
     tmpDir = fs.mkdtempSync(path.join(os.tmpdir(), 'cg-fnref-swiftscope-'));
     fs.writeFileSync(

+ 10 - 4
docs/design/function-ref-capture.md

@@ -192,10 +192,16 @@ Index cost on redis: +6% time, +5% db size.
   cross-file (scope gated on same-file types ∪ imported names, incl. the last
   segment of dotted JVM imports); `this::m` / `super::m` ride the
   class-scoped + supertype path.
-- **Kotlin companion-object members** extract UNQUALIFIED (node `handle`, not
-  `KtHandlers::Companion::handle` — pre-existing extraction shape), so
-  `KtHandlers::handle` refs to companion members stay silent rather than
-  guess. Fix belongs in kotlin companion extraction.
+- **Qualified `Type::member` candidates skip the name gate** (like `this.X`):
+  Java/Kotlin same-package references and Kotlin companions need NO import,
+  so the gate could never see their scope — and the explicit-ref syntax is
+  self-selecting while resolution stays scope-suffix-anchored +
+  unique-or-drop (a `Decoy::handle` can't match a `KtHandlers::handle` ref).
+  This is also what resolves companion-member refs: companions extract
+  TRANSPARENTLY (`KtHandlers::handle`, method of the class) in real
+  multi-line code. (A single-line `class X { companion object { … } }` is an
+  upstream tree-sitter-kotlin misparse — ERROR node — and only ever appeared
+  in our own probe fixture; don't chase it.)
 - **Swift cross-file bare references**: Swift sees module-wide symbols without
   imports, so cross-file bare callbacks only resolve when repo-unique
   (functions; methods are enclosing-type-only). Cross-TYPE `#selector`

+ 1 - 1
src/extraction/extraction-version.ts

@@ -21,4 +21,4 @@
  * turns the re-index hint into noise — keep it honest (see CLAUDE.md, "Honesty
  * in the product is load-bearing").
  */
-export const EXTRACTION_VERSION = 22;
+export const EXTRACTION_VERSION = 23;

+ 8 - 26
src/extraction/tree-sitter.ts

@@ -435,15 +435,8 @@ export class TreeSitterExtractor {
     if (isGeneratedFile(this.filePath)) return;
 
     const definedHere = new Set<string>();
-    const definedTypes = new Set<string>();
     for (const n of this.nodes) {
       if (n.kind === 'function' || n.kind === 'method') definedHere.add(n.name);
-      if (
-        n.kind === 'class' || n.kind === 'struct' || n.kind === 'interface' ||
-        n.kind === 'enum' || n.kind === 'trait' || n.kind === 'protocol'
-      ) {
-        definedTypes.add(n.name);
-      }
     }
 
     // Import-binding names only (all binding emitters push kind 'imports').
@@ -493,31 +486,20 @@ export class TreeSitterExtractor {
       //    strictly class-scoped (own members or the validated supertype
       //    pass), so nothing fuzzy can leak.
       //  - `Scope::member` (C++ member-pointers, Java/Kotlin type-qualified
-      //    method refs): the SCOPE name must be a type defined here or an
-      //    imported name (covers `OtherClass::method` cross-file), or the
-      //    member matches the plain gate (back-compat for C++ same-file).
+      //    method refs, PHP `'Cls::m'`): ALWAYS flush — the explicit-ref
+      //    syntax is self-selecting, the referenced type often needs NO
+      //    import (Java/Kotlin same-package, Kotlin companions), and
+      //    resolution is scope-suffix-anchored + unique-or-drop, so a
+      //    same-named member on another class can't match.
       //  - C-family file-scope initializers skip the gate entirely
       //    (constant-expression context — see FnRefSpec.ungatedModes).
       //  - everything else: name ∈ same-file functions/methods ∪ imports.
-      if (!c.name.startsWith('this.')) {
+      if (!c.name.startsWith('this.') && !c.name.includes('::')) {
         const skipGate =
           (ungated?.has(c.mode) === true && atFileScope) ||
           c.skipGate === true; // PHP HOF-position string callables (see FnRefCandidate.skipGate)
-        if (!skipGate) {
-          if (c.name.includes('::')) {
-            const scopeName = c.name.slice(0, c.name.indexOf('::'));
-            const memberName = c.name.slice(c.name.lastIndexOf('::') + 2);
-            if (
-              !definedTypes.has(scopeName) &&
-              !importedNames.has(scopeName) &&
-              !definedHere.has(memberName) &&
-              !importedNames.has(memberName)
-            ) {
-              continue;
-            }
-          } else if (!definedHere.has(c.name) && !importedNames.has(c.name)) {
-            continue;
-          }
+        if (!skipGate && !definedHere.has(c.name) && !importedNames.has(c.name)) {
+          continue;
         }
       }
       const key = `${c.fromNodeId}|${c.name}`;