Просмотр исходного кода

fix(rust): resolve chained associated-function calls Foo::new().bar() (#750) (#757)

A Rust call through a chained associated function — `Foo::new().bar()`,
`Foo::with(cfg).build()` — dropped the receiver to a bare method name, which
then attached to a same-named method on an unrelated type (a wrong edge) or
didn't resolve. Ports the #645/#608 mechanism for Rust's `::` receivers:

- Part 1: capture Rust return types; `-> Self` yields the `self` marker (resolved
  to the impl's own type, like PHP), references/generics are unwrapped/reduced.
- Part 2: encode an associated-function chain (`Foo::new().bar`), gated to a
  scoped_identifier receiver so instance chains (`x.foo().bar()`) keep bare-name.
- Part 3: resolve via matchScopedCallChain (PHP's `::` resolver, generalized),
  validated by resolveMethodOnType. Wire Rust into the conformance second pass
  (matchScopedCallChain variant) so a chained method provided by a trait the type
  implements (`impl Trait for Type` → existing implements edges) resolves too.

Validated: synthetic decoy + args + Self + trait-default-conformance + absent
safety tests; full suite green (lone failure is the known-flaky #662 daemon test,
passes in isolation). Real-repo A/B vs main: clap (329 .rs) a net precision win —
**+937 added (96% correct builder methods), 622 wrong->right retargets**
(`Command::new().arg()` was mis-resolving to `ArgGroup::arg`, now `Command::arg`),
+162 net unique edges; the pure-drops are largely wrong bare-name edges the fix
correctly stops emitting. tokio-rs/bytes 0/0 (no regression). Known limit: the
single-hop mechanism re-encodes only the first hop of a chain (deeper hops keep
bare-name) — clap's unusually deep builder chains are partly covered.
EXTRACTION_VERSION 10 -> 11.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Colby Mchenry 2 недель назад
Родитель
Сommit
5805f01957

+ 1 - 0
CHANGELOG.md

@@ -29,6 +29,7 @@ and adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 
 ### Fixes
 
+- Rust method calls made through a chained associated function now resolve to the correct type. A call like `Foo::new().bar()` or `Foo::with(cfg).build()` used to drop the receiver, so the chained method silently attached to a same-named method on an unrelated type — or didn't resolve. CodeGraph now captures Rust return types (`-> Self` resolves to the implementing type), infers the chained receiver's type from what the associated function returns, and resolves the method on it — including methods provided by a trait the type implements (via the new `impl Trait for Type` relationships) — creating the edge only when the type or one of its traits genuinely has the method. Existing Rust indexes should be re-indexed (`codegraph index -f`) to benefit. (#750) (Rust)
 - Chained method calls now resolve when the chained method is **inherited from a superclass or declared on an interface/protocol** the receiver's type conforms to — for example a call on a sealed-subclass instance (`Either.Right(x).combine(...)`) that invokes a method defined on its parent type. Previously these chains found no caller edge even though the factory's type was known, so the call was invisible to callers, impact, and trace. CodeGraph now walks the type's supertypes (its `extends` / `implements` relationships) to find the method, creating the edge only when a supertype genuinely declares it (so a wrong inference still produces no edge). This makes Java, Kotlin, and C# factory and fluent chains more complete. Existing indexes should be re-indexed (`codegraph index -f`) to benefit. (#750)
 - Swift method calls made through a static factory, fluent chain, or constructor now resolve to the correct class. A call like `Foo.make().draw()` or `Foo().draw()` used to drop the receiver, so the chained method silently attached to a same-named method on an unrelated class — or didn't resolve at all. CodeGraph now captures Swift return types and infers the chained receiver's type from what the inner call returns (or the constructed type), creating the edge only when that class genuinely has the method (so a wrong inference produces no edge instead of a misleading one). Existing Swift indexes should be re-indexed (`codegraph index -f`) to benefit. (#750) (Swift)
 - C# method calls made through a static factory or fluent chain now resolve to the correct class. A call like `Foo.Create().Bar()` or `JObject.Parse(s).Property(...)` used to lose the receiver's type, so the chained method didn't resolve and the call was invisible to callers/impact/trace. CodeGraph now captures C# return types and infers the chained receiver's type from what the inner call returns, creating the edge only when that class genuinely has the method (so a wrong inference produces no edge). Existing C# indexes should be re-indexed (`codegraph index -f`) to benefit. (#750) (C#)

+ 83 - 0
__tests__/resolution.test.ts

@@ -2534,4 +2534,87 @@ class Caller {
       expect(callerNamesOf('Other::onlyOther')).toEqual([]);
     });
   });
+
+  describe('Rust chained associated-function call resolution (#645/#608 mechanism)', () => {
+    function callerNamesOf(qualifiedName: string): string[] {
+      const target = cg.getNodesByKind('method').find((n) => n.qualifiedName === qualifiedName);
+      if (!target) return [];
+      const names = cg
+        .getIncomingEdges(target.id)
+        .filter((e) => e.kind === 'calls')
+        .map((e) => cg.getNode(e.source)?.name)
+        .filter((n): n is string => !!n);
+      return [...new Set(names)].sort();
+    }
+
+    it('resolves Foo::new().bar() (and a Self return) via the associated fn, never a same-named decoy', async () => {
+      fs.writeFileSync(
+        path.join(tempDir, 'main.rs'),
+        `struct Aaa { _x: i32 }
+impl Aaa { fn bar(&self) {} }
+struct Foo { _x: i32 }
+impl Foo {
+    fn new() -> Foo { Foo { _x: 0 } }
+    fn make() -> Self { Foo { _x: 0 } }
+    fn bar(&self) {}
+}
+fn caller() {
+    Foo::new().bar();
+    Foo::make().bar();
+}
+`
+      );
+      cg = await CodeGraph.init(tempDir, { index: true });
+      expect(callerNamesOf('Foo::bar')).toEqual(['caller']);
+      expect(callerNamesOf('Aaa::bar')).toEqual([]);
+    });
+
+    it('resolves a chain that passes arguments — Foo::with(c).build()', async () => {
+      fs.writeFileSync(
+        path.join(tempDir, 'main.rs'),
+        `struct Config;
+struct Foo { _x: i32 }
+impl Foo {
+    fn with(c: Config) -> Foo { Foo { _x: 0 } }
+    fn build(&self) {}
+}
+fn caller() { Foo::with(Config).build(); }
+`
+      );
+      cg = await CodeGraph.init(tempDir, { index: true });
+      expect(callerNamesOf('Foo::build')).toEqual(['caller']);
+    });
+
+    it('resolves a chained method from a trait the type implements (default method, via conformance)', async () => {
+      fs.writeFileSync(
+        path.join(tempDir, 'main.rs'),
+        `struct Foo { _x: i32 }
+impl Foo { fn new() -> Foo { Foo { _x: 0 } } }
+struct Decoy { _x: i32 }
+impl Decoy { fn draw(&self) {} }
+trait Drawable { fn draw(&self) {} }
+impl Drawable for Foo {}
+fn caller() { Foo::new().draw(); }
+`
+      );
+      cg = await CodeGraph.init(tempDir, { index: true });
+      expect(callerNamesOf('Drawable::draw')).toEqual(['caller']);
+      expect(callerNamesOf('Decoy::draw')).toEqual([]);
+    });
+
+    it('creates NO edge when neither the type nor a supertype has the method (silent miss)', async () => {
+      fs.writeFileSync(
+        path.join(tempDir, 'main.rs'),
+        `struct Foo { _x: i32 }
+impl Foo { fn new() -> Foo { Foo { _x: 0 } } }
+struct Other { _x: i32 }
+impl Other { fn only_other(&self) {} }
+fn caller() { Foo::new().only_other(); }
+`
+      );
+      cg = await CodeGraph.init(tempDir, { index: true });
+      // Foo has no only_other() — must not mis-attach to the same-named Other::only_other.
+      expect(callerNamesOf('Other::only_other')).toEqual([]);
+    });
+  });
 });

+ 1 - 1
src/extraction/extraction-version.ts

@@ -21,4 +21,4 @@
  * turns the re-index hint into noise — keep it honest (see CLAUDE.md, "Honesty
  * in the product is load-bearing").
  */
-export const EXTRACTION_VERSION = 10;
+export const EXTRACTION_VERSION = 11;

+ 31 - 0
src/extraction/languages/rust.ts

@@ -2,6 +2,36 @@ import type { Node as SyntaxNode } from 'web-tree-sitter';
 import { getNodeText, getChildByField } from '../tree-sitter-helpers';
 import type { LanguageExtractor } from '../tree-sitter-types';
 
+/**
+ * A Rust function's declared return type, normalized to the bare type a chained
+ * `Foo::new().bar()` could be called on (the #645/#608 mechanism). Reads the
+ * `return_type` field: `-> Self` yields the marker `self` (resolved to the impl's
+ * own type at resolution time, like PHP's `self`/`static`); a concrete `-> Foo` /
+ * `-> FooBuilder` its name; a reference (`&Foo`) is unwrapped; generics are reduced
+ * to the base type (`Vec<Foo>` → `Vec`); primitives / unit / tuple yield undefined.
+ * Stdlib types that aren't in the graph simply fail the later existence check.
+ */
+function extractRustReturnType(node: SyntaxNode, source: string): string | undefined {
+  let rt = getChildByField(node, 'return_type');
+  if (!rt) return undefined;
+  if (rt.type === 'reference_type') {
+    rt =
+      rt.namedChildren.find(
+        (c: SyntaxNode) =>
+          c.type === 'type_identifier' ||
+          c.type === 'scoped_type_identifier' ||
+          c.type === 'generic_type',
+      ) ?? rt;
+  }
+  if (!rt || rt.type === 'primitive_type' || rt.type === 'unit_type' || rt.type === 'tuple_type') {
+    return undefined;
+  }
+  const text = getNodeText(rt, source).trim().replace(/<[^>]*>/g, '');
+  const last = text.split('::').pop()?.trim();
+  if (!last || !/^[A-Za-z_]\w*$/.test(last)) return undefined;
+  return last === 'Self' ? 'self' : last;
+}
+
 export const rustExtractor: LanguageExtractor = {
   // `function_signature_item` is a trait method DECLARATION (`fn render(&self);`,
   // no body). Extracting it makes a trait's method set first-class, which
@@ -23,6 +53,7 @@ export const rustExtractor: LanguageExtractor = {
   bodyField: 'body',
   paramsField: 'parameters',
   returnField: 'return_type',
+  getReturnType: extractRustReturnType,
   getSignature: (node, source) => {
     const params = getChildByField(node, 'parameters');
     const returnType = getChildByField(node, 'return_type');

+ 16 - 9
src/extraction/tree-sitter.ts

@@ -2528,18 +2528,19 @@ export class TreeSitterExtractor {
               (this.language === 'cpp' ||
                 this.language === 'c' ||
                 this.language === 'kotlin' ||
-                this.language === 'swift') &&
+                this.language === 'swift' ||
+                this.language === 'rust') &&
               receiver &&
               receiver.type === 'call_expression'
             ) {
               // Receiver that is itself a call — `Foo::instance().bar()`,
-              // `openSession()->run()`, `mgr.view().render()` (C/C++), or
-              // `Foo.getInstance().bar()` (Kotlin) / `Foo.make().draw()` (Swift).
-              // Keep the inner call so resolution can infer bar()'s class from what
-              // the inner call RETURNS (#645/#608). Encode as `<innerCallee>().<method>`;
-              // the `().` marker never appears in an ordinary ref, so the resolver
-              // can detect and split it. Other languages keep the bare-name
-              // behavior (dropping the receiver) below.
+              // `openSession()->run()`, `mgr.view().render()` (C/C++),
+              // `Foo.getInstance().bar()` (Kotlin) / `Foo.make().draw()` (Swift), or
+              // `Foo::new().bar()` (Rust). Keep the inner call so resolution can
+              // infer bar()'s class from what the inner call RETURNS (#645/#608).
+              // Encode as `<innerCallee>().<method>`; the `().` marker never appears
+              // in an ordinary ref, so the resolver can detect and split it. Other
+              // languages keep the bare-name behavior (dropping the receiver) below.
               let innerCallee: string;
               let reencode: boolean;
               if (this.language === 'kotlin' || this.language === 'swift') {
@@ -2561,7 +2562,13 @@ export class TreeSitterExtractor {
                 innerCallee = innerFn
                   ? getNodeText(innerFn, this.source).replace(/->/g, '.').replace(/\s+/g, '')
                   : '';
-                reencode = !!innerCallee;
+                // Rust: only re-encode an associated-function chain
+                // (`Foo::new().bar()`), whose inner callee is a path/`scoped_identifier`.
+                // An instance chain (`x.foo().bar()`, inner callee a field_expression)
+                // keeps bare-name — the `::` resolver can't recover a variable's type,
+                // so re-encoding would only drop the edge. C/C++ re-encode any inner.
+                reencode =
+                  this.language === 'rust' ? innerFn?.type === 'scoped_identifier' : !!innerCallee;
               }
               calleeName = reencode ? `${innerCallee}().${methodName}` : methodName;
             } else {

+ 15 - 5
src/resolution/index.ts

@@ -16,7 +16,7 @@ import {
   FrameworkResolver,
   ImportMapping,
 } from './types';
-import { matchReference, matchDottedCallChain, sameLanguageFamily, crossesKnownFamily } from './name-matcher';
+import { matchReference, matchDottedCallChain, matchScopedCallChain, sameLanguageFamily, crossesKnownFamily } from './name-matcher';
 import { resolveViaImport, resolveJvmImport, extractImportMappings, extractReExports, loadCppIncludeDirs, isPhpIncludePathRef } from './import-resolver';
 import { detectFrameworks } from './frameworks';
 import { synthesizeCallbackEdges } from './callback-synthesizer';
@@ -32,8 +32,13 @@ const SUPERTYPE_BEARING_KINDS = new Set<Node['kind']>([
   'class', 'struct', 'interface', 'trait', 'protocol', 'enum',
 ]);
 
-/** Languages whose chained calls use the dotted `inner().method` encoding. */
-const DOT_CHAIN_LANGUAGES = new Set(['java', 'kotlin', 'csharp', 'swift']);
+/**
+ * Languages whose chained static-factory/fluent calls defer to the conformance
+ * second pass. Dotted-receiver languages resolve via matchDottedCallChain; the
+ * `::`-receiver ones (Rust) via matchScopedCallChain.
+ */
+const CHAIN_LANGUAGES = new Set(['java', 'kotlin', 'csharp', 'swift', 'rust']);
+const SCOPED_CHAIN_LANGUAGES = new Set(['rust']);
 
 /** The extractor's chained-receiver encoding: `<inner>().<method>`. */
 const CHAIN_SHAPE = /^(.+)\(\)\.(\w+)$/;
@@ -726,7 +731,7 @@ export class ReferenceResolver {
       // resolvable once implements/extends edges exist (the conformance pass).
       if (
         ref.referenceKind === 'calls' &&
-        DOT_CHAIN_LANGUAGES.has(ref.language) &&
+        CHAIN_LANGUAGES.has(ref.language) &&
         CHAIN_SHAPE.test(ref.referenceName)
       ) {
         this.deferredChainRefs.push(ref);
@@ -839,7 +844,12 @@ export class ReferenceResolver {
     this.clearCaches();
     const resolved: ResolvedRef[] = [];
     for (const ref of deferred) {
-      const match = this.gateLanguage(matchDottedCallChain(ref, this.context), ref);
+      // `::`-receiver languages (Rust) split on `::` (matchScopedCallChain);
+      // dotted-receiver languages on `.` (matchDottedCallChain).
+      const chainMatch = SCOPED_CHAIN_LANGUAGES.has(ref.language)
+        ? matchScopedCallChain(ref, this.context)
+        : matchDottedCallChain(ref, this.context);
+      const match = this.gateLanguage(chainMatch, ref);
       if (match) resolved.push(match);
     }
     if (resolved.length === 0) return 0;

+ 16 - 13
src/resolution/name-matcher.ts

@@ -570,15 +570,17 @@ export function matchCppCallChain(
 }
 
 /**
- * Resolve a PHP fluent static-factory chain whose receiver is a static call —
- * `Cls::for($x)->method()`, encoded by the extractor as `Cls::for().method`
- * (#608, the per-credential Laravel client idiom). The receiver's type is what
- * `Cls::for` returns: a `: self` / `: static` resolves to `Cls` itself, a
- * concrete `: Type` to that type. The outer method is then resolved and
- * VALIDATED on it (resolveMethodOnType requires the method to exist), so a
- * wrong inference yields no edge rather than a wrong one.
+ * Resolve a `::`-scoped factory chain whose receiver is a scoped/static call —
+ * PHP `Cls::for($x)->method()` (#608, the per-credential Laravel client idiom) or
+ * Rust `Foo::new().bar()` (an associated-function call) — both encoded by the
+ * extractor as `Cls::factory().method`. The receiver's type is what `Cls::factory`
+ * returns: a `self` marker (PHP `: self`/`: static`, Rust `-> Self`) resolves to
+ * the factory's own type, a concrete return type to that type. The outer method is
+ * then resolved and VALIDATED on it (resolveMethodOnType requires the method to
+ * exist on the type or a supertype it conforms to), so a wrong inference yields no
+ * edge rather than a wrong one. Shared by the `::`-receiver languages (PHP, Rust).
  */
-export function matchPhpCallChain(
+export function matchScopedCallChain(
   ref: UnresolvedRef,
   context: ResolutionContext,
 ): ResolvedRef | null {
@@ -1080,11 +1082,12 @@ export function matchReference(
     if (result) return result;
   }
 
-  // 1c. PHP fluent static-factory chain — `Cls::for($x)->method()` encoded as
-  // `Cls::for().method` (#608). Same idea as 1b: the receiver's type is the
-  // factory's `: self` / `: Type` return.
-  if (ref.language === 'php') {
-    result = matchPhpCallChain(ref, context);
+  // 1c. `::`-scoped factory chain — PHP `Cls::for($x)->method()` (#608) or Rust
+  // `Foo::new().bar()`, both encoded as `Cls::factory().method`. The receiver's
+  // type is the factory's `self` (PHP `: self`/`: static`, Rust `-> Self`) or
+  // concrete return type.
+  if (ref.language === 'php' || ref.language === 'rust') {
+    result = matchScopedCallChain(ref, context);
     if (result) return result;
   }