Jelajahi Sumber

fix(swift): resolve chained static-factory/fluent calls + nested-extension naming (#750) (#755)

Completes Swift in the #750 chained-call series (after Java #751, Kotlin #752,
C# #753, conformance #754). Two parts:

1. Swift chained-call resolution (the #645/#608 mechanism): capture Swift return
   types (positional, member types -> last segment), encode capitalized-receiver
   chains `Foo.make().draw()` / `Foo(args).draw()`, resolve+validate via the
   shared matchDottedCallChain (+ constructor branch). Fixes the decoy wrong-edge
   bug where a chained method dropped to a bare name and attached to a same-named
   method on an unrelated class.

2. Nested-type extension naming fix: `extension KF.Builder: KFOptionSetter` parsed
   as a class_declaration named `KF.Builder` (dot) — inconsistent with the type's
   own declaration `KF::Builder` (name `Builder`) — so the extension's conformances
   and members were invisible to a chained call on the type. A Swift resolveName
   now names a nested-type extension by its last segment (`Builder`), so its
   `implements`/`extends` edges and methods are found by the supertype walk
   (conformance #754) and the simple-name method match.

Validated: synthetic decoy + args + constructor + absent-method tests; full suite
green; nested-extension repro (`KF.url().onSuccess()` resolves via conformance to
the protocol method). Real-repo A/B vs main (conformance) — Alamofire and
Kingfisher both **0 added / 0 removed, node count unchanged**: NEUTRAL and SAFE.
The prior -168 Kingfisher regression (from the naming inconsistency) is eliminated;
Swift's unique-named fluent methods already resolved by bare name, so the chain
path lands the same edges — the value here is decoy-collision correctness, the
nested-extension naming fix, and consistency with the other four languages.
EXTRACTION_VERSION 9 -> 10.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Colby Mchenry 2 minggu lalu
induk
melakukan
7c7f0dd56f

+ 1 - 0
CHANGELOG.md

@@ -30,6 +30,7 @@ and adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 ### Fixes
 
 - Chained method calls now resolve when the chained method is **inherited from a superclass or declared on an interface/protocol** the receiver's type conforms to — for example a call on a sealed-subclass instance (`Either.Right(x).combine(...)`) that invokes a method defined on its parent type. Previously these chains found no caller edge even though the factory's type was known, so the call was invisible to callers, impact, and trace. CodeGraph now walks the type's supertypes (its `extends` / `implements` relationships) to find the method, creating the edge only when a supertype genuinely declares it (so a wrong inference still produces no edge). This makes Java, Kotlin, and C# factory and fluent chains more complete. Existing indexes should be re-indexed (`codegraph index -f`) to benefit. (#750)
+- Swift method calls made through a static factory, fluent chain, or constructor now resolve to the correct class. A call like `Foo.make().draw()` or `Foo().draw()` used to drop the receiver, so the chained method silently attached to a same-named method on an unrelated class — or didn't resolve at all. CodeGraph now captures Swift return types and infers the chained receiver's type from what the inner call returns (or the constructed type), creating the edge only when that class genuinely has the method (so a wrong inference produces no edge instead of a misleading one). Existing Swift indexes should be re-indexed (`codegraph index -f`) to benefit. (#750) (Swift)
 - C# method calls made through a static factory or fluent chain now resolve to the correct class. A call like `Foo.Create().Bar()` or `JObject.Parse(s).Property(...)` used to lose the receiver's type, so the chained method didn't resolve and the call was invisible to callers/impact/trace. CodeGraph now captures C# return types and infers the chained receiver's type from what the inner call returns, creating the edge only when that class genuinely has the method (so a wrong inference produces no edge). Existing C# indexes should be re-indexed (`codegraph index -f`) to benefit. (#750) (C#)
 - Kotlin method calls made through a companion-object factory or fluent chain now resolve to the correct class. A call like `Foo.getInstance().bar()` or `Config.create(opts).build()` used to drop the receiver entirely, so the chained method silently attached to a same-named method on an unrelated class — or didn't resolve at all — corrupting callers, impact, and trace. CodeGraph now captures Kotlin return types and infers the chained receiver's type from what the inner call returns, creating the edge only when that class genuinely has the method (so a wrong inference produces no edge instead of a misleading one). Existing Kotlin indexes should be re-indexed (`codegraph index -f`) to benefit. (#750) (Kotlin)
 - Java method calls made through a static factory or fluent chain now resolve to the correct class. A call like `Foo.getInstance().bar()` or `Config.create(opts).build()` used to lose the receiver's type, so when two classes had a same-named method the call silently attached to whichever was indexed first — or didn't resolve at all — corrupting callers, impact, and trace. CodeGraph now captures Java return types and infers the chained receiver's type from what the inner call returns, creating the edge only when that class genuinely has the method (so a wrong inference produces no edge instead of a misleading one). Covers factories and fluent builders that take arguments (`hashKeys().arrayListValues()`), including builders that return a nested type. Existing Java indexes should be re-indexed (`codegraph index -f`) to benefit. (#750) (Java)

+ 66 - 0
__tests__/resolution.test.ts

@@ -2403,6 +2403,72 @@ class Caller {
     });
   });
 
+  describe('Swift chained static-factory call resolution (#645/#608 mechanism)', () => {
+    function callerNamesOf(qualifiedName: string): string[] {
+      const target = cg.getNodesByKind('method').find((n) => n.qualifiedName === qualifiedName);
+      if (!target) return [];
+      const names = cg
+        .getIncomingEdges(target.id)
+        .filter((e) => e.kind === 'calls')
+        .map((e) => cg.getNode(e.source)?.name)
+        .filter((n): n is string => !!n);
+      return [...new Set(names)].sort();
+    }
+
+    it('resolves Foo.make().draw() via the factory return type, never a same-named decoy', async () => {
+      // Aaa sorts first and has a same-named draw() — without the fix Swift dropped
+      // the receiver to a bare `draw` and attached to Aaa (a wrong edge).
+      fs.writeFileSync(
+        path.join(tempDir, 'Main.swift'),
+        `class Aaa { func draw() {} }
+class Foo {
+    static func make() -> Foo { return Foo() }
+    func draw() {}
+}
+func runCaller() { Foo.make().draw() }
+`
+      );
+      cg = await CodeGraph.init(tempDir, { index: true });
+      expect(callerNamesOf('Foo::draw')).toEqual(['runCaller']);
+      expect(callerNamesOf('Aaa::draw')).toEqual([]);
+    });
+
+    it('resolves a constructor chain Foo().draw() and an args factory chain Foo.build(c).render()', async () => {
+      fs.writeFileSync(
+        path.join(tempDir, 'Main.swift'),
+        `class Config {}
+class Foo {
+    static func build(_ c: Config) -> Foo { return Foo() }
+    func draw() {}
+    func render() {}
+}
+func runCaller() {
+    Foo().draw()
+    Foo.build(Config()).render()
+}
+`
+      );
+      cg = await CodeGraph.init(tempDir, { index: true });
+      expect(callerNamesOf('Foo::draw')).toEqual(['runCaller']);
+      expect(callerNamesOf('Foo::render')).toEqual(['runCaller']);
+    });
+
+    it('creates NO edge when the factory return type lacks the method (silent miss, not a wrong edge)', async () => {
+      fs.writeFileSync(
+        path.join(tempDir, 'Main.swift'),
+        `class Foo {
+    static func make() -> Foo { return Foo() }
+}
+class Other { func onlyOther() {} }
+func runCaller() { Foo.make().onlyOther() }
+`
+      );
+      cg = await CodeGraph.init(tempDir, { index: true });
+      // Foo has no onlyOther() — must not mis-attach to the same-named Other::onlyOther.
+      expect(callerNamesOf('Other::onlyOther')).toEqual([]);
+    });
+  });
+
   describe('Chained call resolves a method on a supertype (conformance, #750)', () => {
     function callerNamesOf(qualifiedName: string): string[] {
       const target = cg.getNodesByKind('method').find((n) => n.qualifiedName === qualifiedName);

+ 1 - 1
src/extraction/extraction-version.ts

@@ -21,4 +21,4 @@
  * turns the re-index hint into noise — keep it honest (see CLAUDE.md, "Honesty
  * in the product is load-bearing").
  */
-export const EXTRACTION_VERSION = 9;
+export const EXTRACTION_VERSION = 10;

+ 55 - 0
src/extraction/languages/swift.ts

@@ -2,6 +2,44 @@ import type { Node as SyntaxNode } from 'web-tree-sitter';
 import { getNodeText, getChildByField } from '../tree-sitter-helpers';
 import type { LanguageExtractor } from '../tree-sitter-types';
 
+/**
+ * A Swift function's declared return type, normalized to the bare class name a
+ * chained `Foo.make().draw()` could be called on (the #645/#608 mechanism).
+ * tree-sitter-swift labels BOTH the function name (`simple_identifier`) and the
+ * return type (a `user_type`) with the field `name`, so `childForFieldName`
+ * returns the name; the return type is found positionally — the first type node
+ * after the `simple_identifier` name, before the body. Optionals (`Foo?`) are
+ * unwrapped; arrays/tuples/function types and `Void` yield undefined.
+ */
+function extractSwiftReturnType(node: SyntaxNode, source: string): string | undefined {
+  let seenName = false;
+  for (let i = 0; i < node.namedChildCount; i++) {
+    const child = node.namedChild(i);
+    if (!child) continue;
+    if (child.type === 'simple_identifier' && !seenName) {
+      seenName = true;
+      continue;
+    }
+    if (!seenName) continue;
+    if (child.type === 'function_body') return undefined; // body reached: no return type
+    let typeNode: SyntaxNode | null = null;
+    if (child.type === 'user_type') typeNode = child;
+    else if (child.type === 'optional_type') {
+      typeNode = child.namedChildren.find((c: SyntaxNode) => c.type === 'user_type') ?? null;
+    }
+    if (typeNode) {
+      // Use the whole type node's text, strip generics, then take the LAST
+      // dotted segment — a member type `KF.Builder` resolves to `Builder` (its
+      // first type_identifier is the OUTER `KF`, which would be wrong).
+      const name = getNodeText(typeNode, source).trim().replace(/<[^>]*>/g, '');
+      const last = name.split('.').pop()?.trim();
+      if (!last || !/^[A-Za-z_]\w*$/.test(last) || last === 'Void') return undefined;
+      return last;
+    }
+  }
+  return undefined;
+}
+
 export const swiftExtractor: LanguageExtractor = {
   functionTypes: ['function_declaration'],
   classTypes: ['class_declaration'],
@@ -18,6 +56,23 @@ export const swiftExtractor: LanguageExtractor = {
   bodyField: 'body',
   paramsField: 'parameter',
   returnField: 'return_type',
+  getReturnType: extractSwiftReturnType,
+  resolveName: (node, source) => {
+    // A nested-type extension `extension KF.Builder { … }` parses as a
+    // class_declaration whose `name` is a multi-segment `user_type` (`KF.Builder`
+    // = type_identifiers `KF`, `Builder`). Name the node by the LAST segment
+    // (`Builder`) so it shares the simple name of the extended type's own
+    // declaration (`struct Builder` → `KF::Builder`) instead of becoming a
+    // distinct `KF.Builder` node. Without this, the extension's conformances and
+    // members are invisible to a chained call on the type — supertype lookup and
+    // method matching both key off the simple name (#750). Simple names (regular
+    // class/struct/enum, or `extension Plain`) fall through to default extraction.
+    if (node.type !== 'class_declaration') return undefined;
+    const nameNode = getChildByField(node, 'name');
+    if (!nameNode || nameNode.type !== 'user_type') return undefined;
+    const ids = nameNode.namedChildren.filter((c: SyntaxNode) => c.type === 'type_identifier');
+    return ids.length > 1 ? getNodeText(ids[ids.length - 1]!, source) : undefined;
+  },
   getSignature: (node, source) => {
     // Swift function signature: func name(params) -> ReturnType
     const params = getChildByField(node, 'parameter');

+ 18 - 14
src/extraction/tree-sitter.ts

@@ -2525,32 +2525,36 @@ export class TreeSitterExtractor {
                 calleeName = methodName;
               }
             } else if (
-              (this.language === 'cpp' || this.language === 'c' || this.language === 'kotlin') &&
+              (this.language === 'cpp' ||
+                this.language === 'c' ||
+                this.language === 'kotlin' ||
+                this.language === 'swift') &&
               receiver &&
               receiver.type === 'call_expression'
             ) {
               // Receiver that is itself a call — `Foo::instance().bar()`,
               // `openSession()->run()`, `mgr.view().render()` (C/C++), or
-              // `Foo.getInstance().bar()` (Kotlin). Keep the inner call so
-              // resolution can infer bar()'s class from what the inner call
-              // RETURNS (#645/#608). Encode as `<innerCallee>().<method>`; the
-              // `().` marker never appears in an ordinary ref, so the resolver
+              // `Foo.getInstance().bar()` (Kotlin) / `Foo.make().draw()` (Swift).
+              // Keep the inner call so resolution can infer bar()'s class from what
+              // the inner call RETURNS (#645/#608). Encode as `<innerCallee>().<method>`;
+              // the `().` marker never appears in an ordinary ref, so the resolver
               // can detect and split it. Other languages keep the bare-name
               // behavior (dropping the receiver) below.
               let innerCallee: string;
               let reencode: boolean;
-              if (this.language === 'kotlin') {
-                // tree-sitter-kotlin has no field names — the inner callee is the
+              if (this.language === 'kotlin' || this.language === 'swift') {
+                // tree-sitter-kotlin/swift expose the inner callee as the
                 // call_expression's first named child (a navigation_expression
-                // `Foo.getInstance`, or a bare identifier for a free call).
+                // `Foo.getInstance`, or a bare identifier for a free/constructor call).
                 const innerNav = receiver.namedChild(0);
                 innerCallee = innerNav ? getNodeText(innerNav, this.source).replace(/\s+/g, '') : '';
-                // Only re-encode a CLASS / companion-factory chain, whose receiver
-                // chain starts with a capitalized type (`Foo.getInstance().bar()`).
-                // An instance chain (`list.filter{}.map{}`) has a lowercase receiver
-                // whose type we can't recover here — re-encoding it would only drop
-                // the edge (no chain resolution, no bare-name fallback), regressing
-                // recall in fluent codebases. Leave those to the bare-name path.
+                // Only re-encode a CLASS / companion-factory / constructor chain,
+                // whose receiver chain starts with a capitalized type
+                // (`Foo.getInstance().bar()`, `Foo().bar()`). An instance chain
+                // (`list.filter{}.map{}`) has a lowercase receiver whose type we
+                // can't recover here — re-encoding it would only drop the edge (no
+                // chain resolution, no bare-name fallback), regressing recall in
+                // fluent codebases. Leave those to the bare-name path.
                 reencode = /^[A-Z]/.test(innerCallee);
               } else {
                 const innerFn = getChildByField(receiver, 'function');

+ 1 - 1
src/resolution/index.ts

@@ -33,7 +33,7 @@ const SUPERTYPE_BEARING_KINDS = new Set<Node['kind']>([
 ]);
 
 /** Languages whose chained calls use the dotted `inner().method` encoding. */
-const DOT_CHAIN_LANGUAGES = new Set(['java', 'kotlin', 'csharp']);
+const DOT_CHAIN_LANGUAGES = new Set(['java', 'kotlin', 'csharp', 'swift']);
 
 /** The extractor's chained-receiver encoding: `<inner>().<method>`. */
 const CHAIN_SHAPE = /^(.+)\(\)\.(\w+)$/;

+ 21 - 9
src/resolution/name-matcher.ts

@@ -595,6 +595,13 @@ export function matchPhpCallChain(
   return resolveMethodOnType(resolvedClass, method, ref, context, 0.85, 'instance-method');
 }
 
+/**
+ * Languages where an unprefixed capitalized call `Foo(args)` constructs the
+ * class (so a `Foo(args).method()` receiver's type is `Foo`). Java/C# need `new`,
+ * so a bare `Foo()` there is a method call, not construction — excluded.
+ */
+const CONSTRUCTS_VIA_BARE_CALL = new Set(['kotlin', 'swift']);
+
 /**
  * Resolve a dotted chained call whose receiver is a static factory / fluent call —
  * `Foo.getInstance().bar()`, encoded by the extractor as `Foo.getInstance().bar`
@@ -603,7 +610,7 @@ export function matchPhpCallChain(
  * it (resolveMethodOnType requires `Type::method` to exist), so a wrong inference
  * yields no edge rather than a wrong one (e.g. a same-named `bar()` on an
  * unrelated class is never matched). Shared by the dot-notation languages
- * (Java, Kotlin, C#) — same receiver shape, same `Class::method` qualified names.
+ * (Java, Kotlin, C#, Swift) — same receiver shape, same `Class::method` qualified names.
  */
 export function matchDottedCallChain(
   ref: UnresolvedRef,
@@ -617,13 +624,13 @@ export function matchDottedCallChain(
 
   // Constructor receiver `Foo(args).method()` (encoded `Foo().method`): a bare,
   // capitalized inner is a class construction, so the receiver's type is the
-  // class itself — resolve the method on it. Kotlin only: there an unprefixed
-  // capitalized call constructs the class, whereas in Java a bare `Foo()` is a
-  // method call (constructors need `new`), so we must not assume construction.
-  // A lowercase bare inner is a top-level `factory().method()` whose type we
-  // can't recover — bail.
+  // class itself — resolve the method on it. Only in languages where an
+  // unprefixed capitalized call constructs the class (Kotlin, Swift); in Java/C#
+  // a bare `Foo()` is a method call (constructors need `new`), so we must not
+  // assume construction. A lowercase bare inner is a top-level `factory().method()`
+  // whose type we can't recover — bail.
   if (lastDot <= 0) {
-    if (ref.language !== 'kotlin' || !/^[A-Z]/.test(inner)) return null;
+    if (!CONSTRUCTS_VIA_BARE_CALL.has(ref.language) || !/^[A-Z]/.test(inner)) return null;
     return resolveMethodOnType(inner, method, ref, context, 0.85, 'instance-method', importedFqnOf(inner, ref, context));
   }
 
@@ -1081,11 +1088,16 @@ export function matchReference(
     if (result) return result;
   }
 
-  // 1d. Dotted chained static-factory / fluent call (Java / Kotlin / C#) —
+  // 1d. Dotted chained static-factory / fluent call (Java / Kotlin / C# / Swift) —
   // `Foo.getInstance().bar()` encoded as `Foo.getInstance().bar` (#645/#608
   // mechanism). Resolve bar's class from getInstance's declared return type, then
   // validate the method on it.
-  if (ref.language === 'java' || ref.language === 'kotlin' || ref.language === 'csharp') {
+  if (
+    ref.language === 'java' ||
+    ref.language === 'kotlin' ||
+    ref.language === 'csharp' ||
+    ref.language === 'swift'
+  ) {
     result = matchDottedCallChain(ref, context);
     if (result) return result;
   }