Pārlūkot izejas kodu

fix(resolution): Java/Kotlin imports disambiguate same-name classes (#314)

A Maven multi-module project where `dao/converter/FooConverter` and
`service/converter/FooConverter` both expose a `convert` method used
to resolve by file-path proximity — picking whichever class was closer
to the caller, which is wrong any time the caller lives in an
equidistant cross-cutting module. `extractImportMappings` had no Java
branch at all, so the FQN signal Java imports carry — `import
com.example.dao.converter.FooConverter;` — was thrown away.

- `extractJavaImports` parses regular and `import static` directives;
  wildcard imports (`*`) are intentionally skipped.
- `resolveViaImport` has a new Java/Kotlin cross-file branch that
  converts the imported FQN to a file-path suffix
  (`com/example/dao/converter/FooConverter.java`, or `.kt`) and
  resolves the symbol against the file whose path matches by suffix.
- For the field-receiver pattern (`@Autowired private FooConverter
  fooConverter; fooConverter.convert(...)` — Spring's typical shape),
  `matchMethodCall` now looks up the receiver's inferred type in the
  caller file's imports and threads the resulting FQN through to
  `resolveMethodOnType`. When two `FooConverter::convert` candidates
  exist, the import — not iteration order — picks the right one.

Validated end-to-end with a synthetic 3-module repro:
  import com.example.dao.converter.FooConverter;     -> dao convert
  import com.example.service.converter.FooConverter; -> service convert
Same field declaration, same call site, swapping only the import
swaps the resolved target.

spring-petclinic (47 .java files, single-module so no same-name
collisions): +15 newly import-resolved edges, +2 references, no
regression in calls / imports / extends / instantiates.

Closes #314.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Colby McHenry 3 nedēļas atpakaļ
vecāks
revīzija
ef29de4fd2

+ 1 - 0
CHANGELOG.md

@@ -16,6 +16,7 @@ and adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
   - **Field-injected concrete-bean trace.** A Spring controller's `@Resource(name="userBO") private UserBO userbo;` followed by `this.userbo.toLogin2(...)` now resolves through to `UserBO.toLogin2` even when the field type is a concrete class whose name doesn't match the field by Java naming convention (`userbo` → `UserBO`). The fix is two layered changes in the language layer (Java only): (a) the call extractor unwraps `this.<field>` receivers (previously surfaced as `this.userbo.toLogin2` and dropped through every name-matcher strategy); (b) the resolver looks up the receiver name in the enclosing class's field declarations and uses the declared type to resolve the method. This generalizes beyond Spring — any Java code using `this.field.method()` now resolves correctly.
 
 ### Fixed
+- **Java/Kotlin imports now disambiguate same-name classes across modules (#314).** A Maven multi-module project where `dao/converter/FooConverter` and `service/converter/FooConverter` both expose a `convert` method used to resolve via file-path proximity — picking whichever class was closer to the caller, which is wrong any time the caller lives in an equidistant cross-cutting module. The import resolver had no Java branch at all (`extractImportMappings` returned `[]` for `.java`/`.kt`), so the FQN signal Java imports carry — `import com.example.dao.converter.FooConverter;` — was being thrown away. New `extractJavaImports` parses regular and `import static` directives. `resolveViaImport` now has a Java/Kotlin cross-file branch that converts the imported FQN to a file-path suffix (`com/example/dao/converter/FooConverter.java`) and resolves the symbol against the file whose path matches. For the `@Autowired private FooConverter fooConverter; fooConverter.convert(...)` field-receiver pattern (Spring's typical shape), `matchMethodCall` now passes the imported FQN to `resolveMethodOnType` so when multiple `FooConverter::convert` candidates exist, the import — not iteration order — picks the right one. Validated end-to-end on a synthetic two-module repro: swapping only the `import` line on the caller (with identical field declaration and call site) switches the resolved target between dao and service correctly. On spring-petclinic, +15 newly import-resolved Java edges with no regression in `calls`/`imports`/`extends`.
 - **TypeScript `type` aliases with object shapes no longer cause cross-module false-positive call edges (#359).** Receiver-typed `handle.stop()` where `handle: RecorderHandle` and `RecorderHandle = { stop: () => Promise<void> }` used to attach the call edge to an unrelated `class Foo { stop() {} }` in a sibling directory via path-proximity matching, because the type alias had no `stop` node — only the look-alike class did. The fix surfaces type-alias object-shape members (and intersection-type members) as first-class `property`/`method` nodes under the alias: `type X = { foo: T; bar(): T }` now produces `X::foo` and `X::bar` in the graph. Function-typed properties (`stop: () => Promise<void>`) are emitted as `method` kind so `obj.stop()` resolves to them; non-function properties remain `property` kind. With the alias's members in the graph, the existing camelCase receiver-name word overlap (`recorder` ↔ `RecorderHandle`) routes the call to the correct alias member instead of the wrong class. Anonymous nested object types inside generic arguments (`Promise<{ ok: true }>`) intentionally don't produce phantom members — only immediate `object_type` / `intersection_type` operands of the alias value are walked. Measured on excalidraw/excalidraw (314 .ts files): **+776 new property nodes** + **+1,008 method nodes from type-alias members** + **+226 newly accurate `calls` edges** pointing at alias members (some shifted from incorrect class targets, some previously unresolved).
 - **C# now produces `references` edges for parameter, return, property, and field types (#381).** Indexing any C# project used to yield **zero** `references` edges, so `codegraph_callers SomeDto` returned no results even when the DTO was used as a parameter or return type across the codebase, and `codegraph_callees` on a service class only saw its `using` imports. Two root causes: `csharp.ts` was missing `returnField`, and the type-leaf walker only matched `type_identifier` nodes — but C# tree-sitter emits `identifier`/`predefined_type`/`qualified_name`/`generic_name` instead. The fix adds the missing extractor field, routes C# through a dedicated type walker that only descends into known type-position fields (so parameter NAMES like `request` in `Build(UserDto request)` never mis-emit as type refs), and hooks `extractField`/`extractProperty` to invoke the walker. Measured on dotnet/eShop (527 `.cs` files): C# `references` edges go from **35 → 925** (+26x), with no regression in `calls`/`imports`/`instantiates`/`extends`/`implements`.
 - **Go cross-package qualified calls (`pkga.FuncX(...)`) now resolve to the right package (#388).** On a Go monorepo with a layered package layout (handler/service/domain/dao), `codegraph_callers`, `_callees`, `_impact`, and `_trace` used to return ~0-1 results where grep finds hundreds to thousands of real call sites — the central value proposition of CodeGraph silently degraded on entire Go codebases. Root cause: the import resolver flagged every Go import path without `/internal/` as third-party (because it had no idea what the project's own module path was), so cross-package calls fell through to name-matching with path-proximity scoring, which on real codebases picks ~one accidental candidate per call site. The Go branch now reads the project's `go.mod`, treats `<module-path>/...` imports as in-module, and looks up the qualified symbol in the imported package's directory; same-name functions in *different* packages no longer collide. As a side fix, Go nodes now correctly carry `is_exported=1` for capitalized identifiers (the resolver needs this to filter candidates). Measured on gRPC-Go (1,031 `.go` files, layered packages): cross-package `calls` edges go from 10,880 → 19,929 (**+83%**), total `calls` from 23,803 → 34,105 (**+43%**), with no false-positive resolution of stdlib calls (`fmt.Println` etc. stay external).

+ 58 - 0
__tests__/resolution.test.ts

@@ -802,6 +802,64 @@ export async function finaliseRecording(recorder: RecorderHandle) {
       expect(handleStop!.kind).toBe('method');
     });
 
+    it('Java import disambiguates same-name classes across modules (#314)', async () => {
+      // Pre-#314 the import resolver had no Java branch at all, so a
+      // multi-module Maven repo where `dao/converter/FooConverter` and
+      // `service/converter/FooConverter` both export a `convert` method
+      // resolved by file-path proximity — picking whichever class was
+      // closer to the caller, which is wrong any time the caller lives
+      // in an equidistant cross-cutting module.
+      const daoDir = path.join(tempDir, 'dao/src/main/java/com/example/dao/converter');
+      const serviceDir = path.join(tempDir, 'service/src/main/java/com/example/service/converter');
+      const webDir = path.join(tempDir, 'web/src/main/java/com/example/web');
+      fs.mkdirSync(daoDir, { recursive: true });
+      fs.mkdirSync(serviceDir, { recursive: true });
+      fs.mkdirSync(webDir, { recursive: true });
+
+      fs.writeFileSync(
+        path.join(daoDir, 'FooConverter.java'),
+        `package com.example.dao.converter;
+public class FooConverter { public String convert(String x) { return "dao:" + x; } }
+`
+      );
+      fs.writeFileSync(
+        path.join(serviceDir, 'FooConverter.java'),
+        `package com.example.service.converter;
+public class FooConverter { public String convert(String x) { return "svc:" + x; } }
+`
+      );
+      // The caller imports the SERVICE version — even though dao is
+      // alphabetically/lexically first in the candidate list, the
+      // import must trump that order.
+      fs.writeFileSync(
+        path.join(webDir, 'Handler.java'),
+        `package com.example.web;
+
+import com.example.service.converter.FooConverter;
+
+public class Handler {
+  private FooConverter fooConverter;
+  public String use() { return fooConverter.convert("input"); }
+}
+`
+      );
+
+      cg = await CodeGraph.init(tempDir, { index: true });
+
+      const use = cg
+        .getNodesByKind('method')
+        .find((n) => n.qualifiedName === 'Handler::use');
+      expect(use).toBeDefined();
+      const calls = cg.getOutgoingEdges(use!.id).filter((e) => e.kind === 'calls');
+      expect(calls.length).toBeGreaterThanOrEqual(1);
+
+      const target = cg.getNode(calls[0]!.target);
+      expect(target?.name).toBe('convert');
+      expect(target?.filePath.replace(/\\/g, '/')).toBe(
+        'service/src/main/java/com/example/service/converter/FooConverter.java'
+      );
+    });
+
     it('C# extracts references from method/property/field types (#381)', async () => {
       // Pre-#381, every C# project produced ZERO `references` edges:
       // csharp.ts was missing returnField, and the type-leaf walker

+ 133 - 0
src/resolution/import-resolver.ts

@@ -232,6 +232,8 @@ export function extractImportMappings(
     mappings.push(...extractPythonImports(content));
   } else if (language === 'go') {
     mappings.push(...extractGoImports(content));
+  } else if (language === 'java' || language === 'kotlin') {
+    mappings.push(...extractJavaImports(content));
   } else if (language === 'php') {
     mappings.push(...extractPHPImports(content));
   }
@@ -441,6 +443,49 @@ function extractGoImports(content: string): ImportMapping[] {
   return mappings;
 }
 
+/**
+ * Extract Java / Kotlin import mappings.
+ *
+ * Java/Kotlin imports carry the full qualified name of the imported
+ * symbol — `import com.example.dao.converter.FooConverter;` — which is
+ * exactly the disambiguation signal we need when two packages both
+ * declare a `FooConverter`. Pre-#314 the resolver had no Java branch
+ * here at all, so this mapping was empty and cross-module name
+ * collisions were resolved by file-path proximity (often wrongly).
+ *
+ * `import static com.example.Foo.bar;` is parsed as a local-name `bar`
+ * pointing at FQN `com.example.Foo.bar` so static-method call sites
+ * (`bar(...)`) can resolve through the same import lookup.
+ */
+function extractJavaImports(content: string): ImportMapping[] {
+  const mappings: ImportMapping[] = [];
+  // Strip line and block comments so `// import foo;` doesn't false-match.
+  const stripped = content
+    .replace(/\/\*[\s\S]*?\*\//g, '')
+    .replace(/\/\/[^\n]*/g, '');
+  // `import [static] <fqn>[.*];`
+  const re = /^\s*import\s+(static\s+)?([\w.]+(?:\.\*)?)\s*;/gm;
+  let match: RegExpExecArray | null;
+  while ((match = re.exec(stripped)) !== null) {
+    const fqn = match[2]!;
+    // `import com.example.*;` — wildcard. We can't materialize a single
+    // local name; skip and let name-matching handle members reachable
+    // through the wildcard. (Future enhancement: enumerate package files.)
+    if (fqn.endsWith('.*')) continue;
+    const parts = fqn.split('.');
+    const localName = parts[parts.length - 1];
+    if (!localName) continue;
+    mappings.push({
+      localName,
+      exportedName: localName,
+      source: fqn,
+      isDefault: false,
+      isNamespace: false,
+    });
+  }
+  return mappings;
+}
+
 /**
  * Extract PHP import mappings (use statements)
  */
@@ -619,6 +664,17 @@ export function resolveViaImport(
     if (goResult) return goResult;
   }
 
+  // Java / Kotlin: imports are FQNs (`import com.example.Foo;`) — no
+  // resolvable file path the JS/TS-style chain below could follow. Look
+  // up the symbol by name and filter to the candidate whose file path
+  // matches the imported FQN. This is the disambiguation signal that
+  // breaks the same-name class collision the path-proximity matcher
+  // can't resolve (issue #314).
+  if (ref.language === 'java' || ref.language === 'kotlin') {
+    const javaResult = resolveJavaImportedReference(ref, imports, context);
+    if (javaResult) return javaResult;
+  }
+
   // Check if the reference name matches any import
   for (const imp of imports) {
     if (imp.localName === ref.referenceName || ref.referenceName.startsWith(imp.localName + '.')) {
@@ -659,6 +715,83 @@ export function resolveViaImport(
   return null;
 }
 
+/**
+ * Resolve a Java/Kotlin reference whose receiver is the simple name of
+ * an imported FQN: `Foo.bar(...)` where `import com.example.Foo;`. The
+ * imported FQN converts to a file-path suffix (`com/example/Foo.java`
+ * or `.kt`) which uniquely identifies the right symbol when multiple
+ * classes share the same simple name.
+ *
+ * Also handles bare references to the imported class itself
+ * (`new Foo()` extraction emits `Foo` as a `references`/`instantiates`
+ * ref) and `import static <Foo>.bar` style imports of a single member.
+ */
+function resolveJavaImportedReference(
+  ref: UnresolvedRef,
+  imports: ImportMapping[],
+  context: ResolutionContext
+): ResolvedRef | null {
+  if (imports.length === 0) return null;
+
+  const ext = ref.language === 'kotlin' ? '.kt' : '.java';
+
+  for (const imp of imports) {
+    const matchesBare = imp.localName === ref.referenceName;
+    const matchesQualified = ref.referenceName.startsWith(imp.localName + '.');
+    if (!matchesBare && !matchesQualified) continue;
+
+    // Convert FQN to a file-path suffix. `com.example.Foo` ->
+    // `com/example/Foo.java` (or `.kt`). The actual file may live
+    // under any source root (`src/main/java/`, `src/`, etc.), so match
+    // by suffix rather than exact path.
+    const fqnPath = imp.source.replace(/\./g, '/') + ext;
+
+    // Which symbol name to look up: the class itself, or a member.
+    const memberName = matchesBare
+      ? imp.localName
+      : ref.referenceName.substring(imp.localName.length + 1);
+
+    const candidates = context.getNodesByName(memberName);
+    for (const node of candidates) {
+      if (node.language !== ref.language) continue;
+      const fp = node.filePath.replace(/\\/g, '/');
+      if (fp.endsWith(fqnPath) || fp.endsWith('/' + fqnPath)) {
+        return {
+          original: ref,
+          targetNodeId: node.id,
+          confidence: 0.9,
+          resolvedBy: 'import',
+        };
+      }
+    }
+
+    // `import static com.example.Foo.bar;` — the FQN's tail is the
+    // member name, the part before is the owner class. Look up the
+    // member named `<imp.localName>` (e.g. `bar`) and prefer the
+    // candidate whose file matches the parent FQN's path.
+    if (matchesBare) {
+      const dot = imp.source.lastIndexOf('.');
+      if (dot > 0) {
+        const ownerFqn = imp.source.substring(0, dot);
+        const ownerPath = ownerFqn.replace(/\./g, '/') + ext;
+        for (const node of candidates) {
+          if (node.language !== ref.language) continue;
+          const fp = node.filePath.replace(/\\/g, '/');
+          if (fp.endsWith(ownerPath) || fp.endsWith('/' + ownerPath)) {
+            return {
+              original: ref,
+              targetNodeId: node.id,
+              confidence: 0.9,
+              resolvedBy: 'import',
+            };
+          }
+        }
+      }
+    }
+  }
+  return null;
+}
+
 /**
  * Resolve a Go cross-package qualified reference (`pkga.FuncX`) by matching
  * the package alias against an in-module import, stripping the module prefix

+ 36 - 2
src/resolution/name-matcher.ts

@@ -153,6 +153,15 @@ function resolveMethodOnType(
   context: ResolutionContext,
   confidence: number,
   resolvedBy: ResolvedRef['resolvedBy'],
+  /**
+   * Optional FQN that identifies WHICH class declaration `typeName`
+   * refers to in the caller's file. When multiple candidates share
+   * the same qualifiedName (`FooConverter::convert` in both
+   * `dao/converter/` and `service/converter/`), the FQN's
+   * file-path-suffix picks the right one — the disambiguation
+   * signal Java imports carry but the call site doesn't (#314).
+   */
+  preferredFqn?: string,
 ): ResolvedRef | null {
   // Look up methods by name and match by qualifiedName ending in
   // `<typeName>::<methodName>`. This works whether the method is defined
@@ -161,21 +170,40 @@ function resolveMethodOnType(
   // The previous same-file approach missed the latter — the typical C++ layout.
   const methodCandidates = context.getNodesByName(methodName);
   const want = `${typeName}::${methodName}`;
+  const matches: Node[] = [];
   for (const m of methodCandidates) {
     if (m.kind !== 'method') continue;
     if (m.language !== ref.language) continue;
     const qn = m.qualifiedName;
     if (qn === want || qn.endsWith(`::${want}`)) {
+      matches.push(m);
+    }
+  }
+  if (matches.length === 0) return null;
+
+  if (matches.length > 1 && preferredFqn) {
+    const ext = ref.language === 'kotlin' ? '.kt' : '.java';
+    const fqnPath = preferredFqn.replace(/\./g, '/') + ext;
+    const chosen = matches.find((m) => {
+      const fp = m.filePath.replace(/\\/g, '/');
+      return fp.endsWith(fqnPath) || fp.endsWith('/' + fqnPath);
+    });
+    if (chosen) {
       return {
         original: ref,
-        targetNodeId: m.id,
+        targetNodeId: chosen.id,
         confidence,
         resolvedBy,
       };
     }
   }
 
-  return null;
+  return {
+    original: ref,
+    targetNodeId: matches[0]!.id,
+    confidence,
+    resolvedBy,
+  };
 }
 
 // C++ keywords/control-flow tokens that can appear right before a receiver
@@ -365,6 +393,11 @@ export function matchMethodCall(
   if ((ref.language === 'java' || ref.language === 'kotlin') && dotMatch) {
     const inferredType = inferJavaFieldReceiverType(objectOrClass!, ref, context);
     if (inferredType) {
+      // When two classes share the same simple name, the caller file's
+      // import is the only signal that names WHICH one — pass the
+      // imported FQN so resolveMethodOnType can disambiguate (#314).
+      const imports = context.getImportMappings(ref.filePath, ref.language);
+      const importedFqn = imports.find((i) => i.localName === inferredType)?.source;
       const typedMatch = resolveMethodOnType(
         inferredType,
         methodName!,
@@ -372,6 +405,7 @@ export function matchMethodCall(
         context,
         0.9,
         'instance-method',
+        importedFqn,
       );
       if (typedMatch) {
         return typedMatch;