Răsfoiți Sursa

fix(resolution): Go cross-package qualified calls resolve via go.mod (#388) (#469)

`pkga.FuncX(...)` cross-package calls in Go monorepos were dropping
through the import resolver — `isExternalImport(go)` flagged any
non-`/internal/` import as third-party because the resolver had no idea
what the project's own module path was. Resolution fell back to name
matching with path-proximity scoring, which on a layered codebase picks
one accidental candidate per call site (~<1% recall per #388's
5,303-vs-1 figure).

- `src/resolution/go-module.ts` (new) parses the `module ...` directive
  from project-root `go.mod`, exposed via `getGoModule()` on
  `ResolutionContext`.
- `isExternalImport(go)` treats `<module-path>/...` imports as in-module;
  the existing `/internal/` escape hatch is preserved for repos without
  a parsed go.mod.
- `resolveViaImport` gets a Go cross-package branch that strips the
  module prefix to a project-relative directory, then resolves the
  qualified member via `getNodesByName(member)` filtered to that exact
  directory and `isExported=true`. Sub-packages don't collide with their
  parents; same-name funcs in different packages don't false-merge.
- Go extractor sets `isExported` from the identifier's first character
  (Go's universal uppercase=exported convention). The resolver depends
  on this to filter candidates.

Validation on gRPC-Go (1,031 .go files, layered package tree):
  total `calls` edges:    23,803 -> 34,105 (+43%)
  cross-pkg `calls`:      10,880 -> 19,929 (+83%)
  fmt/strconv/etc. stdlib calls: stay external (no false positives)

Tests cover in-module disambiguation with same-name funcs in two
packages, aliased imports, and stdlib calls not being false-resolved to
in-project nodes.

Closes #388.
Colby Mchenry 3 săptămâni în urmă
părinte
comite
f1b79eeae1

+ 1 - 0
CHANGELOG.md

@@ -16,6 +16,7 @@ and adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
   - **Field-injected concrete-bean trace.** A Spring controller's `@Resource(name="userBO") private UserBO userbo;` followed by `this.userbo.toLogin2(...)` now resolves through to `UserBO.toLogin2` even when the field type is a concrete class whose name doesn't match the field by Java naming convention (`userbo` → `UserBO`). The fix is two layered changes in the language layer (Java only): (a) the call extractor unwraps `this.<field>` receivers (previously surfaced as `this.userbo.toLogin2` and dropped through every name-matcher strategy); (b) the resolver looks up the receiver name in the enclosing class's field declarations and uses the declared type to resolve the method. This generalizes beyond Spring — any Java code using `this.field.method()` now resolves correctly.
 
 ### Fixed
+- **Go cross-package qualified calls (`pkga.FuncX(...)`) now resolve to the right package (#388).** On a Go monorepo with a layered package layout (handler/service/domain/dao), `codegraph_callers`, `_callees`, `_impact`, and `_trace` used to return ~0-1 results where grep finds hundreds to thousands of real call sites — the central value proposition of CodeGraph silently degraded on entire Go codebases. Root cause: the import resolver flagged every Go import path without `/internal/` as third-party (because it had no idea what the project's own module path was), so cross-package calls fell through to name-matching with path-proximity scoring, which on real codebases picks ~one accidental candidate per call site. The Go branch now reads the project's `go.mod`, treats `<module-path>/...` imports as in-module, and looks up the qualified symbol in the imported package's directory; same-name functions in *different* packages no longer collide. As a side fix, Go nodes now correctly carry `is_exported=1` for capitalized identifiers (the resolver needs this to filter candidates). Measured on gRPC-Go (1,031 `.go` files, layered packages): cross-package `calls` edges go from 10,880 → 19,929 (**+83%**), total `calls` from 23,803 → 34,105 (**+43%**), with no false-positive resolution of stdlib calls (`fmt.Println` etc. stay external).
 - **`codegraph_files` now returns the whole project when an agent passes `path="/"`, `"."`, `"./"`, `""`, or a Windows-style `"\\"` — instead of "No files found matching the criteria."** Indexed file paths are stored as project-relative POSIX (e.g. `src/foo.ts`), but the path filter used a plain `startsWith`, so a leading slash or any of the other root-ish shapes an agent might guess matched nothing and pushed the agent back to Read/Glob — the exact opencode + Gemini Flash regression reported on Windows 11. Subdirectory filters are now equally forgiving: `"/src"`, `"./src"`, `"src/"`, `"src\\components"`, etc. all resolve correctly. Sibling-prefix bleed (`"src"` was previously matching `src-utils/...`) is also fixed — the filter now requires either an exact match or a `<filter>/` boundary. Closes #426.
 - **File watcher no longer marks edited files as fresh when another process holds the index lock.** When a second writer (concurrent `codegraph index`, a git hook, another MCP daemon) held `.codegraph/codegraph.lock`, `CodeGraph.sync()` returned a zero-shape no-op instead of throwing. The file watcher took that as a successful sync and cleared `pendingFiles` — so the per-file staleness signal MCP tools surface to agents (issue #403) dropped immediately, even though the edit was never indexed. `CodeGraph.watch()` now converts that no-op into a typed `LockUnavailableError` thrown into the watcher; the existing retry path preserves `pendingFiles` and reschedules until the lock becomes available. The error is logged at debug only (no `onSyncError` callback) so a long-running external indexer doesn't spam stderr every debounce cycle. Closes #449.
 - **Watch sync no longer aborts with `FOREIGN KEY constraint failed`.** PR #62 plugged this FK violation at the extraction layer (empty-named nodes whose containment edges had no target), but the same violation kept reappearing on v0.9.5 during the daemon's *watch sync* — not on initial index. Once an agent's daemon had been running long enough to accumulate edits, a resolver lookup that crossed a framework-specific cache could hand back a node whose row had been removed by a recent file rewrite, and the FK check then aborted the entire resolution batch, leaving the user's daemon log filling with `Watch sync failed { error: 'FOREIGN KEY constraint failed' }`. `QueryBuilder.insertEdges` now validates every batch's endpoints against the `nodes` table directly (one fresh `SELECT id IN (...)` per batch, no cache) and silently skips edges with missing source or target — so a stale lookup result drops one edge instead of aborting the whole sync. Surfaces as a fresh `codegraph init`/`index` cycle now surviving its first watch-sync cycle without the FK error, and the daemon recovering naturally instead of compounding into further failures. Closes #455.

+ 121 - 0
__tests__/resolution.test.ts

@@ -645,6 +645,127 @@ def bootstrap():
       );
       expect(callsToUserService).toHaveLength(0);
     });
+
+    it('resolves Go cross-package qualified calls via go.mod module path (#388)', async () => {
+      // Pre-#388, every `pkga.FuncX(...)` call in a Go monorepo was flagged
+      // external (isExternalImport returned true for any non-`/internal/`
+      // import without `.`-prefix) and resolution fell through to name-match
+      // with path proximity — recall on cross-package callers was ~<1%.
+      fs.writeFileSync(
+        path.join(tempDir, 'go.mod'),
+        'module github.com/example/myproject\n\ngo 1.21\n'
+      );
+
+      const pkgaDir = path.join(tempDir, 'pkga');
+      const pkgbDir = path.join(tempDir, 'pkgb');
+      const pkgcDir = path.join(tempDir, 'pkgc');
+      fs.mkdirSync(pkgaDir);
+      fs.mkdirSync(pkgbDir);
+      fs.mkdirSync(pkgcDir);
+
+      // Same-name exported function in two packages — only the imported one
+      // should resolve. Exercises disambiguation, not just connectivity.
+      fs.writeFileSync(
+        path.join(pkgaDir, 'conv.go'),
+        'package pkga\nfunc Convert(x int) int { return x * 2 }\n'
+      );
+      fs.writeFileSync(
+        path.join(pkgbDir, 'conv.go'),
+        'package pkgb\nfunc Convert(x int) int { return x + 1 }\n'
+      );
+      fs.writeFileSync(
+        path.join(pkgcDir, 'use.go'),
+        `package pkgc
+
+import "github.com/example/myproject/pkga"
+
+func UsePkga() {
+  pkga.Convert(5)
+}
+`
+      );
+
+      cg = await CodeGraph.init(tempDir, { index: true });
+
+      const usePkga = cg.getNodesByKind('function').filter((n) => n.name ==='UsePkga')[0];
+      expect(usePkga).toBeDefined();
+
+      const outgoing = cg.getOutgoingEdges(usePkga!.id);
+      const callEdges = outgoing.filter((e) => e.kind === 'calls');
+      expect(callEdges).toHaveLength(1);
+
+      const target = cg.getNode(callEdges[0]!.target);
+      expect(target?.name).toBe('Convert');
+      // Critical: the resolver must pick the imported pkga's Convert,
+      // not pkgb's. With the broken (pre-fix) resolver this lands on
+      // whichever Convert happens to be cheaper under path proximity.
+      expect(target?.filePath.replace(/\\/g, '/')).toBe('pkga/conv.go');
+    });
+
+    it('resolves Go aliased imports across packages (#388)', async () => {
+      fs.writeFileSync(
+        path.join(tempDir, 'go.mod'),
+        'module github.com/example/myproject\n\ngo 1.21\n'
+      );
+      fs.mkdirSync(path.join(tempDir, 'pkgb'));
+      fs.mkdirSync(path.join(tempDir, 'pkgd'));
+
+      fs.writeFileSync(
+        path.join(tempDir, 'pkgb', 'lib.go'),
+        'package pkgb\nfunc Compute(x int) int { return x }\n'
+      );
+      fs.writeFileSync(
+        path.join(tempDir, 'pkgd', 'use.go'),
+        `package pkgd
+
+import (
+  "fmt"
+  alias "github.com/example/myproject/pkgb"
+)
+
+func UseAliased() {
+  fmt.Println("hi")
+  alias.Compute(3)
+}
+`
+      );
+
+      cg = await CodeGraph.init(tempDir, { index: true });
+
+      const useAliased = cg.getNodesByKind('function').filter((n) => n.name ==='UseAliased')[0];
+      expect(useAliased).toBeDefined();
+      const calls = cg.getOutgoingEdges(useAliased!.id).filter((e) => e.kind === 'calls');
+      // fmt.Println is stdlib — must stay external. alias.Compute must resolve.
+      expect(calls).toHaveLength(1);
+      const target = cg.getNode(calls[0]!.target);
+      expect(target?.name).toBe('Compute');
+      expect(target?.filePath.replace(/\\/g, '/')).toBe('pkgb/lib.go');
+    });
+
+    it('Go: leaves stdlib calls (fmt.Println, etc.) external', async () => {
+      fs.writeFileSync(
+        path.join(tempDir, 'go.mod'),
+        'module github.com/example/myproject\n\ngo 1.21\n'
+      );
+      fs.writeFileSync(
+        path.join(tempDir, 'main.go'),
+        `package main
+
+import "fmt"
+
+func main() {
+  fmt.Println("hi")
+}
+`
+      );
+
+      cg = await CodeGraph.init(tempDir, { index: true });
+
+      const mainFn = cg.getNodesByKind('function').filter((n) => n.name ==='main')[0];
+      const calls = cg.getOutgoingEdges(mainFn!.id).filter((e) => e.kind === 'calls');
+      // No spurious in-project edge — fmt.* must stay unresolved/external.
+      expect(calls).toHaveLength(0);
+    });
   });
 
   describe('Name Matcher: kind bias for new ref kinds', () => {

+ 12 - 0
src/extraction/languages/go.ts

@@ -36,6 +36,18 @@ export const goExtractor: LanguageExtractor = {
     if (typeChild.type === 'interface_type') return 'interface';
     return undefined;
   },
+  isExported: (node, source) => {
+    // Go: a symbol is exported when its identifier starts with an uppercase letter.
+    // Look at the `name` field directly (works for function_declaration,
+    // method_declaration, type_spec, and var_spec / const_spec via extractor flow).
+    const nameNode = getChildByField(node, 'name');
+    if (nameNode) {
+      const text = getNodeText(nameNode, source);
+      const first = text.charCodeAt(0);
+      return first >= 65 && first <= 90; // A-Z
+    }
+    return false;
+  },
   getReceiverType: (node, source) => {
     // Go method_declaration has a "receiver" field: func (sl *scrapeLoop) run(...)
     // The receiver is a parameter_list containing a parameter_declaration

+ 47 - 0
src/resolution/go-module.ts

@@ -0,0 +1,47 @@
+/**
+ * Go module path detection.
+ *
+ * A Go monorepo's cross-package calls (`pkga.FuncX(...)`) only resolve when
+ * the resolver knows the project's module path (the `module ...` directive
+ * in `go.mod`). Without it, `isExternalImport` treats every in-module import
+ * — `github.com/example/myproject/pkga` — as a third-party package, so
+ * resolution falls through to name-matching with path proximity and returns
+ * a tiny fraction of the real call sites. See issue #388.
+ */
+
+import * as fs from 'fs';
+import * as path from 'path';
+
+export interface GoModule {
+  /** The module path declared in `go.mod`, e.g. `github.com/example/myproject` */
+  modulePath: string;
+  /** Absolute path to the directory containing the `go.mod` file. */
+  rootDir: string;
+}
+
+/**
+ * Read the `go.mod` file at the project root and extract the module path.
+ * Returns `null` if no `go.mod` exists or it has no `module` directive.
+ *
+ * Limitation: only the project-root `go.mod` is read. Nested `go.mod` files
+ * (Go workspaces, monorepos with multiple modules) are not yet resolved —
+ * a follow-up if a real repro shows up.
+ */
+export function loadGoModule(projectRoot: string): GoModule | null {
+  const goModPath = path.join(projectRoot, 'go.mod');
+  let content: string;
+  try {
+    content = fs.readFileSync(goModPath, 'utf-8');
+  } catch {
+    return null;
+  }
+  // `module <path>` is the first non-comment directive in any valid go.mod.
+  // Strip line comments so a `// module foo` doesn't false-match.
+  const stripped = content.replace(/\/\/[^\n]*/g, '');
+  const match = stripped.match(/^\s*module\s+(\S+)\s*$/m);
+  if (!match) return null;
+  // Strip optional quoting around the module path.
+  const modulePath = match[1]!.replace(/^["']|["']$/g, '');
+  if (!modulePath) return null;
+  return { modulePath, rootDir: projectRoot };
+}

+ 84 - 3
src/resolution/import-resolver.ts

@@ -103,10 +103,24 @@ function isExternalImport(
   }
 
   if (language === 'go') {
-    // Standard library or external packages
-    if (!importPath.startsWith('.') && !importPath.includes('/internal/')) {
-      return true;
+    // Relative imports (rare in idiomatic Go but the grammar allows them).
+    if (importPath.startsWith('.')) {
+      return false;
+    }
+    // In-module imports look like `<module-path>/sub/pkg` — local to
+    // this project. Without the module-path check we'd flag every
+    // cross-package call in a Go monorepo as external (issue #388).
+    const mod = context?.getGoModule?.();
+    if (mod && (importPath === mod.modulePath || importPath.startsWith(mod.modulePath + '/'))) {
+      return false;
+    }
+    // `internal/` packages stay local even when go.mod is missing —
+    // preserves the pre-#388 escape hatch for repos without a parsed module path.
+    if (importPath.includes('/internal/')) {
+      return false;
     }
+    // Anything else is the Go standard library or a third-party module.
+    return true;
   }
 
   return false;
@@ -596,6 +610,15 @@ export function resolveViaImport(
     return null;
   }
 
+  // Go cross-package calls: `pkga.FuncX(...)` extracts to referenceName
+  // `pkga.FuncX` and the import `github.com/example/myproject/pkga`
+  // maps to a *package directory* containing one or more .go files.
+  // The generic file-based lookup below can't follow that — issue #388.
+  if (ref.language === 'go') {
+    const goResult = resolveGoCrossPackageReference(ref, imports, context);
+    if (goResult) return goResult;
+  }
+
   // Check if the reference name matches any import
   for (const imp of imports) {
     if (imp.localName === ref.referenceName || ref.referenceName.startsWith(imp.localName + '.')) {
@@ -636,6 +659,64 @@ export function resolveViaImport(
   return null;
 }
 
+/**
+ * Resolve a Go cross-package qualified reference (`pkga.FuncX`) by matching
+ * the package alias against an in-module import, stripping the module prefix
+ * to a project-relative directory, and locating the exported symbol in any
+ * `.go` file under that directory. Returns `null` for stdlib / third-party
+ * imports (no `go.mod`-relative match) so the rest of `resolveViaImport`
+ * can still try the file-based path.
+ */
+function resolveGoCrossPackageReference(
+  ref: UnresolvedRef,
+  imports: ImportMapping[],
+  context: ResolutionContext
+): ResolvedRef | null {
+  const mod = context.getGoModule?.();
+  if (!mod) return null;
+
+  // Qualified call: receiver before `.`, member after. A bare reference
+  // (no dot) is a same-file/in-package call — handled elsewhere.
+  const dotIdx = ref.referenceName.indexOf('.');
+  if (dotIdx <= 0) return null;
+  const receiver = ref.referenceName.substring(0, dotIdx);
+  const memberName = ref.referenceName.substring(dotIdx + 1);
+  if (!memberName) return null;
+
+  for (const imp of imports) {
+    if (imp.localName !== receiver) continue;
+    // Only in-module imports map to a known directory.
+    if (imp.source !== mod.modulePath && !imp.source.startsWith(mod.modulePath + '/')) {
+      continue;
+    }
+    const pkgDir = imp.source === mod.modulePath
+      ? ''
+      : imp.source.substring(mod.modulePath.length + 1);
+
+    // Look up the member by name and pick the candidate whose file lives
+    // directly in the package directory. Match the immediate parent dir
+    // exactly so a call to `pkga.FuncX` doesn't accidentally land on a
+    // `FuncX` declared in `pkga/subpkg/`.
+    const candidates = context.getNodesByName(memberName);
+    for (const node of candidates) {
+      if (node.language !== 'go') continue;
+      if (!node.isExported) continue;
+      const fp = node.filePath.replace(/\\/g, '/');
+      const lastSlash = fp.lastIndexOf('/');
+      const fileDir = lastSlash >= 0 ? fp.substring(0, lastSlash) : '';
+      if (fileDir === pkgDir) {
+        return {
+          original: ref,
+          targetNodeId: node.id,
+          confidence: 0.9,
+          resolvedBy: 'import',
+        };
+      }
+    }
+  }
+  return null;
+}
+
 /** Recursive depth cap for re-export chain following. Real codebases
  *  rarely chain barrels more than 2–3 deep; 8 is a generous safety
  *  net that still bounds worst-case work. */

+ 10 - 0
src/resolution/index.ts

@@ -21,6 +21,7 @@ import { resolveViaImport, extractImportMappings, extractReExports } from './imp
 import { detectFrameworks } from './frameworks';
 import { synthesizeCallbackEdges } from './callback-synthesizer';
 import { loadProjectAliases, type AliasMap } from './path-aliases';
+import { loadGoModule, type GoModule } from './go-module';
 import { logDebug } from '../errors';
 import type { ReExport } from './types';
 import { LRUCache } from './lru-cache';
@@ -157,6 +158,8 @@ export class ReferenceResolver {
   // `null` = computed and absent. Treated as immutable for the
   // resolver's lifetime; callers re-create the resolver if config changes.
   private projectAliases: AliasMap | null | undefined = undefined;
+  // go.mod module path. Same lazy/immutable convention as projectAliases.
+  private goModule: GoModule | null | undefined = undefined;
 
   constructor(projectRoot: string, queries: QueryBuilder) {
     this.projectRoot = projectRoot;
@@ -370,6 +373,13 @@ export class ReferenceResolver {
         return this.projectAliases;
       },
 
+      getGoModule: () => {
+        if (this.goModule === undefined) {
+          this.goModule = loadGoModule(this.projectRoot);
+        }
+        return this.goModule;
+      },
+
       getReExports: (filePath: string, language) => {
         const cached = this.reExportCache.get(filePath);
         if (cached) return cached;

+ 8 - 0
src/resolution/types.ts

@@ -91,6 +91,14 @@ export interface ResolutionContext {
    * compile without modification; production resolver implements it.
    */
   getProjectAliases?(): import('./path-aliases').AliasMap | null;
+  /**
+   * Go module info from `go.mod` at the project root. Returns `null`
+   * when the project has no `go.mod` (non-Go projects, pre-modules
+   * Go code, or projects whose modules live in subdirectories). Used
+   * by the Go branch of import resolution to distinguish in-module
+   * cross-package imports from third-party packages.
+   */
+  getGoModule?(): import('./go-module').GoModule | null;
   /**
    * Re-exports declared by a file (`export { x } from './other'`,
    * `export * from './other'`). Empty array when the file has none.