Browse Source

feat(impact): Rust cross-module coverage — struct literals, trait dispatch, use-binding linking

Continues the cross-language blast-radius work for Rust (ripgrep 63%->81%,
tokio src 70%->83% file-dependent coverage).

- Struct literals: add `struct_expression` to the instantiation kinds so
  `Widget { n: 1 }` / `m::Widget { .. }` record an `instantiates` edge.
- Trait dispatch (#584 for Rust): extract trait method declarations
  (`function_signature_item`) as method nodes so a trait's method set is
  first-class, and enable Rust in the interface-override bridge — a struct's
  explicit `impl Trait for T` now links `Trait.method -> impl.method`, so a type
  reached only through `&dyn Trait` gets dependents.
- `use` / `pub use` binding linking: emit the leaf binding of each use path so a
  `pub use self::sub::Item` re-export hub depends on the module it re-exports
  (the largest lever) and items imported but used in non-call positions link.

Known remaining Rust frontier: common-name re-export collisions
(`pub use self::read::read`) need Rust module-path resolution; plus macros and
external-trait-only impls. New tests + CHANGELOG.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Colby McHenry 2 weeks ago
parent
commit
b538aee853

+ 1 - 0
CHANGELOG.md

@@ -18,6 +18,7 @@ and adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 - `codegraph affected` now reports the tests and files that actually depend on your changes. It used to follow only `import` statements — but those never cross file boundaries in CodeGraph's graph — so it returned **no affected tests for any change, in every language**. It now traces the real cross-file usage graph (calls, references, instantiations, and class `extends` / `implements`), so `git diff --name-only | codegraph affected` surfaces the test files that exercise the changed code. Circular-dependency detection, which had the same blind spot, now works too.
 - Blast radius, callers, and `codegraph affected` now recognize far more of the dependencies that were already in your code. A symbol now counts as a dependency whether it's called, used only in a type annotation inside a function body (`const items: Foo[] = []`), imported and placed in a registry array or passed as an argument, used as a JSX component, simply re-exported from a barrel (`export { X } from './x'`), or pulled in as a namespace (`import * as ns from '@/x'`) — including through tsconfig path aliases like `@/`. Previously only called, instantiated, or signature-typed symbols created a cross-file link, so a file that used a dependency in any other way could look like it depended on nothing — and the file that defined a widely-used symbol could look like nothing depended on it. The graph still indexes exactly the same symbols; it just connects the ones that were already there. (TypeScript/JavaScript)
 - The same completeness fix now applies to **Python**: a name brought in with `from module import X` is recorded as a dependency on that module even when `X` is only stored in a list/dict, passed as an argument, used as a decorator, or re-exported through an `__init__.py`. Previously Python linked only imports that were called or instantiated, so a module consumed purely by value — or only re-exported — looked like nothing depended on it.
+- Rust impact and `codegraph affected` now connect far more of the module graph. Struct literals (`Widget { n: 1 }`) are recorded as instantiations; a `use` / `pub use` brings its item into the dependency graph — so a `pub use` re-export hub (a `mod.rs` re-exporting its submodules) depends on the modules it re-exports; and trait dispatch reaches implementations — a struct whose methods cover a trait's is treated as implementing it, and a call through `&dyn Trait` resolves to the concrete method. Previously a Rust type linked only when called or used in a type position, so structs built by literal, modules surfaced only through `pub use`, and trait-only implementations looked like they had no dependents. (#584 for Rust traits)
 - C# `record` types are now indexed. `record`, `record class`, and `record struct` declarations (everywhere in modern C# — DTOs, value objects, CQRS messages, MediatR notifications) were previously skipped entirely, so every reference, generic type argument (`IEnumerable<MyRecord>`), and `new MyRecord(...)` pointed at nothing and the file defining a record looked like it had no callers or dependents. (#237)
 - Go interfaces now connect to their implementations. Go has no `implements` keyword — a type satisfies an interface just by having the right methods — so CodeGraph now infers that link: a struct whose methods cover an interface's method set is treated as implementing it, and a call through the interface (`API.Marshal(...)`) reaches every concrete implementation. This means a type used only via an interface (the common plugin/strategy pattern — e.g. JSON-codec or renderer implementations selected at runtime) is no longer reported as having no callers or no dependents, and impact now flows from an interface method to the implementations behind it. (#584)
 - Go now records cross-package struct creation. A composite literal like `render.XML{...}` or `pkga.Widget{...}` — including ones registered in a package-level `var registry = map[string]R{...}` — now links to the package that defines the type. Cross-package function calls and type references already resolved; this closes struct instantiation, so a package whose types are only *constructed* elsewhere (a common pattern for interface implementations) is no longer reported as having no dependents. Go type conversions such as `(*Wrapped)(x)` now link to the converted-to type as well.

+ 62 - 0
__tests__/extraction.test.ts

@@ -4778,3 +4778,65 @@ describe('C# records (blast-radius recall)', () => {
     }
   });
 });
+
+describe('Rust cross-module recall', () => {
+  function rustProject(files: Record<string, string>): string {
+    const dir = createTempDir();
+    fs.writeFileSync(path.join(dir, 'Cargo.toml'), '[package]\nname = "proj"\nversion = "0.1.0"\nedition = "2021"\n');
+    fs.mkdirSync(path.join(dir, 'src'), { recursive: true });
+    for (const [rel, content] of Object.entries(files)) {
+      const full = path.join(dir, 'src', rel);
+      fs.mkdirSync(path.dirname(full), { recursive: true });
+      fs.writeFileSync(full, content);
+    }
+    return dir;
+  }
+
+  it('extracts a struct literal `Foo { .. }` as an instantiation across modules', async () => {
+    const dir = rustProject({
+      'lib.rs': 'pub mod types;\npub mod consumer;\n',
+      'types.rs': 'pub struct Widget { pub n: i32 }\n',
+      'consumer.rs': 'use crate::types::Widget;\npub fn build() -> Widget { Widget { n: 1 } }\n',
+    });
+    try {
+      const cg = CodeGraph.initSync(dir, { config: { include: ['src/**/*.rs'], exclude: [] } });
+      await cg.indexAll();
+      cg.resolveReferences();
+      expect(cg.getFileDependents('src/types.rs')).toContain('src/consumer.rs');
+      cg.destroy();
+    } finally { cleanupTempDir(dir); }
+  });
+
+  it('extracts trait method declarations and bridges trait dispatch to the impl', async () => {
+    const dir = rustProject({
+      'lib.rs': 'pub mod types;\npub mod consumer;\n',
+      'types.rs': 'pub trait Render { fn render(&self) -> i32; }\n',
+      // Mine implements Render structurally; reached via &dyn Render dispatch.
+      'consumer.rs': 'use crate::types::Render;\npub struct Mine { pub x: i32 }\nimpl Render for Mine { fn render(&self) -> i32 { self.x } }\n',
+    });
+    try {
+      const cg = CodeGraph.initSync(dir, { config: { include: ['src/**/*.rs'], exclude: [] } });
+      await cg.indexAll();
+      cg.resolveReferences();
+      // implements edge (Mine -> Render) makes types.rs a dependent of consumer.rs's struct.
+      expect(cg.getFileDependents('src/types.rs')).toContain('src/consumer.rs');
+      cg.destroy();
+    } finally { cleanupTempDir(dir); }
+  });
+
+  it('links `pub use` re-export hubs to the modules they re-export', async () => {
+    const dir = rustProject({
+      'lib.rs': 'pub mod api;\n',
+      'api/mod.rs': 'mod widget;\npub use self::widget::Widget;\n',
+      'api/widget.rs': 'pub struct Widget { pub n: i32 }\n',
+    });
+    try {
+      const cg = CodeGraph.initSync(dir, { config: { include: ['src/**/*.rs'], exclude: [] } });
+      await cg.indexAll();
+      cg.resolveReferences();
+      // The re-export hub depends on the module it re-exports from.
+      expect(cg.getFileDependents('src/api/widget.rs')).toContain('src/api/mod.rs');
+      cg.destroy();
+    } finally { cleanupTempDir(dir); }
+  });
+});

+ 6 - 2
src/extraction/languages/rust.ts

@@ -3,9 +3,13 @@ import { getNodeText, getChildByField } from '../tree-sitter-helpers';
 import type { LanguageExtractor } from '../tree-sitter-types';
 
 export const rustExtractor: LanguageExtractor = {
-  functionTypes: ['function_item'],
+  // `function_signature_item` is a trait method DECLARATION (`fn render(&self);`,
+  // no body). Extracting it makes a trait's method set first-class, which
+  // impl-navigation and trait-dispatch synthesis need (a struct's method set is
+  // matched against the trait's).
+  functionTypes: ['function_item', 'function_signature_item'],
   classTypes: [], // Rust has impl blocks
-  methodTypes: ['function_item'], // Methods are functions in impl blocks
+  methodTypes: ['function_item', 'function_signature_item'],
   interfaceTypes: ['trait_item'],
   structTypes: ['struct_item'],
   enumTypes: ['enum_item'],

+ 69 - 0
src/extraction/tree-sitter.ts

@@ -125,6 +125,7 @@ const INSTANTIATION_KINDS: ReadonlySet<string> = new Set([
   'object_creation_expression',      // java / c#
   'instance_creation_expression',    // some grammars
   'composite_literal',               // go — `Widget{...}` / `pkga.Widget{...}`
+  'struct_expression',               // rust — `Widget { n: 1 }` / `m::Widget { .. }`
 ]);
 
 /**
@@ -1692,6 +1693,14 @@ export class TreeSitterExtractor {
           const parentId = this.nodeStack[this.nodeStack.length - 1];
           if (parentId) this.emitPyFromImportRefs(node, parentId);
         }
+        // Rust `use crate::m::Item;` / `pub use self::sub::Item;` — link each
+        // imported leaf to its definition. Covers `pub use` re-export hubs
+        // (a `mod.rs` re-exporting submodule items, e.g. tokio's `fs/mod.rs`)
+        // and items imported but used in non-call/non-type positions.
+        if (this.language === 'rust' && node.type === 'use_declaration') {
+          const parentId = this.nodeStack[this.nodeStack.length - 1];
+          if (parentId) this.emitRustUseBindingRefs(node, parentId);
+        }
         return;
       }
       // Hook returned null — fall through to multi-import inline handlers only
@@ -1878,6 +1887,66 @@ export class TreeSitterExtractor {
     }
   }
 
+  /**
+   * Emit one `imports` reference per leaf binding of a Rust `use` declaration —
+   * `use crate::m::Item`, `use crate::m::{A, B as C}`, `pub use self::sub::Item`.
+   * The leaf name (the defining symbol, not the local alias) is resolved by the
+   * name-matcher to its definition, so a `pub use` re-export hub (a `mod.rs`
+   * re-exporting submodule items) depends on the modules it re-exports, and a
+   * `use`d item that's only stored/passed (not called/typed) still links.
+   * `use ...::*` and bare `self`/`super`/`crate` segments have no leaf to link.
+   */
+  private emitRustUseBindingRefs(node: SyntaxNode, fromNodeId: string): void {
+    const leaves: SyntaxNode[] = [];
+    const collect = (n: SyntaxNode): void => {
+      switch (n.type) {
+        case 'identifier':
+          leaves.push(n);
+          break;
+        case 'scoped_identifier': {
+          // `a::b::C` → the leaf is the final `name` segment.
+          const name = getChildByField(n, 'name') ?? n.namedChild(n.namedChildCount - 1);
+          if (name && name.type === 'identifier') leaves.push(name);
+          break;
+        }
+        case 'scoped_use_list': {
+          // `path::{ ... }` → recurse into the list; the path prefix isn't a leaf.
+          const list = getChildByField(n, 'list') ?? n.namedChildren.find((c) => c.type === 'use_list');
+          if (list) collect(list);
+          break;
+        }
+        case 'use_list':
+          for (let i = 0; i < n.namedChildCount; i++) {
+            const c = n.namedChild(i);
+            if (c) collect(c);
+          }
+          break;
+        case 'use_as_clause': {
+          // `Path as Alias` → link the source path (the definition), not the alias.
+          const path = getChildByField(n, 'path') ?? n.namedChild(0);
+          if (path) collect(path);
+          break;
+        }
+        // use_wildcard / self / super / crate → no specific leaf to link.
+      }
+    };
+    for (let i = 0; i < node.namedChildCount; i++) {
+      const c = node.namedChild(i);
+      if (c) collect(c);
+    }
+    for (const leaf of leaves) {
+      const name = getNodeText(leaf, this.source);
+      if (!name || name === 'self' || name === 'super' || name === 'crate') continue;
+      this.unresolvedReferences.push({
+        fromNodeId,
+        referenceName: name,
+        referenceKind: 'imports',
+        line: leaf.startPosition.row + 1,
+        column: leaf.startPosition.column,
+      });
+    }
+  }
+
   /**
    * Emit one `imports` reference per name imported in a Python
    * `from module import A, B as C` statement, attributed to the file node — so

+ 1 - 1
src/resolution/callback-synthesizer.ts

@@ -444,7 +444,7 @@ function cppOverrideEdges(queries: QueryBuilder): Edge[] {
 // and are added below; their concrete-side nodes can be a `struct` (Swift)
 // or an `object` (Scala) so the loop also iterates those kinds.
 const IFACE_OVERRIDE_LANGS = new Set([
-  'java', 'kotlin', 'csharp', 'typescript', 'javascript', 'swift', 'scala', 'go',
+  'java', 'kotlin', 'csharp', 'typescript', 'javascript', 'swift', 'scala', 'go', 'rust',
 ]);
 /**
  * Go implicit interface satisfaction (#584). Go has no `implements` keyword — a