1
0
Эх сурвалжийг харах

feat(impact): PHP namespace capture + use-import resolution + type-hint refs

PHP ignored namespaces entirely: every class's qualified name was its bare simple
name, so a framework's many same-named classes (Laravel has 7+ `Factory`
interfaces, several `Dispatcher`s, across namespaces) collapsed into one arbitrary
match, and `use` imports created nodes that never resolved to a definition. An
interface imported and constructor-injected (the DI pattern) showed zero
dependents.

- Capture `namespace Foo\Bar;` as a file-level package (like a Java/Kotlin
  package) so each class is scoped to `Foo\Bar::Class` — same-named classes are
  now distinguishable.
- Resolve `use Foo\Bar\Baz;` (single and grouped) to the namespace-qualified
  definition via an `imports` ref the resolver matches by exact qualified name —
  so an imported-but-DI-injected contract records a precise cross-file dependency.
- Add `php` to the type-annotation languages with a PHP-aware walker: parameter /
  property / return type-hints are `named_type`/`union_type`/… wrapping
  `name`/`qualified_name` (never `type_identifier`), and `variable_name` ($x)
  param names are never mis-emitted.

Measured (fair cross-file dependent coverage, symbol-bearing source files):
guzzle 95.2% → 100.0%, laravel 80.5% → 94.9%. Node count grows only by the new
namespace nodes (one per file, a real symbol — like JVM package nodes), not
duplication; edges added precisely. laravel residual is genuine frontiers
(service providers / controllers / middleware registered by class-string,
console view components dispatched by magic `__call`, mixin traits for app
consumption). Full suite green; non-PHP languages unaffected (changes gated to
`language === 'php'`).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Colby McHenry 2 долоо хоног өмнө
parent
commit
acfb4445a1

+ 1 - 0
CHANGELOG.md

@@ -23,6 +23,7 @@ and adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 - Java annotations are now connected. Annotation definitions (`@interface Foo`) are indexed as types, and every `@Foo` usage on a class, method, or field is recorded as a dependency on it. Previously neither side was captured — annotation usages were dropped (they live inside the declaration's modifiers) and `@interface` types weren't indexed at all — so annotation-driven code (Spring `@GetMapping`, JPA `@Entity`, Gson `@SerializedName`, …) showed the annotation as having no users and the annotated class as not depending on it.
 - Kotlin Multiplatform `expect`/`actual` declarations are now connected. A platform implementation — `actual fun`, `actual class`, or an `actual typealias` in a `jvm` / `native` / `js` / `wasm` source set — is linked to the common `expect` declaration it fulfills (including the common case of an `expect class` fulfilled by an `actual typealias`). Previously a caller in common code resolved to the `expect` declaration, so every platform `actual` looked like it had no dependents and editing one showed an empty blast radius; now changing a platform implementation surfaces the common API and everything that uses it. (Kotlin)
 - Scala impact and `codegraph affected` now connect the type graph that typeclass-style code is built on. A parameterized supertype (`trait Monoid[A] extends Semigroup[A] with Serializable`) now links to each parent; a type used in a `val`/`def` signature, as a type argument, or as a context bound (`def f[A: Monoid]`) — including the trailing implicit parameter list (`(implicit M: Monoid[A])`) where typeclass instances are passed — now records a dependency; and `new T[...] { … }` counts as an instantiation. Previously Scala linked only plain calls and bare, non-generic supertypes, so a trait extended with type parameters, used as a type, or required as an implicit looked like nothing depended on it — which on a typeclass-heavy codebase (cats, algebra) was most of the graph. (Scala)
+- PHP impact and `codegraph affected` now understand namespaces and `use` imports. Classes are tracked by their namespaced name, so the many same-named classes a framework defines (Laravel has 7+ `Factory` interfaces, several `Dispatcher`s, across namespaces) are told apart instead of collapsing into one arbitrary match. A `use App\Contracts\Cache\Factory;` now records a dependency on exactly that class — so a contract or interface that's imported and constructor-injected (the dependency-injection pattern) is no longer reported as having no dependents — and parameter, property, and return type-hints are recorded too. Previously PHP ignored namespaces entirely and linked only calls, `new`, and inheritance. (PHP)
 - C# `record` types are now indexed. `record`, `record class`, and `record struct` declarations (everywhere in modern C# — DTOs, value objects, CQRS messages, MediatR notifications) were previously skipped entirely, so every reference, generic type argument (`IEnumerable<MyRecord>`), and `new MyRecord(...)` pointed at nothing and the file defining a record looked like it had no callers or dependents. (#237)
 - Go interfaces now connect to their implementations. Go has no `implements` keyword — a type satisfies an interface just by having the right methods — so CodeGraph now infers that link: a struct whose methods cover an interface's method set is treated as implementing it, and a call through the interface (`API.Marshal(...)`) reaches every concrete implementation. This means a type used only via an interface (the common plugin/strategy pattern — e.g. JSON-codec or renderer implementations selected at runtime) is no longer reported as having no callers or no dependents, and impact now flows from an interface method to the implementations behind it. (#584)
 - Go now records cross-package struct creation. A composite literal like `render.XML{...}` or `pkga.Widget{...}` — including ones registered in a package-level `var registry = map[string]R{...}` — now links to the package that defines the type. Cross-package function calls and type references already resolved; this closes struct instantiation, so a package whose types are only *constructed* elsewhere (a common pattern for interface implementations) is no longer reported as having no dependents. Go type conversions such as `(*Wrapped)(x)` now link to the converted-to type as well.

+ 86 - 0
__tests__/extraction.test.ts

@@ -3359,6 +3359,92 @@ object Folding {
   });
 });
 
+describe('PHP namespace + import resolution', () => {
+  let tempDir: string;
+  let cg: CodeGraph;
+
+  beforeEach(() => {
+    tempDir = createTempDir();
+  });
+
+  afterEach(() => {
+    if (cg) cg.close();
+    if (fs.existsSync(tempDir)) fs.rmSync(tempDir, { recursive: true, force: true });
+  });
+
+  it('resolves `use` imports to the namespace-qualified definition and type-hints across files', async () => {
+    const src = path.join(tempDir, 'src');
+    // Two interfaces with the SAME simple name in different namespaces — the
+    // exact ambiguity (Laravel has 7+ `Factory`) that bare-name matching can't
+    // resolve. The namespace qualifies them; the `use` import disambiguates.
+    fs.mkdirSync(path.join(src, 'Cache'), { recursive: true });
+    fs.mkdirSync(path.join(src, 'Mail'), { recursive: true });
+    fs.mkdirSync(path.join(src, 'App'), { recursive: true });
+    fs.writeFileSync(
+      path.join(src, 'Cache', 'Factory.php'),
+      `<?php
+namespace Contracts\\Cache;
+
+interface Factory {
+    public function store(): object;
+}
+`
+    );
+    fs.writeFileSync(
+      path.join(src, 'Mail', 'Factory.php'),
+      `<?php
+namespace Contracts\\Mail;
+
+interface Factory {
+    public function mailer(): object;
+}
+`
+    );
+    fs.writeFileSync(
+      path.join(src, 'App', 'Service.php'),
+      `<?php
+namespace App;
+
+use Contracts\\Cache\\Factory;
+
+class Service {
+    public function make(): Factory {
+        return resolve(Factory::class);
+    }
+}
+`
+    );
+
+    cg = CodeGraph.initSync(tempDir);
+    await cg.indexAll();
+    cg.resolveReferences();
+
+    // The PHP namespace is captured into the qualified name, so the two
+    // same-named interfaces are distinguishable.
+    const cacheFactory = cg
+      .getNodesByKind('interface')
+      .find((n) => n.qualifiedName === 'Contracts\\Cache::Factory');
+    const mailFactory = cg
+      .getNodesByKind('interface')
+      .find((n) => n.qualifiedName === 'Contracts\\Mail::Factory');
+    expect(cacheFactory).toBeDefined();
+    expect(mailFactory).toBeDefined();
+
+    // Service `use`s Contracts\Cache\Factory, so editing THAT interface reaches
+    // Service.php — and editing the same-named Contracts\Mail\Factory must NOT
+    // (the import resolved to the right namespace, not an arbitrary `Factory`).
+    const serviceFile = 'src/App/Service.php';
+    const cacheReaches = [...cg.getImpactRadius(cacheFactory!.id, 3).nodes.values()].some(
+      (n) => (n.filePath ?? '').endsWith(serviceFile)
+    );
+    const mailReaches = [...cg.getImpactRadius(mailFactory!.id, 3).nodes.values()].some(
+      (n) => (n.filePath ?? '').endsWith(serviceFile)
+    );
+    expect(cacheReaches).toBe(true);
+    expect(mailReaches).toBe(false);
+  });
+});
+
 describe('Full Indexing', () => {
   let tempDir: string;
 

+ 12 - 0
src/extraction/languages/php.ts

@@ -78,6 +78,18 @@ export const phpExtractor: LanguageExtractor = {
 
     return false;
   },
+  // PHP `namespace Foo\Bar;` is file-level (like a Java/Kotlin package). Capturing
+  // it scopes every class under an `Foo\Bar::` qualified name, which is what makes
+  // `use` imports and same-named types (Laravel has 7+ `Factory` interfaces across
+  // namespaces) resolvable to the RIGHT definition instead of an arbitrary match.
+  packageTypes: ['namespace_definition'],
+  extractPackage: (node, source) => {
+    const nsName = node.namedChildren.find((c: SyntaxNode) => c.type === 'namespace_name');
+    // Skip braced `namespace Foo { … }` (has a body) — file-level only.
+    const hasBody = node.namedChildren.some((c: SyntaxNode) => c.type === 'compound_statement' || c.type === 'declaration_list');
+    if (!nsName || hasBody) return null;
+    return getNodeText(nsName, source);
+  },
   extractImport: (node, source) => {
     const importText = source.substring(node.startIndex, node.endIndex).trim();
 

+ 129 - 1
src/extraction/tree-sitter.ts

@@ -147,6 +147,18 @@ function scalaBaseTypeName(node: SyntaxNode | null, source: string): string | nu
   }
 }
 
+/**
+ * PHP type-position wrapper node kinds (a type-hint is `named_type`,
+ * `?Foo` is `optional_type`, `A|B` is `union_type`, `A&B` is
+ * `intersection_type`). Used to find the type subtree inside a parameter /
+ * property / return position before walking it for class references.
+ */
+const PHP_TYPE_NODES: ReadonlySet<string> = new Set([
+  'named_type', 'optional_type', 'nullable_type',
+  'union_type', 'intersection_type', 'disjunctive_normal_form_type',
+  'primitive_type',
+]);
+
 /**
  * Tree-sitter node kinds that represent constructor invocations
  * (`new Foo()` and friends). Used by extractInstantiation to emit
@@ -1760,6 +1772,13 @@ export class TreeSitterExtractor {
           const parentId = this.nodeStack[this.nodeStack.length - 1];
           if (parentId) this.emitRustUseBindingRefs(node, parentId);
         }
+        // PHP `use Foo\Bar\Baz;` — link to the namespace-qualified definition so
+        // an imported-but-DI-injected contract (Laravel's pattern) records a
+        // cross-file dependency. Grouped imports are handled in their own branch.
+        if (this.language === 'php' && node.type === 'namespace_use_declaration') {
+          const parentId = this.nodeStack[this.nodeStack.length - 1];
+          if (parentId) this.emitPhpUseRefs(node, parentId);
+        }
         return;
       }
       // Hook returned null — fall through to multi-import inline handlers only
@@ -1847,6 +1866,8 @@ export class TreeSitterExtractor {
             this.createNode('import', fullPath, node, {
               signature: importText,
             });
+            const parentId = this.nodeStack[this.nodeStack.length - 1];
+            if (parentId) this.pushPhpUseRef(fullPath, parentId, node);
           }
         }
         return;
@@ -2011,6 +2032,36 @@ export class TreeSitterExtractor {
     }
   }
 
+  /**
+   * Emit an `imports` reference for a single PHP `use Foo\Bar\Baz;` (grouped
+   * imports `use Foo\{A, B}` are handled where their per-item nodes are created).
+   * The reference targets the namespace-qualified `Foo\Bar::Baz` form classes are
+   * stored under (see the PHP `namespace` capture), so it resolves to the RIGHT
+   * definition — Laravel has many same-named contracts (`Factory`, `Dispatcher`,
+   * `Guard`) across namespaces that a bare-name match can't disambiguate.
+   */
+  private emitPhpUseRefs(node: SyntaxNode, fromNodeId: string): void {
+    const clause = node.namedChildren.find((c: SyntaxNode) => c.type === 'namespace_use_clause');
+    if (!clause) return;
+    const qn = clause.namedChildren.find((c: SyntaxNode) => c.type === 'qualified_name')
+      ?? clause.namedChildren.find((c: SyntaxNode) => c.type === 'name');
+    if (qn) this.pushPhpUseRef(getNodeText(qn, this.source), fromNodeId, node);
+  }
+
+  /** Convert a PHP FQN `Foo\Bar\Baz` to the stored `Foo\Bar::Baz` and emit an `imports` ref. */
+  private pushPhpUseRef(fqn: string, fromNodeId: string, node: SyntaxNode): void {
+    const clean = fqn.replace(/^\\/, '');
+    const lastSep = clean.lastIndexOf('\\');
+    if (lastSep < 0) return; // global-namespace class — already matches by simple name
+    this.unresolvedReferences.push({
+      fromNodeId,
+      referenceName: `${clean.slice(0, lastSep)}::${clean.slice(lastSep + 1)}`,
+      referenceKind: 'imports',
+      line: node.startPosition.row + 1,
+      column: node.startPosition.column,
+    });
+  }
+
   /**
    * Emit one `imports` reference per name imported in a Python
    * `from module import A, B as C` statement, attributed to the file node — so
@@ -2984,7 +3035,16 @@ export class TreeSitterExtractor {
    * Languages that support type annotations (TypeScript, etc.)
    */
   private readonly TYPE_ANNOTATION_LANGUAGES = new Set([
-    'typescript', 'tsx', 'dart', 'kotlin', 'swift', 'rust', 'go', 'java', 'csharp', 'scala',
+    'typescript', 'tsx', 'dart', 'kotlin', 'swift', 'rust', 'go', 'java', 'csharp', 'scala', 'php',
+  ]);
+
+  /**
+   * PHP pseudo-types and `self`/`static`/`parent` that aren't project symbols.
+   * (Scalar primitives parse as `primitive_type` and are skipped structurally.)
+   */
+  private readonly PHP_PSEUDO_TYPES = new Set([
+    'self', 'static', 'parent', 'mixed', 'object', 'iterable', 'callable', 'void',
+    'null', 'false', 'true', 'never', 'array', 'int', 'float', 'string', 'bool',
   ]);
 
   /**
@@ -3025,6 +3085,17 @@ export class TreeSitterExtractor {
       return;
     }
 
+    // PHP type-hints are `named_type`/`optional_type`/`union_type` wrapping a
+    // `name`/`qualified_name` — never `type_identifier` — so the generic walker
+    // below emits nothing for them. Dispatch to a PHP-aware path that walks only
+    // type positions (parameter / return / property types), so type-hinted
+    // dependencies (the constructor-injected contracts that dominate Laravel) are
+    // recorded and a `variable_name` like `$events` never mis-emits as a ref.
+    if (this.language === 'php') {
+      this.extractPhpTypeRefs(node, nodeId);
+      return;
+    }
+
     // Extract parameter type annotations. Scala curries — `def f(a)(implicit
     // M: TC)` has MULTIPLE `parameters` siblings, and the typeclass is almost
     // always in the trailing implicit list — so walk every parameter list, not
@@ -3178,6 +3249,63 @@ export class TreeSitterExtractor {
     }
   }
 
+  /**
+   * Extract PHP type references from a method/function/property declaration.
+   * Walks ONLY type positions: each parameter's type child (inside
+   * `formal_parameters`), the return type, and a property's type — all
+   * `named_type` / `optional_type` / `union_type` / … direct children. Parameter
+   * and property NAMES are `variable_name` (`$x`), never type nodes, so they
+   * can't be mis-emitted.
+   */
+  private extractPhpTypeRefs(node: SyntaxNode, nodeId: string): void {
+    const params = node.namedChildren.find((c: SyntaxNode) => c.type === 'formal_parameters');
+    if (params) {
+      for (const p of params.namedChildren) {
+        // simple_parameter / property_promotion_parameter / variadic_parameter
+        for (const c of p.namedChildren) {
+          if (PHP_TYPE_NODES.has(c.type)) this.walkPhpTypePosition(c, nodeId);
+        }
+      }
+    }
+    // Return type (method/function) and property type are TYPE nodes that are
+    // DIRECT children of the declaration.
+    for (const c of node.namedChildren) {
+      if (PHP_TYPE_NODES.has(c.type)) this.walkPhpTypePosition(c, nodeId);
+    }
+  }
+
+  /** Walk a PHP subtree KNOWN to be in a type position; emit class/interface refs. */
+  private walkPhpTypePosition(node: SyntaxNode, fromNodeId: string): void {
+    if (node.type === 'primitive_type') return; // int/string/void/…
+    if (node.type === 'name') {
+      const name = getNodeText(node, this.source);
+      if (name && !this.PHP_PSEUDO_TYPES.has(name)) {
+        this.unresolvedReferences.push({
+          fromNodeId, referenceName: name, referenceKind: 'references',
+          line: node.startPosition.row + 1, column: node.startPosition.column,
+        });
+      }
+      return;
+    }
+    if (node.type === 'qualified_name') {
+      // `App\Contracts\Logger` → match on the trailing simple name (what the
+      // class node is stored as, and what a `use` import brings into scope).
+      const last = getNodeText(node, this.source).split('\\').pop() ?? '';
+      if (last && !this.PHP_PSEUDO_TYPES.has(last)) {
+        this.unresolvedReferences.push({
+          fromNodeId, referenceName: last, referenceKind: 'references',
+          line: node.startPosition.row + 1, column: node.startPosition.column,
+        });
+      }
+      return;
+    }
+    // optional_type / nullable_type / union_type / intersection_type / named_type → recurse
+    for (let i = 0; i < node.namedChildCount; i++) {
+      const child = node.namedChild(i);
+      if (child) this.walkPhpTypePosition(child, fromNodeId);
+    }
+  }
+
   /**
    * Extract type references from a variable's type annotation.
    */