Преглед на файлове

feat(impact): Objective-C — single-arg selectors, class-message receivers, #import, class-method resolution

Objective-C was the worst-covered README language (~33-50%). Four gaps:

- Single-argument selectors: `[cache storeImage:key]` (the most common call
  form) was named `storeImage` — without the colon — at the call site, so it
  never matched its `storeImage:` method. A single keyword now gets its colon
  when the message has an argument (distinguished from the unary `[c reset]`).
- Class-message receiver: `[SDImageCache sharedCache]` / `[[Foo alloc] init]`
  resolved the method but never referenced the CLASS — whose @interface lives in
  the header. Emit a `references` edge to a capitalized message receiver, so a
  class used via class messages (alloc/init, singletons, factories) and its
  header record a dependent.
- `#import "Foo.h"`: the import created a node but never resolved cross-file
  (a bare `Foo.h` has no slash). matchByFilePath now resolves a bare filename
  with a short extension to the header by basename — so every imported header is
  covered.
- Class-method calls: `SDImageCache.storeImage:` didn't resolve because the
  method-call regex rejected the trailing-colon selector; broadened to allow
  colon keywords (a no-op for other languages).

Measured (fair cross-file dependent coverage): AFNetworking 50.0% → 90.0%;
SDWebImage (Core) 33.8% → 91.6%. Node count stable. Residual = public-API
category methods called only by app code (a frontier). Full suite green; the
shared matchByFilePath / method-regex changes are no-ops for other languages
(only bare-filename refs and colon-bearing selectors are newly handled).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Colby McHenry преди 2 седмици
родител
ревизия
33ce431
променени са 4 файла, в които са добавени 116 реда и са изтрити 7 реда
  1. 1 0
      CHANGELOG.md
  2. 74 0
      __tests__/extraction.test.ts
  3. 29 4
      src/extraction/tree-sitter.ts
  4. 12 3
      src/resolution/name-matcher.ts

+ 1 - 0
CHANGELOG.md

@@ -24,6 +24,7 @@ and adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 - Kotlin Multiplatform `expect`/`actual` declarations are now connected. A platform implementation — `actual fun`, `actual class`, or an `actual typealias` in a `jvm` / `native` / `js` / `wasm` source set — is linked to the common `expect` declaration it fulfills (including the common case of an `expect class` fulfilled by an `actual typealias`). Previously a caller in common code resolved to the `expect` declaration, so every platform `actual` looked like it had no dependents and editing one showed an empty blast radius; now changing a platform implementation surfaces the common API and everything that uses it. (Kotlin)
 - Scala impact and `codegraph affected` now connect the type graph that typeclass-style code is built on. A parameterized supertype (`trait Monoid[A] extends Semigroup[A] with Serializable`) now links to each parent; a type used in a `val`/`def` signature, as a type argument, or as a context bound (`def f[A: Monoid]`) — including the trailing implicit parameter list (`(implicit M: Monoid[A])`) where typeclass instances are passed — now records a dependency; and `new T[...] { … }` counts as an instantiation. Previously Scala linked only plain calls and bare, non-generic supertypes, so a trait extended with type parameters, used as a type, or required as an implicit looked like nothing depended on it — which on a typeclass-heavy codebase (cats, algebra) was most of the graph. (Scala)
 - PHP impact and `codegraph affected` now understand namespaces and `use` imports. Classes are tracked by their namespaced name, so the many same-named classes a framework defines (Laravel has 7+ `Factory` interfaces, several `Dispatcher`s, across namespaces) are told apart instead of collapsing into one arbitrary match. A `use App\Contracts\Cache\Factory;` now records a dependency on exactly that class — so a contract or interface that's imported and constructor-injected (the dependency-injection pattern) is no longer reported as having no dependents — and parameter, property, and return type-hints are recorded too. Previously PHP ignored namespaces entirely and linked only calls, `new`, and inheritance. (PHP)
+- Objective-C impact and `codegraph affected` are dramatically more complete. Four gaps are fixed: a single-argument message (`[cache storeImage:key]` — the most common call form) now matches its `storeImage:` method instead of dropping the colon; a class-message receiver (`[SDImageCache sharedCache]`, `[[Foo alloc] init]`) now records a dependency on the class, whose `@interface` lives in the header; `#import "Foo.h"` now resolves to the header file, so a header is no longer reported as having no dependents; and class-method message calls now resolve through the receiver type. Together these took typical libraries from a third-to-half of their files showing real dependents to ~90%. (Objective-C)
 - A type referenced only through a static member or enum value now records a dependency. Reading an enum value (`MediaKind.video`), a static constant (`Colors.red`, `JsonScope.EMPTY_DOCUMENT`), or a class constant (`Foo::BAR`) now links to the type — previously only method calls and `new` did, so a type or enum used purely *by value* (enum-heavy APIs, constants classes — a very common pattern) looked like nothing depended on it. Applies to Java, C#, Kotlin, Swift, Scala, Dart, PHP, and C++.
 - Dart impact and `codegraph affected` now follow mixins and method type annotations. A `with` mixin — Dart's core composition mechanism, which Flutter is built on — now records a dependency, so editing a mixin surfaces every class that mixes it in (the whole `with` clause used to be dropped, and a class declared `with M` alone even lost its real superclass link). And types used in a method's parameters or return value now link to their definition, so a class or enum referenced only as a type — not constructed or called — is no longer reported as having no dependents. (Dart)
 - C++ free functions are now indexed under their real name. A function written with a qualified-type parameter (`std::string TableFileName(const std::string& dbname)`) or an `auto … -> std::string` trailing return type was mistakenly named after that type (`string`), so calls to it never resolved, `codegraph_node` couldn't find it by name, and the file defining it looked like nothing depended on it. The function now keeps its real name, so cross-file calls, callers, and blast radius work — a meaningful gain for any namespaced C++ codebase (this is how most free functions in a library look). (C++)

+ 74 - 0
__tests__/extraction.test.ts

@@ -3743,6 +3743,80 @@ describe('Static-member / value-read references', () => {
   });
 });
 
+describe('Objective-C messages, class receivers, and #import', () => {
+  let tempDir: string;
+  let cg: CodeGraph;
+
+  beforeEach(() => {
+    tempDir = createTempDir();
+  });
+
+  afterEach(() => {
+    if (cg) cg.close();
+    if (fs.existsSync(tempDir)) fs.rmSync(tempDir, { recursive: true, force: true });
+  });
+
+  it('resolves single-arg selectors, class-message receivers, and #import headers', async () => {
+    fs.writeFileSync(
+      path.join(tempDir, 'SDImageCache.h'),
+      `#import <Foundation/Foundation.h>
+@interface SDImageCache : NSObject
++ (instancetype)sharedCache;
++ (void)storeImage:(NSString *)key;
+@end
+`
+    );
+    fs.writeFileSync(
+      path.join(tempDir, 'SDImageCache.m'),
+      `#import "SDImageCache.h"
+@implementation SDImageCache
++ (instancetype)sharedCache { return nil; }
++ (void)storeImage:(NSString *)key { }
+@end
+`
+    );
+    fs.writeFileSync(
+      path.join(tempDir, 'SDManager.m'),
+      `#import "SDImageCache.h"
+@interface SDManager : NSObject
+@end
+@implementation SDManager
+- (void)run {
+  [SDImageCache sharedCache];
+  [SDImageCache storeImage:@"k"];
+}
+@end
+`
+    );
+
+    cg = CodeGraph.initSync(tempDir);
+    await cg.indexAll();
+    cg.resolveReferences();
+
+    // 1. The single-argument selector `[SDImageCache storeImage:@"k"]` resolves
+    //    to the `storeImage:` method — named WITH its colon both at the call site
+    //    and the definition (before the fix the call site dropped the colon).
+    const storeImage = cg.getNodesByKind('method').find((n) => n.name === 'storeImage:');
+    expect(storeImage, 'storeImage: method').toBeDefined();
+    const storeCallers = [...cg.getImpactRadius(storeImage!.id, 2).nodes.values()].map((n) => n.filePath ?? '');
+    expect(storeCallers.some((p) => p.endsWith('SDManager.m'))).toBe(true);
+
+    // 2. The class-message receiver `[SDImageCache sharedCache]` references the
+    //    SDImageCache class (whose @interface lives in the header).
+    const cache = cg.getNodesByKind('class').find((n) => n.name === 'SDImageCache');
+    expect(cache, 'SDImageCache class').toBeDefined();
+    const classDeps = [...cg.getImpactRadius(cache!.id, 2).nodes.values()].map((n) => n.filePath ?? '');
+    expect(classDeps.some((p) => p.endsWith('SDManager.m'))).toBe(true);
+
+    // 3. `#import "SDImageCache.h"` resolves to the header FILE — editing it
+    //    surfaces both importers.
+    const header = cg.getNodesByKind('file').find((n) => n.filePath.endsWith('SDImageCache.h'));
+    expect(header, 'SDImageCache.h indexed').toBeDefined();
+    const importers = [...cg.getImpactRadius(header!.id, 2).nodes.values()].map((n) => n.filePath ?? '');
+    expect(importers.some((p) => p.endsWith('SDManager.m'))).toBe(true);
+  });
+});
+
 describe('Full Indexing', () => {
   let tempDir: string;
 

+ 29 - 4
src/extraction/tree-sitter.ts

@@ -2247,16 +2247,41 @@ export class TreeSitterExtractor {
         }
       }
       if (methodKeywords.length > 0) {
-        const methodName: string =
-          methodKeywords.length === 1
-            ? (methodKeywords[0] as string)
-            : methodKeywords.map((k) => `${k}:`).join('');
+        // A selector keyword takes a `:` when it has an argument. A SINGLE
+        // keyword can be unary (`[c reset]` → `reset`) OR take one argument
+        // (`[c storeImage:k]` → `storeImage:`) — distinguished by whether the
+        // message has a `:` token. Without this, every single-argument message
+        // (the most common form: `addObject:`, `storeImage:`, …) was named
+        // without the colon and never matched its `storeImage:` method.
+        let hasColon = false;
+        for (let i = 0; i < node.childCount; i++) {
+          if (node.child(i)?.type === ':') { hasColon = true; break; }
+        }
+        const methodName: string = hasColon
+          ? methodKeywords.map((k) => `${k}:`).join('')
+          : (methodKeywords[0] as string);
         const receiverField = getChildByField(node, 'receiver');
         const SKIP_RECEIVERS = new Set(['self', 'super']);
         if (receiverField && receiverField.type !== 'message_expression') {
           const receiverName = getNodeText(receiverField, this.source);
           if (receiverName && !SKIP_RECEIVERS.has(receiverName)) {
             calleeName = `${receiverName}.${methodName}`;
+            // A CLASS-message receiver (`[SDImageCache alloc]`,
+            // `[SDImageCache sharedCache]`) is a capitalized class name. The
+            // call resolves the method (`alloc`/`sharedCache`), but the CLASS
+            // itself — whose @interface lives in the header — would otherwise
+            // never be referenced. Emit a `references` edge to it so a class
+            // used only via class messages (alloc/init, singletons, factories)
+            // and its header record a dependent.
+            if (/^[A-Z][A-Za-z0-9_]*$/.test(receiverName)) {
+              this.unresolvedReferences.push({
+                fromNodeId: callerId,
+                referenceName: receiverName,
+                referenceKind: 'references',
+                line: receiverField.startPosition.row + 1,
+                column: receiverField.startPosition.column,
+              });
+            }
           } else {
             calleeName = methodName;
           }

+ 12 - 3
src/resolution/name-matcher.ts

@@ -15,7 +15,13 @@ export function matchByFilePath(
   ref: UnresolvedRef,
   context: ResolutionContext
 ): ResolvedRef | null {
-  if (!ref.referenceName.includes('/')) return null;
+  // Path-like (`a/b.liquid`) OR a bare filename ending in a short extension
+  // (`Foo.h` — an Objective-C `#import "Foo.h"`, resolved to the header by
+  // basename). A bare ref WITHOUT an extension is a symbol name, not a file, so
+  // leave it to the symbol-matching strategies.
+  if (!ref.referenceName.includes('/') && !/\.[A-Za-z][A-Za-z0-9]{0,3}$/.test(ref.referenceName)) {
+    return null;
+  }
 
   // Extract the filename from the path
   const fileName = ref.referenceName.split('/').pop();
@@ -357,8 +363,11 @@ export function matchMethodCall(
   ref: UnresolvedRef,
   context: ResolutionContext
 ): ResolvedRef | null {
-  // Parse method call patterns like "obj.method" or "Class::method"
-  const dotMatch = ref.referenceName.match(/^(\w+)\.(\w+)$/);
+  // Parse method call patterns like "obj.method" or "Class::method". The method
+  // part allows trailing `:` keywords so Objective-C selectors resolve
+  // (`SDImageCache.storeImage:`, `obj.setX:y:`); colons never appear in other
+  // languages' method refs, so this is a no-op for them.
+  const dotMatch = ref.referenceName.match(/^(\w+)\.(\w+:?(?:\w+:)*)$/);
   const colonMatch = ref.referenceName.match(/^(\w+)::(\w+)$/);
 
   const match = dotMatch || colonMatch;