Explorar el Código

feat: Add TypeScript abstract class extraction and fix arrow function naming

Addresses TypeScript abstract classes missing by adding abstract_class_declaration to classTypes. Fixes single-expression arrow functions being silently dropped by preventing extractName from searching identifiers in arrow_function/function_expression bodies, ensuring they return  for proper parent name resolution instead of incorrectly using body identifiers.
Colby McHenry hace 2 meses
padre
commit
afcb9fa3e5

+ 3 - 4
docs/SEARCH_QUALITY_LOOP.md

@@ -448,6 +448,8 @@ test().catch(console.error);
 | C++ classes/structs/enums missing from macro namespaces | Macros like `NLOHMANN_JSON_NAMESPACE_BEGIN` cause tree-sitter to misparse namespace blocks as `function_definition` | `src/extraction/languages/c-cpp.ts: isMisparsedFunction` filters bad names; `src/extraction/tree-sitter.ts: visitFunctionBody` extracts structural nodes |
 | C++ classes missing from `.h` headers | `.h` files default to `c` language which has `classTypes: []` | `src/extraction/grammars.ts: looksLikeCpp()` — content-based heuristic promotes `.h` files to `cpp` when C++ patterns detected |
 | Ruby methods inside modules missing owner in `qualified_name` | Ruby `module` AST nodes not being extracted | `src/extraction/languages/ruby.ts: visitNode` hook extracts modules; `src/extraction/tree-sitter.ts: isInsideClassLikeNode` includes `module` kind |
+| TypeScript abstract classes missing | `abstract_class_declaration` not in `classTypes` | `src/extraction/languages/typescript.ts: classTypes` — add `abstract_class_declaration` |
+| Single-expression arrow functions silently dropped | `extractName` finds identifier in expression body instead of returning `<anonymous>` | `src/extraction/tree-sitter.ts: extractName` — skip identifier search for `arrow_function`/`function_expression` nodes |
 
 ## After Fixing Issues
 
@@ -531,13 +533,10 @@ if (receiverType) {
 - [x] **C++** — NOT needed for header-only libs. `isMisparsedFunction` hook filters macro-caused misparse artifacts (e.g. `NLOHMANN_JSON_NAMESPACE_BEGIN`). `visitFunctionBody` now extracts structural nodes (classes/structs/enums) inside macro-confused "function" bodies. Content-based `.h` detection (`looksLikeCpp` in `grammars.ts`) promotes C++ headers to `cpp` language so classes in `.h` files are extracted. Verified against nlohmann/json and gRPC. Note: out-of-class `Type::method()` definitions would need `getReceiverType` but are uncommon in header-only codebases.
 - [x] **C#** — NOT needed. Methods nested in class body. Added `base_list` handling in `extractInheritance` for C#'s `: Parent, IInterface` syntax. Added `propertyTypes` support for C# `property_declaration` nodes. Fixed `extractField` to handle C#'s nested `variable_declaration > variable_declarator` structure. Verified against Jellyfin
 - [x] **Ruby** — NOT needed for `getReceiverType`. Methods nested in class body. Added `visitNode` hook to extract Ruby `module` nodes (concerns, namespaces) with proper containment and qualified names. Methods inside modules get `Module::method` qualified names. Also wired up the `ExtractorContext` with `pushScope`/`popScope` for language hooks. Verified against Discourse
+- [x] **TypeScript** — NOT needed for `getReceiverType`. Methods nested in class body. Added `abstract_class_declaration` to `classTypes` so abstract classes are properly extracted. Fixed single-expression arrow function extraction (`const fn = () => expr` was silently dropped because `extractName` picked up the body identifier instead of returning `<anonymous>` for parent name resolution). Verified against Grafana
 
 ### Needs Verification
 
 Check these — may need `getReceiverType` if methods are top-level in the AST:
 
 - [ ] Kotlin — extension functions `fun Type.method()`
-
-Verify these DON'T need `getReceiverType` (methods nested in class body):
-
-- [ ] TypeScript

+ 1 - 1
src/extraction/languages/typescript.ts

@@ -3,7 +3,7 @@ import type { LanguageExtractor } from '../tree-sitter-types';
 
 export const typescriptExtractor: LanguageExtractor = {
   functionTypes: ['function_declaration', 'arrow_function', 'function_expression'],
-  classTypes: ['class_declaration'],
+  classTypes: ['class_declaration', 'abstract_class_declaration'],
   methodTypes: ['method_definition', 'public_field_definition'],
   interfaceTypes: ['interface_declaration'],
   structTypes: [],

+ 8 - 0
src/extraction/tree-sitter.ts

@@ -70,6 +70,14 @@ function extractName(node: SyntaxNode, source: string, extractor: LanguageExtrac
     }
   }
 
+  // Arrow/function expressions get their name from the parent variable_declarator,
+  // not from identifiers in their body. Without this, single-expression arrow
+  // functions like `const fn = () => someIdentifier` get named "someIdentifier"
+  // instead of "fn", because the fallback below finds the body identifier.
+  if (node.type === 'arrow_function' || node.type === 'function_expression') {
+    return '<anonymous>';
+  }
+
   // Fall back to first identifier child
   for (let i = 0; i < node.namedChildCount; i++) {
     const child = node.namedChild(i);