Forráskód Böngészése

fix(extraction): broaden the curated C++ inline-macro library list (#1103)

* fix(extraction): broaden the curated C++ inline-macro library list

Since #1102 the post-parse salvage already recovers the NAME for any macro, so
adding a library now buys full return-type recovery for it. Extend the curated
list across the major C++ ecosystem: Mozilla/SpiderMonkey, Protobuf, {fmt},
Hedley + nlohmann/json, GLM, Bullet (SIMD_FORCE_INLINE), Skia, OpenCV, EASTL,
Cocos2d-x, Chromium/WebKit (NEVER_INLINE), GLib, SQLite, and the unambiguous
Windows calling conventions (WINAPI / APIENTRY / STDMETHODCALLTYPE / WINAPIV —
which sit between the return type and the name, so blanking them recovers the
return type, e.g. `HRESULT WINAPI Foo()` -> Foo : HRESULT).

Every entry is an exact, curated token matched only in specifier position, so a
real all-caps return type is never touched. Anything still missed keeps its name
via the universal salvage. CARLA control unchanged (440->6 mangles, 0
regressions — none of these libs appear there, confirming no collateral). Eleven
representative full-recovery tests added.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

* docs(changelog): note broadened C++ inline-macro library coverage (#1103)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Colby Mchenry 20 órája
szülő
commit
00765200d8
3 módosított fájl, 43 hozzáadás és 0 törlés
  1. 1 0
      CHANGELOG.md
  2. 19 0
      __tests__/extraction.test.ts
  3. 23 0
      src/extraction/languages/c-cpp.ts

+ 1 - 0
CHANGELOG.md

@@ -16,6 +16,7 @@ and adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 - C++ functions written with an inline-specifier macro before the return type are now indexed correctly. In Unreal Engine, inline helpers are commonly written `FORCEINLINE FString GetEnumerationToString(...)`; the `FORCEINLINE` macro made the parser read the return type as part of the function's name (`FString GetEnumerationToString` instead of `GetEnumerationToString`) and lose the real return type, so the function couldn't be found by name and its callers weren't linked. CodeGraph now recognizes the standard Unreal inline macros (`FORCEINLINE`, `FORCENOINLINE`, `FORCEINLINE_DEBUGGABLE`), so both the name and the return type are captured. (#1100)
 - The same function-name recovery now covers inline macros from common third-party C++ libraries, not just Unreal Engine — including pugixml (`PUGI__FN`, `PUGIXML_FUNCTION`), Godot (`_FORCE_INLINE_`), Boost (`BOOST_FORCEINLINE`), and generic `ALWAYS_INLINE` / `FORCE_INLINE`. Functions decorated with these are now indexed under their real names. On a large Unreal project vendoring these libraries this cleaned up the large majority of remaining function-name garbling. (#1101)
 - C++ function names are now recovered even when decorated with a macro CodeGraph doesn't specifically know about. A function written `SOME_LIBRARY_MACRO ReturnType doWork(...)` previously had the macro or return type absorbed into its name whenever the macro wasn't one CodeGraph recognized; now the real name (`doWork`) is recovered regardless of the macro, so it's findable and its callers link — no per-library configuration needed. The recognized-macro list was also broadened (Qt, Folly, Abseil, LLVM, V8, Eigen, rapidjson) so those additionally capture the return type. This only ever cleans up an already-garbled name and is limited to C and C++, so ordinary names — and languages like Kotlin and Scala where identifiers can legitimately contain spaces — are unaffected. (#1102)
+- The set of C++ libraries whose macros are recognized for full return-type recovery was expanded well beyond Unreal Engine — now spanning Mozilla, Protobuf, {fmt}, nlohmann/json, GLM, Bullet, Skia, OpenCV, EASTL, Cocos2d-x, GLib, SQLite, and the common Windows calling conventions (so `HRESULT WINAPI CreateThing(...)` indexes as `CreateThing` returning `HRESULT`). Functions from libraries not on the list still get their name recovered automatically; being listed additionally recovers the return type. (#1103)
 
 
 ## [1.1.6] - 2026-06-30

+ 19 - 0
__tests__/extraction.test.ts

@@ -3048,6 +3048,25 @@ class APXCharacter {  // the one real definition
       expect(namesOf('V8_INLINE MaybeLocal Get(int i) { return H(i); }')).toContain('Get');
       expect(namesOf('RAPIDJSON_FORCEINLINE bool Parse(const char* s) { return H(s); }')).toContain('Parse');
     });
+
+    it('curated list spans the broader ecosystem (Mozilla, GLM, Bullet, OpenCV, Skia, EASTL, protobuf, fmt, Windows conventions)', () => {
+      const info = (c: string) =>
+        extractFromSource('x.cpp', c).nodes
+          .filter((n) => n.kind === 'method' || n.kind === 'function')
+          .map((n) => ({ name: n.name, ret: n.returnType }));
+      expect(info('MOZ_ALWAYS_INLINE Value get(int i) { return H(i); }')).toEqual([{ name: 'get', ret: 'Value' }]);
+      expect(info('GLM_FUNC_QUALIFIER vec3 cross(const vec3& a) { return H(a); }')).toEqual([{ name: 'cross', ret: 'vec3' }]);
+      expect(info('SIMD_FORCE_INLINE btScalar dot(const btVector3& v) const { return H(v); }')).toEqual([{ name: 'dot', ret: 'btScalar' }]);
+      expect(info('CV_INLINE Mat clone() const { return H(); }')).toEqual([{ name: 'clone', ret: 'Mat' }]);
+      expect(namesOf('PROTOBUF_ALWAYS_INLINE int size() const { return H(); }')).toContain('size');
+      expect(namesOf('FMT_CONSTEXPR auto parse(int x) { return H(x); }')).toContain('parse');
+      expect(namesOf('SK_ALWAYS_INLINE SkScalar width() const { return H(); }')).toContain('width');
+      expect(namesOf('EA_FORCE_INLINE size_type size() const { return H(); }')).toContain('size');
+      // Windows calling-convention macros sit between return type and name; the
+      // macro is blanked so the real return type survives.
+      expect(info('HRESULT WINAPI CreateThing(int x) { return H(x); }')).toEqual([{ name: 'CreateThing', ret: 'HRESULT' }]);
+      expect(info('ULONG STDMETHODCALLTYPE AddRef() { return H(); }')).toEqual([{ name: 'AddRef', ret: 'ULONG' }]);
+    });
   });
 
   describe('C++ templated base-class inheritance (#1043)', () => {

+ 23 - 0
src/extraction/languages/c-cpp.ts

@@ -286,6 +286,29 @@ const CPP_INLINE_MACROS = [
   'V8_INLINE', 'V8_NOINLINE',
   'EIGEN_STRONG_INLINE', 'EIGEN_ALWAYS_INLINE', 'EIGEN_DEVICE_FUNC',
   'RAPIDJSON_FORCEINLINE',
+  // Mozilla / SpiderMonkey
+  'MOZ_ALWAYS_INLINE', 'MOZ_NEVER_INLINE',
+  // Protocol Buffers
+  'PROTOBUF_ALWAYS_INLINE', 'PROTOBUF_NOINLINE',
+  // {fmt} / spdlog
+  'FMT_CONSTEXPR20', 'FMT_CONSTEXPR', 'FMT_INLINE',
+  // Hedley + nlohmann/json (bundles Hedley)
+  'JSON_HEDLEY_ALWAYS_INLINE', 'JSON_HEDLEY_NEVER_INLINE',
+  'HEDLEY_ALWAYS_INLINE', 'HEDLEY_NEVER_INLINE',
+  // GLM (graphics math — pervasive in games/rendering)
+  'GLM_FUNC_QUALIFIER', 'GLM_FUNC_DECL', 'GLM_CONSTEXPR', 'GLM_INLINE',
+  // Bullet Physics / Skia / OpenCV / EASTL / Cocos2d-x / Chromium-WebKit
+  'SIMD_FORCE_INLINE',
+  'SK_ALWAYS_INLINE',
+  'CV_ALWAYS_INLINE', 'CV_INLINE',
+  'EA_FORCE_INLINE', 'EA_NOINLINE',
+  'CC_INLINE',
+  'NEVER_INLINE',
+  // C libraries: GLib, SQLite (internal linkage)
+  'G_INLINE_FUNC', 'SQLITE_PRIVATE', 'SQLITE_API',
+  // Windows calling conventions (linkage position — recover the return type; the
+  // name is salvaged regardless). Only the unambiguous, non-word-like ones.
+  'STDMETHODCALLTYPE', 'WINAPIV', 'WINAPI', 'APIENTRY',
   // Common cross-ecosystem inline/attribute hints
   'ALWAYS_INLINE', 'FORCE_INLINE', 'NOINLINE',
 ] as const;