Browse Source

fix(extraction): resolve C++ inheritance from templated base classes (#1043) (#1048)

A C++ class deriving from a template — `class Derived : public Base<int>`,
a CRTP base `class App : public CRTPBase<App>`, a struct inheriting a
template, or a templated base mixed into a multi-base clause — recorded its
base as the full instantiation text (`Base<int>`). That never name-matched
the template, which is indexed as the bare node `Base`, so the `extends`
edge never resolved and the derived class looked like it inherited from
nothing — callers/impact analysis stopped at the boundary.

Strip the template arguments from the base-type reference name in the
`base_class_clause` handler via a new `stripCppTemplateArgs` helper: it
removes every balanced `<…>` group (any nesting/position), so `Base<int>`
→ `Base` and `ns::Tpl<int>` → `ns::Tpl`. The remaining qualified head is
exactly what the non-templated base case already produces, so resolution
treats templated and non-templated bases identically; a name with no
template args passes through unchanged.

Covers same-file and same-namespace bases (the dominant real-world
patterns). A base in a different namespace referenced with its qualifier
(`other_ns::Tpl<int>`) still doesn't resolve, but that's a pre-existing,
orthogonal namespace-resolution gap — the non-templated `other_ns::Plain`
fails identically — not a template issue.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Colby Mchenry 11 hours ago
parent
commit
f4e03e9cdc

+ 1 - 0
CHANGELOG.md

@@ -14,6 +14,7 @@ and adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 - CodeGraph now indexes nested repositories that git records as gitlinks, so a workspace built by stacking several repos inside one another indexes completely from a single `codegraph init` at the top. When a repo contains another git repo that was `git add`ed into it — so git tracks it as a `160000` "commit" pointer rather than a folder of files — or a submodule that isn't an active, initialized submodule in your checkout, that nested repo's source used to be skipped entirely: indexing the top level stopped at the nested repo's boundary and pulled in only the outer repo's own files, so a stacked-repo project came up nearly empty (one report saw ~10 files indexed at the root). CodeGraph now descends into each such nested repo that has a real working tree on disk and indexes it as its own embedded repository, recursively, so every layer of a stacked workspace is covered. Active submodules (already handled) and plain untracked nested clones are unchanged; a nested repo under a dependency directory such as `vendor/` or `node_modules/` stays excluded; and a submodule with nothing checked out on disk is correctly left alone rather than reported as empty. Thanks @ofergr and @kun-yx for the reports. (#1031, #1033)
 - CodeGraph no longer shows a misleading "different git working tree" warning when you work inside a submodule (or other nested repo) of a workspace you indexed at its root. Because indexing a workspace now pulls in its submodules and embedded clones, a query run from inside one correctly resolves up to the workspace's single index — but it was still warning that the results came from "a different working tree" and suggesting you run `codegraph init -i`, which would have split the submodule back out into its own separate index and undone the unified view. CodeGraph now recognizes that the nested repo's code is already part of the workspace index and stays quiet. The warning still appears for a genuine git worktree — a second checkout of the *same* repository on another branch, which really does have its own uncommitted symbols — since that's the case it exists for. (#1031, #1033)
 - On Windows, CodeGraph's background server now shuts down cleanly instead of occasionally aborting with a crash error. When the indexed project contained a nested repository (a submodule or embedded clone), stopping the server could race the file watcher's teardown and exit with a Windows crash code rather than a clean exit. Shutdown now lets that teardown finish first, so the server stops cleanly and promptly. (Windows only; other platforms were unaffected.) (#1033)
+- C++ classes that inherit from a templated base — `class Widget : public Base<int>`, a CRTP base like `class App : public CRTPBase<App>`, or a struct inheriting a template — are now linked to that base class in the graph. Previously the template arguments (`<int>`) made the inheritance go unrecognized, so these classes looked like they inherited from nothing and impact/callers analysis stopped at the boundary; the connection is now followed like any other base class. Thanks @ryancu7 for the report. (#1043)
 
 
 ## [1.1.2] - 2026-06-28

+ 42 - 0
__tests__/extraction.test.ts

@@ -11,6 +11,7 @@ import * as os from 'os';
 import { CodeGraph } from '../src';
 import { extractFromSource, scanDirectory, buildDefaultIgnore } from '../src/extraction';
 import { detectLanguage, isLanguageSupported, getSupportedLanguages, initGrammars, loadAllGrammars, isSourceFile } from '../src/extraction/grammars';
+import { stripCppTemplateArgs } from '../src/extraction/languages/c-cpp';
 import { normalizePath } from '../src/utils';
 
 beforeAll(async () => {
@@ -2721,6 +2722,47 @@ class Plain : public Base { public: int y; };
     });
   });
 
+  describe('C++ templated base-class inheritance (#1043)', () => {
+    // Inheriting from a template (`class D : public Base<int>`) recorded the base
+    // ref as the full instantiation `Base<int>`, which never name-matched the
+    // template indexed as the bare node `Base`. The `<…>` args are stripped so the
+    // `extends` reference matches.
+    it('strips template args from a templated base so the extends ref is the bare name', () => {
+      const code = `
+template<typename T> class Base {};
+template<typename D> class CRTPBase {};
+namespace ns { template<typename T> class Tpl {}; }
+class Plain {};
+
+class Widget : public Base<int> {};
+class App : public CRTPBase<App> {};
+class Q : public ns::Tpl<int> {};
+class Both : public Base<char>, public Plain {};
+`;
+      const extendsRefs = extractFromSource('f.cpp', code).unresolvedReferences.filter(
+        (r) => r.referenceKind === 'extends'
+      );
+      const names = extendsRefs.map((r) => r.referenceName);
+
+      // Templated bases carry the bare name, NOT the `<…>` instantiation.
+      expect(names).toContain('Base'); // from Base<int> / Base<char>
+      expect(names).toContain('CRTPBase'); // from CRTPBase<App> (CRTP)
+      expect(names).toContain('ns::Tpl'); // qualified head preserved, args dropped
+      expect(names).toContain('Plain'); // non-templated base unchanged
+      // No reference still carries angle brackets.
+      expect(names.find((n) => n.includes('<'))).toBeUndefined();
+    });
+
+    it('stripCppTemplateArgs removes balanced <…> at any depth and is a no-op without them', () => {
+      expect(stripCppTemplateArgs('Base<int>')).toBe('Base');
+      expect(stripCppTemplateArgs('ns::Tpl<int>')).toBe('ns::Tpl');
+      expect(stripCppTemplateArgs('ns::Tpl<Foo<int>>')).toBe('ns::Tpl'); // nested
+      expect(stripCppTemplateArgs('Outer<int>::Inner')).toBe('Outer::Inner'); // mid-name
+      expect(stripCppTemplateArgs('Base')).toBe('Base'); // no-op
+      expect(stripCppTemplateArgs('ns::Plain')).toBe('ns::Plain'); // no-op qualified
+    });
+  });
+
   describe('C/C++ imports', () => {
     it('should extract system include', () => {
       const code = `#include <iostream>`;

+ 45 - 0
__tests__/resolution.test.ts

@@ -2173,6 +2173,51 @@ func main() {
     });
   });
 
+  describe('C++ templated base-class inheritance (#1043)', () => {
+    // A class deriving from a TEMPLATE — `class D : public Base<int>` (or a CRTP
+    // `class W : public CRTPBase<W>`, or a qualified `class Q : public ns::Tpl<int>`)
+    // recorded its base as the full instantiation text (`Base<int>`), which never
+    // name-matched the template, indexed as the bare node `Base`. The `<…>` args
+    // are now stripped so the `extends` edge resolves end-to-end.
+    it('resolves an extends edge to a templated base (plain, CRTP, struct, multi-base)', async () => {
+      fs.writeFileSync(
+        path.join(tempDir, 'lib.hpp'),
+        `#pragma once
+template<typename T> class Base { public: void foo(); };
+template<typename Derived> class CRTPBase {};
+class Plain {};
+
+class Widget : public Base<int> {};            // plain template base
+class App : public CRTPBase<App> {};           // CRTP (curiously-recurring)
+struct Node : public Base<double> {};          // struct inheriting a template
+class Both : public Base<char>, public Plain {}; // templated + plain in one clause
+`
+      );
+      cg = await CodeGraph.init(tempDir, { index: true });
+      const db = DatabaseConnection.open(path.join(tempDir, '.codegraph', 'codegraph.db'));
+      const edges = db
+        .getDb()
+        .prepare(
+          `select src.name as fromName, dst.name as toName
+             from edges e
+             join nodes src on e.source = src.id
+             join nodes dst on e.target = dst.id
+            where e.kind = 'extends'`
+        )
+        .all() as Array<{ fromName: string; toName: string }>;
+      const has = (from: string, to: string) =>
+        edges.some((r) => r.fromName === from && r.toName === to);
+
+      // Every templated base now resolves to the bare template node.
+      expect(has('Widget', 'Base'), 'Widget : Base<int>').toBe(true);
+      expect(has('App', 'CRTPBase'), 'App : CRTPBase<App> (CRTP)').toBe(true);
+      expect(has('Node', 'Base'), 'struct Node : Base<double>').toBe(true);
+      // A mixed clause resolves BOTH the templated and the plain base.
+      expect(has('Both', 'Base'), 'Both : Base<char>').toBe(true);
+      expect(has('Both', 'Plain'), 'Both : Plain (non-templated, regression guard)').toBe(true);
+    });
+  });
+
   describe('PHP Include Resolution', () => {
     it('isPhpIncludePathRef distinguishes include paths from namespace use (#660)', () => {
       const mk = (name: string, over: Partial<UnresolvedRef> = {}): UnresolvedRef => ({

+ 27 - 0
src/extraction/languages/c-cpp.ts

@@ -84,6 +84,33 @@ export function normalizeCppReturnType(raw: string): string | undefined {
   return last;
 }
 
+/**
+ * Strip C++ template arguments from a base-type reference name so it matches the
+ * bare class/struct the template was DEFINED as. `template<typename T> class
+ * Base { … }` is indexed as a node named `Base`, but a derived class
+ * `class D : public Base<int>` records its base as the full `Base<int>` (and
+ * `class Q : public ns::Tpl<int>` as `ns::Tpl<int>`) — neither name-matches
+ * `Base` / `ns::Tpl`, so the `extends` edge never resolves and the derived class
+ * looks like it inherits from nothing (#1043).
+ *
+ * Removes every balanced `<…>` group regardless of nesting or position, so
+ * `Base<int>` → `Base`, `ns::Tpl<Foo<int>>` → `ns::Tpl`, and the rare
+ * `Outer<int>::Inner` → `Outer::Inner`. The remaining qualified head is exactly
+ * what the non-templated base case already produces, so resolution treats them
+ * identically. A name with no template args passes through unchanged.
+ */
+export function stripCppTemplateArgs(name: string): string {
+  if (!name.includes('<')) return name;
+  let out = '';
+  let depth = 0;
+  for (const ch of name) {
+    if (ch === '<') depth++;
+    else if (ch === '>') { if (depth > 0) depth--; }
+    else if (depth === 0) out += ch;
+  }
+  return out.trim();
+}
+
 /**
  * A function/method's return type lives in the `function_definition`'s `type`
  * field (`Metrics& Metrics::instance()` → `Metrics`). Constructors, destructors,

+ 7 - 2
src/extraction/tree-sitter.ts

@@ -21,6 +21,7 @@ import { FN_REF_SPECS, captureFnRefCandidates, type FnRefSpec, type FnRefCandida
 import { isGeneratedFile } from './generated-detection';
 import type { LanguageExtractor, ExtractorContext } from './tree-sitter-types';
 import { EXTRACTORS } from './languages';
+import { stripCppTemplateArgs } from './languages/c-cpp';
 import { LiquidExtractor } from './liquid-extractor';
 import { RazorExtractor } from './razor-extractor';
 import { SvelteExtractor } from './svelte-extractor';
@@ -4455,7 +4456,11 @@ export class TreeSitterExtractor {
 
       // C++ base classes: `class Derived : public Base, private Other` →
       // base_class_clause holds access specifiers + base type(s). Emit an extends
-      // ref per base type (skip the public/private/protected keywords).
+      // ref per base type (skip the public/private/protected keywords). A
+      // templated base (`Base<int>`, `ns::Tpl<int>`) arrives as a `template_type`
+      // or a `qualified_identifier` wrapping one; strip the `<…>` args so the ref
+      // matches the bare class the template was defined as — `Base`, `ns::Tpl` —
+      // instead of never resolving (#1043).
       if (child.type === 'base_class_clause') {
         for (const t of child.namedChildren) {
           if (
@@ -4465,7 +4470,7 @@ export class TreeSitterExtractor {
           ) {
             this.unresolvedReferences.push({
               fromNodeId: classId,
-              referenceName: getNodeText(t, this.source),
+              referenceName: stripCppTemplateArgs(getNodeText(t, this.source)),
               referenceKind: 'extends',
               line: t.startPosition.row + 1,
               column: t.startPosition.column,