Browse Source

feat(extraction): add C, Java, C#, PHP, Scala, and Kotlin to value-reference edges

Extends same-file constant-reader impact edges to six more languages, each
requiring a different technique:

**C** — file-scope `static const` / global nodes weren't extracted at all (name
nests in `init_declarator`, the generic fallback missed it). Added a C branch in
`extractVariable` with `cDeclaratorIdentifier` to walk the declarator chain.
Skips bare-`identifier` declarators to kill the macro-prefixed-prototype misparse
FP cluster (a `MACRO RetType fn(args);` mints a spurious type-named global).
Prune switch gains `init_declarator`.

**Java** — `static final` fields already existed but as `field` kind, which the
value-ref gate rejects. An `isConst` predicate (`static` + `final` modifiers)
re-kinds the constant subset to `constant` in `extractField`. Instance `final`
fields stay `field`.

**C#** — same `field`→`constant` approach: `const` modifier or `static readonly`
modifier pair → `constant`; instance `readonly` stays `field`.

**PHP** — constants already extracted as `constant`; the only gap was the
reader-scan. PHP represents a constant reference as a `name` node (not
`identifier`), so bare `X` and the const half of `self::X` / `Foo::X` were
invisible. Added `name` to the scan. No prune wiring needed: a `$var` local is a
`variable_name` — a different namespace — and can never shadow a bare constant.

**Scala** — top-level `val` was already `constant`; `object` and `class` vals
both came out as `field`. Fixed by walking to the enclosing AST definition in the
`val_definition` handler: `object_definition` → `constant`/`variable`; `class`/
`trait`/`enum`/`given` → `field`. Prune switch gains `val_definition`/
`var_definition`.

**Kotlin** — properties weren't extracted at all (`property_declaration →
variable_declaration → simple_identifier` is too deep for the generic path).
Added a `property_declaration` handler in `visitNode`: pulls the nested name,
walks to the enclosing scope for kind (`object`/`companion object`/top-level →
`constant`/`variable`; `class` → `field`; function body → local, skipped). Reader-
scan gains `simple_identifier`; prune switch gains `property_declaration`.

Validated S/M/L across all six languages (hiredis/curl/redis, gson/commons-lang/
guava, AutoMapper/Newtonsoft/efcore, guzzle/monolog/laravel, upickle/cats/pekko,
okio/coroutines/ktor): node count identical on/off, precision guards held, 0
shadow leaks, impact wins confirmed (`INDEX_NOT_FOUND` 4→165, `Curl_ssl` 3→57,
`_resourceManager` 22→1664, `STATE_IN_QUEUE` 1→32, `BLOCKING_SHIFT` 1→24).

C++ was attempted and reverted — tree-sitter-cpp parse fidelity on
template/macro-heavy code leaks class members to file scope as bogus constants;
did not reach the precision bar.
Colby McHenry 1 tuần trước cách đây
mục cha
commit
907098a703

+ 3 - 1
CHANGELOG.md

@@ -11,7 +11,9 @@ and adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 
 ### New Features
 
-- Impact and blast-radius analysis for TypeScript, JavaScript, Go, Python, Rust, and Ruby now understands the readers of a constant. When you change a file-scope, package-level, module-level, or class-level constant — a config object, a lookup table, a shared constant — the other symbols in that file that read it now show up as affected, where before they were invisible (impact only followed calls, imports, and inheritance, so a constant's consumers looked like "nothing depends on this"). This makes `codegraph impact`, and the impact trail in `codegraph_explore`/`codegraph_node`, catch the "change this table, break its readers" class of change. It's on by default and adds no nodes to your graph; bundled/minified files and ambiguously-shadowed names are skipped to keep results precise. Set `CODEGRAPH_VALUE_REFS=0` to turn it off.
+- Impact and blast-radius analysis for TypeScript, JavaScript, Go, Python, Rust, Ruby, C, Java, C#, PHP, Scala, and Kotlin now understands the readers of a constant. When you change a file-scope, package-level, module-level, or class-level constant — a config object, a lookup table, a shared constant — the other symbols in that file that read it now show up as affected, where before they were invisible (impact only followed calls, imports, and inheritance, so a constant's consumers looked like "nothing depends on this"). This makes `codegraph impact`, and the impact trail in `codegraph_explore`/`codegraph_node`, catch the "change this table, break its readers" class of change. It's on by default and adds no nodes to your graph; bundled/minified files and ambiguously-shadowed names are skipped to keep results precise. Set `CODEGRAPH_VALUE_REFS=0` to turn it off.
+- C file-scope constants and globals — `static const` scalars, pointer/array lookup tables, and shared mutable globals — are now recognized as symbols in their own right. They previously weren't extracted at all, so they never appeared in search or carried any dependents; now they show up in `codegraph search` and participate in impact analysis (see above), so changing a C lookup table surfaces the same-file functions that read it.
+- Java `static final` constants, C# `const` / `static readonly` constants, Scala `object` vals, and Kotlin top-level / `object` / `companion object` `val`s are now classified as constants rather than generic fields, so they participate in the constant-reader impact analysis above — change a `public static final` table, a `const string`, a Scala `object Config { val Timeout = … }`, or a Kotlin `companion object { const val … }` and the methods that read it now show up as affected. (Per-object Java `final` / C# `readonly` / Scala & Kotlin `class` instance properties are unchanged.) Kotlin constants were previously not indexed as their own symbols at all, so they now also appear in `codegraph search`.
 
 ### Fixes
 

+ 290 - 0
__tests__/value-reference-edges.test.ts

@@ -258,6 +258,296 @@ describe('value-reference edges', () => {
     expect(valueRefReaders(cg, 'TIMEOUT')).toEqual(expect.arrayContaining(['get_timeout', 'describe']));
   });
 
+  it('edges same-file readers to a file-scope const/table (C)', async () => {
+    // C keeps shareable values at file scope as `static const` — scalars and,
+    // very commonly, pointer/array lookup tables. Both must be extracted as
+    // nodes (the generic fallback misses C's nested init_declarator name) and
+    // their same-file readers edged.
+    fs.writeFileSync(
+      path.join(dir, 'config.c'),
+      [
+        'static const int MAX_ITEMS = 100;',
+        'static const char *const STATUS_NAMES[] = { "ok", "fail", "pending" };',
+        '',
+        'int capped(int n) { return n > MAX_ITEMS ? MAX_ITEMS : n; }',
+        'const char *label(int i) { return STATUS_NAMES[i]; }',
+      ].join('\n'),
+    );
+    cg = index();
+    await cg.indexAll();
+
+    expect(valueRefReaders(cg, 'MAX_ITEMS')).toEqual(expect.arrayContaining(['capped']));
+    expect(valueRefReaders(cg, 'STATUS_NAMES')).toEqual(expect.arrayContaining(['label']));
+  });
+
+  it('does NOT edge a C file const shadowed by a function-local of the same name', async () => {
+    // `TIMEOUT` is a file const AND a local `int TIMEOUT = 5` (init_declarator)
+    // in shadows(). The local read resolves to the inner binding, so a
+    // file-scope edge would be a false positive — the shadow prune drops it.
+    fs.writeFileSync(
+      path.join(dir, 'shadow.c'),
+      [
+        'static const int TIMEOUT = 30;',
+        '',
+        'int uses_const(void) { return TIMEOUT; }',
+        'int shadows(void) {',
+        '    int TIMEOUT = 5;',
+        '    return TIMEOUT;',
+        '}',
+      ].join('\n'),
+    );
+    cg = index();
+    await cg.indexAll();
+
+    expect(valueRefReaders(cg, 'TIMEOUT')).toEqual([]);
+  });
+
+  it('does NOT mint a value target from a macro-prefixed C prototype (return-type misparse)', async () => {
+    // A prototype led by an unknown macro (`CURL_EXTERN CURLcode fn(args);`)
+    // makes tree-sitter-c misparse it as a declaration whose "variable" is the
+    // bare return-type identifier — which would mint a spurious `CURLcode`
+    // value target read by every function of that type. The bare-identifier
+    // skip prevents it, while real file-scope consts still edge their readers.
+    fs.writeFileSync(
+      path.join(dir, 'api.c'),
+      [
+        'typedef enum { CURLE_OK, CURLE_FAIL } CURLcode;',
+        'CURL_EXTERN CURLcode curl_easy_init(int x);',
+        'CURL_EXTERN CURLcode curl_easy_setopt(int y);',
+        '',
+        'static const int REAL_LIMIT = 42;',
+        'int use_real(void) { return REAL_LIMIT; }',
+      ].join('\n'),
+    );
+    cg = index();
+    await cg.indexAll();
+
+    // The return-type name is never extracted as a const/var, so it is not a
+    // value-ref target at all.
+    const curlcodeValues = cg
+      .searchNodes('CURLcode')
+      .map((r) => r.node)
+      .filter((n) => n.name === 'CURLcode' && (n.kind === 'constant' || n.kind === 'variable'));
+    expect(curlcodeValues).toEqual([]);
+    // Real file-scope consts alongside the misparse-prone prototypes still work.
+    expect(valueRefReaders(cg, 'REAL_LIMIT')).toEqual(expect.arrayContaining(['use_real']));
+  });
+
+  it('edges same-file methods to a class-scope static final constant (Java)', async () => {
+    // Java keeps constants as `static final` fields inside a class. They extract
+    // as `constant` kind (not `field`) so the value-ref gate targets them; a
+    // plain instance `final` field is NOT a constant and must not be a target.
+    fs.writeFileSync(
+      path.join(dir, 'Limits.java'),
+      [
+        'class Limits {',
+        '  public static final int MAX_ITEMS = 100;',
+        '  static final String[] STATUS_NAMES = { "ok", "fail" };',
+        '  final int instanceId = 1;',
+        '  int capped(int n) { return n > MAX_ITEMS ? MAX_ITEMS : n; }',
+        '  String label(int i) { return STATUS_NAMES[i]; }',
+        '  int id() { return instanceId; }',
+        '}',
+      ].join('\n'),
+    );
+    cg = index();
+    await cg.indexAll();
+
+    expect(valueRefReaders(cg, 'MAX_ITEMS')).toEqual(expect.arrayContaining(['capped']));
+    expect(valueRefReaders(cg, 'STATUS_NAMES')).toEqual(expect.arrayContaining(['label']));
+    // An instance `final` field is mutable per-object state, not a shared
+    // constant — it stays `field` kind and is never a value-ref target.
+    expect(valueRefReaders(cg, 'instanceId')).toEqual([]);
+  });
+
+  it('does NOT edge a Java class const shadowed by a method-local of the same name', async () => {
+    fs.writeFileSync(
+      path.join(dir, 'Shadow.java'),
+      [
+        'class Shadow {',
+        '  static final int TIMEOUT = 30;',
+        '  int usesConst() { return TIMEOUT; }',
+        '  int shadows() { int TIMEOUT = 5; return TIMEOUT; }',
+        '}',
+      ].join('\n'),
+    );
+    cg = index();
+    await cg.indexAll();
+
+    expect(valueRefReaders(cg, 'TIMEOUT')).toEqual([]);
+  });
+
+  it('edges same-file methods to a class const / static readonly (C#)', async () => {
+    // C# constants are `const` (compile-time) or `static readonly` (runtime);
+    // both extract as `constant`. An instance `readonly` field is per-object and
+    // stays `field`.
+    fs.writeFileSync(
+      path.join(dir, 'Limits.cs'),
+      [
+        'class Limits {',
+        '  const int MAX_ITEMS = 100;',
+        '  static readonly string[] STATUS_NAMES = { "ok", "fail" };',
+        '  readonly int instanceId = 1;',
+        '  int Capped(int n) { return n > MAX_ITEMS ? MAX_ITEMS : n; }',
+        '  string Label(int i) { return STATUS_NAMES[i]; }',
+        '  int Id() { return instanceId; }',
+        '}',
+      ].join('\n'),
+    );
+    cg = index();
+    await cg.indexAll();
+
+    expect(valueRefReaders(cg, 'MAX_ITEMS')).toEqual(expect.arrayContaining(['Capped']));
+    expect(valueRefReaders(cg, 'STATUS_NAMES')).toEqual(expect.arrayContaining(['Label']));
+    expect(valueRefReaders(cg, 'instanceId')).toEqual([]);
+  });
+
+  it('does NOT edge a C# class const shadowed by a method-local of the same name', async () => {
+    fs.writeFileSync(
+      path.join(dir, 'Shadow.cs'),
+      [
+        'class Shadow {',
+        '  const int TIMEOUT = 30;',
+        '  int UsesConst() { return TIMEOUT; }',
+        '  int Shadows() { int TIMEOUT = 5; return TIMEOUT; }',
+        '}',
+      ].join('\n'),
+    );
+    cg = index();
+    await cg.indexAll();
+
+    expect(valueRefReaders(cg, 'TIMEOUT')).toEqual([]);
+  });
+
+  it('edges same-file readers to a top-level and class const, incl. self:: / Class:: (PHP)', async () => {
+    // PHP keeps constants at file scope (`const X`) and inside classes (`const
+    // X`), both extracted as `constant`. A constant *reference* is a `name` node
+    // (bare `X`, or the const half of `self::X` / `Foo::X`), so the reader-scan
+    // must match `name`. A `$var` local is a different namespace and can never
+    // shadow a bare constant — so there is nothing to prune.
+    fs.writeFileSync(
+      path.join(dir, 'Config.php'),
+      [
+        '<?php',
+        'const APP_VERSION = "1.0";',
+        'class Config {',
+        '  const MAX_ITEMS = 100;',
+        '  const STATUS_NAMES = ["ok", "fail"];',
+        '  public static $counter = 0;',
+        '  function capped($n) { return $n > self::MAX_ITEMS ? self::MAX_ITEMS : $n; }',
+        '  function label($i) { return Config::STATUS_NAMES[$i]; }',
+        '  function version() { return APP_VERSION; }',
+        '}',
+      ].join('\n'),
+    );
+    cg = index();
+    await cg.indexAll();
+
+    expect(valueRefReaders(cg, 'MAX_ITEMS')).toEqual(expect.arrayContaining(['capped']));
+    expect(valueRefReaders(cg, 'STATUS_NAMES')).toEqual(expect.arrayContaining(['label']));
+    expect(valueRefReaders(cg, 'APP_VERSION')).toEqual(expect.arrayContaining(['version']));
+    // A static property is mutable class state, not a constant — never a target.
+    expect(valueRefReaders(cg, 'counter')).toEqual([]);
+  });
+
+  it('edges readers to a top-level and object-scope val, not a class instance val (Scala)', async () => {
+    // Scala has no `static`: an `object` is a singleton, so its `val`s are the
+    // shared-constant idiom (extracted as `constant`, like a top-level val). A
+    // `class` val is a per-instance immutable field (`field`, never a target).
+    fs.writeFileSync(
+      path.join(dir, 'Demo.scala'),
+      [
+        'val AppVersion = "1.0"',
+        'object Config {',
+        '  val TIMEOUT_MS = 30',
+        '  val STATUS_NAMES = List("ok", "fail")',
+        '  def capped(n: Int): Int = if (n > TIMEOUT_MS) TIMEOUT_MS else n',
+        '  def label(i: Int): String = STATUS_NAMES(i)',
+        '}',
+        'class Widget {',
+        '  val MaxItems = 100',
+        '  def within(n: Int): Int = if (n < MaxItems) n else MaxItems',
+        '}',
+      ].join('\n'),
+    );
+    cg = index();
+    await cg.indexAll();
+
+    expect(valueRefReaders(cg, 'TIMEOUT_MS')).toEqual(expect.arrayContaining(['capped']));
+    expect(valueRefReaders(cg, 'STATUS_NAMES')).toEqual(expect.arrayContaining(['label']));
+    // A class instance `val` is per-object state (kind `field`), not a shared
+    // constant — never a value-ref target even though `within` reads it.
+    expect(valueRefReaders(cg, 'MaxItems')).toEqual([]);
+  });
+
+  it('does NOT edge a Scala object val shadowed by a method-local val of the same name', async () => {
+    fs.writeFileSync(
+      path.join(dir, 'Shadow.scala'),
+      [
+        'object Config {',
+        '  val TIMEOUT = 30',
+        '  def usesConst(): Int = TIMEOUT',
+        '  def shadows(): Int = { val TIMEOUT = 5; TIMEOUT }',
+        '}',
+      ].join('\n'),
+    );
+    cg = index();
+    await cg.indexAll();
+
+    expect(valueRefReaders(cg, 'TIMEOUT')).toEqual([]);
+  });
+
+  it('edges readers to top-level, object, and companion-object constants, not a class val (Kotlin)', async () => {
+    // Kotlin has no `static`: a top-level property, an `object` (singleton), and a
+    // class's `companion object` all hold shared constants (`val`→constant). A
+    // class instance `val` is per-object state (`field`, never a target). The
+    // property name nests as variable_declaration→simple_identifier, and a const
+    // reference is a `simple_identifier`.
+    fs.writeFileSync(
+      path.join(dir, 'Demo.kt'),
+      [
+        'const val TOP_LEVEL_MAX = 100',
+        'object Config {',
+        '  const val TIMEOUT_MS = 30',
+        '  val STATUS_NAMES = listOf("ok", "fail")',
+        '  fun capped(n: Int): Int = if (n > TIMEOUT_MS) TIMEOUT_MS else n',
+        '  fun label(i: Int): String = STATUS_NAMES[i]',
+        '}',
+        'class Widget {',
+        '  companion object { const val MAX_RETRIES = 3 }',
+        '  val instanceField = 1',
+        '  fun retries(): Int = MAX_RETRIES',
+        '  fun within(n: Int): Int = if (n < TOP_LEVEL_MAX) n else TOP_LEVEL_MAX',
+        '}',
+      ].join('\n'),
+    );
+    cg = index();
+    await cg.indexAll();
+
+    expect(valueRefReaders(cg, 'STATUS_NAMES')).toEqual(expect.arrayContaining(['label']));
+    expect(valueRefReaders(cg, 'MAX_RETRIES')).toEqual(expect.arrayContaining(['retries']));
+    expect(valueRefReaders(cg, 'TOP_LEVEL_MAX')).toEqual(expect.arrayContaining(['within']));
+    // A class instance `val` is per-object state (kind `field`), never a target.
+    expect(valueRefReaders(cg, 'instanceField')).toEqual([]);
+  });
+
+  it('does NOT edge a Kotlin object const shadowed by a method-local val of the same name', async () => {
+    fs.writeFileSync(
+      path.join(dir, 'Shadow.kt'),
+      [
+        'object Config {',
+        '  const val TIMEOUT = 30',
+        '  fun usesConst(): Int = TIMEOUT',
+        '  fun shadows(): Int { val TIMEOUT = 5; return TIMEOUT }',
+        '}',
+      ].join('\n'),
+    );
+    cg = index();
+    await cg.indexAll();
+
+    expect(valueRefReaders(cg, 'TIMEOUT')).toEqual([]);
+  });
+
   it('emits nothing when CODEGRAPH_VALUE_REFS=0', async () => {
     const prev = process.env.CODEGRAPH_VALUE_REFS;
     process.env.CODEGRAPH_VALUE_REFS = '0';

+ 130 - 20
docs/design/value-reference-edges-playbook.md

@@ -45,7 +45,7 @@ agent read-reduction (see §4.3).
 
 | Symbol | Role |
 |---|---|
-| `VALUE_REF_LANGS` (static Set) | languages the feature runs for. Currently `typescript`, `javascript`, `tsx`, `go`, `python`, `rust`, `ruby`. **Add the new language here.** |
+| `VALUE_REF_LANGS` (static Set) | languages the feature runs for. Currently `typescript`, `javascript`, `tsx`, `go`, `python`, `rust`, `ruby`, `c`, `java`, `csharp`, `php`, `scala`, `kotlin`. **Add the new language here.** |
 | `valueRefsEnabled` | `process.env.CODEGRAPH_VALUE_REFS !== '0'` — default ON, env opts out. |
 | `MAX_VALUE_REF_NODES` (20_000) | per-scope traversal cap (and the shadow-scan cap). |
 | `captureValueRefScope(kind, name, id, node)` | called from `createNode` on every node. Records **targets** (file-scope `const`/`var`) and **reader scopes** (`function`/`method`/`const`/`var`). |
@@ -66,13 +66,59 @@ targets** (see §3).
 
 ## 2. Current state (what's shipped + validated)
 
-- **Default ON** for TS/JS/tsx + Go + Python + Rust + Ruby (`CODEGRAPH_VALUE_REFS=0` disables). Shipped in **PR #895**
+- **Default ON** for TS/JS/tsx + Go + Python + Rust + Ruby + C + Java + C# (`CODEGRAPH_VALUE_REFS=0` disables). Shipped in **PR #895**
   (flip-on + the shadow prune); Go added in a later PR (the shadow-prune declarator switch +
-  `VALUE_REF_LANGS`).
-- **Validated S/M/L** in **TS, JS, tsx, Go, Python, Rust, and Ruby** — see the matrix in the
+  `VALUE_REF_LANGS`); C added later still (extractor change to emit the nodes + the bare-identifier
+  misparse guard); Java + C# after that (field→constant kind switch for the const subset).
+- **Validated S/M/L** in **TS, JS, tsx, Go, Python, Rust, Ruby, C, Java, and C#** — see the matrix in the
   design doc. All clean: node count identical on/off, precision guards held, impact win
   reproduced. Go required extending the shadow prune (per-grammar declarators) — the worked
-  example of "step B is load-bearing."
+  example of "step B is load-bearing." **C required the Ruby treatment** (the extractor didn't emit
+  C file-scope const/var nodes at all) **plus** a C-specific FP guard (a macro-prefixed-prototype
+  misparse mints a bare-identifier "variable" named after the return type — skip bare-`identifier`
+  declarators). It was the worked example of "the §2b coverage table's *easy-path* guess can be
+  wrong — always do §5 step C (confirm the nodes exist) before trusting it."
+- **Java + C# were the cleanest class-scope ("Ruby treatment") languages.** The constants already
+  extract — but as `field` kind, which the gate rejects. The whole change was emitting the const
+  *subset* as `constant`: an `isConst` predicate on each extractor (Java `static final`; C# `const`
+  / `static readonly`) + a kind switch in `extractField`. **No new shadow-prune wiring** (method
+  locals are `variable_declarator`, already in the switch) and **no FP guards** (UPPER_SNAKE /
+  PascalCase fit the distinctive-name gate). Instance `final`/`readonly` fields correctly stay
+  `field`. Validated S/M/L: gson/commons-lang/guava, automapper/newtonsoft/efcore — 0 leaks, node
+  parity, big impact wins (`INDEX_NOT_FOUND` 4→165, `_resourceManager` 22→1664).
+- **PHP was the cleanest of all — one reader-scan line.** Constants already extract as `constant`
+  (top-level + class), so the only change was teaching the reader-scan that a PHP constant
+  *reference* is a `name` node (bare `X`, or the const half of `self::X` / `Foo::X`). **No extractor
+  change, no prune wiring** (a `$var` local can't shadow a bare constant — different namespace).
+  Validated S/M/L (guzzle/monolog/laravel), all clean, 0 class/const collisions. The honest caveat:
+  **lower yield** — PHP reads constants cross-file far more than same-file (laravel 2,956 files → 86
+  edges), and value-refs is same-file only; still correct, just a smaller contribution.
+- **Scala — an `object` is the constant scope.** Scala has no `static`; a singleton `object`'s `val`s
+  are the shared-constant idiom (`object Config { val Timeout = 30 }`). Top-level `val` already
+  extracted as `constant`, but object/class vals both came out as `field`. The fix: in the Scala
+  `val_definition` handler, walk to the enclosing definition — `object_definition` (or top-level) →
+  `constant`/`variable`; `class`/`trait`/`enum` → `field` (per-instance, like Java instance `final`).
+  Added `val_definition`/`var_definition` to the shadow prune (method-local `val` shadows). Reader-scan
+  needed nothing (refs are `identifier`). Minor known limitation: Scala uses `val`/`def`
+  interchangeably for members, so a camelCase val can share a name with a method — same-file name
+  matching can't tell them apart (bounded, like Ruby's sibling-class; sweep showed flagged collisions
+  were mostly real object vals read by siblings). Validated S/M/L (upickle/cats/pekko).
+- **C++ was attempted and reverted — DON'T retry without solving parse fidelity first.** tree-sitter-cpp
+  mis-parses real template/macro-heavy C++ (and `.h` files route to the C grammar): class members and
+  parameters leak to file scope as bogus constants/variables. Two guards (skip `ERROR`-ancestor and
+  `compound_statement`-ancestor declarations) removed ~83% of gross leaks, but the residual pervades
+  even well-structured library source (template-class member leaks, amalgamated mega-headers,
+  `.h`-as-C++). It did not reach the precision bar of the other languages. See the C++ section below.
+- **Kotlin = C + Scala + PHP techniques combined (and clean).** Nothing extracted before (property name
+  nests `property_declaration → variable_declaration → simple_identifier` — the C problem). Fix:
+  handle `property_declaration` in the Kotlin `visitNode` hook — pull the nested name, walk to the
+  enclosing definition for the kind (`object`/`companion object`/top-level → `constant`/`variable`;
+  `class` → `field` — the Scala rule; skip locals under a `function_body`/`init`/lambda), add
+  `simple_identifier` to the reader-scan (the PHP-`name` move), and `property_declaration` to the
+  shadow prune. Clean parse fidelity (the one `fun interface` misparse is already handled), so no
+  C++-style tail. One of the cleanest yields — companion-object bit-masks/state consts are a heavy
+  same-file-read idiom. Validated S/M/L (okio/coroutines/ktor); only the bounded val/def-or-class and
+  sibling-companion name overlaps remain (shared with Scala/Ruby).
 - **Tests:** `__tests__/value-reference-edges.test.ts` — same-file readers edged; surfaced in
   impact radius; shadowed const NOT edged (verified to fail without the guard); JSX-only read
   edged (tsx); `CODEGRAPH_VALUE_REFS=0` emits nothing.
@@ -95,16 +141,25 @@ the bottom of this section).
 | Go | package `const`/`var` |
 | Rust | module + impl `const`/`static` |
 | Ruby | class/module `CONST` (the class-scope extension) |
+| C | file-scope `static const` scalars + pointer/array lookup tables + mutable globals. **Needed an extractor change** (nodes weren't emitted) + a bare-identifier misparse guard — NOT the easy path the table below first guessed |
+| Java | class `static final` fields. Nodes existed as `field` kind; emitted the const subset as `constant` (`isConst` + `extractField` kind switch). No new prune wiring, no FP guards |
+| C# | class `const` / `static readonly`. Identical to Java — same `field`→`constant` change |
+| PHP | top-level `const` + class `const` (both already `constant` kind). **Only** change was the reader-scan: a PHP const *reference* is a `name` node. No extractor change, no prune wiring (a `$var` local can't shadow a bare constant). Lower yield — PHP reads consts cross-file more than same-file |
+| Scala | top-level `val` (already `constant`) + **`object` val** (the singleton-constant idiom; re-kinded from `field` by walking to the enclosing `object_definition`). `class`/`trait`/`enum` vals stay `field`. `val_definition`/`var_definition` added to the shadow prune. Minor val/def name-collision limit |
+| Kotlin | top-level / `object` / `companion object` `val` (re-kinded from nothing — properties weren't extracted at all). Handled in `visitNode`: nested name (`variable_declaration → simple_identifier`, the C move) + scope-walk for kind (Scala move) + `simple_identifier` in the reader-scan (PHP move) + prune. `class` instance vals stay `field`. Clean — one of the best yields (companion bit-masks) |
 | **Svelte, Vue, Astro** | **inherited for free** — their extractors re-parse the `<script>`/frontmatter block as `typescript`/`javascript`, which are in `VALUE_REF_LANGS` (verified: a `.svelte` `const` edges its readers). No separate work; no separate matrix row needed. |
 
 **🔜 Remaining — likely the easy path** (constants are file/module-scope, or top-level; do §5: add
 to `VALUE_REF_LANGS`, verify the declarator node type + extractor kind, sweep). Classify each
-*before* building — several are mixed file+class scope:
+*before* building — several are mixed file+class scope. **Caveat learned from C:** "easy path" here
+means *scope* fits — it does NOT promise the extractor already emits the const nodes. C was in this
+column but emitted *no* file-scope const/var nodes (its name nests in an `init_declarator` the
+generic fallback can't read), so it needed the Ruby-style extractor change after all. **Always run
+§5 step C (confirm `select kind,name from nodes …` actually shows the consts) before trusting this
+column.**
 
 | Language | Constant forms | Note |
 |---|---|---|
-| C | file-scope `const` / `static const` | `init_declarator` in a `declaration`; `#define` macros aren't value nodes |
-| Kotlin | top-level `const val`/`val` (file-scope) + `companion`/`object` (class-scope) | top-level is easy; companion needs the class-scope gate (present) |
 | Swift | top-level `let` (file) + `static let` in a type (class) | README notes Swift stored properties aren't extracted as own nodes — check |
 | Dart | top-level `const`/`final` (file) + `static const` (class) | mixed |
 | Lua / Luau | file/chunk `local X =` + globals; no `const` keyword | distinctive-name gate (needs `[A-Z_]`) catches fewer — Lua casing varies |
@@ -112,19 +167,25 @@ to `VALUE_REF_LANGS`, verify the declarator node type + extractor kind, sweep).
 
 **🧱 Remaining — needs the Ruby treatment** (constants live almost entirely **inside a
 class/type**; the class-scope *gate* exists now, but first confirm the extractor emits them as
-`constant`/`variable` nodes — Ruby's weren't extracted at all, and Java/C# fields may come out as
-`field`/`property` kind):
+`constant`/`variable` nodes — Ruby's weren't extracted at all, and class fields often come out as
+`field`/`property` kind, which the gate rejects). **Java + C# (done) were this case**: their
+constants extracted as `field` kind, and the fix was emitting the const subset (`static final` /
+`const` / `static readonly`) as `constant` — the template for the rest of this bucket:
 
 | Language | Constant forms |
 |---|---|
-| Java | `static final` fields in a class |
-| C# | `const` / `static readonly` in a class |
-| Scala | `val` / `final val` in an `object`/`class` |
-| PHP | class `const` + top-level `const` + `define()` |
-| C++ | file-scope + class `static const`/`constexpr` (mixed) |
 | Pascal / Delphi | `const` sections at unit (file) or class scope (mixed) |
 | Objective-C | `static const` / `extern const` / `#define` (file-ish; macros unparsed; already "partial support") |
 
+**⛔ Attempted & reverted — C++.** file-scope + class `static const`/`constexpr` (mixed). Machinery
+built and correct on clean C++, but **tree-sitter-cpp parse fidelity is the blocker**: template/
+macro-heavy real C++ leaks class members + parameters to file scope as bogus constants/variables, and
+`.h` files route to the C grammar (mangling C++ classes). Two guards (skip `ERROR`-ancestor and
+`compound_statement`-ancestor declarations) cut ~83% of gross leaks but the residual pervades even
+well-structured library source. **Did not meet the precision bar; reverted.** Don't retry as a
+"value-refs" task — it needs prior work on C++ parse handling (template-class member scoping,
+`.h`-as-C++ detection, amalgamated-header exclusion).
+
 **🚫 N/A:** Liquid (template language — no value constants to track).
 
 **Frameworks — not a value-refs axis.** The README's framework list (Django, Flask, Express,
@@ -267,10 +328,18 @@ The target gate now accepts **`file:`, `class:`, and `module:`** parents. Before
   enclosing class's constant, and strict matching would drop those valid reads. The only real FP
   is the same constant name in *sibling* classes in one file (~1.7% of Ruby targets on rails);
   valid code rarely hits it (a bare sibling-class constant is a NameError in Ruby).
-- **Still untested:** Java `static final`, C# `const`, Swift `static let`. The gate covers them
-  now, but confirm the extractor emits them as `constant`/`variable` nodes with a `class:`/
-  `struct:` parent (Swift stored properties, for one, aren't extracted as their own nodes) — and
-  if the parent kind is `struct:`/`interface:` rather than `class:`/`module:`, widen the gate.
+- **Java `static final` + C# `const`/`static readonly` are DONE** (emitted as `field` → re-kinded to
+  `constant`). **Still untested:** Swift `static let`, Kotlin `companion`/`object`. The gate covers
+  them, but confirm the extractor emits them as `constant`/`variable` nodes with a `class:`/`struct:`
+  parent (Swift stored properties aren't extracted as their own nodes) — and if the parent kind is
+  `struct:`/`interface:` rather than `class:`/`module:`, widen the gate.
+- **Confirm the reader-scan matches the language's constant *reference* node type (the PHP lesson).**
+  The reader-scan in `flushValueRefs` matches `identifier` / `constant` / `name`. If the new language
+  represents a constant *read* as some other node type, the scan finds nothing and **no edges form**
+  even with targets correctly registered. PHP refs a const as a **`name`** node (bare `X`, and the
+  const half of `self::X` / `Foo::X`), which the scan missed until `name` was added. Dump a sample's
+  reader body (`scripts/agent-eval` or a quick `getParser` walk) and check the node type of a
+  constant reference *before* sweeping — a zero-edge sweep usually means this, not a target-gate bug.
 
 ### B. Confirm the declarator node type (for the shadow prune)
 
@@ -290,7 +359,13 @@ silently does nothing for the new language and intra-file shadowing produces fal
 | Rust | `const_item`, `static_item`, `let_declaration` | const/static → `name` field; let → `pattern` field | **done** |
 | Ruby | `assignment` (LHS is a `constant` node) | already in the switch; Ruby can't local-shadow a constant, so the prune is effectively a no-op for it | **done** (class-scope) |
 | Ruby | `assignment` with constant LHS (`CONST`) | LHS | to verify |
-| C/C++ | `init_declarator` in a file-scope `declaration` | declarator id | to verify |
+| C | `init_declarator` in a file-scope `declaration` | `cDeclaratorIdentifier` walks the `declarator` chain (init → pointer/array → identifier) | **done** |
+| C++ | **attempted & reverted** — parse fidelity (see the C++ note in §2b) | — | reverted |
+| Java | `variable_declarator` (field AND method-local) | `namedChild(0)` = name identifier — **already the TS/JS case**, no new wiring | **done** |
+| C# | `variable_declarator` (field AND method-local) | same as Java — already in the switch | **done** |
+| PHP | **none** | a `$var` local (`variable_name`) is a different namespace from a bare constant — a local can never shadow a constant, so the prune is a no-op and needs no PHP declarator | **done** (n/a) |
+| Scala | `val_definition`, `var_definition` | `pattern` field (identifier) — catches an object/top-level val shadowed by a method-local `val` | **done** |
+| Kotlin | `property_declaration` | `variable_declaration → simple_identifier` (and `bump` accepts `simple_identifier`) — catches an object/companion const shadowed by a method-local `val` | **done** |
 
 **The prune rule is `declarators > file-scope-node-count`, NOT `> 1`.** A name can be bound
 twice *at file scope* legitimately — a **conditional module def** (`try: X = a; except: X = b`,
@@ -367,6 +442,41 @@ fixed); impact delta shows the blind→real radius win; full test suite green.
 - **require-bindings (CommonJS) are not FPs** — see §3. Don't "fix" them.
 - **Don't over-engineer a guard for a gap that doesn't manifest** (e.g. param-only shadow):
   evidence-driven only. The maintainer steered toward minimal, surgical fixes.
+- **C macro-prefixed-prototype misparse (the C FP cluster):** an unknown leading macro
+  (`CURL_EXTERN`, `XXH_PUBLIC_API`) makes tree-sitter-c misparse a prototype `MACRO RetType
+  fn(args);` as a *declaration* whose declared "variable" is the bare return-type identifier
+  (`XXH_errorcode`), splitting `fn(args)` into a bogus expression. It mints one spurious type-named
+  global per prototype — then edged by every function of that type (redis `XXH_errorcode` 1→18).
+  These misparses *always* produce a **bare `identifier`** declarator (checked across
+  pointer/array/sized-return variants); real consts/tables always have an `init_declarator` and real
+  pointer/array globals their own declarator. Fix = **skip bare-`identifier` declarators** in the C
+  branch. The "extra" file-scope variable nodes also drop node-count vs an early pass — both arms
+  match, but don't be surprised the post-fix count is *lower*.
+- **"Easy path" ≠ "nodes already exist."** The §2b table classifies by *scope*; it does not promise
+  the language's consts are extracted. C sat in the easy column yet emitted zero file-scope const
+  nodes. Run §5 step C (`select kind,name from nodes where file_path like '%sample%'`) on a sample
+  *first* — if the consts aren't there, you're doing the Ruby treatment, not the easy path.
+- **Class consts may extract as `field` kind, not `constant` (Java/C#).** Step C must check the
+  *kind*, not just that a node exists: Java `static final` and C# `const`/`static readonly` came out
+  as `field`, which the value-ref target gate (`constant`/`variable` only) silently rejects — so the
+  feature emitted nothing despite the nodes being present. Fix = an `isConst` predicate on the
+  extractor (gated on the const modifiers) + a kind switch in `extractField` (scoped per-language so
+  other languages' fields stay `field`). Don't widen the *gate* to accept `field` — that would pull
+  in every mutable instance field as a target. And only the const *subset* converts: a Java instance
+  `final` or C# instance `readonly` is per-object state, must stay `field`.
+- **A zero-edge sweep with correctly-registered targets = the reader-scan node type (the PHP trap).**
+  Targets can register perfectly (right kind, right scope) and *still* produce zero edges if the
+  reader-scan doesn't recognise how the language writes a constant *read*. PHP refs a const as a
+  **`name`** node, not `identifier`/`constant`, so the scan saw nothing until `name` was added to the
+  match. Before assuming a target-gate bug on a sparse/empty sweep, dump a reader body and check the
+  node type of a known constant reference. (Adding a ref node type to the scan is safe across
+  languages — `flushValueRefs` only runs for the value-ref set, and a file holds only its own
+  grammar's nodes; `name` is PHP-only among the current set.)
+- **Same-file-only means cross-file-heavy languages yield less — that's correct, not a miss.** PHP
+  reads constants across files far more than within one (`Logger::DEBUG` everywhere), so laravel
+  (2,956 files) gave only 86 edges vs Ruby rails's 2,255. Don't chase it: cross-file value consumers
+  are out of scope for *every* language (would need import/scope resolution). Report the lower yield
+  honestly in the matrix rather than treating it as a bug to fix.
 
 ---
 

+ 176 - 4
docs/design/value-reference-edges.md

@@ -1,6 +1,6 @@
 # Design + status: same-file value-reference edges
 
-**Status:** SHIPPED (default-on for TS/JS/tsx + Go + Python + Rust + Ruby; `CODEGRAPH_VALUE_REFS=0` disables). The
+**Status:** SHIPPED (default-on for TS/JS/tsx + Go + Python + Rust + Ruby + C + Java + C# + PHP + Scala + Kotlin; `CODEGRAPH_VALUE_REFS=0` disables). The
 emitter lives in `TreeSitterExtractor.flushValueRefs` (`src/extraction/tree-sitter.ts`).
 **Motivation:** close the impact-analysis hole for *value consumers*. Static
 extraction edges calls, imports, and inheritance, but never edges a constant to the
@@ -13,7 +13,7 @@ readers" class of change (the ReScript-PR false positive that motivated the work
 ## TL;DR for a new session
 
 We emit a `references` edge (`metadata: { valueRef: true }`) from a reader symbol to
-the **file/package-scope `const`/`var` it reads**, same-file only, for TS/JS/tsx + Go + Python + Rust + Ruby. Those edges
+the **file/package-scope `const`/`var` it reads**, same-file only, for TS/JS/tsx + Go + Python + Rust + Ruby + C + Java + C# + PHP + Scala + Kotlin. Those edges
 flow straight into `getImpactRadius` / `codegraph impact` and the impact trail in
 `codegraph_explore` / `codegraph_node` — no agent-behaviour change required.
 
@@ -46,7 +46,7 @@ The win is **impact-radius correctness**, not agent read-reduction (see "Agent A
    the content-minified bundles guard #1 misses.
 3. **Distinctive-name + same-file** as above.
 
-## Validation matrix — TS / JS / Go / Python / Rust / Ruby
+## Validation matrix — TS / JS / Go / Python / Rust / Ruby / C / Java / C# / PHP / Scala / Kotlin
 
 Method per repo: index the same tree twice (value-refs on vs `CODEGRAPH_VALUE_REFS=0`),
 diff node/edge counts, spot-check precision, and measure `codegraph impact` on a few
@@ -101,7 +101,56 @@ file-scope consts. Node count must be **identical** on/off (edges-only feature).
 | jekyll/jekyll | medium | 218 | 1,906 (stable) | +100 (2.4%) | ~100% TP | `DEFAULT_PRIORITY` 1→3, `LOG_LEVELS` 4→5 |
 | rails/rails | large | 1,452 | 61,911 (stable) | +2,255 (1.2%) | ~98% TP (same-file ambiguity 21/1208 targets) | `Post` (Struct const) 75 readers |
 
-Across S/M/L in all six languages: node count never moved, the precision guards held, and
+**C** (file-scope `static const` scalars + pointer/array lookup tables + mutable globals; required
+extracting the nodes first — see below)
+
+| Repo | size | files | nodes (on=off) | +value-ref edges | precision | `impact` on→off example |
+|---|---|---|---|---|---|---|
+| redis/hiredis | small | 52 | 1,161 (stable) | +29 (2.5%) | all sampled TP; guard holds | `hiredisAllocFns` 1→**71** |
+| curl/curl | large | 994 | 16,124 (stable) | +597 (3.7%) | all sampled TP; guard holds; no minified FPs | `Curl_ssl` 3→**57** |
+| redis/redis | medium | 782 | 19,446 (stable) | +1,634 (8.4%) | all sampled TP after the macro-misparse fix; guard holds | `asmManager` 2→**97**, `keyMetaClass` 1→36, `XXH3_kSecret` 1→27, `helpEntries` 1→13 |
+
+**Java** (class-scope `static final` constants; required emitting them as `constant` kind — see below)
+
+| Repo | size | files | nodes (on=off) | +value-ref edges | precision | `impact` on→off example |
+|---|---|---|---|---|---|---|
+| google/gson | small | 262 | 8,563 (stable) | +387 | all sampled TP; guard holds | `PEEKED_NONE` 1→**31** |
+| apache/commons-lang | medium | 623 | 19,976 (stable) | +2,087 | all sampled TP; guard holds; no minified FPs | `INDEX_NOT_FOUND` 4→**165**, `EMPTY` 5→161 |
+| google/guava | large | 3,227 | 130,945 (stable) | +6,354 | all sampled TP; guard holds; no minified FPs | `APPLICATION_TYPE` 2→**126**, `ABSENT` 4→66 |
+
+**C#** (class-scope `const` / `static readonly`; same `field`→`constant` change as Java)
+
+| Repo | size | files | nodes (on=off) | +value-ref edges | precision | `impact` on→off example |
+|---|---|---|---|---|---|---|
+| AutoMapper/AutoMapper | small | 511 | 19,254 (stable) | +133 | all sampled TP; guard holds | `ContextParameter` 1→**17**, `InstanceFlags` 1→14 |
+| JamesNK/Newtonsoft.Json | medium | 945 | 20,208 (stable) | +344 | all sampled TP; guard holds | `DefaultFlags` 1→**37**, `JsonNamespaceUri` 1→15 |
+| dotnet/efcore | large | 5,731 | 140,847 (stable) | +3,720 | all sampled TP; guard holds; no minified FPs | `_resourceManager` 22→**1664**, `Prefix` 40→237, `Guid77` 2→191 |
+
+**PHP** (top-level `const` + class `const`, both already `constant`; needed only a reader-scan tweak — see below)
+
+| Repo | size | files | nodes (on=off) | +value-ref edges | precision | `impact` on→off example |
+|---|---|---|---|---|---|---|
+| guzzle/guzzle | small | 81 | 1,655 (stable) | +5 (sparse — see note) | all sampled TP; no collisions | `CONNECTION_ERRORS` 1→3 |
+| Seldaek/monolog | medium | 217 | 3,047 (stable) | +79 | all sampled TP; no class/const collisions | `DEFAULT_JSON_FLAGS` 1→**18**, `RFC_5424_LEVELS` 1→17 |
+| laravel/framework | large | 2,956 | 57,519 (stable) | +86 | all sampled TP; no minified/collision FPs | `INVISIBLE_CHARACTERS` 1→**93**, `SESSION_ID_LENGTH` 1→9 |
+
+**Scala** (top-level `val` + `object` val — re-kinded from `field`; `class` instance vals stay `field`)
+
+| Repo | size | files | nodes (on=off) | +value-ref edges | precision | `impact` on→off example |
+|---|---|---|---|---|---|---|
+| com-lihaoyi/upickle | small | 145 | 3,052 (stable) | +82 | all sampled TP; no class/method collisions | `IntegralPattern` 1→**9** |
+| typelevel/cats | medium | 835 | 15,774 (stable) | +89 | sampled TP; flagged val/def name-collisions were real object vals read by siblings | `maxArity` 3→**17**, `fusionMaxStackDepth` 1→13, `minIntValue` 1→7 |
+| apache/pekko | large | 2,720 | 135,041 (stable) | +8,453 (2,065 Scala) | Scala object vals clean; the bulk are valid Java `PARSER`/`DEFAULT_INSTANCE` from generated protobuf `.java` | `ErrorLevel` 5→**33**, `WarningLevel` 5→29 |
+
+**Kotlin** (top-level / `object` / `companion object` `val` → `constant`; `class` instance vals stay `field`)
+
+| Repo | size | files | nodes (on=off) | +value-ref edges | precision | `impact` on→off example |
+|---|---|---|---|---|---|---|
+| square/okio | small | 307 | 8,540 (stable) | +157 | all sampled TP; 0 collisions | `STATE_IN_QUEUE` 1→**32**, `HMAC_KEY` 1→9 |
+| Kotlin/kotlinx.coroutines | medium | 1,039 | 17,058 (stable) | +210 | all sampled TP; 1 cross-file collision | `BLOCKING_SHIFT` 1→**24**, `TERMINATED` 2→22 (companion bit-masks) |
+| ktorio/ktor | large | 2,302 | 43,272 (stable) | +849 | object/companion consts (HTTP header names); flagged collisions are real consts; `TYPE` is a sibling-companion ambiguity | `TYPE` 8→**109**, `FailedPath` 1→22 |
+
+Across S/M/L in all twelve languages: node count never moved, the precision guards held, and
 the `impact` OFF column is the bug — a const that 80–140 symbols read reports "1 affected"
 without value-refs.
 
@@ -155,6 +204,129 @@ classes in one file — 21 of 1,208 targets (1.7%) on rails, and most of those r
 referencing a sibling class's bare constant is a NameError in real Ruby, so valid code rarely
 hits it. Net precision ~98–100%.
 
+**C was NOT the "easy path" the language tracker first assumed — it needed the extractor to emit
+the nodes first.** C keeps shareable values at file scope (`static const` scalars, and very
+commonly pointer/array **lookup tables** + mutable global state), which fits the file-scope target
+gate. But unlike Go/Rust (whose const nodes already existed), C's file-scope `const`/`var` were
+**never extracted as nodes at all**: a C `declaration` nests its name inside an `init_declarator`
+(through `pointer_declarator`/`array_declarator`), and the generic variable-extraction fallback
+only finds a *direct* `identifier` child — so it produced nothing. Three changes (the same shape as
+Ruby's): (1) a C branch in `extractVariable` that resolves the name through the declarator chain and
+emits file-scope declarations as `constant`/`variable` (skipping function-body locals via an
+ancestor check, and `function_declarator` prototypes); (2) an `isConst` on the C extractor (a
+`const` `type_qualifier` → `constant` kind); (3) the shadow prune's declarator switch extended with
+`init_declarator`. Scoped to **C only** — C++ stays on the generic fallback (its class-scope members
+are the harder bucket).
+
+The one false-positive cluster the sweep surfaced was a **macro-prefixed-prototype misparse**, and
+the fix is the load-bearing C detail: an unknown leading macro (`CURL_EXTERN`, `XXH_PUBLIC_API`)
+makes tree-sitter-c misparse a prototype `MACRO RetType fn(args);` as a declaration whose declared
+"variable" is the **bare return-type identifier** (`XXH_errorcode`/`CURLcode`), splitting `fn(args)`
+off as a bogus expression — minting one spurious type-named global per prototype, then edged by
+every function returning that type (redis's `XXH_errorcode` 1→18 before the fix). These misparses
+*always* yield a **bare `identifier`** declarator (verified across pointer/array/sized return
+variants); real consts/tables always carry an initializer (`init_declarator`) and real
+pointer/array globals carry their own declarator. So the C branch **skips bare-`identifier`
+declarators entirely** — killing the whole FP class at the cost of only uninitialized scalar globals
+(`static int g;`), which are rare and low-value. After the fix: every sampled edge on
+hiredis/redis/curl was a true positive, the guard-invariant leak check found 0 shadows across all
+three, and `impact` deltas confirm the blind→real radius (`asmManager` 2→97, `Curl_ssl` 3→57,
+`hiredisAllocFns` 1→71).
+
+**Java + C# were the cleanest class-scope languages — one kind switch, no new guards.** Both keep
+constants *inside a class* (Java `static final` fields; C# `const` / `static readonly`), so unlike
+C the nodes already existed — but as **`field`** kind, which the value-ref gate (`constant`/
+`variable` only) rejects. The whole change was emitting the constant *subset* as `constant`: an
+`isConst` predicate on each extractor (Java = a `static final` field; C# = a `const`, or a `static
+readonly`) plus a kind switch in `extractField`. Everything else was already in place — the
+class-scope target gate (from Ruby), the `identifier` reader-scan, and crucially the shadow prune:
+a method-local that shadows a class const is a `variable_declarator` in both grammars, *already* in
+the prune switch, so a class const shadowed by a local is dropped with no new wiring (validated by
+the Java/C# shadow tests). Instance fields stay `field` — a Java instance `final` or a C# instance
+`readonly` is per-object state, not a shared constant, so it's never a target. The distinctive-name
+gate fits both conventions cleanly (Java `UPPER_SNAKE`, C# `PascalCase`), so no FP class emerged:
+across S/M/L (gson/commons-lang/guava, automapper/newtonsoft/efcore) every sampled edge was a true
+positive, 0 shadow leaks, no minified-file FPs, node count identical on/off. The `impact` wins are
+the headline — Java's canonical `public static final` constants (`INDEX_NOT_FOUND` 4→165, `EMPTY`
+5→161) and C#'s `const`/`static readonly` (`Prefix` 40→237, a generated `_resourceManager` 22→1664)
+all went from a blind "1 affected" to their real radius. The known sibling-class limitation (the
+same const name in two classes in one file resolves to the file-wide target) is shared with Ruby and
+stayed negligible.
+
+**PHP was a near-pure "easy path" — one reader-scan line, no extractor change, no prune wiring.**
+PHP already extracts both top-level `const X = …` and class `const X = …` as `constant` kind (a
+dedicated `const_declaration` handler), inside the right scope (`file:` / `class:`, both gated). The
+*only* change was the reader-scan: PHP represents a constant *reference* — bare `X`, or the const
+half of `self::X` / `Foo::X` / `static::X` — as a **`name`** node, which the scan (matching
+`identifier` / `constant`) missed, so it found nothing until `name` was added. That's safe across
+languages: `flushValueRefs` only runs for the value-ref set, and `name` is PHP-only among them. **No
+shadow prune was needed at all** — a PHP local is a `$var` (`variable_name`), a different namespace
+from a bare constant, so a local can *never* shadow a constant; there is nothing to prune (the
+cleanest case yet). Precision was excellent: UPPER_SNAKE constants fit the distinctive-name gate, and
+a dedicated check for a target whose name collides with a same-file *class* (PHP's one realistic FP —
+`name` nodes also name classes in `new Foo()` / `Foo::`) found **zero** collisions across
+guzzle/monolog/laravel; every sampled edge was a true positive, node count identical on/off.
+
+**The honest caveat: PHP is lower-yield than the class-scope languages, by design.** PHP idiom reads
+constants *across* files far more than within one (a `Logger::DEBUG` or a config constant consumed
+everywhere), and value-refs is **same-file only** — so laravel (2,956 files) produced only 86 edges
+vs. Ruby rails's 2,255 (1,452 files). This is not a miss: the cross-file reads are out of scope for
+*every* language (resolution would need import/scope analysis), and PHP simply leans on them more.
+The same-file reads it *does* capture are clean and the transitive impact wins are real
+(`INVISIBLE_CHARACTERS` 1→93 from 3 direct readers). Net: correct and additive, just a smaller
+absolute contribution than Java/C#/Go.
+
+**Scala — the `object` is the constant scope.** Scala has no `static`; the idiom for a shared
+constant is a `val` inside a singleton `object` (`object Config { val Timeout = 30 }`). A top-level
+`val` already extracted as `constant`, but `object` and `class` vals both came out as `field` (the
+gate rejects `field`). The fix is a kind refinement in the Scala `val_definition` handler: walk to
+the enclosing definition and treat an `object_definition` (or top level) val as `constant`/`variable`
+— while a `class`/`trait`/`enum` val stays `field`, because it is per-instance immutable state, the
+exact analogue of the Java instance `final` we also keep as `field`. (`object` and `class` both
+extract as `class` *kind*, so the distinction is the enclosing AST node type, not the node kind.)
+The shadow prune gained `val_definition`/`var_definition` (a method-local `val` can shadow an object
+val); the reader-scan needed nothing, since a Scala val reference is a plain `identifier`. Method-local
+vals are not extracted at all, so they're not a target source. The one **known limitation** is
+Scala's interchangeable `val`/`def` for members: a camelCase val can share a name with a method in the
+same file, and same-file name matching can't distinguish them — but it's bounded (like Ruby's
+sibling-class case), and on the sweep every flagged val/def collision turned out to be a real `object`
+val read by sibling vals (cats' typeclass instances: `val flatMap = monad`, read by
+`invariantSemigroupal`). Validated S/M/L (upickle/cats/pekko): node count identical on/off, top
+targets genuine object vals (`maxArity` `val = 22`, `DigitTens` lookup table), impact wins real
+(`maxArity` 3→17). The distinctive-name gate fits Scala's camelCase/PascalCase constants (`maxArity`,
+`IntegralPattern`) via their internal uppercase letter.
+
+**Kotlin combined three already-built techniques.** Kotlin has no `static`: shared constants live at
+top level, in an `object` (singleton), or in a class's `companion object` — all `val`/`const val`. A
+class instance `val` is per-object state. Nothing extracted before because a Kotlin property name
+nests (`property_declaration → variable_declaration → simple_identifier`) and the generic path reads
+only a direct child — the **C** problem. The fix handles `property_declaration` in the Kotlin
+`visitNode` hook (where the existing one already manages `fun interface` misparses): pull the nested
+name, then walk to the enclosing definition to set the kind — `object_declaration`/`companion_object`
+(or top level) → `constant`/`variable` (the **Scala** object-vs-class rule), `class_declaration` →
+`field`, and a property under a `function_body`/`init`/lambda is a local and skipped. The reader-scan
+gained `simple_identifier` (Kotlin's reference node — the **PHP `name`** move; `simple_identifier` is
+Kotlin-only among the value-ref set), and the shadow prune gained `property_declaration` (a method-local
+`val` can shadow an object const). Kotlin's parse fidelity is clean (its one known misparse,
+`fun interface`, is already handled), so unlike C++ no precision tail emerged. It validated as one of
+the *cleanest* languages: companion-object bit-masks and state constants are a heavy, same-file-read
+idiom (coroutines' `BLOCKING_SHIFT` 1→24, `TERMINATED` 2→22 in the scheduler; okio's `STATE_IN_QUEUE`
+1→32; ktor's content-type `TYPE` 8→109). okio had 0 collisions, coroutines 1 (cross-file). The same
+val/def-or-class name-overlap limitation as Scala applies (ktor's HTTP DSL names a header const and a
+class the same), plus the sibling-companion case (several `companion object { const val TYPE }` in one
+file collapse to the file-wide target, like Ruby's sibling-class) — both bounded, and every flagged
+collision investigated was a real object/companion const.
+
+**C++ was attempted and reverted** — the machinery (file/namespace-scope + class `field_declaration`
+extraction) is correct on clean C++, but tree-sitter-cpp's parse fidelity on real template/macro-heavy
+code (and the `.h`→C-grammar routing) leaks class members and parameters to file scope as bogus
+constants. Two guards (skip declarations under an `ERROR` or `compound_statement` ancestor) removed
+~83% of the gross leaks, but the residual pervaded even well-structured library source
+(template-class member leaks, amalgamated mega-headers, `.h`-as-C++). It did not reach the precision
+bar the other languages hold, so it was reverted. Reviving C++ needs prior work on C++ parse handling
+(template-class member scoping, `.h`-as-C++ detection, amalgamated-header exclusion), not a value-refs
+wiring pass. See the playbook's §2b C++ note.
+
 **`tsx` is covered by the TS rows** — excalidraw is a React/.tsx codebase, so the headline
 `tablerIconProps` (1→170) and most of its targets live in `.tsx` files. The one
 tsx-specific path — a const read *only* inside JSX (`<Foo x={CONST}/>`) — relies on the

+ 7 - 0
src/extraction/languages/c-cpp.ts

@@ -110,6 +110,13 @@ export const cExtractor: LanguageExtractor = {
   nameField: 'declarator',
   bodyField: 'body',
   paramsField: 'parameters',
+  // A `const`/`static const` file-scope declaration carries a `type_qualifier`
+  // child reading "const" — extract those as `constant`, plain globals as
+  // `variable`.
+  isConst: (node) =>
+    node.namedChildren.some(
+      (c: SyntaxNode) => c.type === 'type_qualifier' && c.text === 'const'
+    ),
   getReturnType: extractCppReturnType,
   resolveTypeAliasKind: (node, _source) => {
     // C typedef: `typedef enum { ... } name;` or `typedef struct { ... } name;`

+ 16 - 0
src/extraction/languages/csharp.ts

@@ -121,6 +121,22 @@ export const csharpExtractor: LanguageExtractor = {
     }
     return false;
   },
+  // `const` and `static readonly` fields are C# constants (`MaxItems`, lookup
+  // tables, shared config). Drives `constant` kind so value-reference edges
+  // target them; instance `readonly` / plain `static` fields stay `field`s.
+  isConst: (node) => {
+    let hasStatic = false;
+    let hasReadonly = false;
+    for (let i = 0; i < node.childCount; i++) {
+      const child = node.child(i);
+      if (child?.type !== 'modifier') continue;
+      const t = child.text;
+      if (t === 'const') return true;
+      if (t === 'static') hasStatic = true;
+      else if (t === 'readonly') hasReadonly = true;
+    }
+    return hasStatic && hasReadonly;
+  },
   isAsync: (node) => {
     for (let i = 0; i < node.childCount; i++) {
       const child = node.child(i);

+ 13 - 0
src/extraction/languages/java.ts

@@ -86,6 +86,19 @@ export const javaExtractor: LanguageExtractor = {
     }
     return false;
   },
+  // A `static final` field is a Java constant (`MAX_ITEMS`, lookup tables,
+  // shared config). Drives `constant` kind so value-reference edges target it;
+  // instance / `final`-only / `static`-only fields stay mutable `field`s.
+  isConst: (node) => {
+    for (let i = 0; i < node.childCount; i++) {
+      const child = node.child(i);
+      if (child?.type === 'modifiers') {
+        const text = child.text;
+        return /\bstatic\b/.test(text) && /\bfinal\b/.test(text);
+      }
+    }
+    return false;
+  },
   extractImport: (node, source) => {
     const importText = source.substring(node.startIndex, node.endIndex).trim();
     const scopedId = node.namedChildren.find((c: SyntaxNode) => c.type === 'scoped_identifier');

+ 45 - 0
src/extraction/languages/kotlin.ts

@@ -85,6 +85,51 @@ export const kotlinExtractor: LanguageExtractor = {
   nameField: 'simple_identifier',
   bodyField: 'function_body',
   visitNode: (node, ctx) => {
+    // Kotlin properties (`val` / `var` / `const val`). The name nests as
+    // property_declaration → variable_declaration → simple_identifier, which the
+    // generic variable/field path can't read — so nothing was extracted before.
+    // Kind by enclosing scope: a singleton `object` / `companion object` (and a
+    // top-level property) holds *shared* values — `val`→`constant`,
+    // `var`→`variable` (the Scala-object rule; a `const val` is a `val`). A
+    // `class`/`interface`/`enum` instance `val`/`var` is per-instance state →
+    // `field` (never a value-ref target, like a Java instance `final`). A
+    // property inside a function body / `init` block / lambda is a local and is
+    // skipped entirely.
+    if (node.type === 'property_declaration') {
+      const varDecl = node.namedChildren.find((c) => c.type === 'variable_declaration');
+      const nameNode = varDecl?.namedChildren.find((c) => c.type === 'simple_identifier');
+      if (!nameNode) return false; // destructuring `val (a,b)` etc. — leave to default
+      const name = getNodeText(nameNode, ctx.source);
+      if (!name) return false;
+
+      // Walk to the nearest enclosing definition: a function body / init / lambda
+      // means it's a local; `object`/`companion object` is a constant scope; a
+      // `class_declaration` (covers class/interface/enum) is an instance scope.
+      let scope: 'local' | 'const' | 'instance' = 'const';
+      for (let p = node.parent; p; p = p.parent) {
+        const pt = p.type;
+        if (
+          pt === 'function_body' || pt === 'function_declaration' ||
+          pt === 'lambda_literal' || pt === 'anonymous_initializer' ||
+          pt === 'control_structure_body' || pt === 'getter' || pt === 'setter'
+        ) { scope = 'local'; break; }
+        if (pt === 'companion_object' || pt === 'object_declaration') { scope = 'const'; break; }
+        if (pt === 'class_declaration') { scope = 'instance'; break; }
+      }
+      if (scope === 'local') return true; // a local — don't extract
+
+      const binding = node.namedChildren.find((c) => c.type === 'binding_pattern_kind');
+      const isVal = binding != null && getNodeText(binding, ctx.source) === 'val';
+      const kind = scope === 'instance' ? 'field' : isVal ? 'constant' : 'variable';
+
+      const typeNode = node.childForFieldName('type');
+      const sig = typeNode
+        ? `${isVal ? 'val' : 'var'} ${name}: ${getNodeText(typeNode, ctx.source)}`
+        : undefined;
+      ctx.createNode(kind, name, node, { signature: sig });
+      return true;
+    }
+
     // Handle Kotlin `fun interface` declarations.
     // Tree-sitter-kotlin doesn't support `fun interface` syntax (Kotlin 1.4+).
     // It produces two different misparse patterns:

+ 23 - 12
src/extraction/languages/scala.ts

@@ -136,18 +136,29 @@ export const scalaExtractor: LanguageExtractor = {
       const name = getValVarName(node, ctx.source);
       if (!name) return false;
 
-      const isInClass = ctx.nodeStack.length > 0 &&
-        (() => {
-          const parentId = ctx.nodeStack[ctx.nodeStack.length - 1];
-          const parentNode = ctx.nodes.find((n) => n.id === parentId);
-          return parentNode != null && (
-            parentNode.kind === 'class' || parentNode.kind === 'trait' ||
-            parentNode.kind === 'interface' || parentNode.kind === 'struct' ||
-            parentNode.kind === 'enum' || parentNode.kind === 'module'
-          );
-        })();
-
-      const kind = isInClass ? 'field' : (t === 'val_definition' ? 'constant' : 'variable');
+      // An `object` is a singleton: its `val`s are shared constants (the Scala
+      // idiom for `static final` — `object Config { val Timeout = 30 }`), so
+      // emit them as `constant`/`variable` like a top-level val, which lets
+      // value-reference edges target them. A `class`/`trait`/`enum`/`given` val
+      // is a per-instance immutable field. Both an `object` and a `class`
+      // extract as `class` kind, so the AST node type of the enclosing
+      // definition — not the parent node's kind — is what distinguishes them.
+      let enclosingDef: string | null = null;
+      for (let p = node.parent; p; p = p.parent) {
+        if (
+          p.type === 'class_definition' || p.type === 'trait_definition' ||
+          p.type === 'enum_definition' || p.type === 'given_definition' ||
+          p.type === 'object_definition'
+        ) {
+          enclosingDef = p.type;
+          break;
+        }
+      }
+      const isInstanceField =
+        enclosingDef === 'class_definition' || enclosingDef === 'trait_definition' ||
+        enclosingDef === 'enum_definition' || enclosingDef === 'given_definition';
+
+      const kind = isInstanceField ? 'field' : (t === 'val_definition' ? 'constant' : 'variable');
       const typeNode = node.childForFieldName('type');
       const sig = typeNode
         ? `${t === 'val_definition' ? 'val' : 'var'} ${name}: ${getNodeText(typeNode, ctx.source)}`

+ 129 - 6
src/extraction/tree-sitter.ts

@@ -151,6 +151,47 @@ function scalaBaseTypeName(node: SyntaxNode | null, source: string): string | nu
   }
 }
 
+/**
+ * Resolve the declared identifier inside a C declarator. A `declaration`'s
+ * `declarator` field nests the name through `init_declarator` (with value),
+ * `pointer_declarator`/`array_declarator`/`parenthesized_declarator`
+ * wrappers (each via their own `declarator` field) down to an `identifier`.
+ * A `function_declarator` means the declaration is a function prototype (or a
+ * function-pointer var) — return null so it isn't extracted as a variable.
+ */
+function cDeclaratorIdentifier(node: SyntaxNode | null): SyntaxNode | null {
+  let cur: SyntaxNode | null = node;
+  let guard = 0;
+  while (cur && guard++ < 12) {
+    switch (cur.type) {
+      case 'identifier':
+        return cur;
+      case 'function_declarator':
+        return null;
+      case 'init_declarator':
+      case 'pointer_declarator':
+      case 'array_declarator':
+      case 'parenthesized_declarator':
+        cur = getChildByField(cur, 'declarator');
+        break;
+      default:
+        return null;
+    }
+  }
+  return null;
+}
+
+/** True when `node` is (transitively) inside a C function body — i.e. a local,
+ * not a file/namespace-scope declaration. Walks the parent chain to the root. */
+function hasFunctionAncestor(node: SyntaxNode): boolean {
+  let p = node.parent;
+  while (p) {
+    if (p.type === 'function_definition') return true;
+    p = p.parent;
+  }
+  return false;
+}
+
 /**
  * PHP type-position wrapper node kinds (a type-hint is `named_type`,
  * `?Foo` is `optional_type`, `A|B` is `union_type`, `A&B` is
@@ -224,7 +265,7 @@ export class TreeSitterExtractor {
   // Value-reference edges (default ON; set CODEGRAPH_VALUE_REFS=0 to disable; see flushValueRefs).
   // Same-file reads of file-scope const/var symbols → `references` edges so impact analysis catches
   // value consumers ("change this constant/table, affect its readers").
-  private static readonly VALUE_REF_LANGS = new Set<string>(['typescript', 'javascript', 'tsx', 'go', 'python', 'rust', 'ruby']);
+  private static readonly VALUE_REF_LANGS = new Set<string>(['typescript', 'javascript', 'tsx', 'go', 'python', 'rust', 'ruby', 'c', 'java', 'csharp', 'php', 'scala', 'kotlin']);
   private static readonly MAX_VALUE_REF_NODES = 20_000;
   private readonly valueRefsEnabled = process.env.CODEGRAPH_VALUE_REFS !== '0';
   private fileScopeValues = new Map<string, string>();
@@ -587,7 +628,8 @@ export class TreeSitterExtractor {
     if (this.tree) {
       const declCounts = new Map<string, number>();
       const bump = (nameNode: SyntaxNode | null) => {
-        if (nameNode && nameNode.type === 'identifier') {
+        // `simple_identifier` is Kotlin's name node (a property declarator's name).
+        if (nameNode && (nameNode.type === 'identifier' || nameNode.type === 'simple_identifier')) {
           const nm = getNodeText(nameNode, this.source);
           if (targets.has(nm)) declCounts.set(nm, (declCounts.get(nm) ?? 0) + 1);
         }
@@ -615,6 +657,21 @@ export class TreeSitterExtractor {
             else if (left) for (const c of left.namedChildren) bump(c);
             break;
           }
+          case 'init_declarator':       // C  `T X = …` (file-scope const AND the local that shadows it)
+            bump(cDeclaratorIdentifier(n));
+            break;
+          case 'val_definition':        // Scala  `val X = …` (object/top-level const AND a method-local that shadows it)
+          case 'var_definition': {      // Scala  `var X = …`
+            const pat = getChildByField(n, 'pattern');
+            if (pat?.type === 'identifier') bump(pat);
+            break;
+          }
+          case 'property_declaration': { // Kotlin  `val X = …` (object/top-level const AND a method-local that shadows it)
+            const vd = n.namedChildren.find((c) => c.type === 'variable_declaration');
+            const id = vd?.namedChildren.find((c) => c.type === 'simple_identifier');
+            if (id) bump(id);
+            break;
+          }
         }
         for (let i = 0; i < n.namedChildCount; i++) {
           const c = n.namedChild(i);
@@ -633,8 +690,18 @@ export class TreeSitterExtractor {
         const n = stack.pop()!;
         visited++;
         // `constant` covers Ruby, where both a constant's definition and its
-        // references are `constant`-typed nodes, not `identifier`.
-        if (n.type === 'identifier' || n.type === 'constant') {
+        // references are `constant`-typed nodes, not `identifier`. `name` covers
+        // PHP, where a constant reference — bare `MAX_ITEMS` or the const half of
+        // `self::MAX_ITEMS` / `Foo::MAX_ITEMS` — is a `name` node (a `$var` local
+        // is a `variable_name`, a different namespace, so it can never shadow a
+        // bare constant — no prune wiring needed). `simple_identifier` covers
+        // Kotlin, whose every name reference (a const read included) is that
+        // node type. Safe across languages: a file only holds its own grammar's
+        // nodes; `name` is PHP-only and `simple_identifier` is Kotlin-only here.
+        if (
+          n.type === 'identifier' || n.type === 'constant' ||
+          n.type === 'name' || n.type === 'simple_identifier'
+        ) {
           const refName = getNodeText(n, this.source);
           const targetId = targets.get(refName);
           // Skip self and same-name targets: a symbol referencing a file-scope
@@ -1581,6 +1648,17 @@ export class TreeSitterExtractor {
     const visibility = this.extractor.getVisibility?.(node);
     const isStatic = this.extractor.isStatic?.(node) ?? false;
 
+    // A class field that is actually a CONSTANT (Java `static final`, C# `const`
+    // / `static readonly`) is extracted as `constant` kind, not `field`, so
+    // value-reference edges treat it as a target (the gate accepts
+    // constant/variable, not field). Scoped to languages whose `isConst`
+    // predicate is field-shaped — other languages' fields stay `field`.
+    const fieldKind: NodeKind =
+      (this.language === 'java' || this.language === 'csharp') &&
+      (this.extractor.isConst?.(node) ?? false)
+        ? 'constant'
+        : 'field';
+
     // Java field_declaration: "private final String name = value;" → variable_declarator(s) are direct children
     // C# field_declaration: wraps in variable_declaration → variable_declarator(s)
     let declarators = node.namedChildren.filter(
@@ -1641,7 +1719,7 @@ export class TreeSitterExtractor {
         if (!nameNode) continue;
         const name = getNodeText(nameNode, this.source);
         const signature = typeText ? `${typeText} ${name}` : name;
-        const fieldNode = this.createNode('field', name, decl, {
+        const fieldNode = this.createNode(fieldKind, name, decl, {
           docstring,
           signature,
           visibility,
@@ -1665,7 +1743,7 @@ export class TreeSitterExtractor {
         || node.namedChildren.find(c => c.type === 'identifier');
       if (nameNode) {
         const name = getNodeText(nameNode, this.source);
-        this.createNode('field', name, node, {
+        this.createNode(fieldKind, name, node, {
           docstring,
           visibility,
           isStatic,
@@ -1967,6 +2045,51 @@ export class TreeSitterExtractor {
         const initSignature = initValue ? `= ${initValue}${initValue.length >= 100 ? '...' : ''}` : undefined;
         this.createNode(kind, name, nameNode, { docstring, signature: initSignature, isExported });
       });
+    } else if (this.language === 'c') {
+      // C: a `declaration` node's name nests inside the `declarator` field —
+      // `init_declarator` (with value) or bare/pointer/array declarators (no
+      // value); a `function_declarator` is a prototype, not a variable. The
+      // generic fallback below only finds a *direct* identifier child, which C
+      // never has, so file-scope consts/globals went unextracted entirely (and
+      // so had no impact-radius edges). Only file-scope declarations are tracked
+      // — locals inside a function body are skipped (a `static const` table read
+      // by same-file functions is the value the impact graph wants, not every
+      // block-local). C allows several declarators per declaration
+      // (`int a = 1, b = 2;`), so iterate them.
+      if (!hasFunctionAncestor(node)) {
+        for (let i = 0; i < node.namedChildCount; i++) {
+          const child = node.namedChild(i);
+          if (!child) continue;
+          // Accept only `init_declarator` (has a value) and pointer/array
+          // declarators. A *bare* `identifier` declarator is deliberately
+          // skipped: an unknown leading macro (`CURL_EXTERN`, `XXH_PUBLIC_API`)
+          // makes tree-sitter-c misparse a prototype `MACRO RetType fn(args);`
+          // as a declaration whose "variable" is the bare return-type
+          // identifier, splitting `fn(args)` off as a bogus expression — minting
+          // a spurious type-named global for every macro-prefixed prototype in a
+          // header. Those misparses are always bare identifiers; real
+          // consts/tables always carry an initializer. The only legit loss is
+          // uninitialized scalar globals (`static int g;`).
+          if (
+            child.type !== 'init_declarator' &&
+            child.type !== 'pointer_declarator' &&
+            child.type !== 'array_declarator'
+          ) {
+            continue;
+          }
+          const nameNode = cDeclaratorIdentifier(child);
+          if (!nameNode) continue;
+          const name = getNodeText(nameNode, this.source);
+          if (!name) continue;
+          const valueNode =
+            child.type === 'init_declarator' ? getChildByField(child, 'value') : null;
+          const initValue = valueNode ? getNodeText(valueNode, this.source).slice(0, 100) : undefined;
+          const initSignature = initValue
+            ? `= ${initValue}${initValue.length >= 100 ? '...' : ''}`
+            : undefined;
+          this.createNode(kind, name, child, { docstring, signature: initSignature, isExported });
+        }
+      }
     } else {
       // Generic fallback for other languages
       // Try to find identifier children