# Playbook: extend value-reference edges to a new language **Purpose.** This is the operational runbook for adding + validating value-reference-edge coverage for one more language. Point a fresh session at this file and say **"Start on language X"** — it has everything: how the feature works, where the code is, the exact validation recipe (with scripts), the per-language checklist, and the traps already hit. Design rationale + the validation matrix already done live in the companion doc: [`value-reference-edges.md`](./value-reference-edges.md). This file is the *how-to*. --- ## 0. "Start on language X" — do this in order 1. Read §1 (how it works) and §2 (current state) so you know the mechanism and what's done. 2. Do the **per-language wiring check** (§5 step A–C) — this is where languages differ and where most of the real work/decisions are. Do NOT skip: a wrong declarator node type or a class-scope-vs-file-scope mismatch makes the feature silently emit nothing (or wrong edges). 3. Run the **validation sweep** (§4) on small/medium/large **public OSS** repos for that language. Hunt FPs. **Fix FP clusters; record singletons.** (See §3 for what a real FP looks like vs an acceptable one.) 4. Add a **row to the matrix** in `value-reference-edges.md` and a **test case** in `__tests__/value-reference-edges.test.ts`. 5. Commit on a branch, open a PR. (§6 has the git workflow + how the prior PRs were done.) Scope rule (hard): **never eval on the maintainer's own repos** — clone a real public OSS repo for the language. (Memory: `agent-eval-targets-public-oss-only`.) --- ## 1. How value-reference edges work **What:** a `references` edge with `metadata: { valueRef: true }` from a *reader symbol* to the **file-scope `const`/`var` it reads**, same-file only. It exists so impact analysis catches "change this constant / config object / lookup table → affect its readers" — a class of change calls/imports/inheritance edges never captured (a const's consumers used to look like "nothing depends on this"). **Where it flows:** straight into `getImpactRadius` → `codegraph impact` and the impact trail in `codegraph_explore` / `codegraph_node`. No agent-behaviour change required. **The win is impact-radius correctness** (a const 90 symbols read going from "1 affected" to "90"), *not* agent read-reduction (see §4.3). **Code — all in `src/extraction/tree-sitter.ts`:** | Symbol | Role | |---|---| | `VALUE_REF_LANGS` (static Set) | languages the feature runs for. Currently `typescript`, `javascript`, `tsx`, `go`, `python`, `rust`, `ruby`. **Add the new language here.** | | `valueRefsEnabled` | `process.env.CODEGRAPH_VALUE_REFS !== '0'` — default ON, env opts out. | | `MAX_VALUE_REF_NODES` (20_000) | per-scope traversal cap (and the shadow-scan cap). | | `captureValueRefScope(kind, name, id, node)` | called from `createNode` on every node. Records **targets** (file-scope `const`/`var`) and **reader scopes** (`function`/`method`/`const`/`var`). | | `flushValueRefs()` | called once at end of `extract()`. Prunes shadowed targets, then for each reader scope walks its subtree for identifiers matching a target name and emits the edges. | **The two gates inside `captureValueRefScope`** (what you may need to adjust per language): - **Target gate:** `kind ∈ {constant, variable}` **and** `name.length >= 3` **and** `/[A-Z_]/.test(name)` (distinctive name — dodges single-letter / all-lowercase shadowing) **and** the node's parent id starts with `file:`, `class:`, or `module:` (file/class/module scope). - **Reader gate:** `kind ∈ {function, method, constant, variable}`. **The emit loop in `flushValueRefs`:** same-file only (targets + scopes are per-file, reset each flush); deduped per `(reader, target)`; skips `isGeneratedFile(path)`; **prunes shadowed targets** (see §3). --- ## 2. Current state (what's shipped + validated) - **Default ON** for TS/JS/tsx + Go + Python + Rust + Ruby (`CODEGRAPH_VALUE_REFS=0` disables). Shipped in **PR #895** (flip-on + the shadow prune); Go added in a later PR (the shadow-prune declarator switch + `VALUE_REF_LANGS`). - **Validated S/M/L** in **TS, JS, tsx, Go, Python, Rust, and Ruby** — see the matrix in the design doc. All clean: node count identical on/off, precision guards held, impact win reproduced. Go required extending the shadow prune (per-grammar declarators) — the worked example of "step B is load-bearing." - **Tests:** `__tests__/value-reference-edges.test.ts` — same-file readers edged; surfaced in impact radius; shadowed const NOT edged (verified to fail without the guard); JSX-only read edged (tsx); `CODEGRAPH_VALUE_REFS=0` emits nothing. - **Memory:** `value-reference-edges-default-on` (the A/B finding + shadow guard rationale). --- ## 2b. Coverage vs the README (languages + frameworks) Tracked against the README's **Supported Languages** table (24 rows) and **Framework-aware Routes** list. Value-refs is **language-level**, so frameworks are *not* a separate axis (see the bottom of this section). **✅ Done — validated S/M/L (7 + 3 inherited):** | Language | How | |---|---| | TypeScript, JavaScript, tsx | file-scope `const`/`var`; the original languages | | Python | module-level `NAME =` | | Go | package `const`/`var` | | Rust | module + impl `const`/`static` | | Ruby | class/module `CONST` (the class-scope extension) | | **Svelte, Vue, Astro** | **inherited for free** — their extractors re-parse the `