il y a 1 mois · 4329a52bec
--- a/.claude/skills/add-lang/SKILL.md
+++ b/.claude/skills/add-lang/SKILL.md
@@ -0,0 +1,219 @@
 
				+---
			
 
				+name: add-lang
			
 
				+description: Add tree-sitter language support to codegraph end-to-end — wire the grammar + extractor, write tests, then benchmark extraction quality and retrieval value on 3 popular real-world repos. Use when the user runs /add-lang <language> or asks to add/support a new language (e.g. Lua, Elixir, Zig, OCaml) in codegraph.
			
 
				+---
			
 
				+
			
 
				+# Add a language to CodeGraph
			
 
				+
			
 
				+Wire a new tree-sitter language into codegraph's extraction pipeline, prove it
			
 
				+extracts real symbols on popular repos, and prove it beats no-codegraph for an
			
 
				+agent. Runs **fully autonomously** — pick repos, benchmark, update docs, then
			
 
				+report. **Never commit, push, publish, or tag** (house rule); leave all changes
			
 
				+for the user to review.
			
 
				+
			
 
				+The argument is the language token used throughout the `Language` union, e.g.
			
 
				+`lua`, `elixir`, `zig`. If none was given, ask which language. Use the lowercase
			
 
				+single-token form everywhere (`csharp`, not `c#`).
			
 
				+
			
 
				+## Prerequisites
			
 
				+- Run from the codegraph repo root. `node`, `git`, `gh`, and a logged-in
			
 
				+  `claude` CLI (the benchmark spawns real `claude -p` runs).
			
 
				+- The benchmark uses the local dev build — Step 8 builds + links it on PATH.
			
 
				+
			
 
				+## Workflow
			
 
				+
			
 
				+Copy this checklist and work through it in order:
			
 
				+```
			
 
				+- [ ] 1. Resolve language; bail early if already supported (just benchmark)
			
 
				+- [ ] 2. Find a grammar + health-check it (ABI / heap corruption)
			
 
				+- [ ] 3. Discover the grammar's AST node types (dump-ast.mjs)
			
 
				+- [ ] 4. Wire the language (4 files; sometimes a 5th core touch)
			
 
				+- [ ] 5. Build + verify-extraction loop until PASS
			
 
				+- [ ] 6. Add extraction tests; make them green
			
 
				+- [ ] 7. Auto-pick 3 popular repos by size tier; add to corpus.json
			
 
				+- [ ] 8. Benchmark all 3: extraction + with/without A/B
			
 
				+- [ ] 9. Update README + CHANGELOG
			
 
				+- [ ] 10. Report; do NOT commit
			
 
				+```
			
 
				+
			
 
				+### Step 1 — Resolve + short-circuit
			
 
				+
			
 
				+Check whether the language is already wired: look for the token in the
			
 
				+`LANGUAGES` const (`src/types.ts`) and the `EXTRACTORS` map
			
 
				+(`src/extraction/languages/index.ts`). If it is already supported (e.g.
			
 
				+`typescript`, `rust`), **skip Steps 2–6** and go straight to benchmarking
			
 
				+(Steps 7–8) to validate/measure it — note in the report that no code changed.
			
 
				+
			
 
				+### Step 2 — Find a grammar, then health-check it
			
 
				+
			
 
				+```bash
			
 
				+ls node_modules/tree-sitter-wasms/out/ | grep -i <lang>   # csharp -> c_sharp
			
 
				+```
			
 
				+- **Present** → likely off-the-shelf; `grammars.ts` resolves it from
			
 
				+  `tree-sitter-wasms` automatically. (Many languages: elixir, zig, ocaml,
			
 
				+  solidity, toml, yaml, …)
			
 
				+- **Absent** → vendor a `.wasm` into `src/extraction/wasm/` (like `pascal` /
			
 
				+  `scala` / `lua`) and add the token to the vendored branch in Step 4.
			
 
				+
			
 
				+**Always health-check before writing an extractor — a *present* grammar can
			
 
				+still be unusable:**
			
 
				+```bash
			
 
				+node scripts/add-lang/check-grammar.mjs <lang> path/to/valid-sample.<ext>
			
 
				+```
			
 
				+It prints the grammar's ABI version and parses a valid sample many times in a
			
 
				+multi-grammar runtime. If it **FAILs** (ERROR trees on valid code — an old ABI
			
 
				+corrupting the shared WASM heap, which silently drops nested calls/imports on
			
 
				+every file after the first; e.g. the tree-sitter-wasms **Lua** grammar is ABI 13
			
 
				+and fails), do NOT use that wasm. **Vendor a newer (ABI 14/15) build instead:**
			
 
				+```bash
			
 
				+npm pack @tree-sitter-grammars/tree-sitter-<lang>   # often ships a prebuilt *.wasm
			
 
				+# or build one: npx tree-sitter build --wasm   (needs Docker/emscripten)
			
 
				+cp <the>.wasm src/extraction/wasm/tree-sitter-<lang>.wasm
			
 
				+```
			
 
				+then add the token to the vendored branch in Step 4 and re-run check-grammar on
			
 
				+the vendored path until it PASSes. **If you cannot obtain a healthy wasm, STOP
			
 
				+and tell the user.**
			
 
				+
			
 
				+### Step 3 — Discover AST node types
			
 
				+
			
 
				+Get a representative source file (write a small sample covering functions,
			
 
				+classes/structs, imports, enums; or `curl` a raw file from a known repo), then:
			
 
				+```bash
			
 
				+node scripts/add-lang/dump-ast.mjs <lang> path/to/sample.<ext>
			
 
				+# vendored grammar: pass the wasm path instead of the token
			
 
				+node scripts/add-lang/dump-ast.mjs src/extraction/wasm/tree-sitter-<lang>.wasm sample.<ext>
			
 
				+```
			
 
				+The frequency table + field names (`name:`, `parameters:`, `body:`,
			
 
				+`return_type:`) tell you what to map. Open the existing extractor closest to the
			
 
				+language's paradigm as a model: `rust.ts`/`scala.ts` (functional, traits),
			
 
				+`java.ts`/`csharp.ts` (OO), `python.ts`/`ruby.ts` (scripting), `go.ts`
			
 
				+(top-level methods + receivers).
			
 
				+
			
 
				+### Step 4 — Wire the language (4 files)
			
 
				+
			
 
				+These are exact, fragile wiring — match the existing style precisely:
			
 
				+
			
 
				+1. **`src/types.ts`** — TWO edits:
			
 
				+   - add `'<lang>',` to the `LANGUAGES` const (before `'unknown'`);
			
 
				+   - add `'**/*.<ext>',` to `DEFAULT_CONFIG.include`. **Don't skip this** — it's
			
 
				+     the file-scan allowlist; without the glob, `codegraph init` finds **0
			
 
				+     files** even though detection/extraction are wired.
			
 
				+2. **`src/extraction/grammars.ts`** — three maps:
			
 
				+   - `WASM_GRAMMAR_FILES`: `<lang>: 'tree-sitter-<lang>.wasm',`
			
 
				+   - `EXTENSION_MAP`: each file extension → `'<lang>'` (e.g. `'.lua': 'lua',`)
			
 
				+   - `getLanguageDisplayName`: `<lang>: '<Display Name>',`
			
 
				+   - **vendored only**: add `<lang>` to the
			
 
				+     `(lang === 'pascal' || lang === 'scala' || …)` wasm-path branch.
			
 
				+3. **`src/extraction/languages/<lang>.ts`** — new file exporting
			
 
				+   `export const <lang>Extractor: LanguageExtractor = { … }`. Map the node types
			
 
				+   from Step 3. Required fields: `functionTypes`, `classTypes`, `methodTypes`,
			
 
				+   `interfaceTypes`, `structTypes`, `enumTypes`, `typeAliasTypes`,
			
 
				+   `importTypes`, `callTypes`, `variableTypes`, `nameField`, `bodyField`,
			
 
				+   `paramsField`. Add hooks as the grammar needs them (`getSignature`,
			
 
				+   `getVisibility`, `isExported`, `extractImport`, `visitNode`, `getReceiverType`,
			
 
				+   `interfaceKind`, `enumMemberTypes`, etc. — see
			
 
				+   `src/extraction/tree-sitter-types.ts`).
			
 
				+4. **`src/extraction/languages/index.ts`** — `import { <lang>Extractor } from
			
 
				+   './<lang>';` and add `<lang>: <lang>Extractor,` to `EXTRACTORS`.
			
 
				+
			
 
				+**Sometimes a 5th, core touch in `src/extraction/tree-sitter.ts`** — variable
			
 
				+extraction has per-language branches in `extractVariable` (the generic fallback
			
 
				+only finds direct `identifier`/`variable_declarator` children). If the grammar
			
 
				+nests declared names (e.g. Lua's `variable_declaration → variable_list`), add a
			
 
				+`} else if (this.language === '<lang>')` branch there, mirroring the existing
			
 
				+ts/python/go ones. Import forms that aren't a distinct node (Lua/Ruby `require`
			
 
				+is a *call*) are handled in the extractor's `visitNode` hook instead.
			
 
				+
			
 
				+### Step 5 — Build + verify loop
			
 
				+
			
 
				+```bash
			
 
				+npm run build            # tsc + copy-assets (copies any vendored *.wasm into dist/)
			
 
				+```
			
 
				+Index a small sample repo and check extraction:
			
 
				+```bash
			
 
				+( cd <sample-repo> && codegraph init -i )
			
 
				+node scripts/add-lang/verify-extraction.mjs <sample-repo> <lang>
			
 
				+```
			
 
				+`verify-extraction.mjs` fails (exit 1) if the language isn't detected or only
			
 
				+`file`/`import` nodes were produced — the classic symptom of wrong node-type
			
 
				+names. On FAIL or a thin WARN: re-run `dump-ast.mjs` on a richer file, fix the
			
 
				+mappings in `<lang>.ts`, `npm run build`, re-index, re-verify. **Repeat until
			
 
				+PASS.**
			
 
				+
			
 
				+### Step 6 — Tests
			
 
				+
			
 
				+Add to `__tests__/extraction.test.ts`, modeled on the `Rust Extraction` block:
			
 
				+- a `detectLanguage` assertion in `describe('Language Detection')`
			
 
				+- a `describe('<Lang> Extraction')` block asserting functions/classes/imports
			
 
				+  are extracted from an inline source string.
			
 
				+```bash
			
 
				+npx vitest run __tests__/extraction.test.ts
			
 
				+```
			
 
				+Green before continuing.
			
 
				+
			
 
				+### Step 7 — Auto-pick 3 repos + corpus
			
 
				+
			
 
				+Pick **without asking**. Find candidates, then curate 3 that are genuinely
			
 
				+`<lang>`-dominant, one per size tier:
			
 
				+```bash
			
 
				+gh search repos --language=<lang> --sort=stars --limit 40 \
			
 
				+  --json fullName,stargazerCount,description
			
 
				+```
			
 
				+Tiers (match `corpus.json`): **Small** <~150 files · **Medium** ~150–1500 ·
			
 
				+**Large** >~1500. Skip repos that are tagged `<lang>` but mostly another
			
 
				+language. Write one cross-file architecture **question** per repo (the kind that
			
 
				+needs tracing across files). Add a `"<Language>"` block to
			
 
				+`.claude/skills/agent-eval/corpus.json` (fields: `name`, `repo`, `size`,
			
 
				+`files`, `question`) so `/agent-eval` can reuse them.
			
 
				+
			
 
				+### Step 8 — Benchmark all 3 (extraction + A/B)
			
 
				+
			
 
				+Make the dev build the codegraph on PATH **once**, then loop:
			
 
				+```bash
			
 
				+npm run build && ./scripts/local-install.sh
			
 
				+scripts/add-lang/bench.sh <lang> <name> <url> "<question>" headless   # ×3
			
 
				+```
			
 
				+`bench.sh` clones (shared `/tmp/codegraph-corpus`), wipes + indexes, runs
			
 
				+`verify-extraction.mjs`, then the with/without retrieval A/B via
			
 
				+`scripts/agent-eval/run-all.sh` (skips the paid A/B if extraction is broken).
			
 
				+Read each `parse-run.mjs` summary printed by `run-all.sh`: tool calls, file
			
 
				+`Read`s, Grep/Bash, codegraph-tool calls, duration, and **cost** — for both the
			
 
				+`with` and `without` arms. After the loop, restore the dev link if needed:
			
 
				+`./scripts/local-install.sh`.
			
 
				+
			
 
				+### Step 9 — Docs + CHANGELOG
			
 
				+
			
 
				+- **README.md**: add `<Lang>` to the "19+ Languages" feature bullet, and add a
			
 
				+  row to the **Supported Languages** table:
			
 
				+  `| <Lang> | \`.ext\` | Full support (classes, methods, …) |`.
			
 
				+- **CHANGELOG.md**: add an `## [Unreleased]` section at the top (above the
			
 
				+  latest version) with `### Added` → a user-perspective bullet, e.g.
			
 
				+  *"CodeGraph now indexes **<Lang>** (`.ext`) — functions, classes, imports, and
			
 
				+  call edges."* If `## [Unreleased]` already exists, append under it. (`/publish`
			
 
				+  folds this into the next versioned block at release time.)
			
 
				+
			
 
				+### Step 10 — Report (do NOT commit)
			
 
				+
			
 
				+Summarize for review:
			
 
				+- **Files changed**: the 4 wiring edits + new extractor + tests + README +
			
 
				+  CHANGELOG + corpus.json (+ any vendored `.wasm`).
			
 
				+- **Extraction** per repo: files / nodes / edges / `verify-extraction` result.
			
 
				+- **A/B** per repo: `with` vs `without` (tool calls, file Reads, cost) and a
			
 
				+  one-line verdict — did codegraph reduce effort, and did both arms reach a
			
 
				+  correct answer?
			
 
				+- **Gaps / follow-ups** (node types not yet mapped, resolution edges missing,
			
 
				+  framework routes, etc.).
			
 
				+
			
 
				+Hand the changes to the user. **Do not** run `git commit`/`push`,
			
 
				+`npm publish`, or `scripts/release.sh`.
			
 
				+
			
 
				+## Notes
			
 
				+- The A/B spawns real **paid** `claude -p` runs (opus, `--max-budget-usd`),
			
 
				+  2 arms × 3 repos. The corpus dir `/tmp/codegraph-corpus` is shared with
			
 
				+  `/agent-eval`, so clones are reused across runs.
			
 
				+- Any new `*.wasm` must live in `src/extraction/wasm/` — `copy-assets` (run by
			
 
				+  `npm run build`) ships it; otherwise it won't be in `dist/`.
			
 
				+- An index must be served by the **same** binary that built it. Step 8 builds +
			
 
				+  links the dev build first, so this holds.
			
 
				+- If a grammar can't be obtained, or extraction can't reach PASS, **STOP and
			
 
				+  report** — don't ship a half-wired language.
			
--- a/.claude/skills/agent-eval/corpus.json
+++ b/.claude/skills/agent-eval/corpus.json
@@ -59,5 +59,15 @@
 
				   ],
			
 
				   "Svelte": [
			
 
				     { "name": "shadcn-svelte", "repo": "https://github.com/huntabyte/shadcn-svelte", "size": "Medium", "files": "~600", "question": "How do shadcn-svelte components compose and apply their styling?" }
			
 
				+  ],
			
 
				+  "Lua": [
			
 
				+    { "name": "lualine.nvim", "repo": "https://github.com/nvim-lualine/lualine.nvim", "size": "Small", "files": "~120", "question": "How does lualine assemble and render its statusline sections and components?" },
			
 
				+    { "name": "telescope.nvim", "repo": "https://github.com/nvim-telescope/telescope.nvim", "size": "Medium", "files": "~80", "question": "How does Telescope wire a picker to its finder, sorter, and previewer?" },
			
 
				+    { "name": "kong", "repo": "https://github.com/Kong/kong", "size": "Large", "files": "~1330", "question": "How does Kong execute plugins across a request's lifecycle phases?" }
			
 
				+  ],
			
 
				+  "Luau": [
			
 
				+    { "name": "Knit", "repo": "https://github.com/Sleitnick/Knit", "size": "Small", "files": "~10", "question": "How does Knit register services and expose them to clients?" },
			
 
				+    { "name": "vide", "repo": "https://github.com/centau/vide", "size": "Small", "files": "~40", "question": "How does vide track reactive sources and re-run effects when state changes?" },
			
 
				+    { "name": "Fusion", "repo": "https://github.com/dphfox/Fusion", "size": "Medium", "files": "~115", "question": "How does Fusion build and update its reactive UI graph from state objects?" }
			
 
				   ]
			
 
				 }
			
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -7,6 +7,20 @@ a [GitHub Release](https://github.com/colbymchenry/codegraph/releases) tagged
 
				 This project follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/)
			
 
				 and adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
			
 
				 
			
 
				+## [Unreleased]
			
 
				+
			
 
				+### Added
			
 
				+- **Lua**: CodeGraph now indexes Lua (`.lua`) — functions, methods (table `t.f`
			
 
				+  and `t:m` definitions become methods with a `t::f` receiver-qualified name),
			
 
				+  local variables, `require(...)` imports, and the call edges between them.
			
 
				+  Querying a Lua project (Neovim plugins, Kong, OpenResty, game code) now
			
 
				+  surfaces its modules, methods, and call graph.
			
 
				+- **Luau** ([#232](https://github.com/colbymchenry/codegraph/issues/232)):
			
 
				+  CodeGraph now indexes Luau (`.luau`), Roblox's typed superset of Lua —
			
 
				+  everything Lua extracts, plus `type` / `export type` aliases, typed function
			
 
				+  signatures, generics, and Roblox instance-path `require(script.Parent.X)`
			
 
				+  imports.
			
 
				+
			
 
				 ## [0.8.0] - 2026-05-20
			
 
				 
			
 
				 ### Added
			
--- a/README.md
+++ b/README.md
@@ -107,7 +107,7 @@ The gains scale with codebase size: on large repos the agent answers from the in
 
				 | **Full-Text Search** | Find code by name instantly across your entire codebase, powered by FTS5 |
			
 
				 | **Impact Analysis** | Trace callers, callees, and the full impact radius of any symbol before making changes |
			
 
				 | **Always Fresh** | File watcher uses native OS events (FSEvents/inotify/ReadDirectoryChangesW) with debounced auto-sync — the graph stays current as you code, zero config |
			
 
				-| **19+ Languages** | TypeScript, JavaScript, Python, Go, Rust, Java, C#, PHP, Ruby, C, C++, Swift, Kotlin, Dart, Svelte, Liquid, Pascal/Delphi |
			
 
				+| **19+ Languages** | TypeScript, JavaScript, Python, Go, Rust, Java, C#, PHP, Ruby, C, C++, Swift, Kotlin, Dart, Lua, Luau, Svelte, Liquid, Pascal/Delphi |
			
 
				 | **Framework-aware Routes** | Recognizes web-framework routing files and links URL patterns to their handlers across 13 frameworks |
			
 
				 | **100% Local** | No data leaves your machine. No API keys. No external services. SQLite database only |
			
 
				 
			
@@ -447,6 +447,8 @@ The `.codegraph/config.json` file controls indexing:
 
				 | Vue | `.vue` | Full support (script + script-setup extraction, Nuxt page/API/middleware routes) |
			
 
				 | Liquid | `.liquid` | Full support |
			
 
				 | Pascal / Delphi | `.pas`, `.dpr`, `.dpk`, `.lpr` | Full support (classes, records, interfaces, enums, DFM/FMX form files) |
			
 
				+| Lua | `.lua` | Full support (functions, methods with receivers, local variables, `require` imports, call edges) |
			
 
				+| Luau | `.luau` | Full support (everything in Lua, plus `type`/`export type` aliases, typed signatures, and Roblox instance-path `require`) |
			
 
				 
			
 
				 ## Troubleshooting
			
 
				 
			
--- a/__tests__/extraction.test.ts
+++ b/__tests__/extraction.test.ts
@@ -3722,3 +3722,180 @@ class Svc {
 
				     expect(decoratedNode?.name).toBe('method');
			
 
				   });
			
 
				 });
			
 
				+
			
 
				+// =============================================================================
			
 
				+// Lua
			
 
				+// =============================================================================
			
 
				+
			
 
				+describe('Lua Extraction', () => {
			
 
				+  describe('Language detection', () => {
			
 
				+    it('should detect Lua files', () => {
			
 
				+      expect(detectLanguage('init.lua')).toBe('lua');
			
 
				+      expect(detectLanguage('src/util.lua')).toBe('lua');
			
 
				+    });
			
 
				+
			
 
				+    it('should report Lua as supported', () => {
			
 
				+      expect(isLanguageSupported('lua')).toBe(true);
			
 
				+      expect(getSupportedLanguages()).toContain('lua');
			
 
				+    });
			
 
				+  });
			
 
				+
			
 
				+  describe('Function extraction', () => {
			
 
				+    it('should extract global and local functions', () => {
			
 
				+      const code = `
			
 
				+function configure(opts) return opts end
			
 
				+local function helper(x) return x * 2 end
			
 
				+`;
			
 
				+      const result = extractFromSource('init.lua', code);
			
 
				+      const funcs = result.nodes.filter((n) => n.kind === 'function').map((n) => n.name);
			
 
				+      expect(funcs).toContain('configure');
			
 
				+      expect(funcs).toContain('helper');
			
 
				+      const configure = result.nodes.find((n) => n.name === 'configure');
			
 
				+      expect(configure?.language).toBe('lua');
			
 
				+      expect(configure?.signature).toBe('(opts)');
			
 
				+    });
			
 
				+
			
 
				+    it('should split table/method functions into a receiver and method name', () => {
			
 
				+      const code = `
			
 
				+function M.connect(host, port) return host end
			
 
				+function M:send(data) return self end
			
 
				+`;
			
 
				+      const result = extractFromSource('init.lua', code);
			
 
				+      const methods = result.nodes.filter((n) => n.kind === 'method');
			
 
				+      const connect = methods.find((m) => m.name === 'connect');
			
 
				+      expect(connect?.qualifiedName).toBe('M::connect');
			
 
				+      const send = methods.find((m) => m.name === 'send');
			
 
				+      expect(send?.qualifiedName).toBe('M::send');
			
 
				+    });
			
 
				+  });
			
 
				+
			
 
				+  describe('Variable extraction', () => {
			
 
				+    it('should extract local variable declarations', () => {
			
 
				+      const code = `
			
 
				+local M = {}
			
 
				+local count = 0
			
 
				+`;
			
 
				+      const result = extractFromSource('mod.lua', code);
			
 
				+      const vars = result.nodes.filter((n) => n.kind === 'variable').map((n) => n.name);
			
 
				+      expect(vars).toContain('M');
			
 
				+      expect(vars).toContain('count');
			
 
				+    });
			
 
				+  });
			
 
				+
			
 
				+  describe('Import extraction (require)', () => {
			
 
				+    it('should extract require() in local declarations and bare calls', () => {
			
 
				+      const code = `
			
 
				+local socket = require("socket")
			
 
				+local http = require "resty.http"
			
 
				+require("side.effect")
			
 
				+`;
			
 
				+      const result = extractFromSource('net.lua', code);
			
 
				+      const imports = result.nodes.filter((n) => n.kind === 'import').map((n) => n.name);
			
 
				+      expect(imports).toContain('socket');
			
 
				+      expect(imports).toContain('resty.http');
			
 
				+      expect(imports).toContain('side.effect');
			
 
				+
			
 
				+      const ref = result.unresolvedReferences.find(
			
 
				+        (r) => r.referenceKind === 'imports' && r.referenceName === 'socket'
			
 
				+      );
			
 
				+      expect(ref).toBeDefined();
			
 
				+    });
			
 
				+
			
 
				+    // Regression: the tree-sitter-wasms Lua grammar (ABI 13) corrupts the shared
			
 
				+    // WASM heap under web-tree-sitter 0.25, dropping nested calls/imports on every
			
 
				+    // parse after the first. We vendor the ABI-15 grammar instead — this guards it
			
 
				+    // by extracting several sources in sequence and asserting the LAST still works.
			
 
				+    it('should keep extracting require across many sequential parses', () => {
			
 
				+      let last;
			
 
				+      for (let i = 0; i < 8; i++) {
			
 
				+        last = extractFromSource(`f${i}.lua`, `local m = require("module.${i}")\nreturn m\n`);
			
 
				+      }
			
 
				+      const imports = last!.nodes.filter((n) => n.kind === 'import').map((n) => n.name);
			
 
				+      expect(imports).toContain('module.7');
			
 
				+    });
			
 
				+  });
			
 
				+
			
 
				+  describe('Call extraction', () => {
			
 
				+    it('should record intra-file calls as resolvable references', () => {
			
 
				+      const code = `
			
 
				+local function helper(x) return x end
			
 
				+local function run(y) return helper(y) end
			
 
				+`;
			
 
				+      const result = extractFromSource('calls.lua', code);
			
 
				+      const call = result.unresolvedReferences.find(
			
 
				+        (r) => r.referenceKind === 'calls' && r.referenceName === 'helper'
			
 
				+      );
			
 
				+      expect(call).toBeDefined();
			
 
				+    });
			
 
				+  });
			
 
				+});
			
 
				+
			
 
				+// =============================================================================
			
 
				+// Luau (typed superset of Lua — https://luau.org)
			
 
				+// =============================================================================
			
 
				+
			
 
				+describe('Luau Extraction', () => {
			
 
				+  describe('Language detection', () => {
			
 
				+    it('should detect Luau files', () => {
			
 
				+      expect(detectLanguage('init.luau')).toBe('luau');
			
 
				+      expect(detectLanguage('src/Client.luau')).toBe('luau');
			
 
				+    });
			
 
				+
			
 
				+    it('should report Luau as supported', () => {
			
 
				+      expect(isLanguageSupported('luau')).toBe(true);
			
 
				+      expect(getSupportedLanguages()).toContain('luau');
			
 
				+    });
			
 
				+  });
			
 
				+
			
 
				+  describe('Type aliases', () => {
			
 
				+    it('should extract `type` and `export type` definitions', () => {
			
 
				+      const code = `
			
 
				+export type Vector = { x: number, y: number }
			
 
				+type Handler = (msg: string) -> boolean
			
 
				+`;
			
 
				+      const result = extractFromSource('types.luau', code);
			
 
				+      const aliases = result.nodes.filter((n) => n.kind === 'type_alias');
			
 
				+      const vector = aliases.find((a) => a.name === 'Vector');
			
 
				+      expect(vector).toBeDefined();
			
 
				+      expect(vector?.isExported).toBe(true);
			
 
				+      const handler = aliases.find((a) => a.name === 'Handler');
			
 
				+      expect(handler).toBeDefined();
			
 
				+      expect(handler?.isExported).toBe(false);
			
 
				+    });
			
 
				+  });
			
 
				+
			
 
				+  describe('Typed functions and methods', () => {
			
 
				+    it('should capture typed signatures and split methods by receiver', () => {
			
 
				+      const code = `
			
 
				+function configure(opts: { debug: boolean }): boolean
			
 
				+	return opts.debug
			
 
				+end
			
 
				+function Client:fetch(path: string): Response
			
 
				+	return path
			
 
				+end
			
 
				+`;
			
 
				+      const result = extractFromSource('client.luau', code);
			
 
				+      const configure = result.nodes.find((n) => n.kind === 'function' && n.name === 'configure');
			
 
				+      expect(configure?.language).toBe('luau');
			
 
				+      expect(configure?.signature).toBe('(opts: { debug: boolean }): boolean');
			
 
				+      const fetch = result.nodes.find((n) => n.kind === 'method' && n.name === 'fetch');
			
 
				+      expect(fetch?.qualifiedName).toBe('Client::fetch');
			
 
				+    });
			
 
				+  });
			
 
				+
			
 
				+  describe('Imports and variables', () => {
			
 
				+    it('should extract string and Roblox instance-path require imports', () => {
			
 
				+      const code = `
			
 
				+local http = require("http")
			
 
				+local Signal = require(script.Parent.Signal)
			
 
				+local count = 0
			
 
				+`;
			
 
				+      const result = extractFromSource('mod.luau', code);
			
 
				+      const imports = result.nodes.filter((n) => n.kind === 'import').map((n) => n.name);
			
 
				+      expect(imports).toContain('http'); // string require
			
 
				+      expect(imports).toContain('Signal'); // Roblox instance-path require
			
 
				+      const vars = result.nodes.filter((n) => n.kind === 'variable').map((n) => n.name);
			
 
				+      expect(vars).toContain('count');
			
 
				+    });
			
 
				+  });
			
 
				+});
			
--- a/scripts/add-lang/bench.sh
+++ b/scripts/add-lang/bench.sh
@@ -0,0 +1,60 @@
 
				+#!/usr/bin/env bash
			
 
				+# Add-lang benchmark for ONE repo:
			
 
				+#   clone -> wipe+index (with the codegraph on PATH) -> verify extraction ->
			
 
				+#   with/without retrieval A/B (reuses scripts/agent-eval/run-all.sh).
			
 
				+#
			
 
				+# Assumes the codegraph dev build is already built + linked on PATH — the skill
			
 
				+# runs `npm run build && ./scripts/local-install.sh` ONCE before looping repos.
			
 
				+# The A/B is skipped if extraction fails its critical checks (don't burn $ on a
			
 
				+# broken extractor); set FORCE_AB=1 to run it anyway.
			
 
				+#
			
 
				+# Usage: bench.sh <lang> <repo-name> <repo-url> "<question>" [headless|tmux|all]
			
 
				+# Env:   CORPUS   corpus dir (default /tmp/codegraph-corpus, shared with agent-eval)
			
 
				+set -uo pipefail
			
 
				+
			
 
				+LANG_TOKEN="${1:?usage: bench.sh <lang> <repo-name> <repo-url> \"<question>\" [mode]}"
			
 
				+NAME="${2:?repo-name required}"
			
 
				+URL="${3:?repo-url required}"
			
 
				+Q="${4:?question required}"
			
 
				+MODE="${5:-headless}"
			
 
				+
			
 
				+HARNESS="$(cd "$(dirname "$0")" && pwd)"
			
 
				+AGENT_EVAL="$(cd "$HARNESS/../agent-eval" && pwd)"
			
 
				+CORPUS="${CORPUS:-/tmp/codegraph-corpus}"
			
 
				+REPO="$CORPUS/$NAME"
			
 
				+
			
 
				+command -v codegraph >/dev/null || { echo "no codegraph on PATH (build + ./scripts/local-install.sh first)"; exit 1; }
			
 
				+
			
 
				+echo "==================== add-lang bench: $NAME ($LANG_TOKEN) ===================="
			
 
				+echo "codegraph: $(command -v codegraph) -> $(codegraph --version 2>/dev/null || echo '?')"
			
 
				+
			
 
				+# 1. Ensure the repo (shallow clone, reuse if present).
			
 
				+mkdir -p "$CORPUS"
			
 
				+if [ -d "$REPO/.git" ]; then
			
 
				+  echo "→ reusing checkout: $REPO"
			
 
				+else
			
 
				+  echo "→ cloning $URL"
			
 
				+  git clone --depth 1 "$URL" "$REPO" || { echo "git clone failed"; exit 1; }
			
 
				+fi
			
 
				+
			
 
				+# 2. Wipe + index with the binary under test.
			
 
				+echo "→ wiping .codegraph and indexing"
			
 
				+rm -rf "$REPO/.codegraph"
			
 
				+( cd "$REPO" && codegraph init -i ) || { echo "indexing failed"; exit 1; }
			
 
				+
			
 
				+# 3. Verify extraction (cheap guard before the paid A/B).
			
 
				+echo "→ verifying extraction"
			
 
				+node "$HARNESS/verify-extraction.mjs" "$REPO" "$LANG_TOKEN"
			
 
				+VERIFY=$?
			
 
				+
			
 
				+# 4. Retrieval A/B (skipped if extraction is broken, unless FORCE_AB=1).
			
 
				+if [ "$VERIFY" -ne 0 ] && [ "${FORCE_AB:-0}" != "1" ]; then
			
 
				+  echo "→ SKIPPING A/B — extraction failed critical checks (set FORCE_AB=1 to override)"
			
 
				+else
			
 
				+  echo "→ retrieval A/B (mode=$MODE)"
			
 
				+  bash "$AGENT_EVAL/run-all.sh" "$REPO" "$Q" "$MODE"
			
 
				+fi
			
 
				+
			
 
				+echo "==================== bench complete: $NAME (verify exit=$VERIFY) ===================="
			
 
				+# Exit reflects extraction: 0 = pass/warn, 1 = critical fail, 2 = couldn't read status.
			
 
				+exit "$VERIFY"
			
--- a/scripts/add-lang/check-grammar.mjs
+++ b/scripts/add-lang/check-grammar.mjs
@@ -0,0 +1,75 @@
 
				+#!/usr/bin/env node
			
 
				+// Verify a tree-sitter grammar wasm is HEALTHY under the project's web-tree-sitter
			
 
				+// runtime BEFORE writing an extractor. Prints the ABI version and parses a valid
			
 
				+// sample many times in a multi-grammar context, to catch heap-corruption bugs
			
 
				+// that silently drop nodes on every parse after the first.
			
 
				+//
			
 
				+// Why this exists: the tree-sitter-wasms Lua grammar is ABI 13 and corrupts the
			
 
				+// shared WASM heap under web-tree-sitter 0.25 — Lua extraction degraded on every
			
 
				+// file after the first (nested calls/imports vanished). The fix was to vendor the
			
 
				+// upstream ABI-15 wasm. Run this on any new grammar first; if it FAILs, vendor a
			
 
				+// newer build instead of using the tree-sitter-wasms one.
			
 
				+//
			
 
				+// Usage: node scripts/add-lang/check-grammar.mjs <lang|wasm-path> <valid-sample> [iterations]
			
 
				+// Exit: 0 healthy, 1 corruption / parse errors, 2 could not run.
			
 
				+// NOTE: the sample must be SYNTACTICALLY VALID — a broken sample fails for the
			
 
				+//       wrong reason.
			
 
				+
			
 
				+import { readFileSync, existsSync } from 'node:fs';
			
 
				+import { createRequire } from 'node:module';
			
 
				+import { Parser, Language } from 'web-tree-sitter';
			
 
				+
			
 
				+const require = createRequire(import.meta.url);
			
 
				+const fail = (code, msg) => { console.error(`[check-grammar] ${msg}`); process.exit(code); };
			
 
				+
			
 
				+const [token, sample, iterArg] = process.argv.slice(2);
			
 
				+if (!token || !sample) fail(2, 'usage: check-grammar.mjs <lang|wasm-path> <valid-sample> [iterations]');
			
 
				+if (!existsSync(sample)) fail(2, `sample not found: ${sample}`);
			
 
				+const iters = iterArg ? parseInt(iterArg, 10) : 20;
			
 
				+
			
 
				+const SPECIAL = { csharp: 'c_sharp', 'c#': 'c_sharp' };
			
 
				+function resolveWasm(t) {
			
 
				+  if (t.endsWith('.wasm')) return existsSync(t) ? t : fail(2, `wasm not found: ${t}`);
			
 
				+  const base = SPECIAL[t.toLowerCase()] ?? t.toLowerCase();
			
 
				+  try { return require.resolve(`tree-sitter-wasms/out/tree-sitter-${base}.wasm`); } catch { /* try vendored */ }
			
 
				+  const vendored = `src/extraction/wasm/tree-sitter-${base}.wasm`;
			
 
				+  if (existsSync(vendored)) return vendored;
			
 
				+  return fail(2, `no grammar for "${t}" — not in tree-sitter-wasms and not vendored`);
			
 
				+}
			
 
				+
			
 
				+const wasmPath = resolveWasm(token);
			
 
				+const source = readFileSync(sample, 'utf8');
			
 
				+
			
 
				+try { await Parser.init(); }
			
 
				+catch { await Parser.init({ locateFile: () => require.resolve('web-tree-sitter/tree-sitter.wasm') }); }
			
 
				+
			
 
				+// Load a second, known-good grammar — the corruption surfaces under the
			
 
				+// multi-grammar runtime that real indexing uses, not a single grammar in isolation.
			
 
				+try { await Language.load(require.resolve('tree-sitter-wasms/out/tree-sitter-python.wasm')); } catch { /* ok */ }
			
 
				+
			
 
				+let language;
			
 
				+try { language = await Language.load(wasmPath); }
			
 
				+catch (e) { fail(2, `failed to load ${wasmPath}: ${e.message}`); }
			
 
				+
			
 
				+const parser = new Parser();
			
 
				+parser.setLanguage(language);
			
 
				+
			
 
				+let ok = 0, err = 0;
			
 
				+for (let i = 0; i < iters; i++) {
			
 
				+  const tree = parser.parse(source);
			
 
				+  if (tree.rootNode.hasError) err++; else ok++;
			
 
				+}
			
 
				+
			
 
				+console.log(`grammar: ${wasmPath.split('/').pop()}`);
			
 
				+console.log(`  ABI version: ${language.abiVersion}`);
			
 
				+console.log(`  parses: ${ok} clean / ${err} with errors (of ${iters})`);
			
 
				+if (err > 0) {
			
 
				+  console.log(
			
 
				+    `RESULT: FAIL — ${err}/${iters} parses produced ERROR trees on a valid sample. ` +
			
 
				+    `This grammar corrupts under web-tree-sitter; vendor a newer (ABI 14/15) wasm ` +
			
 
				+    `(see SKILL.md "Find a grammar"). Confirm your sample is syntactically valid first.`
			
 
				+  );
			
 
				+  process.exit(1);
			
 
				+}
			
 
				+console.log('RESULT: PASS — grammar parses cleanly and reuses safely.');
			
 
				+process.exit(0);
			
--- a/scripts/add-lang/dump-ast.mjs
+++ b/scripts/add-lang/dump-ast.mjs
@@ -0,0 +1,103 @@
 
				+#!/usr/bin/env node
			
 
				+// Dump the tree-sitter AST for a sample file so you can write a LanguageExtractor
			
 
				+// mapping. Loads a grammar .wasm directly via web-tree-sitter (the same runtime
			
 
				+// codegraph uses) — you do NOT need to register the language first.
			
 
				+//
			
 
				+// Usage:
			
 
				+//   node scripts/add-lang/dump-ast.mjs <lang|wasm-path> <sample-file> [--depth=N] [--full]
			
 
				+// Examples:
			
 
				+//   node scripts/add-lang/dump-ast.mjs lua sample.lua
			
 
				+//   node scripts/add-lang/dump-ast.mjs src/extraction/wasm/tree-sitter-zig.wasm a.zig --depth=4
			
 
				+//
			
 
				+// Output: an indented AST (named nodes, with field names) followed by a
			
 
				+// node-type FREQUENCY table. The frequency table is the payoff — it tells you
			
 
				+// which node types to map to functionTypes / classTypes / importTypes / etc.
			
 
				+
			
 
				+import { readFileSync, existsSync } from 'node:fs';
			
 
				+import { createRequire } from 'node:module';
			
 
				+import { Parser, Language } from 'web-tree-sitter';
			
 
				+
			
 
				+const require = createRequire(import.meta.url);
			
 
				+const fail = (msg) => { console.error(`[dump-ast] ${msg}`); process.exit(1); };
			
 
				+
			
 
				+const argv = process.argv.slice(2);
			
 
				+const positional = argv.filter((a) => !a.startsWith('--'));
			
 
				+const [langOrWasm, sampleFile] = positional;
			
 
				+const depthFlag = argv.find((a) => a.startsWith('--depth='));
			
 
				+const showAll = argv.includes('--full'); // also print anonymous (token) nodes
			
 
				+const maxDepth = depthFlag ? parseInt(depthFlag.split('=')[1], 10) : (showAll ? Infinity : 8);
			
 
				+
			
 
				+if (!langOrWasm || !sampleFile) {
			
 
				+  fail('usage: dump-ast.mjs <lang|wasm-path> <sample-file> [--depth=N] [--full]');
			
 
				+}
			
 
				+if (!existsSync(sampleFile)) fail(`sample file not found: ${sampleFile}`);
			
 
				+
			
 
				+// Language tokens whose tree-sitter-wasms filename differs from the token.
			
 
				+const WASM_SPECIAL = { csharp: 'c_sharp', 'c#': 'c_sharp' };
			
 
				+
			
 
				+function resolveWasm(token) {
			
 
				+  if (token.endsWith('.wasm')) {
			
 
				+    if (!existsSync(token)) fail(`wasm not found: ${token}`);
			
 
				+    return token;
			
 
				+  }
			
 
				+  const base = WASM_SPECIAL[token.toLowerCase()] ?? token.toLowerCase();
			
 
				+  try {
			
 
				+    return require.resolve(`tree-sitter-wasms/out/tree-sitter-${base}.wasm`);
			
 
				+  } catch {
			
 
				+    /* not in tree-sitter-wasms — try a vendored copy */
			
 
				+  }
			
 
				+  const vendored = `src/extraction/wasm/tree-sitter-${base}.wasm`;
			
 
				+  if (existsSync(vendored)) return vendored;
			
 
				+  fail(
			
 
				+    `no grammar for "${token}" — not in tree-sitter-wasms and not vendored at ` +
			
 
				+      `${vendored}. Pass an explicit .wasm path, or vendor one (see SKILL.md "Find a grammar").`
			
 
				+  );
			
 
				+}
			
 
				+
			
 
				+const wasmPath = resolveWasm(langOrWasm);
			
 
				+const source = readFileSync(sampleFile, 'utf8');
			
 
				+
			
 
				+try {
			
 
				+  await Parser.init();
			
 
				+} catch {
			
 
				+  await Parser.init({ locateFile: () => require.resolve('web-tree-sitter/tree-sitter.wasm') });
			
 
				+}
			
 
				+
			
 
				+let language;
			
 
				+try {
			
 
				+  language = await Language.load(wasmPath);
			
 
				+} catch (e) {
			
 
				+  fail(`failed to load grammar ${wasmPath}: ${e.message}`);
			
 
				+}
			
 
				+
			
 
				+const parser = new Parser();
			
 
				+parser.setLanguage(language);
			
 
				+const tree = parser.parse(source);
			
 
				+
			
 
				+const freq = new Map();
			
 
				+const snippet = (node) => {
			
 
				+  const t = node.text.replace(/\s+/g, ' ').trim();
			
 
				+  return t.length > 48 ? `${t.slice(0, 48)}…` : t;
			
 
				+};
			
 
				+
			
 
				+function walk(node, depth, fieldName) {
			
 
				+  if (node.isNamed) freq.set(node.type, (freq.get(node.type) || 0) + 1);
			
 
				+  if ((node.isNamed || showAll) && depth <= maxDepth) {
			
 
				+    const field = fieldName ? `${fieldName}: ` : '';
			
 
				+    const leaf = node.childCount === 0 ? `  "${snippet(node)}"` : '';
			
 
				+    console.log(`${'  '.repeat(depth)}${field}${node.type}  @${node.startPosition.row + 1}:${node.startPosition.column}${leaf}`);
			
 
				+  }
			
 
				+  for (let i = 0; i < node.childCount; i++) {
			
 
				+    const child = node.child(i);
			
 
				+    if (child) walk(child, depth + 1, node.fieldNameForChild(i));
			
 
				+  }
			
 
				+}
			
 
				+
			
 
				+console.log(`\n# AST for ${sampleFile}  (grammar: ${wasmPath.split('/').pop()})\n`);
			
 
				+walk(tree.rootNode, 0, null);
			
 
				+
			
 
				+console.log('\n# Node-type frequency (named nodes) — map the relevant ones in your extractor:\n');
			
 
				+[...freq.entries()]
			
 
				+  .sort((a, b) => b[1] - a[1])
			
 
				+  .forEach(([type, n]) => console.log(`  ${String(n).padStart(5)}  ${type}`));
			
 
				+console.log();
			
--- a/scripts/add-lang/verify-extraction.mjs
+++ b/scripts/add-lang/verify-extraction.mjs
@@ -0,0 +1,70 @@
 
				+#!/usr/bin/env node
			
 
				+// Sanity-check that codegraph extracted REAL symbols (not just file/import nodes)
			
 
				+// from a repo for a given language. Exits non-zero on a critical failure so it
			
 
				+// can drive a write-extractor -> build -> re-check loop.
			
 
				+//
			
 
				+// Usage: node scripts/add-lang/verify-extraction.mjs <repo-path> <lang>
			
 
				+// Reads `codegraph status <repo> --json` using whatever codegraph is on PATH,
			
 
				+// so it reflects the binary that built the index.
			
 
				+//
			
 
				+// Exit codes: 0 = pass or soft-warn, 1 = critical fail, 2 = could not run.
			
 
				+
			
 
				+import { execFileSync } from 'node:child_process';
			
 
				+
			
 
				+const [repo, lang] = process.argv.slice(2);
			
 
				+if (!repo || !lang) {
			
 
				+  console.error('usage: verify-extraction.mjs <repo-path> <lang>');
			
 
				+  process.exit(2);
			
 
				+}
			
 
				+
			
 
				+let status;
			
 
				+try {
			
 
				+  const out = execFileSync('codegraph', ['status', repo, '--json'], { encoding: 'utf8' });
			
 
				+  status = JSON.parse(out);
			
 
				+} catch (e) {
			
 
				+  console.error(`[verify] could not read codegraph status for ${repo}: ${e.message}`);
			
 
				+  process.exit(2);
			
 
				+}
			
 
				+
			
 
				+// Kinds that prove the extractor mapped AST node types (everything except
			
 
				+// 'file' and 'import', which codegraph creates structurally for any language).
			
 
				+const SYMBOL_KINDS = new Set([
			
 
				+  'module', 'class', 'struct', 'interface', 'trait', 'protocol', 'function',
			
 
				+  'method', 'property', 'field', 'variable', 'constant', 'enum', 'enum_member',
			
 
				+  'type_alias', 'namespace', 'route', 'component',
			
 
				+]);
			
 
				+
			
 
				+const byKind = status.nodesByKind || {};
			
 
				+const langs = status.languages || [];
			
 
				+const files = status.fileCount || 0;
			
 
				+const edges = status.edgeCount || 0;
			
 
				+const symbolKinds = Object.keys(byKind).filter((k) => SYMBOL_KINDS.has(k));
			
 
				+const symbolCount = symbolKinds.reduce((s, k) => s + byKind[k], 0);
			
 
				+
			
 
				+const checks = [];
			
 
				+const add = (severity, ok, label, detail) => checks.push({ severity, ok, label, detail });
			
 
				+
			
 
				+add('critical', status.initialized === true, 'index initialized', `initialized=${status.initialized}`);
			
 
				+add('critical', langs.includes(lang), `language "${lang}" detected`, `languages=[${langs.join(', ')}]`);
			
 
				+add('critical', symbolCount > 0, 'structural symbols extracted', `${symbolCount} symbols (${symbolKinds.join(', ') || 'NONE — only file/import nodes!'})`);
			
 
				+add('soft', symbolCount >= files, 'symbol density >= 1/file', `${symbolCount} symbols across ${files} files`);
			
 
				+add('soft', edges > files, 'edges resolved', `${edges} edges across ${files} files`);
			
 
				+
			
 
				+console.log(`\n# Extraction check — ${repo}  (lang=${lang}, backend=${status.backend})`);
			
 
				+console.log(`  files=${files} nodes=${status.nodeCount} edges=${edges}`);
			
 
				+console.log(`  nodesByKind: ${JSON.stringify(byKind)}\n`);
			
 
				+for (const c of checks) console.log(`  ${c.ok ? '✓' : '✗'} ${c.label} — ${c.detail}`);
			
 
				+
			
 
				+const critical = checks.filter((c) => !c.ok && c.severity === 'critical');
			
 
				+const soft = checks.filter((c) => !c.ok && c.severity === 'soft');
			
 
				+console.log();
			
 
				+if (critical.length) {
			
 
				+  console.log(`RESULT: FAIL (${critical.length} critical) — extractor or grammar wiring is broken. Re-run dump-ast.mjs and fix the node-type mappings.`);
			
 
				+  process.exit(1);
			
 
				+}
			
 
				+if (soft.length) {
			
 
				+  console.log(`RESULT: WARN (${soft.length} soft) — extraction works but looks thin; inspect the counts above.`);
			
 
				+  process.exit(0);
			
 
				+}
			
 
				+console.log('RESULT: PASS — extraction looks healthy.');
			
 
				+process.exit(0);
			
--- a/src/extraction/grammars.ts
+++ b/src/extraction/grammars.ts
@@ -35,6 +35,8 @@ const WASM_GRAMMAR_FILES: Record<GrammarLanguage, string> = {
 
				   dart: 'tree-sitter-dart.wasm',
			
 
				   pascal: 'tree-sitter-pascal.wasm',
			
 
				   scala: 'tree-sitter-scala.wasm',
			
 
				+  lua: 'tree-sitter-lua.wasm',
			
 
				+  luau: 'tree-sitter-luau.wasm',
			
 
				 };
			
 
				 
			
 
				 /**
			
@@ -78,6 +80,8 @@ export const EXTENSION_MAP: Record<string, Language> = {
 
				   '.fmx': 'pascal',
			
 
				   '.scala': 'scala',
			
 
				   '.sc': 'scala',
			
 
				+  '.lua': 'lua',
			
 
				+  '.luau': 'luau',
			
 
				 };
			
 
				 
			
 
				 /**
			
@@ -125,8 +129,12 @@ export async function loadGrammarsForLanguages(languages: Language[]): Promise<v
 
				   for (const lang of toLoad) {
			
 
				     const wasmFile = WASM_GRAMMAR_FILES[lang];
			
 
				     try {
			
 
				-      // Pascal and Scala ship their own WASMs (not in tree-sitter-wasms)
			
 
				-      const wasmPath = (lang === 'pascal' || lang === 'scala')
			
 
				+      // Some grammars ship their own WASMs (not in tree-sitter-wasms, or the
			
 
				+      // tree-sitter-wasms build is too old). Lua: tree-sitter-wasms ships an
			
 
				+      // ABI-13 build that corrupts the shared WASM heap under web-tree-sitter
			
 
				+      // 0.25 (drops nested calls/imports on every file after the first); we
			
 
				+      // vendor the upstream ABI-15 wasm instead.
			
 
				+      const wasmPath = (lang === 'pascal' || lang === 'scala' || lang === 'lua' || lang === 'luau')
			
 
				         ? path.join(__dirname, 'wasm', wasmFile)
			
 
				         : require.resolve(`tree-sitter-wasms/out/${wasmFile}`);
			
 
				       const language = await WasmLanguage.load(wasmPath);
			
@@ -291,6 +299,8 @@ export function getLanguageDisplayName(language: Language): string {
 
				     liquid: 'Liquid',
			
 
				     pascal: 'Pascal / Delphi',
			
 
				     scala: 'Scala',
			
 
				+    lua: 'Lua',
			
 
				+    luau: 'Luau',
			
 
				     unknown: 'Unknown',
			
 
				   };
			
 
				   return names[language] || language;
			
--- a/src/extraction/languages/index.ts
+++ b/src/extraction/languages/index.ts
@@ -23,6 +23,8 @@ import { kotlinExtractor } from './kotlin';
 
				 import { dartExtractor } from './dart';
			
 
				 import { pascalExtractor } from './pascal';
			
 
				 import { scalaExtractor } from './scala';
			
 
				+import { luaExtractor } from './lua';
			
 
				+import { luauExtractor } from './luau';
			
 
				 
			
 
				 export const EXTRACTORS: Partial<Record<Language, LanguageExtractor>> = {
			
 
				   typescript: typescriptExtractor,
			
@@ -43,4 +45,6 @@ export const EXTRACTORS: Partial<Record<Language, LanguageExtractor>> = {
 
				   dart: dartExtractor,
			
 
				   pascal: pascalExtractor,
			
 
				   scala: scalaExtractor,
			
 
				+  lua: luaExtractor,
			
 
				+  luau: luauExtractor,
			
 
				 };
			
--- a/src/extraction/languages/lua.ts
+++ b/src/extraction/languages/lua.ts
@@ -0,0 +1,152 @@
 
				+import type { Node as SyntaxNode } from 'web-tree-sitter';
			
 
				+import { getNodeText, getChildByField } from '../tree-sitter-helpers';
			
 
				+import type { LanguageExtractor } from '../tree-sitter-types';
			
 
				+
			
 
				+// Node names follow the vendored ABI-15 grammar (@tree-sitter-grammars/
			
 
				+// tree-sitter-lua), NOT the older tree-sitter-wasms build — see grammars.ts.
			
 
				+
			
 
				+/** First descendant of a given type (breadth-first), or null. */
			
 
				+function findDescendant(node: SyntaxNode, type: string): SyntaxNode | null {
			
 
				+  const queue: SyntaxNode[] = [...node.namedChildren];
			
 
				+  while (queue.length) {
			
 
				+    const n = queue.shift()!;
			
 
				+    if (n.type === type) return n;
			
 
				+    queue.push(...n.namedChildren);
			
 
				+  }
			
 
				+  return null;
			
 
				+}
			
 
				+
			
 
				+/**
			
 
				+ * If `callNode` is a `require(...)` call, return the module name; otherwise null.
			
 
				+ * Lua/Luau have no import statement — modules are loaded by calling the global
			
 
				+ * `require`. Handles both:
			
 
				+ *   - string requires:  `require("net.http")` / `require "net.http"`  → "net.http"
			
 
				+ *   - Roblox/Luau path requires: `require(script.Parent.Signal)`      → "Signal"
			
 
				+ *     (the dominant idiom in Roblox code, where the argument is an instance path
			
 
				+ *     rather than a string — use the trailing field as the module name).
			
 
				+ */
			
 
				+function requireModule(callNode: SyntaxNode, source: string): string | null {
			
 
				+  // function_call > name: <callee>, arguments: arguments
			
 
				+  const name = getChildByField(callNode, 'name');
			
 
				+  // A dotted/colon callee (e.g. `socket.connect`) is dot/method_index_expression,
			
 
				+  // never a bare `require`.
			
 
				+  if (!name || name.type !== 'identifier') return null;
			
 
				+  if (getNodeText(name, source) !== 'require') return null;
			
 
				+
			
 
				+  const args = getChildByField(callNode, 'arguments');
			
 
				+  if (!args) return null;
			
 
				+
			
 
				+  // String require — `string > content: string_content` gives the bare name.
			
 
				+  const content = findDescendant(args, 'string_content');
			
 
				+  if (content) return getNodeText(content, source).trim() || null;
			
 
				+  const str = findDescendant(args, 'string');
			
 
				+  if (str) {
			
 
				+    const mod = getNodeText(str, source)
			
 
				+      .trim()
			
 
				+      .replace(/^\[\[/, '')
			
 
				+      .replace(/\]\]$/, '')
			
 
				+      .replace(/^["']/, '')
			
 
				+      .replace(/["']$/, '');
			
 
				+    if (mod) return mod;
			
 
				+  }
			
 
				+
			
 
				+  // Roblox/Luau instance-path require: `require(script.Parent.Signal)` → "Signal".
			
 
				+  const idx = findDescendant(args, 'dot_index_expression') ?? findDescendant(args, 'method_index_expression');
			
 
				+  if (idx) {
			
 
				+    const field = getChildByField(idx, 'field') ?? getChildByField(idx, 'method');
			
 
				+    if (field) return getNodeText(field, source).trim() || null;
			
 
				+  }
			
 
				+  return null;
			
 
				+}
			
 
				+
			
 
				+export const luaExtractor: LanguageExtractor = {
			
 
				+  // function_declaration covers global (`function f`), table (`function t.f`),
			
 
				+  // method (`function t:m`), and local (`local function f`) forms — the form is
			
 
				+  // distinguished by the `name:` child (identifier / dot_index_expression /
			
 
				+  // method_index_expression) and a `local` token, not by separate node types.
			
 
				+  // Anonymous `function() ... end` (function_definition) has no name and is
			
 
				+  // captured via its enclosing variable instead.
			
 
				+  functionTypes: ['function_declaration'],
			
 
				+  classTypes: [], // Lua has no classes/structs/interfaces/enums — tables are used for everything
			
 
				+  methodTypes: [],
			
 
				+  interfaceTypes: [],
			
 
				+  structTypes: [],
			
 
				+  enumTypes: [],
			
 
				+  typeAliasTypes: [],
			
 
				+  importTypes: [], // `require` is a function_call — handled in visitNode below
			
 
				+  callTypes: ['function_call'],
			
 
				+  variableTypes: ['variable_declaration'], // see the `lua` branch in extractVariable
			
 
				+  nameField: 'name',
			
 
				+  bodyField: 'body',
			
 
				+  paramsField: 'parameters',
			
 
				+
			
 
				+  getSignature: (node, source) => {
			
 
				+    const params = getChildByField(node, 'parameters');
			
 
				+    return params ? getNodeText(params, source) : undefined;
			
 
				+  },
			
 
				+
			
 
				+  // `function t.f()` / `function t:m()` are methods on table `t`: return the
			
 
				+  // table as the receiver so they extract as methods with a `t::f` qualified
			
 
				+  // name. Plain `function f()` / `local function f()` have no receiver and stay
			
 
				+  // functions. (For `a.b.c`, the receiver is the nested `a.b`.)
			
 
				+  getReceiverType: (node, source) => {
			
 
				+    const name = getChildByField(node, 'name');
			
 
				+    if (name && (name.type === 'dot_index_expression' || name.type === 'method_index_expression')) {
			
 
				+      const table = getChildByField(name, 'table');
			
 
				+      if (table) return getNodeText(table, source);
			
 
				+    }
			
 
				+    return undefined;
			
 
				+  },
			
 
				+
			
 
				+  // Emit import nodes for `require(...)`. The local-declaration form is handled
			
 
				+  // explicitly because the variable branch skips the initializer subtree; bare
			
 
				+  // and global `require` calls are caught when the walker reaches the
			
 
				+  // function_call node.
			
 
				+  visitNode: (node, ctx) => {
			
 
				+    const source = ctx.source;
			
 
				+
			
 
				+    const emit = (callNode: SyntaxNode): void => {
			
 
				+      const mod = requireModule(callNode, source);
			
 
				+      if (!mod) return;
			
 
				+      const imp = ctx.createNode('import', mod, callNode, {
			
 
				+        signature: getNodeText(callNode, source).trim().slice(0, 100),
			
 
				+      });
			
 
				+      if (imp && ctx.nodeStack.length > 0) {
			
 
				+        const parentId = ctx.nodeStack[ctx.nodeStack.length - 1];
			
 
				+        if (parentId) {
			
 
				+          ctx.addUnresolvedReference({
			
 
				+            fromNodeId: parentId,
			
 
				+            referenceName: mod,
			
 
				+            referenceKind: 'imports',
			
 
				+            line: callNode.startPosition.row + 1,
			
 
				+            column: callNode.startPosition.column,
			
 
				+          });
			
 
				+        }
			
 
				+      }
			
 
				+    };
			
 
				+
			
 
				+    // Bare / global `require("x")` — claim it so it isn't double-counted as a call.
			
 
				+    if (node.type === 'function_call') {
			
 
				+      if (requireModule(node, source)) {
			
 
				+        emit(node);
			
 
				+        return true;
			
 
				+      }
			
 
				+      return false;
			
 
				+    }
			
 
				+
			
 
				+    // `local x = require("x")` — variable_declaration wraps an assignment_statement
			
 
				+    // whose initializer subtree the variable branch will skip, so dig it out here.
			
 
				+    if (node.type === 'variable_declaration') {
			
 
				+      const assign = node.namedChildren.find((c) => c.type === 'assignment_statement');
			
 
				+      const exprList = assign?.namedChildren.find((c) => c.type === 'expression_list');
			
 
				+      if (exprList) {
			
 
				+        for (const val of exprList.namedChildren) {
			
 
				+          if (val.type === 'function_call') emit(val);
			
 
				+        }
			
 
				+      }
			
 
				+      return false;
			
 
				+    }
			
 
				+
			
 
				+    return false;
			
 
				+  },
			
 
				+};
			
--- a/src/extraction/languages/luau.ts
+++ b/src/extraction/languages/luau.ts
@@ -0,0 +1,36 @@
 
				+import { getNodeText, getChildByField } from '../tree-sitter-helpers';
			
 
				+import type { LanguageExtractor } from '../tree-sitter-types';
			
 
				+import { luaExtractor } from './lua';
			
 
				+
			
 
				+// Luau (https://luau.org) is a gradually-typed superset of Lua. The
			
 
				+// tree-sitter-luau grammar reuses the same node names as the vendored Lua
			
 
				+// grammar (function_declaration, variable_declaration, function_call,
			
 
				+// dot/method_index_expression, …), so the Luau extractor extends the Lua one
			
 
				+// and adds the type-system pieces Luau introduces:
			
 
				+//   - `type X = ...` / `export type X = ...`  → type_definition (type_alias)
			
 
				+//   - typed parameters and return types        → richer signatures
			
 
				+//
			
 
				+// require detection, receiver-splitting (t.f / t:m → methods), and local
			
 
				+// variable extraction are inherited unchanged from luaExtractor. The shared
			
 
				+// `extractVariable` core branch is gated on `lua` || `luau`.
			
 
				+export const luauExtractor: LanguageExtractor = {
			
 
				+  ...luaExtractor,
			
 
				+
			
 
				+  // `type X = ...` and `export type X = ...`
			
 
				+  typeAliasTypes: ['type_definition'],
			
 
				+
			
 
				+  // Only Luau `export type` is exported; the keyword leads the node.
			
 
				+  isExported: (node, source) => source.slice(node.startIndex, node.startIndex + 7) === 'export ',
			
 
				+
			
 
				+  // Params + Luau return type (the named child after `parameters`, before the body).
			
 
				+  getSignature: (node, source) => {
			
 
				+    const params = getChildByField(node, 'parameters');
			
 
				+    if (!params) return undefined;
			
 
				+    let sig = getNodeText(params, source);
			
 
				+    const kids = node.namedChildren;
			
 
				+    const idx = kids.findIndex((c) => c.startIndex === params.startIndex);
			
 
				+    const ret = idx >= 0 ? kids[idx + 1] : null;
			
 
				+    if (ret && ret.type !== 'block') sig += `: ${getNodeText(ret, source)}`;
			
 
				+    return sig;
			
 
				+  },
			
 
				+};
			
--- a/src/extraction/tree-sitter.ts
+++ b/src/extraction/tree-sitter.ts
@@ -50,6 +50,17 @@ function extractName(node: SyntaxNode, source: string, extractor: LanguageExtrac
 
				       const innerName = getChildByField(resolved, 'declarator') || resolved.namedChild(0);
			
 
				       return innerName ? getNodeText(innerName, source) : getNodeText(resolved, source);
			
 
				     }
			
 
				+    // Lua: `function t.f()` / `function t:m()` — the name node is a dot/method
			
 
				+    // index expression; the simple name is the trailing field/method (the table
			
 
				+    // receiver is captured separately via getReceiverType).
			
 
				+    if (resolved.type === 'dot_index_expression') {
			
 
				+      const field = getChildByField(resolved, 'field');
			
 
				+      if (field) return getNodeText(field, source);
			
 
				+    }
			
 
				+    if (resolved.type === 'method_index_expression') {
			
 
				+      const method = getChildByField(resolved, 'method');
			
 
				+      if (method) return getNodeText(method, source);
			
 
				+    }
			
 
				     return getNodeText(resolved, source);
			
 
				   }
			
 
				 
			
@@ -1111,6 +1122,23 @@ export class TreeSitterExtractor {
 
				           }
			
 
				         }
			
 
				       }
			
 
				+    } else if (this.language === 'lua' || this.language === 'luau') {
			
 
				+      // Lua/Luau: variable_declaration → assignment_statement → variable_list
			
 
				+      //      (name: identifier...) = expression_list. `local x, y = 1, 2`
			
 
				+      //      declares multiple names; only plain identifiers are locals.
			
 
				+      const assign = node.namedChildren.find((c) => c.type === 'assignment_statement') ?? node;
			
 
				+      const varList = assign.namedChildren.find((c) => c.type === 'variable_list');
			
 
				+      const exprList = assign.namedChildren.find((c) => c.type === 'expression_list');
			
 
				+      const values = exprList ? exprList.namedChildren : [];
			
 
				+      const names = varList ? varList.namedChildren.filter((c) => c.type === 'identifier') : [];
			
 
				+      names.forEach((nameNode, i) => {
			
 
				+        const name = getNodeText(nameNode, this.source);
			
 
				+        if (!name) return;
			
 
				+        const valueNode = values[i];
			
 
				+        const initValue = valueNode ? getNodeText(valueNode, this.source).slice(0, 100) : undefined;
			
 
				+        const initSignature = initValue ? `= ${initValue}${initValue.length >= 100 ? '...' : ''}` : undefined;
			
 
				+        this.createNode(kind, name, nameNode, { docstring, signature: initSignature, isExported });
			
 
				+      });
			
 
				     } else {
			
 
				       // Generic fallback for other languages
			
 
				       // Try to find identifier children
			
--- a/src/extraction/wasm/tree-sitter-lua.wasm
+++ b/src/extraction/wasm/tree-sitter-lua.wasm
--- a/src/extraction/wasm/tree-sitter-luau.wasm
+++ b/src/extraction/wasm/tree-sitter-luau.wasm
--- a/src/types.ts
+++ b/src/types.ts
@@ -85,6 +85,8 @@ export const LANGUAGES = [
 
				   'liquid',
			
 
				   'pascal',
			
 
				   'scala',
			
 
				+  'lua',
			
 
				+  'luau',
			
 
				   'unknown',
			
 
				 ] as const;
			
 
				 
			
@@ -545,6 +547,10 @@ export const DEFAULT_CONFIG: CodeGraphConfig = {
 
				     // Scala
			
 
				     '**/*.scala',
			
 
				     '**/*.sc',
			
 
				+    // Lua
			
 
				+    '**/*.lua',
			
 
				+    // Luau
			
 
				+    '**/*.luau',
			
 
				   ],
			
 
				   exclude: [
			
 
				     // Version control