2 месяцев назад · b04ee9f9bb
--- a/docs/SEARCH_QUALITY_LOOP.md
+++ b/docs/SEARCH_QUALITY_LOOP.md
@@ -1,45 +1,183 @@
 
															-# CodeGraph Search Quality Loop
														
 
															+# CodeGraph Language Verification Guide
														
 
															-You are testing and improving CodeGraph's search quality for a specific language. The user will give you a real-world codebase path to test against.
														
 
															+You are verifying that CodeGraph fully supports a specific programming language. The user will give you a path to a real-world, popular open-source codebase cloned locally. Your job is to run a battery of realistic prompts against it using CodeGraph's API and verify the results are good enough to say that language is **covered and supported**.
														
 
															-## What You're Fixing
														
 
															+A language is NOT verified until an LLM can reliably use CodeGraph's MCP tools to navigate that codebase — finding the right symbols, understanding call chains, exploring subsystems, and getting useful context for real tasks.
														
 
															-When an LLM queries CodeGraph via MCP tools (`codegraph_search`, `codegraph_explore`, `codegraph_callees`), the results must be relevant. The main failure mode is: methods with common names (like `run`, `get`, `handle`) flood results and bury the actual target. The fix is usually adding `getReceiverType` to the language extractor so methods include their owner type in the FTS-indexed `qualified_name`.
														
 
															+## Setup
														
 
															-**Example:** Go's `func (sl *scrapeLoop) run()` was indexed as `scrape.go::scrape.go::run`. After adding `getReceiverType`, it became `scrape.go::scrapeLoop::run` — now FTS can rank it above unrelated `run` methods when the query mentions "scrapeLoop".
														
 
															+### 1. Build and index
														
 
															-## The Loop
														
 
															+```bash
														
 
															+npm run build
														
 
															+rm -rf <codebase_path>/.codegraph
														
 
															+node dist/bin/codegraph.js init -iv <codebase_path>
														
 
															+```
														
 
															+
														
 
															+The `-iv` flag gives verbose output showing extraction progress, node/edge counts, and timing.
														
 
															+
														
 
															+### 2. Quick sanity check
														
 
															+
														
 
															+```bash
														
 
															+# Verify nodes were extracted with proper qualified names
														
 
															+sqlite3 <codebase_path>/.codegraph/codegraph.db \
														
 
															+  "SELECT name, kind, qualified_name FROM nodes WHERE kind = 'method' LIMIT 10;"
														
 
															+
														
 
															+# GOOD: file.go::StructName::method_name  (owner type present)
														
 
															+# BAD:  file.go::file.go::method_name     (owner type missing — needs getReceiverType)
														
 
															+
														
 
															+# Check edge counts
														
 
															+sqlite3 <codebase_path>/.codegraph/codegraph.db \
														
 
															+  "SELECT kind, COUNT(*) FROM edges GROUP BY kind ORDER BY COUNT(*) DESC;"
														
 
															+
														
 
															+# Check node kind distribution
														
 
															+sqlite3 <codebase_path>/.codegraph/codegraph.db \
														
 
															+  "SELECT kind, COUNT(*) FROM nodes GROUP BY kind ORDER BY COUNT(*) DESC;"
														
 
															+```
														
 
															+
														
 
															+If methods are missing their owner type in `qualified_name`, fix that first (see [Adding getReceiverType](#adding-getreceivertype)) before proceeding with the full test battery.
														
 
															+
														
 
															+## The Test Battery
														
 
															-### 1. Pick a test query
														
 
															+Run **all** of the following test categories against the codebase. Use the Node.js API directly — the test scripts below are templates. Adapt the queries to match real types, methods, and subsystems in the codebase you're testing.
														
 
															-Choose a query that exercises the language's method-on-type pattern. Good queries mention:
														
 
															-- A specific type/class/struct name
														
 
															-- A method on that type
														
 
															-- A broader topic connecting multiple files
														
 
															+**Pass criteria for each test:** Does the result give an LLM enough correct information to answer the question or complete the task? Would you trust these results if you were the LLM?
														
 
															-Example for Go: `"scrapeLoop run scrape lifecycle TSDB storage"`
														
 
															+---
														
 
															-### 2. Index the codebase
														
 
															+### Test 1: `codegraph_explore` — Deep Exploration (MOST IMPORTANT)
														
 
															+
														
 
															+This is the primary tool LLMs use. It must return relevant source code grouped by file, with correct relationships, for a natural language query. Test it with **at least 5 different query types**:
														
 
															 ```bash
														
 
															-rm -rf <codebase_path>/.codegraph
														
 
															-node dist/bin/codegraph.js init -iv <codebase_path>
														
 
															+node -e "
														
 
															+const { CodeGraph } = require('./dist/index.js');
														
 
															+async function test() {
														
 
															+  const cg = await CodeGraph.open('<codebase_path>');
														
 
															+
														
 
															+  const queries = [
														
 
															+    // A. Subsystem exploration — broad topic, should find the right files and key classes
														
 
															+    'How does the caching system work?',
														
 
															+
														
 
															+    // B. Specific class/type deep dive — should return that class, its methods, and related types
														
 
															+    'CacheBuilder configuration and build process',
														
 
															+
														
 
															+    // C. Cross-cutting concern — should find implementations across multiple files
														
 
															+    'How are errors handled and propagated?',
														
 
															+
														
 
															+    // D. Data flow question — should trace through multiple layers
														
 
															+    'How does data flow from input to storage?',
														
 
															+
														
 
															+    // E. Implementation detail — specific method behavior
														
 
															+    'How does eviction decide which entries to remove?',
														
 
															+  ];
														
 
															+
														
 
															+  for (const query of queries) {
														
 
															+    console.log(\`\n========================================\`);
														
 
															+    console.log(\`QUERY: \${query}\`);
														
 
															+    console.log(\`========================================\`);
														
 
															+
														
 
															+    const subgraph = await cg.findRelevantContext(query, {
														
 
															+      searchLimit: 8, traversalDepth: 3, maxNodes: 80, minScore: 0.2,
														
 
															+    });
														
 
															+
														
 
															+    // Show entry points — these are what the LLM sees first
														
 
															+    console.log(\`\nEntry points (\${subgraph.roots.length}):\`);
														
 
															+    for (const rootId of subgraph.roots.slice(0, 8)) {
														
 
															+      const node = subgraph.nodes.get(rootId);
														
 
															+      if (node) console.log(\`  \${node.name} (\${node.kind}) — \${node.filePath}:\${node.startLine}\`);
														
 
															+    }
														
 
															+
														
 
															+    // Show file distribution — are the right files surfacing?
														
 
															+    const fileGroups = new Map();
														
 
															+    for (const node of subgraph.nodes.values()) {
														
 
															+      if (!fileGroups.has(node.filePath)) fileGroups.set(node.filePath, []);
														
 
															+      fileGroups.get(node.filePath).push(node.name);
														
 
															+    }
														
 
															+    console.log(\`\nFiles (\${fileGroups.size}):\`);
														
 
															+    for (const [file, nodes] of [...fileGroups.entries()].sort((a,b) => b[1].length - a[1].length).slice(0, 8)) {
														
 
															+      console.log(\`  \${file} (\${nodes.length} symbols): \${nodes.slice(0, 6).join(', ')}\`);
														
 
															+    }
														
 
															+
														
 
															+    // Show edge distribution — are relationships being captured?
														
 
															+    const edgeKinds = new Map();
														
 
															+    for (const edge of subgraph.edges) {
														
 
															+      edgeKinds.set(edge.kind, (edgeKinds.get(edge.kind) || 0) + 1);
														
 
															+    }
														
 
															+    console.log(\`\nEdges (\${subgraph.edges.length}):\`);
														
 
															+    for (const [kind, count] of [...edgeKinds.entries()].sort((a,b) => b - a)) {
														
 
															+      console.log(\`  \${kind}: \${count}\`);
														
 
															+    }
														
 
															+
														
 
															+    console.log(\`\nTotal: \${subgraph.nodes.size} nodes, \${subgraph.edges.length} edges, \${fileGroups.size} files\`);
														
 
															+  }
														
 
															+
														
 
															+  await cg.close();
														
 
															+}
														
 
															+test().catch(console.error);
														
 
															+"
														
 
															 ```
														
 
															-The `-iv` flag gives verbose output showing extraction progress, node/edge counts, and timing.
														
 
															+**What to check for each query:**
														
 
															+- Do the entry points make sense for the question?
														
 
															+- Are the right files surfacing (not just test files or unrelated code)?
														
 
															+- Is there a mix of edge types (calls, contains, extends, implements) — not just `contains`?
														
 
															+- Does the node count feel right? Too few (<5) means search failed. Too many irrelevant ones means noise.
														
 
															+
														
 
															+---
														
 
															-### 3. Check what the DB produced
														
 
															+### Test 2: `codegraph_search` — Symbol Lookup
														
 
															+
														
 
															+Test that searching for specific symbols returns the right results ranked correctly.
														
 
															 ```bash
														
 
															-# Does the method have its owner type in qualified_name?
														
 
															-sqlite3 <codebase_path>/.codegraph/codegraph.db \
														
 
															-  "SELECT name, kind, qualified_name FROM nodes WHERE name = '<method>' AND file_path LIKE '%<file>%';"
														
 
															+node -e "
														
 
															+const { CodeGraph } = require('./dist/index.js');
														
 
															+async function test() {
														
 
															+  const cg = await CodeGraph.open('<codebase_path>');
														
 
															+
														
 
															+  const searches = [
														
 
															+    // A. Class by name
														
 
															+    { query: 'CacheBuilder', kinds: ['class'], desc: 'Find a specific class' },
														
 
															+
														
 
															+    // B. Method on a specific type (the classic disambiguation test)
														
 
															+    { query: 'CacheBuilder build', kinds: ['method'], desc: 'Method on specific class' },
														
 
															+
														
 
															+    // C. Common method name — should still find relevant ones
														
 
															+    { query: 'get', kinds: ['method'], desc: 'Common method name' },
														
 
															+
														
 
															+    // D. Interface/trait
														
 
															+    { query: 'Cache', kinds: ['interface'], desc: 'Find an interface' },
														
 
															+
														
 
															+    // E. Enum
														
 
															+    { query: 'Strength', kinds: ['enum'], desc: 'Find an enum' },
														
 
															+  ];
														
 
															-# GOOD: file.rs::StructName::method_name
														
 
															-# BAD:  file.rs::file.rs::method_name  ← owner type missing, FTS can't find it
														
 
															+  for (const s of searches) {
														
 
															+    console.log(\`\n--- \${s.desc}: \"\${s.query}\" (kinds: \${s.kinds}) ---\`);
														
 
															+    const results = cg.searchNodes(s.query, { limit: 10, kinds: s.kinds });
														
 
															+    for (const r of results) {
														
 
															+      console.log(\`  \${r.score.toFixed(1)} | \${r.node.name} (\${r.node.kind}) | \${r.node.qualifiedName}\`);
														
 
															+    }
														
 
															+    if (results.length === 0) console.log('  *** NO RESULTS ***');
														
 
															+  }
														
 
															+
														
 
															+  await cg.close();
														
 
															+}
														
 
															+test().catch(console.error);
														
 
															+"
														
 
															 ```
														
 
															-### 4. Test search ranking
														
 
															+**What to check:**
														
 
															+- Does the target symbol rank in the top 3?
														
 
															+- For common names like `get`, do the results include qualified names that help disambiguate?
														
 
															+- Are there zero-result queries? That's a bug.
														
 
															+
														
 
															+---
														
 
															+
														
 
															+### Test 3: `codegraph_callers` / `codegraph_callees` — Call Chain Tracing
														
 
															+
														
 
															+Test that call relationships were extracted correctly.
														
 
															 ```bash
														
 
															 node -e "
														
@@ -47,41 +185,237 @@ const { CodeGraph } = require('./dist/index.js');
 
															 async function test() {
														
 
															   const cg = await CodeGraph.open('<codebase_path>');
														
 
															-  // Does the target method rank #1?
														
 
															-  console.log('=== searchNodes ===');
														
 
															-  const results = cg.searchNodes('<OwnerType> <method>', { limit: 10, kinds: ['method'] });
														
 
															-  for (const r of results) {
														
 
															-    console.log(\`\${r.score.toFixed(2)} | \${r.node.name} (\${r.node.kind}) | \${r.node.filePath}:\${r.node.startLine}\`);
														
 
															+  // Pick 3-4 important methods and check their call graphs
														
 
															+  const symbols = ['build', 'get', 'put', 'invalidate'];
														
 
															+
														
 
															+  for (const sym of symbols) {
														
 
															+    // Find the symbol
														
 
															+    const results = cg.searchNodes(sym, { limit: 5, kinds: ['method'] });
														
 
															+    if (results.length === 0) { console.log(\`\${sym}: not found\`); continue; }
														
 
															+
														
 
															+    const node = results[0].node;
														
 
															+    console.log(\`\n--- \${node.name} (\${node.qualifiedName}) ---\`);
														
 
															+
														
 
															+    // Check callees (what does it call?)
														
 
															+    const callees = cg.getCallees(node.id);
														
 
															+    console.log(\`  Callees (\${callees.length}): \${callees.slice(0, 10).map(c => c.node.name).join(', ')}\`);
														
 
															+
														
 
															+    // Check callers (what calls it?)
														
 
															+    const callers = cg.getCallers(node.id);
														
 
															+    console.log(\`  Callers (\${callers.length}): \${callers.slice(0, 10).map(c => c.node.name).join(', ')}\`);
														
 
															   }
														
 
															-  // Does explore find the right file?
														
 
															-  console.log('\n=== findRelevantContext ===');
														
 
															-  const subgraph = await cg.findRelevantContext('<your natural language query>', {
														
 
															-    searchLimit: 8, traversalDepth: 3, maxNodes: 80, minScore: 0.2,
														
 
															-  });
														
 
															-  const fileGroups = new Map();
														
 
															-  for (const node of subgraph.nodes.values()) {
														
 
															-    if (!fileGroups.has(node.filePath)) fileGroups.set(node.filePath, []);
														
 
															-    fileGroups.get(node.filePath).push(node.name);
														
 
															+  await cg.close();
														
 
															+}
														
 
															+test().catch(console.error);
														
 
															+"
														
 
															+```
														
 
															+
														
 
															+**What to check:**
														
 
															+- Do methods have callers AND callees? If a method has 0 of both, edge extraction may be broken.
														
 
															+- Do the callers/callees make sense? A `build()` method should call constructor-like things, and be called by setup/initialization code.
														
 
															+- Are the counts reasonable? A core method in a popular codebase should have multiple callers.
														
 
															+
														
 
															+---
														
 
															+
														
 
															+### Test 4: `codegraph_impact` — Change Impact Analysis
														
 
															+
														
 
															+Test that the impact radius correctly identifies affected code.
														
 
															+
														
 
															+```bash
														
 
															+node -e "
														
 
															+const { CodeGraph } = require('./dist/index.js');
														
 
															+async function test() {
														
 
															+  const cg = await CodeGraph.open('<codebase_path>');
														
 
															+
														
 
															+  // Pick a core class or interface that many things depend on
														
 
															+  const results = cg.searchNodes('<CoreClass>', { limit: 1, kinds: ['class', 'interface'] });
														
 
															+  if (results.length === 0) { console.log('Not found'); return; }
														
 
															+
														
 
															+  const node = results[0].node;
														
 
															+  console.log(\`Impact analysis for: \${node.name} (\${node.kind}) — \${node.filePath}\`);
														
 
															+
														
 
															+  const impact = cg.getImpactRadius(node.id, { depth: 2 });
														
 
															+  console.log(\`\nAffected nodes: \${impact.nodes.size}\`);
														
 
															+  console.log(\`Affected edges: \${impact.edges.length}\`);
														
 
															+
														
 
															+  // Group by file
														
 
															+  const files = new Map();
														
 
															+  for (const n of impact.nodes.values()) {
														
 
															+    if (!files.has(n.filePath)) files.set(n.filePath, []);
														
 
															+    files.get(n.filePath).push(n.name);
														
 
															   }
														
 
															-  console.log('Entry points:');
														
 
															-  for (const rootId of subgraph.roots.slice(0, 8)) {
														
 
															-    const node = subgraph.nodes.get(rootId);
														
 
															-    if (node) console.log(\`  \${node.name} (\${node.kind}) - \${node.filePath}:\${node.startLine}\`);
														
 
															+  console.log(\`Affected files: \${files.size}\`);
														
 
															+  for (const [file, nodes] of [...files.entries()].sort((a,b) => b[1].length - a[1].length).slice(0, 10)) {
														
 
															+    console.log(\`  \${file}: \${nodes.slice(0, 5).join(', ')}\`);
														
 
															   }
														
 
															-  console.log('Top files:');
														
 
															-  for (const [file, nodes] of [...fileGroups.entries()].sort((a,b) => b[1].length - a[1].length).slice(0, 5)) {
														
 
															-    console.log(\`  \${file} (\${nodes.length}): \${nodes.slice(0, 5).join(', ')}\`);
														
 
															+
														
 
															+  await cg.close();
														
 
															+}
														
 
															+test().catch(console.error);
														
 
															+"
														
 
															+```
														
 
															+
														
 
															+**What to check:**
														
 
															+- Does changing a core interface/class show a wide impact radius?
														
 
															+- Are the affected files reasonable (things that import/extend/use it)?
														
 
															+- Is the impact radius non-empty? Zero impact on a core type means edges are missing.
														
 
															+
														
 
															+---
														
 
															+
														
 
															+### Test 5: Edge Extraction Quality
														
 
															+
														
 
															+Directly verify that the major edge types are being extracted for this language.
														
 
															+
														
 
															+```bash
														
 
															+node -e "
														
 
															+const { CodeGraph } = require('./dist/index.js');
														
 
															+async function test() {
														
 
															+  const cg = await CodeGraph.open('<codebase_path>');
														
 
															+
														
 
															+  // Check overall edge distribution
														
 
															+  console.log('=== Edge distribution ===');
														
 
															+  // (Use sqlite3 query from sanity check above)
														
 
															+
														
 
															+  // Find a class that extends another
														
 
															+  const classes = cg.searchNodes('', { limit: 100, kinds: ['class'] });
														
 
															+  let foundExtends = false, foundImplements = false;
														
 
															+  for (const r of classes) {
														
 
															+    const callees = cg.getCallees(r.node.id);
														
 
															+    // getCallees returns all outgoing edges, check for extends/implements
														
 
															+    // Better: use graph traversal
														
 
															   }
														
 
															-  // Does qualified lookup resolve correctly?
														
 
															-  console.log('\n=== qualified lookup ===');
														
 
															-  const qr = cg.searchNodes('<OwnerType>.<method>', { limit: 50 });
														
 
															-  const exact = qr.filter(r => r.node.qualifiedName.includes('<OwnerType>::<method>'));
														
 
															-  console.log(\`\${exact.length} match(es) for <OwnerType>.<method>\`);
														
 
															-  if (exact[0]) {
														
 
															-    const callees = cg.getCallees(exact[0].node.id);
														
 
															-    console.log('Callees:', callees.map(c => c.node.name).join(', '));
														
 
															+  // Verify specific relationship types exist
														
 
															+  const checks = [
														
 
															+    { desc: 'contains edges (class → method)', query: 'SELECT COUNT(*) FROM edges WHERE kind = \"contains\"' },
														
 
															+    { desc: 'calls edges', query: 'SELECT COUNT(*) FROM edges WHERE kind = \"calls\"' },
														
 
															+    { desc: 'imports edges', query: 'SELECT COUNT(*) FROM edges WHERE kind = \"imports\"' },
														
 
															+    { desc: 'extends edges', query: 'SELECT COUNT(*) FROM edges WHERE kind = \"extends\"' },
														
 
															+    { desc: 'implements edges', query: 'SELECT COUNT(*) FROM edges WHERE kind = \"implements\"' },
														
 
															+  ];
														
 
															+  // Run these via sqlite3 (shown in sanity check section)
														
 
															+
														
 
															+  await cg.close();
														
 
															+}
														
 
															+test().catch(console.error);
														
 
															+"
														
 
															+```
														
 
															+
														
 
															+```bash
														
 
															+sqlite3 <codebase_path>/.codegraph/codegraph.db "
														
 
															+  SELECT kind, COUNT(*) as cnt FROM edges GROUP BY kind ORDER BY cnt DESC;
														
 
															+"
														
 
															+```
														
 
															+
														
 
															+**What to check:**
														
 
															+- `contains` should be the most common (structural hierarchy).
														
 
															+- `calls` should be plentiful — if near zero, call extraction is broken for this language.
														
 
															+- `imports` should exist — if zero, import parsing is broken.
														
 
															+- `extends` and `implements` should exist if the language has inheritance — if zero, `extractInheritance()` may not handle this language's AST.
														
 
															+
														
 
															+---
														
 
															+
														
 
															+### Test 6: Node Extraction Completeness
														
 
															+
														
 
															+Verify all expected node kinds are being extracted.
														
 
															+
														
 
															+```bash
														
 
															+sqlite3 <codebase_path>/.codegraph/codegraph.db "
														
 
															+  SELECT kind, COUNT(*) as cnt FROM nodes GROUP BY kind ORDER BY cnt DESC;
														
 
															+"
														
 
															+```
														
 
															+
														
 
															+**What to check for each language:**
														
 
															+
														
 
															+| Node Kind | Expected? | Notes |
														
 
															+|-----------|-----------|-------|
														
 
															+| `file` | Always | One per source file |
														
 
															+| `class` | If language has classes | |
														
 
															+| `method` | If language has methods | Should include owner type in `qualified_name` |
														
 
															+| `function` | If language has top-level functions | |
														
 
															+| `interface` | If language has interfaces/protocols | |
														
 
															+| `enum` | If language has enums | |
														
 
															+| `enum_member` | If language has enums | Values inside enums |
														
 
															+| `import` | Always | One per import statement |
														
 
															+| `variable` / `field` | Usually | Fields, constants, top-level vars |
														
 
															+| `struct` | If language has structs | Go, Rust, C, Swift |
														
 
															+| `trait` | If language has traits | Rust |
														
 
															+
														
 
															+If an expected node kind has 0 count, the language extractor is missing that AST type.
														
 
															+
														
 
															+---
														
 
															+
														
 
															+### Test 7: Real-World LLM Prompts
														
 
															+
														
 
															+This is the final and most important test. Simulate the kinds of questions a developer would actually ask an LLM that's using CodeGraph. For each prompt, run `findRelevantContext` (which powers `codegraph_explore`) and evaluate whether the returned context would let an LLM give a correct, complete answer.
														
 
															+
														
 
															+**Run at least 5 of these prompt styles, adapted to the actual codebase:**
														
 
															+
														
 
															+```bash
														
 
															+node -e "
														
 
															+const { CodeGraph } = require('./dist/index.js');
														
 
															+async function test() {
														
 
															+  const cg = await CodeGraph.open('<codebase_path>');
														
 
															+
														
 
															+  const prompts = [
														
 
															+    // 1. \"How does X work?\" — subsystem understanding
														
 
															+    'How does the cache eviction policy work?',
														
 
															+
														
 
															+    // 2. \"Where is X implemented?\" — symbol location
														
 
															+    'Where is the LRU eviction logic implemented?',
														
 
															+
														
 
															+    // 3. \"What calls X?\" — usage discovery
														
 
															+    'What code triggers cache invalidation?',
														
 
															+
														
 
															+    // 4. \"I want to change X, what breaks?\" — impact assessment
														
 
															+    'If I change the Cache interface, what else is affected?',
														
 
															+
														
 
															+    // 5. \"How do X and Y interact?\" — cross-component relationships
														
 
															+    'How does CacheBuilder connect to LocalCache?',
														
 
															+
														
 
															+    // 6. \"Show me the flow from A to B\" — data/control flow
														
 
															+    'What happens when a cache entry expires?',
														
 
															+
														
 
															+    // 7. \"What are all the implementations of X?\" — polymorphism
														
 
															+    'What classes implement the Cache interface?',
														
 
															+
														
 
															+    // 8. Bug investigation prompt
														
 
															+    'Cache entries are not being evicted when they should be — where should I look?',
														
 
															+  ];
														
 
															+
														
 
															+  for (const prompt of prompts) {
														
 
															+    console.log(\`\n========================================\`);
														
 
															+    console.log(\`PROMPT: \${prompt}\`);
														
 
															+    console.log(\`========================================\`);
														
 
															+
														
 
															+    const subgraph = await cg.findRelevantContext(prompt, {
														
 
															+      searchLimit: 8, traversalDepth: 3, maxNodes: 80, minScore: 0.2,
														
 
															+    });
														
 
															+
														
 
															+    console.log(\`Result: \${subgraph.nodes.size} nodes, \${subgraph.edges.length} edges, \${subgraph.roots.length} entry points\`);
														
 
															+
														
 
															+    console.log('Entry points:');
														
 
															+    for (const rootId of subgraph.roots.slice(0, 5)) {
														
 
															+      const node = subgraph.nodes.get(rootId);
														
 
															+      if (node) console.log(\`  \${node.name} (\${node.kind}) — \${node.filePath}:\${node.startLine}\`);
														
 
															+    }
														
 
															+
														
 
															+    const fileGroups = new Map();
														
 
															+    for (const node of subgraph.nodes.values()) {
														
 
															+      if (!fileGroups.has(node.filePath)) fileGroups.set(node.filePath, []);
														
 
															+      fileGroups.get(node.filePath).push(node.name);
														
 
															+    }
														
 
															+    console.log('Top files:');
														
 
															+    for (const [file, nodes] of [...fileGroups.entries()].sort((a,b) => b[1].length - a[1].length).slice(0, 5)) {
														
 
															+      console.log(\`  \${file} (\${nodes.length}): \${nodes.slice(0, 5).join(', ')}\`);
														
 
															+    }
														
 
															+
														
 
															+    // PASS/FAIL judgment
														
 
															+    const hasEntryPoints = subgraph.roots.length > 0;
														
 
															+    const hasEdges = subgraph.edges.length > 0;
														
 
															+    const hasMultipleFiles = fileGroups.size > 1;
														
 
															+    console.log(\`\\nVERDICT: \${hasEntryPoints && hasEdges && hasMultipleFiles ? 'PASS' : 'FAIL — needs investigation'}\`);
														
 
															   }
														
 
															   await cg.close();
														
@@ -90,36 +424,46 @@ test().catch(console.error);
 
															 "
														
 
															 ```
														
 
															-### 5. If results are bad, diagnose and fix
														
 
															+**What to check for each prompt:**
														
 
															+- Does it return entry points? Zero entry points = total failure.
														
 
															+- Are the entry points **relevant** to the question? (Not just random symbols that happen to share a word.)
														
 
															+- Does it span multiple files? Most real questions involve cross-file understanding.
														
 
															+- Are relationships present? An LLM needs to understand how symbols connect, not just a list of names.
														
 
															+- Would **you** be able to answer the question from this context?
														
 
															+
														
 
															+---
														
 
															-| Symptom | Cause | Fix |
														
 
															-|---------|-------|-----|
														
 
															-| Target method not in top 10 of `searchNodes` | Owner type missing from `qualified_name` | Add `getReceiverType` to `src/extraction/languages/<lang>.ts` |
														
 
															-| Explore returns irrelevant files | Common method name flooding exact matches | Check co-location boost in `src/db/queries.ts: findNodesByExactName` |
														
 
															-| A key term is being dropped from search | It's in the STOP_WORDS list | Edit `src/search/query-utils.ts` |
														
 
															-| `<OwnerType>.<method>` returns "not found" | `qualified_name` doesn't contain `OwnerType::method` | Fix `getReceiverType` output |
														
 
															+## Diagnosing Failures
														
 
															-### 6. Rebuild and re-test
														
 
															+| Symptom | Likely Cause | Where to Fix |
														
 
															+|---------|-------------|--------------|
														
 
															+| Method missing owner type in `qualified_name` | Language needs `getReceiverType` | `src/extraction/languages/<lang>.ts` |
														
 
															+| `codegraph_explore` returns irrelevant files | Common names flooding FTS; co-location boost not helping | `src/db/queries.ts: findNodesByExactName`, `src/context/index.ts` |
														
 
															+| Zero `calls` edges | `callTypes` missing or wrong AST node type | `src/extraction/languages/<lang>.ts: callTypes` |
														
 
															+| Zero `extends`/`implements` edges | `extractInheritance()` doesn't handle this language's AST | `src/extraction/tree-sitter.ts: extractInheritance()` |
														
 
															+| Missing node kinds (no enums, no interfaces) | AST type not listed in extractor | `src/extraction/languages/<lang>.ts: enumTypes`, `interfaceTypes`, etc. |
														
 
															+| Search term dropped from query | Term is in the stop words list | `src/search/query-utils.ts: STOP_WORDS` |
														
 
															+| `qualified_name` missing class for nested methods | Extraction not walking parent stack correctly | `src/extraction/tree-sitter.ts: visitNode()` |
														
 
															+| Import edges missing | `extractImport` returns null for this syntax | `src/extraction/languages/<lang>.ts: extractImport` |
														
 
															+
														
 
															+## After Fixing Issues
														
 
															 ```bash
														
 
															 npm run build
														
 
															-# If you changed extraction (getReceiverType), must re-index:
														
 
															 rm -rf <codebase_path>/.codegraph
														
 
															 node dist/bin/codegraph.js init -iv <codebase_path>
														
 
															-# Then re-run Step 4
														
 
															+# Re-run the failing tests from above
														
 
															 ```
														
 
															-### 7. Run the test suite before finishing
														
 
															+Always run the full test suite before marking a language as verified:
														
 
															 ```bash
														
 
															 npm test
														
 
															 ```
														
 
															-All 378+ tests must pass.
														
 
															-
														
 
															-## How to Add `getReceiverType` for a Language
														
 
															+## Adding `getReceiverType`
														
 
															-**Only needed for languages where methods are top-level or outside their owner type in the AST.** If the language nests methods inside class/struct bodies (Python, Java, TypeScript, C#), the qualified name already includes the parent — verify with Step 3 before adding anything.
														
 
															+**Only needed for languages where methods are top-level or outside their owner type in the AST.** If the language nests methods inside class/struct bodies (Python, Java, TypeScript, C#), the qualified name already includes the parent — verify with the sanity check before adding anything.
														
 
															 ### 1. Add the hook to the language extractor
														
@@ -129,7 +473,7 @@ In `src/extraction/languages/<lang>.ts`, add `getReceiverType` to the extractor
 
															 getReceiverType: (node, source) => {
														
 
															   // Extract the owner type name from the method's AST node.
														
 
															   // Return the type name string, or undefined if not applicable.
														
 
															-  // 
														
 
															+  //
														
 
															   // The core extractMethod() in tree-sitter.ts will use this to set:
														
 
															   //   qualifiedName = `${filePath}::${receiverType}::${methodName}`
														
 
															 },
														
@@ -163,29 +507,32 @@ if (receiverType) {
 
															 | File | Role |
														
 
															 |------|------|
														
 
															-| `src/extraction/languages/<lang>.ts` | Language extractor — implement `getReceiverType` here |
														
 
															-| `src/extraction/tree-sitter.ts` | Core extraction — `extractMethod()` uses the hook |
														
 
															+| `src/extraction/languages/<lang>.ts` | Language extractor — node types, call types, `getReceiverType` |
														
 
															+| `src/extraction/tree-sitter.ts` | Core extraction — `extractMethod()`, `extractCall()`, `extractInheritance()` |
														
 
															 | `src/extraction/tree-sitter-types.ts` | `LanguageExtractor` interface definition |
														
 
															 | `src/search/query-utils.ts` | `STOP_WORDS`, `extractSearchTerms`, `scorePathRelevance` |
														
 
															-| `src/db/queries.ts` | `searchNodesFTS` (BM25), `findNodesByExactName` (co-location) |
														
 
															-| `src/context/index.ts` | `findRelevantContext` — hybrid search + co-location boost |
														
 
															-| `src/mcp/tools.ts` | MCP handlers — `matchesSymbol` uses `qualifiedName.includes("Type::method")` |
														
 
															+| `src/db/queries.ts` | `searchNodesFTS` (BM25), `findNodesByExactName` (co-location boost) |
														
 
															+| `src/context/index.ts` | `findRelevantContext` — hybrid search + graph traversal |
														
 
															+| `src/mcp/tools.ts` | MCP tool handlers — `codegraph_explore` implementation |
														
 
															-## Languages Completed
														
 
															+## Language Status
														
 
															+
														
 
															+### Verified
														
 
															 - [x] **Go** — `getReceiverType` extracts receiver from `func (sl *Type) method()`
														
 
															-- [x] **Swift** — NOT needed. Tree-sitter parses `extension Type { }` as `class_declaration`, so methods already get owner type in `qualified_name` (e.g., `SimplifyApply.swift::SimplifyApply.swift::ApplyInst::simplify`)
														
 
															+- [x] **Swift** — NOT needed. Tree-sitter nests methods inside class/extension bodies
														
 
															+- [x] **Java** — NOT needed. Methods nested in class body. Verified against Guava
														
 
															-## Languages To Do
														
 
															+### Needs Verification
														
 
															-Check these — only add `getReceiverType` if methods are top-level (not nested inside their owner type in the AST):
														
 
															+Check these — may need `getReceiverType` if methods are top-level in the AST:
														
 
															 - [ ] Rust — methods in `impl Type { }` blocks
														
 
															 - [ ] C++ — out-of-class method definitions `Type::method()`
														
 
															 - [ ] Kotlin — extension functions `fun Type.method()`
														
 
															-Verify these DON'T need it (methods nested in class body → qualified name should already be correct):
														
 
															-- [ ] Python — verify `qualified_name` includes class name
														
 
															-- [ ] Java — verify `qualified_name` includes class name
														
 
															-- [ ] TypeScript — verify `qualified_name` includes class name
														
 
															-- [ ] C# — verify `qualified_name` includes class name
														
 
															+Verify these DON'T need `getReceiverType` (methods nested in class body):
														
 
															+
														
 
															+- [ ] Python
														
 
															+- [ ] TypeScript
														
 
															+- [ ] C#