|
@@ -1,45 +1,183 @@
|
|
|
-# CodeGraph Search Quality Loop
|
|
|
|
|
|
|
+# CodeGraph Language Verification Guide
|
|
|
|
|
|
|
|
-You are testing and improving CodeGraph's search quality for a specific language. The user will give you a real-world codebase path to test against.
|
|
|
|
|
|
|
+You are verifying that CodeGraph fully supports a specific programming language. The user will give you a path to a real-world, popular open-source codebase cloned locally. Your job is to run a battery of realistic prompts against it using CodeGraph's API and verify the results are good enough to say that language is **covered and supported**.
|
|
|
|
|
|
|
|
-## What You're Fixing
|
|
|
|
|
|
|
+A language is NOT verified until an LLM can reliably use CodeGraph's MCP tools to navigate that codebase — finding the right symbols, understanding call chains, exploring subsystems, and getting useful context for real tasks.
|
|
|
|
|
|
|
|
-When an LLM queries CodeGraph via MCP tools (`codegraph_search`, `codegraph_explore`, `codegraph_callees`), the results must be relevant. The main failure mode is: methods with common names (like `run`, `get`, `handle`) flood results and bury the actual target. The fix is usually adding `getReceiverType` to the language extractor so methods include their owner type in the FTS-indexed `qualified_name`.
|
|
|
|
|
|
|
+## Setup
|
|
|
|
|
|
|
|
-**Example:** Go's `func (sl *scrapeLoop) run()` was indexed as `scrape.go::scrape.go::run`. After adding `getReceiverType`, it became `scrape.go::scrapeLoop::run` — now FTS can rank it above unrelated `run` methods when the query mentions "scrapeLoop".
|
|
|
|
|
|
|
+### 1. Build and index
|
|
|
|
|
|
|
|
-## The Loop
|
|
|
|
|
|
|
+```bash
|
|
|
|
|
+npm run build
|
|
|
|
|
+rm -rf <codebase_path>/.codegraph
|
|
|
|
|
+node dist/bin/codegraph.js init -iv <codebase_path>
|
|
|
|
|
+```
|
|
|
|
|
+
|
|
|
|
|
+The `-iv` flag gives verbose output showing extraction progress, node/edge counts, and timing.
|
|
|
|
|
+
|
|
|
|
|
+### 2. Quick sanity check
|
|
|
|
|
+
|
|
|
|
|
+```bash
|
|
|
|
|
+# Verify nodes were extracted with proper qualified names
|
|
|
|
|
+sqlite3 <codebase_path>/.codegraph/codegraph.db \
|
|
|
|
|
+ "SELECT name, kind, qualified_name FROM nodes WHERE kind = 'method' LIMIT 10;"
|
|
|
|
|
+
|
|
|
|
|
+# GOOD: file.go::StructName::method_name (owner type present)
|
|
|
|
|
+# BAD: file.go::file.go::method_name (owner type missing — needs getReceiverType)
|
|
|
|
|
+
|
|
|
|
|
+# Check edge counts
|
|
|
|
|
+sqlite3 <codebase_path>/.codegraph/codegraph.db \
|
|
|
|
|
+ "SELECT kind, COUNT(*) FROM edges GROUP BY kind ORDER BY COUNT(*) DESC;"
|
|
|
|
|
+
|
|
|
|
|
+# Check node kind distribution
|
|
|
|
|
+sqlite3 <codebase_path>/.codegraph/codegraph.db \
|
|
|
|
|
+ "SELECT kind, COUNT(*) FROM nodes GROUP BY kind ORDER BY COUNT(*) DESC;"
|
|
|
|
|
+```
|
|
|
|
|
+
|
|
|
|
|
+If methods are missing their owner type in `qualified_name`, fix that first (see [Adding getReceiverType](#adding-getreceivertype)) before proceeding with the full test battery.
|
|
|
|
|
+
|
|
|
|
|
+## The Test Battery
|
|
|
|
|
|
|
|
-### 1. Pick a test query
|
|
|
|
|
|
|
+Run **all** of the following test categories against the codebase. Use the Node.js API directly — the test scripts below are templates. Adapt the queries to match real types, methods, and subsystems in the codebase you're testing.
|
|
|
|
|
|
|
|
-Choose a query that exercises the language's method-on-type pattern. Good queries mention:
|
|
|
|
|
-- A specific type/class/struct name
|
|
|
|
|
-- A method on that type
|
|
|
|
|
-- A broader topic connecting multiple files
|
|
|
|
|
|
|
+**Pass criteria for each test:** Does the result give an LLM enough correct information to answer the question or complete the task? Would you trust these results if you were the LLM?
|
|
|
|
|
|
|
|
-Example for Go: `"scrapeLoop run scrape lifecycle TSDB storage"`
|
|
|
|
|
|
|
+---
|
|
|
|
|
|
|
|
-### 2. Index the codebase
|
|
|
|
|
|
|
+### Test 1: `codegraph_explore` — Deep Exploration (MOST IMPORTANT)
|
|
|
|
|
+
|
|
|
|
|
+This is the primary tool LLMs use. It must return relevant source code grouped by file, with correct relationships, for a natural language query. Test it with **at least 5 different query types**:
|
|
|
|
|
|
|
|
```bash
|
|
```bash
|
|
|
-rm -rf <codebase_path>/.codegraph
|
|
|
|
|
-node dist/bin/codegraph.js init -iv <codebase_path>
|
|
|
|
|
|
|
+node -e "
|
|
|
|
|
+const { CodeGraph } = require('./dist/index.js');
|
|
|
|
|
+async function test() {
|
|
|
|
|
+ const cg = await CodeGraph.open('<codebase_path>');
|
|
|
|
|
+
|
|
|
|
|
+ const queries = [
|
|
|
|
|
+ // A. Subsystem exploration — broad topic, should find the right files and key classes
|
|
|
|
|
+ 'How does the caching system work?',
|
|
|
|
|
+
|
|
|
|
|
+ // B. Specific class/type deep dive — should return that class, its methods, and related types
|
|
|
|
|
+ 'CacheBuilder configuration and build process',
|
|
|
|
|
+
|
|
|
|
|
+ // C. Cross-cutting concern — should find implementations across multiple files
|
|
|
|
|
+ 'How are errors handled and propagated?',
|
|
|
|
|
+
|
|
|
|
|
+ // D. Data flow question — should trace through multiple layers
|
|
|
|
|
+ 'How does data flow from input to storage?',
|
|
|
|
|
+
|
|
|
|
|
+ // E. Implementation detail — specific method behavior
|
|
|
|
|
+ 'How does eviction decide which entries to remove?',
|
|
|
|
|
+ ];
|
|
|
|
|
+
|
|
|
|
|
+ for (const query of queries) {
|
|
|
|
|
+ console.log(\`\n========================================\`);
|
|
|
|
|
+ console.log(\`QUERY: \${query}\`);
|
|
|
|
|
+ console.log(\`========================================\`);
|
|
|
|
|
+
|
|
|
|
|
+ const subgraph = await cg.findRelevantContext(query, {
|
|
|
|
|
+ searchLimit: 8, traversalDepth: 3, maxNodes: 80, minScore: 0.2,
|
|
|
|
|
+ });
|
|
|
|
|
+
|
|
|
|
|
+ // Show entry points — these are what the LLM sees first
|
|
|
|
|
+ console.log(\`\nEntry points (\${subgraph.roots.length}):\`);
|
|
|
|
|
+ for (const rootId of subgraph.roots.slice(0, 8)) {
|
|
|
|
|
+ const node = subgraph.nodes.get(rootId);
|
|
|
|
|
+ if (node) console.log(\` \${node.name} (\${node.kind}) — \${node.filePath}:\${node.startLine}\`);
|
|
|
|
|
+ }
|
|
|
|
|
+
|
|
|
|
|
+ // Show file distribution — are the right files surfacing?
|
|
|
|
|
+ const fileGroups = new Map();
|
|
|
|
|
+ for (const node of subgraph.nodes.values()) {
|
|
|
|
|
+ if (!fileGroups.has(node.filePath)) fileGroups.set(node.filePath, []);
|
|
|
|
|
+ fileGroups.get(node.filePath).push(node.name);
|
|
|
|
|
+ }
|
|
|
|
|
+ console.log(\`\nFiles (\${fileGroups.size}):\`);
|
|
|
|
|
+ for (const [file, nodes] of [...fileGroups.entries()].sort((a,b) => b[1].length - a[1].length).slice(0, 8)) {
|
|
|
|
|
+ console.log(\` \${file} (\${nodes.length} symbols): \${nodes.slice(0, 6).join(', ')}\`);
|
|
|
|
|
+ }
|
|
|
|
|
+
|
|
|
|
|
+ // Show edge distribution — are relationships being captured?
|
|
|
|
|
+ const edgeKinds = new Map();
|
|
|
|
|
+ for (const edge of subgraph.edges) {
|
|
|
|
|
+ edgeKinds.set(edge.kind, (edgeKinds.get(edge.kind) || 0) + 1);
|
|
|
|
|
+ }
|
|
|
|
|
+ console.log(\`\nEdges (\${subgraph.edges.length}):\`);
|
|
|
|
|
+ for (const [kind, count] of [...edgeKinds.entries()].sort((a,b) => b - a)) {
|
|
|
|
|
+ console.log(\` \${kind}: \${count}\`);
|
|
|
|
|
+ }
|
|
|
|
|
+
|
|
|
|
|
+ console.log(\`\nTotal: \${subgraph.nodes.size} nodes, \${subgraph.edges.length} edges, \${fileGroups.size} files\`);
|
|
|
|
|
+ }
|
|
|
|
|
+
|
|
|
|
|
+ await cg.close();
|
|
|
|
|
+}
|
|
|
|
|
+test().catch(console.error);
|
|
|
|
|
+"
|
|
|
```
|
|
```
|
|
|
|
|
|
|
|
-The `-iv` flag gives verbose output showing extraction progress, node/edge counts, and timing.
|
|
|
|
|
|
|
+**What to check for each query:**
|
|
|
|
|
+- Do the entry points make sense for the question?
|
|
|
|
|
+- Are the right files surfacing (not just test files or unrelated code)?
|
|
|
|
|
+- Is there a mix of edge types (calls, contains, extends, implements) — not just `contains`?
|
|
|
|
|
+- Does the node count feel right? Too few (<5) means search failed. Too many irrelevant ones means noise.
|
|
|
|
|
+
|
|
|
|
|
+---
|
|
|
|
|
|
|
|
-### 3. Check what the DB produced
|
|
|
|
|
|
|
+### Test 2: `codegraph_search` — Symbol Lookup
|
|
|
|
|
+
|
|
|
|
|
+Test that searching for specific symbols returns the right results ranked correctly.
|
|
|
|
|
|
|
|
```bash
|
|
```bash
|
|
|
-# Does the method have its owner type in qualified_name?
|
|
|
|
|
-sqlite3 <codebase_path>/.codegraph/codegraph.db \
|
|
|
|
|
- "SELECT name, kind, qualified_name FROM nodes WHERE name = '<method>' AND file_path LIKE '%<file>%';"
|
|
|
|
|
|
|
+node -e "
|
|
|
|
|
+const { CodeGraph } = require('./dist/index.js');
|
|
|
|
|
+async function test() {
|
|
|
|
|
+ const cg = await CodeGraph.open('<codebase_path>');
|
|
|
|
|
+
|
|
|
|
|
+ const searches = [
|
|
|
|
|
+ // A. Class by name
|
|
|
|
|
+ { query: 'CacheBuilder', kinds: ['class'], desc: 'Find a specific class' },
|
|
|
|
|
+
|
|
|
|
|
+ // B. Method on a specific type (the classic disambiguation test)
|
|
|
|
|
+ { query: 'CacheBuilder build', kinds: ['method'], desc: 'Method on specific class' },
|
|
|
|
|
+
|
|
|
|
|
+ // C. Common method name — should still find relevant ones
|
|
|
|
|
+ { query: 'get', kinds: ['method'], desc: 'Common method name' },
|
|
|
|
|
+
|
|
|
|
|
+ // D. Interface/trait
|
|
|
|
|
+ { query: 'Cache', kinds: ['interface'], desc: 'Find an interface' },
|
|
|
|
|
+
|
|
|
|
|
+ // E. Enum
|
|
|
|
|
+ { query: 'Strength', kinds: ['enum'], desc: 'Find an enum' },
|
|
|
|
|
+ ];
|
|
|
|
|
|
|
|
-# GOOD: file.rs::StructName::method_name
|
|
|
|
|
-# BAD: file.rs::file.rs::method_name ← owner type missing, FTS can't find it
|
|
|
|
|
|
|
+ for (const s of searches) {
|
|
|
|
|
+ console.log(\`\n--- \${s.desc}: \"\${s.query}\" (kinds: \${s.kinds}) ---\`);
|
|
|
|
|
+ const results = cg.searchNodes(s.query, { limit: 10, kinds: s.kinds });
|
|
|
|
|
+ for (const r of results) {
|
|
|
|
|
+ console.log(\` \${r.score.toFixed(1)} | \${r.node.name} (\${r.node.kind}) | \${r.node.qualifiedName}\`);
|
|
|
|
|
+ }
|
|
|
|
|
+ if (results.length === 0) console.log(' *** NO RESULTS ***');
|
|
|
|
|
+ }
|
|
|
|
|
+
|
|
|
|
|
+ await cg.close();
|
|
|
|
|
+}
|
|
|
|
|
+test().catch(console.error);
|
|
|
|
|
+"
|
|
|
```
|
|
```
|
|
|
|
|
|
|
|
-### 4. Test search ranking
|
|
|
|
|
|
|
+**What to check:**
|
|
|
|
|
+- Does the target symbol rank in the top 3?
|
|
|
|
|
+- For common names like `get`, do the results include qualified names that help disambiguate?
|
|
|
|
|
+- Are there zero-result queries? That's a bug.
|
|
|
|
|
+
|
|
|
|
|
+---
|
|
|
|
|
+
|
|
|
|
|
+### Test 3: `codegraph_callers` / `codegraph_callees` — Call Chain Tracing
|
|
|
|
|
+
|
|
|
|
|
+Test that call relationships were extracted correctly.
|
|
|
|
|
|
|
|
```bash
|
|
```bash
|
|
|
node -e "
|
|
node -e "
|
|
@@ -47,41 +185,237 @@ const { CodeGraph } = require('./dist/index.js');
|
|
|
async function test() {
|
|
async function test() {
|
|
|
const cg = await CodeGraph.open('<codebase_path>');
|
|
const cg = await CodeGraph.open('<codebase_path>');
|
|
|
|
|
|
|
|
- // Does the target method rank #1?
|
|
|
|
|
- console.log('=== searchNodes ===');
|
|
|
|
|
- const results = cg.searchNodes('<OwnerType> <method>', { limit: 10, kinds: ['method'] });
|
|
|
|
|
- for (const r of results) {
|
|
|
|
|
- console.log(\`\${r.score.toFixed(2)} | \${r.node.name} (\${r.node.kind}) | \${r.node.filePath}:\${r.node.startLine}\`);
|
|
|
|
|
|
|
+ // Pick 3-4 important methods and check their call graphs
|
|
|
|
|
+ const symbols = ['build', 'get', 'put', 'invalidate'];
|
|
|
|
|
+
|
|
|
|
|
+ for (const sym of symbols) {
|
|
|
|
|
+ // Find the symbol
|
|
|
|
|
+ const results = cg.searchNodes(sym, { limit: 5, kinds: ['method'] });
|
|
|
|
|
+ if (results.length === 0) { console.log(\`\${sym}: not found\`); continue; }
|
|
|
|
|
+
|
|
|
|
|
+ const node = results[0].node;
|
|
|
|
|
+ console.log(\`\n--- \${node.name} (\${node.qualifiedName}) ---\`);
|
|
|
|
|
+
|
|
|
|
|
+ // Check callees (what does it call?)
|
|
|
|
|
+ const callees = cg.getCallees(node.id);
|
|
|
|
|
+ console.log(\` Callees (\${callees.length}): \${callees.slice(0, 10).map(c => c.node.name).join(', ')}\`);
|
|
|
|
|
+
|
|
|
|
|
+ // Check callers (what calls it?)
|
|
|
|
|
+ const callers = cg.getCallers(node.id);
|
|
|
|
|
+ console.log(\` Callers (\${callers.length}): \${callers.slice(0, 10).map(c => c.node.name).join(', ')}\`);
|
|
|
}
|
|
}
|
|
|
|
|
|
|
|
- // Does explore find the right file?
|
|
|
|
|
- console.log('\n=== findRelevantContext ===');
|
|
|
|
|
- const subgraph = await cg.findRelevantContext('<your natural language query>', {
|
|
|
|
|
- searchLimit: 8, traversalDepth: 3, maxNodes: 80, minScore: 0.2,
|
|
|
|
|
- });
|
|
|
|
|
- const fileGroups = new Map();
|
|
|
|
|
- for (const node of subgraph.nodes.values()) {
|
|
|
|
|
- if (!fileGroups.has(node.filePath)) fileGroups.set(node.filePath, []);
|
|
|
|
|
- fileGroups.get(node.filePath).push(node.name);
|
|
|
|
|
|
|
+ await cg.close();
|
|
|
|
|
+}
|
|
|
|
|
+test().catch(console.error);
|
|
|
|
|
+"
|
|
|
|
|
+```
|
|
|
|
|
+
|
|
|
|
|
+**What to check:**
|
|
|
|
|
+- Do methods have callers AND callees? If a method has 0 of both, edge extraction may be broken.
|
|
|
|
|
+- Do the callers/callees make sense? A `build()` method should call constructor-like things, and be called by setup/initialization code.
|
|
|
|
|
+- Are the counts reasonable? A core method in a popular codebase should have multiple callers.
|
|
|
|
|
+
|
|
|
|
|
+---
|
|
|
|
|
+
|
|
|
|
|
+### Test 4: `codegraph_impact` — Change Impact Analysis
|
|
|
|
|
+
|
|
|
|
|
+Test that the impact radius correctly identifies affected code.
|
|
|
|
|
+
|
|
|
|
|
+```bash
|
|
|
|
|
+node -e "
|
|
|
|
|
+const { CodeGraph } = require('./dist/index.js');
|
|
|
|
|
+async function test() {
|
|
|
|
|
+ const cg = await CodeGraph.open('<codebase_path>');
|
|
|
|
|
+
|
|
|
|
|
+ // Pick a core class or interface that many things depend on
|
|
|
|
|
+ const results = cg.searchNodes('<CoreClass>', { limit: 1, kinds: ['class', 'interface'] });
|
|
|
|
|
+ if (results.length === 0) { console.log('Not found'); return; }
|
|
|
|
|
+
|
|
|
|
|
+ const node = results[0].node;
|
|
|
|
|
+ console.log(\`Impact analysis for: \${node.name} (\${node.kind}) — \${node.filePath}\`);
|
|
|
|
|
+
|
|
|
|
|
+ const impact = cg.getImpactRadius(node.id, { depth: 2 });
|
|
|
|
|
+ console.log(\`\nAffected nodes: \${impact.nodes.size}\`);
|
|
|
|
|
+ console.log(\`Affected edges: \${impact.edges.length}\`);
|
|
|
|
|
+
|
|
|
|
|
+ // Group by file
|
|
|
|
|
+ const files = new Map();
|
|
|
|
|
+ for (const n of impact.nodes.values()) {
|
|
|
|
|
+ if (!files.has(n.filePath)) files.set(n.filePath, []);
|
|
|
|
|
+ files.get(n.filePath).push(n.name);
|
|
|
}
|
|
}
|
|
|
- console.log('Entry points:');
|
|
|
|
|
- for (const rootId of subgraph.roots.slice(0, 8)) {
|
|
|
|
|
- const node = subgraph.nodes.get(rootId);
|
|
|
|
|
- if (node) console.log(\` \${node.name} (\${node.kind}) - \${node.filePath}:\${node.startLine}\`);
|
|
|
|
|
|
|
+ console.log(\`Affected files: \${files.size}\`);
|
|
|
|
|
+ for (const [file, nodes] of [...files.entries()].sort((a,b) => b[1].length - a[1].length).slice(0, 10)) {
|
|
|
|
|
+ console.log(\` \${file}: \${nodes.slice(0, 5).join(', ')}\`);
|
|
|
}
|
|
}
|
|
|
- console.log('Top files:');
|
|
|
|
|
- for (const [file, nodes] of [...fileGroups.entries()].sort((a,b) => b[1].length - a[1].length).slice(0, 5)) {
|
|
|
|
|
- console.log(\` \${file} (\${nodes.length}): \${nodes.slice(0, 5).join(', ')}\`);
|
|
|
|
|
|
|
+
|
|
|
|
|
+ await cg.close();
|
|
|
|
|
+}
|
|
|
|
|
+test().catch(console.error);
|
|
|
|
|
+"
|
|
|
|
|
+```
|
|
|
|
|
+
|
|
|
|
|
+**What to check:**
|
|
|
|
|
+- Does changing a core interface/class show a wide impact radius?
|
|
|
|
|
+- Are the affected files reasonable (things that import/extend/use it)?
|
|
|
|
|
+- Is the impact radius non-empty? Zero impact on a core type means edges are missing.
|
|
|
|
|
+
|
|
|
|
|
+---
|
|
|
|
|
+
|
|
|
|
|
+### Test 5: Edge Extraction Quality
|
|
|
|
|
+
|
|
|
|
|
+Directly verify that the major edge types are being extracted for this language.
|
|
|
|
|
+
|
|
|
|
|
+```bash
|
|
|
|
|
+node -e "
|
|
|
|
|
+const { CodeGraph } = require('./dist/index.js');
|
|
|
|
|
+async function test() {
|
|
|
|
|
+ const cg = await CodeGraph.open('<codebase_path>');
|
|
|
|
|
+
|
|
|
|
|
+ // Check overall edge distribution
|
|
|
|
|
+ console.log('=== Edge distribution ===');
|
|
|
|
|
+ // (Use sqlite3 query from sanity check above)
|
|
|
|
|
+
|
|
|
|
|
+ // Find a class that extends another
|
|
|
|
|
+ const classes = cg.searchNodes('', { limit: 100, kinds: ['class'] });
|
|
|
|
|
+ let foundExtends = false, foundImplements = false;
|
|
|
|
|
+ for (const r of classes) {
|
|
|
|
|
+ const callees = cg.getCallees(r.node.id);
|
|
|
|
|
+ // getCallees returns all outgoing edges, check for extends/implements
|
|
|
|
|
+ // Better: use graph traversal
|
|
|
}
|
|
}
|
|
|
|
|
|
|
|
- // Does qualified lookup resolve correctly?
|
|
|
|
|
- console.log('\n=== qualified lookup ===');
|
|
|
|
|
- const qr = cg.searchNodes('<OwnerType>.<method>', { limit: 50 });
|
|
|
|
|
- const exact = qr.filter(r => r.node.qualifiedName.includes('<OwnerType>::<method>'));
|
|
|
|
|
- console.log(\`\${exact.length} match(es) for <OwnerType>.<method>\`);
|
|
|
|
|
- if (exact[0]) {
|
|
|
|
|
- const callees = cg.getCallees(exact[0].node.id);
|
|
|
|
|
- console.log('Callees:', callees.map(c => c.node.name).join(', '));
|
|
|
|
|
|
|
+ // Verify specific relationship types exist
|
|
|
|
|
+ const checks = [
|
|
|
|
|
+ { desc: 'contains edges (class → method)', query: 'SELECT COUNT(*) FROM edges WHERE kind = \"contains\"' },
|
|
|
|
|
+ { desc: 'calls edges', query: 'SELECT COUNT(*) FROM edges WHERE kind = \"calls\"' },
|
|
|
|
|
+ { desc: 'imports edges', query: 'SELECT COUNT(*) FROM edges WHERE kind = \"imports\"' },
|
|
|
|
|
+ { desc: 'extends edges', query: 'SELECT COUNT(*) FROM edges WHERE kind = \"extends\"' },
|
|
|
|
|
+ { desc: 'implements edges', query: 'SELECT COUNT(*) FROM edges WHERE kind = \"implements\"' },
|
|
|
|
|
+ ];
|
|
|
|
|
+ // Run these via sqlite3 (shown in sanity check section)
|
|
|
|
|
+
|
|
|
|
|
+ await cg.close();
|
|
|
|
|
+}
|
|
|
|
|
+test().catch(console.error);
|
|
|
|
|
+"
|
|
|
|
|
+```
|
|
|
|
|
+
|
|
|
|
|
+```bash
|
|
|
|
|
+sqlite3 <codebase_path>/.codegraph/codegraph.db "
|
|
|
|
|
+ SELECT kind, COUNT(*) as cnt FROM edges GROUP BY kind ORDER BY cnt DESC;
|
|
|
|
|
+"
|
|
|
|
|
+```
|
|
|
|
|
+
|
|
|
|
|
+**What to check:**
|
|
|
|
|
+- `contains` should be the most common (structural hierarchy).
|
|
|
|
|
+- `calls` should be plentiful — if near zero, call extraction is broken for this language.
|
|
|
|
|
+- `imports` should exist — if zero, import parsing is broken.
|
|
|
|
|
+- `extends` and `implements` should exist if the language has inheritance — if zero, `extractInheritance()` may not handle this language's AST.
|
|
|
|
|
+
|
|
|
|
|
+---
|
|
|
|
|
+
|
|
|
|
|
+### Test 6: Node Extraction Completeness
|
|
|
|
|
+
|
|
|
|
|
+Verify all expected node kinds are being extracted.
|
|
|
|
|
+
|
|
|
|
|
+```bash
|
|
|
|
|
+sqlite3 <codebase_path>/.codegraph/codegraph.db "
|
|
|
|
|
+ SELECT kind, COUNT(*) as cnt FROM nodes GROUP BY kind ORDER BY cnt DESC;
|
|
|
|
|
+"
|
|
|
|
|
+```
|
|
|
|
|
+
|
|
|
|
|
+**What to check for each language:**
|
|
|
|
|
+
|
|
|
|
|
+| Node Kind | Expected? | Notes |
|
|
|
|
|
+|-----------|-----------|-------|
|
|
|
|
|
+| `file` | Always | One per source file |
|
|
|
|
|
+| `class` | If language has classes | |
|
|
|
|
|
+| `method` | If language has methods | Should include owner type in `qualified_name` |
|
|
|
|
|
+| `function` | If language has top-level functions | |
|
|
|
|
|
+| `interface` | If language has interfaces/protocols | |
|
|
|
|
|
+| `enum` | If language has enums | |
|
|
|
|
|
+| `enum_member` | If language has enums | Values inside enums |
|
|
|
|
|
+| `import` | Always | One per import statement |
|
|
|
|
|
+| `variable` / `field` | Usually | Fields, constants, top-level vars |
|
|
|
|
|
+| `struct` | If language has structs | Go, Rust, C, Swift |
|
|
|
|
|
+| `trait` | If language has traits | Rust |
|
|
|
|
|
+
|
|
|
|
|
+If an expected node kind has 0 count, the language extractor is missing that AST type.
|
|
|
|
|
+
|
|
|
|
|
+---
|
|
|
|
|
+
|
|
|
|
|
+### Test 7: Real-World LLM Prompts
|
|
|
|
|
+
|
|
|
|
|
+This is the final and most important test. Simulate the kinds of questions a developer would actually ask an LLM that's using CodeGraph. For each prompt, run `findRelevantContext` (which powers `codegraph_explore`) and evaluate whether the returned context would let an LLM give a correct, complete answer.
|
|
|
|
|
+
|
|
|
|
|
+**Run at least 5 of these prompt styles, adapted to the actual codebase:**
|
|
|
|
|
+
|
|
|
|
|
+```bash
|
|
|
|
|
+node -e "
|
|
|
|
|
+const { CodeGraph } = require('./dist/index.js');
|
|
|
|
|
+async function test() {
|
|
|
|
|
+ const cg = await CodeGraph.open('<codebase_path>');
|
|
|
|
|
+
|
|
|
|
|
+ const prompts = [
|
|
|
|
|
+ // 1. \"How does X work?\" — subsystem understanding
|
|
|
|
|
+ 'How does the cache eviction policy work?',
|
|
|
|
|
+
|
|
|
|
|
+ // 2. \"Where is X implemented?\" — symbol location
|
|
|
|
|
+ 'Where is the LRU eviction logic implemented?',
|
|
|
|
|
+
|
|
|
|
|
+ // 3. \"What calls X?\" — usage discovery
|
|
|
|
|
+ 'What code triggers cache invalidation?',
|
|
|
|
|
+
|
|
|
|
|
+ // 4. \"I want to change X, what breaks?\" — impact assessment
|
|
|
|
|
+ 'If I change the Cache interface, what else is affected?',
|
|
|
|
|
+
|
|
|
|
|
+ // 5. \"How do X and Y interact?\" — cross-component relationships
|
|
|
|
|
+ 'How does CacheBuilder connect to LocalCache?',
|
|
|
|
|
+
|
|
|
|
|
+ // 6. \"Show me the flow from A to B\" — data/control flow
|
|
|
|
|
+ 'What happens when a cache entry expires?',
|
|
|
|
|
+
|
|
|
|
|
+ // 7. \"What are all the implementations of X?\" — polymorphism
|
|
|
|
|
+ 'What classes implement the Cache interface?',
|
|
|
|
|
+
|
|
|
|
|
+ // 8. Bug investigation prompt
|
|
|
|
|
+ 'Cache entries are not being evicted when they should be — where should I look?',
|
|
|
|
|
+ ];
|
|
|
|
|
+
|
|
|
|
|
+ for (const prompt of prompts) {
|
|
|
|
|
+ console.log(\`\n========================================\`);
|
|
|
|
|
+ console.log(\`PROMPT: \${prompt}\`);
|
|
|
|
|
+ console.log(\`========================================\`);
|
|
|
|
|
+
|
|
|
|
|
+ const subgraph = await cg.findRelevantContext(prompt, {
|
|
|
|
|
+ searchLimit: 8, traversalDepth: 3, maxNodes: 80, minScore: 0.2,
|
|
|
|
|
+ });
|
|
|
|
|
+
|
|
|
|
|
+ console.log(\`Result: \${subgraph.nodes.size} nodes, \${subgraph.edges.length} edges, \${subgraph.roots.length} entry points\`);
|
|
|
|
|
+
|
|
|
|
|
+ console.log('Entry points:');
|
|
|
|
|
+ for (const rootId of subgraph.roots.slice(0, 5)) {
|
|
|
|
|
+ const node = subgraph.nodes.get(rootId);
|
|
|
|
|
+ if (node) console.log(\` \${node.name} (\${node.kind}) — \${node.filePath}:\${node.startLine}\`);
|
|
|
|
|
+ }
|
|
|
|
|
+
|
|
|
|
|
+ const fileGroups = new Map();
|
|
|
|
|
+ for (const node of subgraph.nodes.values()) {
|
|
|
|
|
+ if (!fileGroups.has(node.filePath)) fileGroups.set(node.filePath, []);
|
|
|
|
|
+ fileGroups.get(node.filePath).push(node.name);
|
|
|
|
|
+ }
|
|
|
|
|
+ console.log('Top files:');
|
|
|
|
|
+ for (const [file, nodes] of [...fileGroups.entries()].sort((a,b) => b[1].length - a[1].length).slice(0, 5)) {
|
|
|
|
|
+ console.log(\` \${file} (\${nodes.length}): \${nodes.slice(0, 5).join(', ')}\`);
|
|
|
|
|
+ }
|
|
|
|
|
+
|
|
|
|
|
+ // PASS/FAIL judgment
|
|
|
|
|
+ const hasEntryPoints = subgraph.roots.length > 0;
|
|
|
|
|
+ const hasEdges = subgraph.edges.length > 0;
|
|
|
|
|
+ const hasMultipleFiles = fileGroups.size > 1;
|
|
|
|
|
+ console.log(\`\\nVERDICT: \${hasEntryPoints && hasEdges && hasMultipleFiles ? 'PASS' : 'FAIL — needs investigation'}\`);
|
|
|
}
|
|
}
|
|
|
|
|
|
|
|
await cg.close();
|
|
await cg.close();
|
|
@@ -90,36 +424,46 @@ test().catch(console.error);
|
|
|
"
|
|
"
|
|
|
```
|
|
```
|
|
|
|
|
|
|
|
-### 5. If results are bad, diagnose and fix
|
|
|
|
|
|
|
+**What to check for each prompt:**
|
|
|
|
|
+- Does it return entry points? Zero entry points = total failure.
|
|
|
|
|
+- Are the entry points **relevant** to the question? (Not just random symbols that happen to share a word.)
|
|
|
|
|
+- Does it span multiple files? Most real questions involve cross-file understanding.
|
|
|
|
|
+- Are relationships present? An LLM needs to understand how symbols connect, not just a list of names.
|
|
|
|
|
+- Would **you** be able to answer the question from this context?
|
|
|
|
|
+
|
|
|
|
|
+---
|
|
|
|
|
|
|
|
-| Symptom | Cause | Fix |
|
|
|
|
|
-|---------|-------|-----|
|
|
|
|
|
-| Target method not in top 10 of `searchNodes` | Owner type missing from `qualified_name` | Add `getReceiverType` to `src/extraction/languages/<lang>.ts` |
|
|
|
|
|
-| Explore returns irrelevant files | Common method name flooding exact matches | Check co-location boost in `src/db/queries.ts: findNodesByExactName` |
|
|
|
|
|
-| A key term is being dropped from search | It's in the STOP_WORDS list | Edit `src/search/query-utils.ts` |
|
|
|
|
|
-| `<OwnerType>.<method>` returns "not found" | `qualified_name` doesn't contain `OwnerType::method` | Fix `getReceiverType` output |
|
|
|
|
|
|
|
+## Diagnosing Failures
|
|
|
|
|
|
|
|
-### 6. Rebuild and re-test
|
|
|
|
|
|
|
+| Symptom | Likely Cause | Where to Fix |
|
|
|
|
|
+|---------|-------------|--------------|
|
|
|
|
|
+| Method missing owner type in `qualified_name` | Language needs `getReceiverType` | `src/extraction/languages/<lang>.ts` |
|
|
|
|
|
+| `codegraph_explore` returns irrelevant files | Common names flooding FTS; co-location boost not helping | `src/db/queries.ts: findNodesByExactName`, `src/context/index.ts` |
|
|
|
|
|
+| Zero `calls` edges | `callTypes` missing or wrong AST node type | `src/extraction/languages/<lang>.ts: callTypes` |
|
|
|
|
|
+| Zero `extends`/`implements` edges | `extractInheritance()` doesn't handle this language's AST | `src/extraction/tree-sitter.ts: extractInheritance()` |
|
|
|
|
|
+| Missing node kinds (no enums, no interfaces) | AST type not listed in extractor | `src/extraction/languages/<lang>.ts: enumTypes`, `interfaceTypes`, etc. |
|
|
|
|
|
+| Search term dropped from query | Term is in the stop words list | `src/search/query-utils.ts: STOP_WORDS` |
|
|
|
|
|
+| `qualified_name` missing class for nested methods | Extraction not walking parent stack correctly | `src/extraction/tree-sitter.ts: visitNode()` |
|
|
|
|
|
+| Import edges missing | `extractImport` returns null for this syntax | `src/extraction/languages/<lang>.ts: extractImport` |
|
|
|
|
|
+
|
|
|
|
|
+## After Fixing Issues
|
|
|
|
|
|
|
|
```bash
|
|
```bash
|
|
|
npm run build
|
|
npm run build
|
|
|
-# If you changed extraction (getReceiverType), must re-index:
|
|
|
|
|
rm -rf <codebase_path>/.codegraph
|
|
rm -rf <codebase_path>/.codegraph
|
|
|
node dist/bin/codegraph.js init -iv <codebase_path>
|
|
node dist/bin/codegraph.js init -iv <codebase_path>
|
|
|
-# Then re-run Step 4
|
|
|
|
|
|
|
+# Re-run the failing tests from above
|
|
|
```
|
|
```
|
|
|
|
|
|
|
|
-### 7. Run the test suite before finishing
|
|
|
|
|
|
|
+Always run the full test suite before marking a language as verified:
|
|
|
|
|
|
|
|
```bash
|
|
```bash
|
|
|
npm test
|
|
npm test
|
|
|
```
|
|
```
|
|
|
|
|
|
|
|
-All 378+ tests must pass.
|
|
|
|
|
-
|
|
|
|
|
-## How to Add `getReceiverType` for a Language
|
|
|
|
|
|
|
+## Adding `getReceiverType`
|
|
|
|
|
|
|
|
-**Only needed for languages where methods are top-level or outside their owner type in the AST.** If the language nests methods inside class/struct bodies (Python, Java, TypeScript, C#), the qualified name already includes the parent — verify with Step 3 before adding anything.
|
|
|
|
|
|
|
+**Only needed for languages where methods are top-level or outside their owner type in the AST.** If the language nests methods inside class/struct bodies (Python, Java, TypeScript, C#), the qualified name already includes the parent — verify with the sanity check before adding anything.
|
|
|
|
|
|
|
|
### 1. Add the hook to the language extractor
|
|
### 1. Add the hook to the language extractor
|
|
|
|
|
|
|
@@ -129,7 +473,7 @@ In `src/extraction/languages/<lang>.ts`, add `getReceiverType` to the extractor
|
|
|
getReceiverType: (node, source) => {
|
|
getReceiverType: (node, source) => {
|
|
|
// Extract the owner type name from the method's AST node.
|
|
// Extract the owner type name from the method's AST node.
|
|
|
// Return the type name string, or undefined if not applicable.
|
|
// Return the type name string, or undefined if not applicable.
|
|
|
- //
|
|
|
|
|
|
|
+ //
|
|
|
// The core extractMethod() in tree-sitter.ts will use this to set:
|
|
// The core extractMethod() in tree-sitter.ts will use this to set:
|
|
|
// qualifiedName = `${filePath}::${receiverType}::${methodName}`
|
|
// qualifiedName = `${filePath}::${receiverType}::${methodName}`
|
|
|
},
|
|
},
|
|
@@ -163,29 +507,32 @@ if (receiverType) {
|
|
|
|
|
|
|
|
| File | Role |
|
|
| File | Role |
|
|
|
|------|------|
|
|
|------|------|
|
|
|
-| `src/extraction/languages/<lang>.ts` | Language extractor — implement `getReceiverType` here |
|
|
|
|
|
-| `src/extraction/tree-sitter.ts` | Core extraction — `extractMethod()` uses the hook |
|
|
|
|
|
|
|
+| `src/extraction/languages/<lang>.ts` | Language extractor — node types, call types, `getReceiverType` |
|
|
|
|
|
+| `src/extraction/tree-sitter.ts` | Core extraction — `extractMethod()`, `extractCall()`, `extractInheritance()` |
|
|
|
| `src/extraction/tree-sitter-types.ts` | `LanguageExtractor` interface definition |
|
|
| `src/extraction/tree-sitter-types.ts` | `LanguageExtractor` interface definition |
|
|
|
| `src/search/query-utils.ts` | `STOP_WORDS`, `extractSearchTerms`, `scorePathRelevance` |
|
|
| `src/search/query-utils.ts` | `STOP_WORDS`, `extractSearchTerms`, `scorePathRelevance` |
|
|
|
-| `src/db/queries.ts` | `searchNodesFTS` (BM25), `findNodesByExactName` (co-location) |
|
|
|
|
|
-| `src/context/index.ts` | `findRelevantContext` — hybrid search + co-location boost |
|
|
|
|
|
-| `src/mcp/tools.ts` | MCP handlers — `matchesSymbol` uses `qualifiedName.includes("Type::method")` |
|
|
|
|
|
|
|
+| `src/db/queries.ts` | `searchNodesFTS` (BM25), `findNodesByExactName` (co-location boost) |
|
|
|
|
|
+| `src/context/index.ts` | `findRelevantContext` — hybrid search + graph traversal |
|
|
|
|
|
+| `src/mcp/tools.ts` | MCP tool handlers — `codegraph_explore` implementation |
|
|
|
|
|
|
|
|
-## Languages Completed
|
|
|
|
|
|
|
+## Language Status
|
|
|
|
|
+
|
|
|
|
|
+### Verified
|
|
|
|
|
|
|
|
- [x] **Go** — `getReceiverType` extracts receiver from `func (sl *Type) method()`
|
|
- [x] **Go** — `getReceiverType` extracts receiver from `func (sl *Type) method()`
|
|
|
-- [x] **Swift** — NOT needed. Tree-sitter parses `extension Type { }` as `class_declaration`, so methods already get owner type in `qualified_name` (e.g., `SimplifyApply.swift::SimplifyApply.swift::ApplyInst::simplify`)
|
|
|
|
|
|
|
+- [x] **Swift** — NOT needed. Tree-sitter nests methods inside class/extension bodies
|
|
|
|
|
+- [x] **Java** — NOT needed. Methods nested in class body. Verified against Guava
|
|
|
|
|
|
|
|
-## Languages To Do
|
|
|
|
|
|
|
+### Needs Verification
|
|
|
|
|
|
|
|
-Check these — only add `getReceiverType` if methods are top-level (not nested inside their owner type in the AST):
|
|
|
|
|
|
|
+Check these — may need `getReceiverType` if methods are top-level in the AST:
|
|
|
|
|
|
|
|
- [ ] Rust — methods in `impl Type { }` blocks
|
|
- [ ] Rust — methods in `impl Type { }` blocks
|
|
|
- [ ] C++ — out-of-class method definitions `Type::method()`
|
|
- [ ] C++ — out-of-class method definitions `Type::method()`
|
|
|
- [ ] Kotlin — extension functions `fun Type.method()`
|
|
- [ ] Kotlin — extension functions `fun Type.method()`
|
|
|
|
|
|
|
|
-Verify these DON'T need it (methods nested in class body → qualified name should already be correct):
|
|
|
|
|
-- [ ] Python — verify `qualified_name` includes class name
|
|
|
|
|
-- [ ] Java — verify `qualified_name` includes class name
|
|
|
|
|
-- [ ] TypeScript — verify `qualified_name` includes class name
|
|
|
|
|
-- [ ] C# — verify `qualified_name` includes class name
|
|
|
|
|
|
|
+Verify these DON'T need `getReceiverType` (methods nested in class body):
|
|
|
|
|
+
|
|
|
|
|
+- [ ] Python
|
|
|
|
|
+- [ ] TypeScript
|
|
|
|
|
+- [ ] C#
|