1 месяц назад · 627485b25c
--- a/.cursor/rules/codegraph.mdc
+++ b/.cursor/rules/codegraph.mdc
@@ -13,22 +13,22 @@ Use codegraph for **structural** questions — what calls what, what would break
 
															 | Question | Tool |
														
 
															 |---|---|
														
 
															+| "How does X work? / trace X / explain a system / architecture" | `codegraph_explore` (seed with symbol names) |
														
 
															 | "Where is X defined?" / "Find symbol named X" | `codegraph_search` |
														
 
															 | "What calls function Y?" | `codegraph_callers` |
														
 
															 | "What does Y call?" | `codegraph_callees` |
														
 
															 | "What would break if I changed Z?" | `codegraph_impact` |
														
 
															 | "Show me Y's signature / source / docstring" | `codegraph_node` |
														
 
															 | "Give me focused context for a task/area" | `codegraph_context` |
														
 
															-| "Survey an unfamiliar module/topic" | `codegraph_explore` |
														
 
															 | "What files exist under path/" | `codegraph_files` |
														
 
															 | "Is the index healthy?" | `codegraph_status` |
														
 
															 ### Rules of thumb
														
 
															+- **`codegraph_explore` is the workhorse for understanding questions** ("how does X work", "trace…", "explain the Y system"). Feed it the key symbol/file names and read its output (line-numbered source from many files in one call). If the question names nothing concrete, do one quick `codegraph_search`/`codegraph_context` to surface the names, then explore with them. Fill gaps with `codegraph_node`/Read — don't grep-and-read your way through; that's the loop explore replaces.
														
 
															+- **Delegating exploration to a subagent?** Tell it to call `codegraph_explore` first and trust the result. A generic "explore"-style agent defaults to grep+Read and treats codegraph as just a search index, throwing away the token savings.
														
 
															 - **Trust codegraph results.** They come from a full AST parse. Do NOT re-verify them with grep — that's slower, less accurate, and wastes context.
														
 
															 - **Don't grep first** when looking up a symbol by name. `codegraph_search` is faster and returns kind + location + signature in one call.
														
 
															-- **Don't chain `codegraph_search` + `codegraph_node`** when you just want context — `codegraph_context` is one call.
														
 
															-- **`codegraph_explore` is the heavy hitter** for unfamiliar areas — it returns full source from all relevant files in one call, but is token-heavy. If your harness supports parallel subagents (e.g., Claude Code's Task tool), spawn one for explore-class questions to keep main session context clean.
														
 
															 - **Index lag**: the file watcher debounces ~500ms behind writes; don't re-query immediately after editing a file in the same turn.
														
 
															 ### If `.codegraph/` doesn't exist
														
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -44,6 +44,17 @@ and adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 
															   VS Code ~12%. Agent-trust floor still holds — the Relationships section,
														
 
															   scored cluster selection, and structured-source output are all retained.
														
 
															   Thanks to [@essopsp](https://github.com/essopsp) for the repro.
														
 
															+- **MCP / tool guidance**: the tool descriptions and installed instructions
														
 
															+  now steer agents to treat `codegraph_explore` as the workhorse for
														
 
															+  understanding/architecture/"how does X work" questions — seed it with the
														
 
															+  key symbol names (a quick `codegraph_search`/`codegraph_context` first if
														
 
															+  the question names nothing concrete) and read its output, rather than
														
 
															+  searching and then Reading each file. Diagnosed from a benchmark run where
														
 
															+  Claude Code's Explore agent used `codegraph_search` + Read + grep (37 tool
														
 
															+  calls, ~90k tokens) and never called `codegraph_explore`, vs a
														
 
															+  general-purpose agent that led with explore (13 calls, ~55k tokens) for the
														
 
															+  same VS Code question. Updated in lockstep across `server-instructions.ts`,
														
 
															+  `instructions-template.ts`, and `.cursor/rules/codegraph.mdc`.
														
 
															 ### Fixed
														
 
															 - **MCP**: source-omission markers in `codegraph_explore` and
														
@@ -51,6 +62,15 @@ and adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 
															   `... (trimmed) ...`, `... (truncated) ...`) instead of C-style `//`
														
 
															   comments, which were misleading inside Python, Ruby, and other non-C
														
 
															   fenced source blocks.
														
 
															+- **Search/explore ranking**: test-file detection now recognizes Kotlin
														
 
															+  (`*Test.kt`, `jvmTest/`/`commonTest/`/`androidTest/` source sets), Swift
														
 
															+  (`*Tests.swift`), and other camelCase test conventions, so test code is
														
 
															+  properly deprioritized in `codegraph_explore` / `codegraph_context`
														
 
															+  results. Previously only Java/JS/Python conventions were known, which let
														
 
															+  test files dominate exploration of Kotlin/Swift codebases (e.g. an OkHttp
														
 
															+  "trace a request" query returned 8/9 test files; now it surfaces
														
 
															+  `Call.kt`, `OkHttpClient.kt`, `Request.kt`, `Response.kt`). Capital-led
														
 
															+  matching keeps production files like `latest.kt` / `manifest.kt` unflagged.
														
 
															 ## [0.7.10] - 2026-05-19
														
--- a/__tests__/is-test-file.test.ts
+++ b/__tests__/is-test-file.test.ts
@@ -0,0 +1,53 @@
 
															+/**
														
 
															+ * isTestFile heuristic — test-file detection used to deprioritize test code in
														
 
															+ * search/explore ranking.
														
 
															+ *
														
 
															+ * Regression coverage for the cold-query fix: the heuristic previously only
														
 
															+ * knew Java/JS/Python conventions, so Kotlin (`*Test.kt`, `jvmTest/`), Swift
														
 
															+ * (`*Tests.swift`), and camelCase test source-set dirs slipped through — which
														
 
															+ * let OkHttp's tests flood `codegraph_explore` results on a plain-language
														
 
															+ * query. The false-positive guards matter just as much: `latest.kt` /
														
 
															+ * `manifest.kt` / a `RealCall.kt` production file must NOT be flagged.
														
 
															+ */
														
 
															+import { describe, it, expect } from 'vitest';
														
 
															+import { isTestFile } from '../src/search/query-utils';
														
 
															+
														
 
															+describe('isTestFile', () => {
														
 
															+  it('flags Kotlin test files and source sets', () => {
														
 
															+    expect(isTestFile('okhttp/src/jvmTest/kotlin/okhttp3/CallTest.kt')).toBe(true);
														
 
															+    expect(isTestFile('okhttp/src/commonTest/kotlin/okhttp3/CompressionInterceptorTest.kt')).toBe(true);
														
 
															+    expect(isTestFile('app/src/androidTest/java/com/example/FooTest.kt')).toBe(true);
														
 
															+    expect(isTestFile('module/src/integrationTest/kotlin/BarSpec.kt')).toBe(true);
														
 
															+  });
														
 
															+
														
 
															+  it('flags Swift test files', () => {
														
 
															+    expect(isTestFile('Tests/SessionTests.swift')).toBe(true);
														
 
															+    expect(isTestFile('Sources/FooTest.swift')).toBe(true);
														
 
															+  });
														
 
															+
														
 
															+  it('still flags the previously-supported conventions', () => {
														
 
															+    expect(isTestFile('foo/test_bar.py')).toBe(true);
														
 
															+    expect(isTestFile('pkg/bar_test.go')).toBe(true);
														
 
															+    expect(isTestFile('src/foo.test.ts')).toBe(true);
														
 
															+    expect(isTestFile('src/foo.spec.ts')).toBe(true);
														
 
															+    expect(isTestFile('com/example/FooTest.java')).toBe(true);
														
 
															+    expect(isTestFile('com/example/FooTestCase.java')).toBe(true);
														
 
															+    expect(isTestFile('project/__tests__/foo.ts')).toBe(true);
														
 
															+    expect(isTestFile('project/tests/foo.rb')).toBe(true);
														
 
															+  });
														
 
															+
														
 
															+  it('does NOT flag production files that merely contain "test" lowercase', () => {
														
 
															+    // The fix is capital-led so camelCase boundaries distinguish these.
														
 
															+    expect(isTestFile('src/latest/loader.kt')).toBe(false);
														
 
															+    expect(isTestFile('lib/manifest.kt')).toBe(false);
														
 
															+    expect(isTestFile('okhttp/src/jvmMain/kotlin/okhttp3/internal/connection/RealCall.kt')).toBe(false);
														
 
															+    expect(isTestFile('src/contestEntry.ts')).toBe(false);
														
 
															+    expect(isTestFile('pkg/greatest.go')).toBe(false);
														
 
															+  });
														
 
															+
														
 
															+  it('does NOT flag ordinary production source', () => {
														
 
															+    expect(isTestFile('src/flask/app.py')).toBe(false);
														
 
															+    expect(isTestFile('src/vs/workbench/api/common/extensionHostMain.ts')).toBe(false);
														
 
															+    expect(isTestFile('okhttp/src/commonJvmAndroid/kotlin/okhttp3/OkHttpClient.kt')).toBe(false);
														
 
															+  });
														
 
															+});
														
--- a/src/installer/instructions-template.ts
+++ b/src/installer/instructions-template.ts
@@ -31,22 +31,22 @@ Use codegraph for **structural** questions — what calls what, what would break
 
															 | Question | Tool |
														
 
															 |---|---|
														
 
															+| "How does X work? / trace X / explain a system / architecture" | \`codegraph_explore\` (seed with symbol names) |
														
 
															 | "Where is X defined?" / "Find symbol named X" | \`codegraph_search\` |
														
 
															 | "What calls function Y?" | \`codegraph_callers\` |
														
 
															 | "What does Y call?" | \`codegraph_callees\` |
														
 
															 | "What would break if I changed Z?" | \`codegraph_impact\` |
														
 
															 | "Show me Y's signature / source / docstring" | \`codegraph_node\` |
														
 
															 | "Give me focused context for a task/area" | \`codegraph_context\` |
														
 
															-| "Survey an unfamiliar module/topic" | \`codegraph_explore\` |
														
 
															 | "What files exist under path/" | \`codegraph_files\` |
														
 
															 | "Is the index healthy?" | \`codegraph_status\` |
														
 
															 ### Rules of thumb
														
 
															+- **\`codegraph_explore\` is the workhorse for understanding questions** ("how does X work", "trace…", "explain the Y system"). Feed it the key symbol/file names and read its output (line-numbered source from many files in one call). If the question names nothing concrete, do one quick \`codegraph_search\`/\`codegraph_context\` to surface the names, then explore with them. Fill gaps with \`codegraph_node\`/Read — don't grep-and-read your way through; that's the loop explore replaces.
														
 
															+- **Delegating exploration to a subagent?** Tell it to call \`codegraph_explore\` first and trust the result. A generic "explore"-style agent defaults to grep+Read and treats codegraph as just a search index, throwing away the token savings.
														
 
															 - **Trust codegraph results.** They come from a full AST parse. Do NOT re-verify them with grep — that's slower, less accurate, and wastes context.
														
 
															 - **Don't grep first** when looking up a symbol by name. \`codegraph_search\` is faster and returns kind + location + signature in one call.
														
 
															-- **Don't chain \`codegraph_search\` + \`codegraph_node\`** when you just want context — \`codegraph_context\` is one call.
														
 
															-- **\`codegraph_explore\` is the heavy hitter** for unfamiliar areas — it returns full source from all relevant files in one call, but is token-heavy. If your harness supports parallel subagents (e.g., Claude Code's Task tool), spawn one for explore-class questions to keep main session context clean.
														
 
															 - **Index lag**: the file watcher debounces ~500ms behind writes; don't re-query immediately after editing a file in the same turn.
														
 
															 ### If \`.codegraph/\` doesn't exist
														
--- a/src/mcp/server-instructions.ts
+++ b/src/mcp/server-instructions.ts
@@ -24,27 +24,27 @@ editing code, not during.
 
															 ## Tool selection by intent
														
 
															-- **"What is the symbol named X?"** → \`codegraph_search\`
														
 
															-- **"What's the deal with this task / feature / area?"** → \`codegraph_context\` (PRIMARY — composes search + node + callers + callees in one call)
														
 
															+- **"How does X work? / trace X end to end / explain the Y system / architecture?"** → \`codegraph_explore\` (PRIMARY for understanding — seed it with the key symbol names, read its output, don't grep+Read your way there)
														
 
															+- **"What is the symbol named X? / where is X defined?"** → \`codegraph_search\` (pinpoint lookups)
														
 
															+- **"What's the deal with this task / feature / area?"** → \`codegraph_context\` (lighter composed view of search + node + callers + callees)
														
 
															 - **"What calls this?"** → \`codegraph_callers\`
														
 
															 - **"What does this call?"** → \`codegraph_callees\`
														
 
															 - **"What would changing this break?"** → \`codegraph_impact\`
														
 
															 - **"Show me this symbol's source / signature / docstring."** → \`codegraph_node\`
														
 
															-- **"Survey an unfamiliar topic / pattern / module."** → \`codegraph_explore\` (heavier; deep dive)
														
 
															 - **"What's in directory X?"** → \`codegraph_files\`
														
 
															 - **"Is the index ready / what's its size?"** → \`codegraph_status\`
														
 
															 ## Common chains
														
 
															-- **Onboarding**: \`codegraph_context\` first. If still unclear, \`codegraph_explore\` for breadth, then \`codegraph_node\` on specific symbols.
														
 
															+- **Understanding / onboarding**: feed \`codegraph_explore\` the key symbol/file names and read its output (line-numbered source from many files in one call). If the question names nothing concrete, do ONE quick \`codegraph_search\` / \`codegraph_context\` to surface the names, then explore with them. Fill remaining gaps with \`codegraph_node\` / Read — don't drop back to grep+Read for the whole topic.
														
 
															 - **Refactor planning**: \`codegraph_search\` → \`codegraph_callers\` → \`codegraph_impact\`. The blast-radius answer comes from impact, not from walking callers manually.
														
 
															 - **Debugging a regression**: \`codegraph_callers\` of the suspected symbol; widen with \`codegraph_impact\` if an unexpected call appears.
														
 
															 ## Anti-patterns
														
 
															+- **Don't search-then-Read your way through an understanding question** — feed the names you find into \`codegraph_explore\` instead of Reading the files one by one; it does that whole loop in one call and returns line numbers you can cite directly.
														
 
															 - **Don't grep first** when looking up a symbol by name — \`codegraph_search\` is faster and returns kind + location + signature.
														
 
															-- **Don't chain \`codegraph_search\` + \`codegraph_node\`** when you just want context — \`codegraph_context\` is one round-trip.
														
 
															-- **Don't use \`codegraph_explore\` for narrow questions** — it's a multi-call deep dive, expensive in tokens. Save it for genuine "I'm new here" surveys.
														
 
															+- **Don't reach for \`codegraph_explore\` on a pinpoint "where is X defined" lookup** — \`codegraph_search\` is one cheap call.
														
 
															 - **Don't query the index immediately after editing a file** — the watcher needs ~500ms to debounce + sync. Wait for the next turn.
														
 
															 ## Limitations
														
--- a/src/mcp/tools.ts
+++ b/src/mcp/tools.ts
@@ -238,7 +238,7 @@ const projectPathProperty: PropertySchema = {
 
															 export const tools: ToolDefinition[] = [
														
 
															   {
														
 
															     name: 'codegraph_search',
														
 
															-    description: 'Quick symbol search by name. Returns locations only (no code). Use codegraph_context instead for comprehensive task context.',
														
 
															+    description: 'Quick symbol search by name. Returns locations only (no code) — best for pinpoint "where is X defined / find the symbol named X" lookups. For understanding how something works or tracing a flow, lead with codegraph_explore instead of searching then reading.',
														
 
															     inputSchema: {
														
 
															       type: 'object',
														
 
															       properties: {
														
@@ -368,13 +368,13 @@ export const tools: ToolDefinition[] = [
 
															   },
														
 
															   {
														
 
															     name: 'codegraph_explore',
														
 
															-    description: 'Deep exploration tool — returns comprehensive context for a topic in a SINGLE call. Groups all relevant source code by file (contiguous sections, not snippets), includes a relationship map, and uses deeper graph traversal. Designed to replace multiple codegraph_node + file Read calls. Use this instead of codegraph_context when you need thorough understanding. IMPORTANT: Use specific symbol names, file names, or short code terms in your query — NOT natural language sentences. Before calling this, use codegraph_search to discover relevant symbol names, then include those names in your query. Bad: "how are agent prompts loaded and passed to the CLI". Good: "readAgentsFromDirectory createClaudeSession chat-manager agents.ts".',
														
 
															+    description: 'PRIMARY TOOL for understanding questions — "how does X work", "trace X end to end", "explain the Y system", architecture/onboarding. Returns comprehensive context in a SINGLE call: relevant source grouped by file (contiguous, line-numbered sections, not snippets) + a relationship map + deep graph traversal. It REPLACES the grep+Read exploration loop: feed it the key symbol/file names and read its output — do NOT Read the files one by one. It works best when your query names the relevant symbols (e.g. "readAgentsFromDirectory createClaudeSession chat-manager agents.ts"); if the question is a plain sentence that names nothing concrete, do ONE quick codegraph_search or codegraph_context to surface the names, then call this with them. After exploring, use codegraph_node / Read only to fill specific gaps it did not cover. Prefer codegraph_search over this only for a pinpoint "where is X defined" lookup.',
														
 
															     inputSchema: {
														
 
															       type: 'object',
														
 
															       properties: {
														
 
															         query: {
														
 
															           type: 'string',
														
 
															-          description: 'Symbol names, file names, or short code terms to explore (e.g., "AuthService loginUser session-manager", "GraphTraverser BFS impact traversal.ts"). Use codegraph_search first to find relevant names.',
														
 
															+          description: 'What to explore. A short list of symbol/file/keyword terms works best (e.g., "AuthService loginUser session-manager", "GraphTraverser BFS impact traversal.ts"), but a plain-language phrase also works — the tool runs its own retrieval. No need to codegraph_search first.',
														
 
															         },
														
 
															         maxFiles: {
														
 
															           type: 'number',
														
--- a/src/search/query-utils.ts
+++ b/src/search/query-utils.ts
@@ -207,36 +207,43 @@ export function scorePathRelevance(filePath: string, query: string): number {
 
															  */
														
 
															 export function isTestFile(filePath: string): boolean {
														
 
															   const lower = filePath.toLowerCase();
														
 
															-  const fileName = path.basename(lower);
														
 
															-
														
 
															-  // Common test file patterns
														
 
															-  return (
														
 
															-    fileName.startsWith('test_') ||
														
 
															-    fileName.startsWith('test.') ||
														
 
															-    fileName.endsWith('.test.ts') ||
														
 
															-    fileName.endsWith('.test.js') ||
														
 
															-    fileName.endsWith('.test.tsx') ||
														
 
															-    fileName.endsWith('.test.jsx') ||
														
 
															-    fileName.endsWith('.spec.ts') ||
														
 
															-    fileName.endsWith('.spec.js') ||
														
 
															-    fileName.endsWith('_test.go') ||
														
 
															-    fileName.endsWith('_test.py') ||
														
 
															-    fileName.endsWith('_test.rs') ||
														
 
															-    fileName.endsWith('Tests.java') ||
														
 
															-    fileName.endsWith('Test.java') ||
														
 
															-    fileName.endsWith('Tester.java') ||
														
 
															-    fileName.endsWith('TestCase.java') ||
														
 
															-    lower.includes('/tests/') ||
														
 
															-    lower.includes('/test/') ||
														
 
															-    lower.includes('/__tests__/') ||
														
 
															-    lower.includes('/spec/') ||
														
 
															-    lower.includes('/testlib/') ||
														
 
															+  const fileName = path.basename(filePath);   // original case — needed for camelCase boundaries
														
 
															+  const lowerName = fileName.toLowerCase();
														
 
															+
														
 
															+  // --- Filename patterns ---
														
 
															+  if (
														
 
															+    lowerName.startsWith('test_') ||                              // python: test_foo.py
														
 
															+    lowerName.startsWith('test.') ||
														
 
															+    // separator-delimited: foo_test.go, foo.test.ts, foo-spec.rb, bar_spec.py
														
 
															+    /[._-](test|tests|spec|specs)\.[a-z0-9]+$/.test(lowerName) ||
														
 
															+    // CamelCase suffix (Java/Kotlin/Swift/C#/Scala): FooTest.kt, BarTests.swift,
														
 
															+    // BazSpec.scala, QuxTestCase.java. Capital-led so "latest.kt"/"manifest.kt"
														
 
															+    // (lowercase "test") are NOT matched.
														
 
															+    /(?:Test|Tests|TestCase|Tester|Spec|Specs)\.[A-Za-z0-9]+$/.test(fileName)
														
 
															+  ) {
														
 
															+    return true;
														
 
															+  }
														
 
															+
														
 
															+  // --- Directory patterns ---
														
 
															+  if (
														
 
															+    lower.includes('/tests/') || lower.includes('/test/') ||
														
 
															+    lower.includes('/__tests__/') || lower.includes('/spec/') ||
														
 
															+    lower.includes('/specs/') || lower.includes('/testlib/') ||
														
 
															     lower.includes('/testing/') ||
														
 
															-    // Non-production directories: examples, samples, benchmarks, fixtures, demos.
														
 
															-    // Check both mid-path (/integration/) and start-of-path (integration/) since
														
 
															-    // file paths may be stored as relative paths without a leading slash.
														
 
															-    matchesNonProductionDir(lower)
														
 
															-  );
														
 
															+    lower.startsWith('test/') || lower.startsWith('tests/') ||
														
 
															+    lower.startsWith('spec/') || lower.startsWith('specs/') ||
														
 
															+    // CamelCase test source-set dirs (Kotlin Multiplatform / Gradle / Xcode):
														
 
															+    // jvmTest/, commonTest/, androidTest/, iosTest/, integrationTest/. Capital-led
														
 
															+    // so "latest/" / "manifest/" are not matched.
														
 
															+    /(?:^|\/)[A-Za-z0-9]*(?:Test|Tests|Spec)\//.test(filePath)
														
 
															+  ) {
														
 
															+    return true;
														
 
															+  }
														
 
															+
														
 
															+  // Non-production directories: examples, samples, benchmarks, fixtures, demos.
														
 
															+  // Check both mid-path (/integration/) and start-of-path (integration/) since
														
 
															+  // file paths may be stored as relative paths without a leading slash.
														
 
															+  return matchesNonProductionDir(lower);
														
 
															 }
														
 
															 /**