Переглянути джерело

feat: extend coverage to all supported languages, not just Go

PR review feedback: the audit was Go-driven, so the patterns I added
were Go-flavored. Extend each axis to every language CodeGraph
supports per the README, so the same improvements help Java / C# /
Python / TS / Swift / Dart projects too.

**generated-detection.ts** — Added patterns for:
- TS/JS: `.gen.[jt]sx?`, `.pb.[jt]s`, `_pb.[jt]s`, `_grpc_pb.[jt]s`
  (ts-proto, gRPC-web, Apollo / GraphQL codegen, Hasura).
- Python: `_pb2.pyi` (mypy stubs from protobuf).
- C#: `.g.cs` (T4 / Razor codegen), `Grpc.cs` (protoc-gen-csharp).
- Java: `OuterClass.java` (protoc-gen-java), `Grpc.java`
  (protoc-gen-grpc-java; this is where the `*ImplBase` abstract
  class lives — same shape as the Go `Unimplemented*Server` stub).
- Swift: `.pb.swift` (protoc-gen-swift).
- Dart: `.pb.dart`, `.pbgrpc.dart`, `.chopper.dart`.
- Rust: `.generated.rs`.

**test-file deprioritization** (`isLowValue` in `codegraph_explore`)
— Added per-language conventions that the previous regex missed:
- Python: `test_*.py` (pytest discovery) and `*_test.py`.
- Ruby: `*_test.rb` (minitest) — `*_spec.rb` already covered.
- C#: `*Tests.cs`, `*Test.cs`, `*Spec.cs`.
- Swift: `*Tests.swift` (XCTest).
- Dart: `*_test.dart`.

**IFACE_OVERRIDE_LANGS** in `callback-synthesizer.ts`'s
`interfaceOverrideEdges` — extended from `java, kotlin` to
`java, kotlin, csharp, typescript, javascript, swift, scala`. Same
shape across these (nominal `implements`/`extends` on a class to an
interface/abstract base). Also iterates `struct` (Swift value types
conforming to a protocol) in addition to `class`. The existing
matchesSymbol-style logic and `getOutgoingEdges(..., ['implements',
'extends'])` work unchanged.

**CLAUDE.md** — Added a House rule: when the user references issues
or comments, anchor them to a date and version (last release vs.
last main commit vs. current branch tip) BEFORE concluding a fix is
incomplete. Issue #388 comments from May 25-27 were responding to
the released v0.9.5 / merged-PR-469 state — not to this branch's
in-flight work. The new rule walks through the disambiguation:
`grep -m1 '^## \[' CHANGELOG.md` for release version, `git log
--first-parent main -1` for main tip.

Tests: 1076/1076 still pass.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Colby McHenry 3 тижнів тому
батько
коміт
27524e36a9

+ 5 - 0
CLAUDE.md

@@ -256,3 +256,8 @@ publish actions on shared state. Write the files, hand the user the commands.
 - The `0.7.x` line is in active multi-agent rollout. Any change to `src/installer/` (especially `targets/`) needs corresponding test coverage and a CHANGELOG entry — installer regressions break every new install silently.
 - When changing what the MCP tools do or how agents should use them, update **all three** of `src/mcp/server-instructions.ts`, `src/installer/instructions-template.ts`, and `.cursor/rules/codegraph.mdc` — they're written to different places but say the same thing.
 - CodeGraph provides **code context**, not product requirements. For new features, ask the user about UX, edge cases, and acceptance criteria — the graph won't tell you.
+- **When the user references issues, PR comments, or external reports, anchor them to a date and version before drawing conclusions.** Check the comment's `createdAt` against:
+  - The **last released version** — `grep -m1 '^## \[' CHANGELOG.md` shows the top-of-file version (older releases follow). A comment dated before the latest `## [X.Y.Z] - YYYY-MM-DD` is reacting to *released* state — work that's only on `main` or on an unmerged branch doesn't apply.
+  - The **last main commit** — `git log --first-parent main -1 --format='%ai %h %s'`. A comment after the last release but before a fix on main may already be addressed there but unreleased.
+  - The **current branch's tip** — your own unmerged work obviously can't be what the comment is reacting to.
+  Always disambiguate "released," "merged-but-unreleased," and "in-progress" before agreeing that a user-reported problem is unfixed (or that a fix is incomplete). A user saying "your fix only covers X" about a recent PR is usually pointing at the *released* shortcomings — your in-flight branch may already address them but they have no way to know that.

+ 26 - 3
src/extraction/generated-detection.ts

@@ -34,15 +34,38 @@ const GENERATED_PATTERNS: ReadonlyArray<RegExp> = [
   /_mock\.go$/,
   /_mocks\.go$/,
   /^mock_[^/]+\.go$/,
-  // TypeScript / JavaScript — common codegen suffix
+  // TypeScript / JavaScript — common codegen suffixes (Apollo / GraphQL
+  // codegen, Prisma, Hasura, ts-proto, gRPC-web, swagger-codegen).
   /\.generated\.[jt]sx?$/,
-  // Python — protobuf
+  /\.gen\.[jt]sx?$/,
+  /\.pb\.[jt]s$/,
+  /_pb\.[jt]s$/,
+  /_grpc_pb\.[jt]s$/,
+  // Python — protobuf / gRPC / openapi-codegen
   /_pb2(_grpc)?\.py$/,
+  /_pb2\.pyi$/,
   // C++ — protobuf
   /\.pb\.(cc|h)$/,
-  // Dart — build_runner / freezed
+  // C# — protobuf / gRPC (protoc-gen-csharp puts output under obj/ but
+  // many projects also commit *.g.cs and *Grpc.cs siblings)
+  /\.g\.cs$/,
+  /Grpc\.cs$/,
+  // Java — protobuf / gRPC: protoc-gen-java emits `*OuterClass.java`,
+  // protoc-gen-grpc-java emits `*Grpc.java`. The XxxImplBase abstract
+  // class lives inside Xxx*Grpc.java.
+  /OuterClass\.java$/,
+  /Grpc\.java$/,
+  // Swift — protobuf
+  /\.pb\.swift$/,
+  // Dart — build_runner / freezed / json_serializable / chopper
   /\.g\.dart$/,
   /\.freezed\.dart$/,
+  /\.pb\.dart$/,
+  /\.pbgrpc\.dart$/,
+  /\.chopper\.dart$/,
+  // Rust — common build.rs OUT_DIR outputs are usually outside the source
+  // tree, but in-tree generated files often use `*.generated.rs`.
+  /\.generated\.rs$/,
 ];
 
 /**

+ 21 - 4
src/mcp/tools.ts

@@ -1914,15 +1914,32 @@ export class ToolHandler {
 
       // Deprioritize test files, icon files, and i18n files. Covers both
       // directory-style (`/tests/`, `/spec/`) AND suffix-style conventions
-      // (`*_test.go`, `*_spec.rb`, `*.test.ts`, `*.spec.tsx`, `*Test.java`,
-      // `*Spec.kt`) — without the suffix check, etcd's `watchable_store_test.go`
-      // displaced 5K chars of real-flow source in codegraph_explore for Q2.
+      // across every language we support — without the suffix check, etcd's
+      // `watchable_store_test.go` displaced 5K chars of real-flow source in
+      // codegraph_explore for Q2.
       const isLowValue = (p: string) =>
         /\/(tests?|__tests?__|spec)\//i.test(p) ||
-        /_test\.(go|py|rb)$/i.test(p) ||
+        // Go: `*_test.go`
+        /_test\.go$/i.test(p) ||
+        // Python: `test_*.py` (pytest discovery) and `*_test.py`
+        /(?:^|\/)test_[^/]+\.py$/i.test(p) ||
+        /_test\.py$/i.test(p) ||
+        // Ruby: `*_spec.rb` (rspec) and `*_test.rb` (minitest)
         /_spec\.rb$/i.test(p) ||
+        /_test\.rb$/i.test(p) ||
+        // JS / TS: `*.test.ts`, `*.spec.tsx`, etc.
         /\.(test|spec)\.[jt]sx?$/i.test(p) ||
+        // JVM: `*Test.java`, `*Tests.java`, `*Spec.kt`, `*Spec.scala`
         /(Test|Spec|Tests)\.(java|kt|scala)$/.test(p) ||
+        // C#: `*Tests.cs`, `*Test.cs`, `*Spec.cs`
+        /(Tests?|Spec)\.cs$/.test(p) ||
+        // Swift: `*Tests.swift` (XCTest convention)
+        /Tests?\.swift$/.test(p) ||
+        // Dart: `*_test.dart`
+        /_test\.dart$/i.test(p) ||
+        // Rust: `tests/*.rs` already caught by `/tests/` above; `_test.rs`
+        // and `_tests.rs` aren't Rust conventions (Rust uses `#[cfg(test)]`
+        // inside source files), so nothing extra needed.
         /\bicons?\b/i.test(p) ||
         /\bi18n\b/i.test(p);
       const aLow = isLowValue(aPath);

+ 17 - 2
src/resolution/callback-synthesizer.ts

@@ -338,7 +338,16 @@ function cppOverrideEdges(queries: QueryBuilder): Edge[] {
  * trace/callees reach the implementation. Over-approximation accepted
  * (reachability-correct); capped per class, gated to JVM languages.
  */
-const IFACE_OVERRIDE_LANGS = new Set(['java', 'kotlin']);
+// Languages whose static `implements`/`extends` edges should bridge an
+// interface (or abstract base) method to the matching concrete-class method.
+// The set is "languages with explicit nominal subtyping and a single class
+// kind that holds methods" — i.e. the shape this loop expects. Swift and
+// Scala fit shape-wise (Swift `protocol`/`class`, Scala `trait`/`class`)
+// and are added below; their concrete-side nodes can be a `struct` (Swift)
+// or an `object` (Scala) so the loop also iterates those kinds.
+const IFACE_OVERRIDE_LANGS = new Set([
+  'java', 'kotlin', 'csharp', 'typescript', 'javascript', 'swift', 'scala',
+]);
 function interfaceOverrideEdges(queries: QueryBuilder): Edge[] {
   const edges: Edge[] = [];
   const seen = new Set<string>();
@@ -347,7 +356,12 @@ function interfaceOverrideEdges(queries: QueryBuilder): Edge[] {
       .getOutgoingEdges(classId, ['contains'])
       .map((e) => queries.getNodeById(e.target))
       .filter((n): n is Node => !!n && n.kind === 'method');
-  for (const cls of queries.getNodesByKind('class')) {
+  // Concrete-side kinds vary by language: `class` covers Java / Kotlin /
+  // C# / TS / Swift-classes / Scala-classes; `struct` covers Swift value
+  // types that conform to protocols. Iterate both.
+  const concreteKinds = ['class', 'struct'] as const;
+  for (const kind of concreteKinds) {
+  for (const cls of queries.getNodesByKind(kind)) {
     const implMethods = methodsOf(cls.id).filter((n) => IFACE_OVERRIDE_LANGS.has(n.language));
     if (implMethods.length === 0) continue;
     for (const sup of queries.getOutgoingEdges(cls.id, ['implements', 'extends'])) {
@@ -384,6 +398,7 @@ function interfaceOverrideEdges(queries: QueryBuilder): Edge[] {
       }
     }
   }
+  }
   return edges;
 }