Przeglądaj źródła

fix(mcp): don't warn "different git working tree" for submodules covered by the parent index (#1031, #1033) (#1039)

Indexing a super-repo now descends into its submodules and gitlinked clones, so
a query run from inside one resolves up to the parent's unified index — whose
graph DOES contain that nested repo's files. But the git-worktree-mismatch
warning still fired, telling the agent the results were from "a different working
tree" and to run `codegraph init -i` — which would split the submodule back into
its own index and undo the unified view. A false positive carrying harmful advice.

Distinguish a genuine borrowed worktree (the SAME repository on a different
branch — shares a git common dir with the index root) from a submodule/embedded
clone (a DIFFERENT repository — its own common dir), and suppress the warning
only for the latter. Add gitCommonDir() for the check. The issue-#155 linked-worktree
case is unchanged.

Verified end-to-end: the warning no longer fires for a submodule-rooted MCP
session and still fires for a real linked worktree. Edit-sync (manual sync +
the live watcher) keeps the nested repo's files current on both macOS and Linux
(active-submodule and bare-gitlink shapes), so suppressing the warning is safe.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Colby Mchenry 13 godzin temu
rodzic
commit
f73227d2f5
3 zmienionych plików z 137 dodań i 0 usunięć
  1. 1 0
      CHANGELOG.md
  2. 98 0
      __tests__/worktree-detection.test.ts
  3. 38 0
      src/sync/worktree.ts

+ 1 - 0
CHANGELOG.md

@@ -12,6 +12,7 @@ and adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 ### Fixes
 
 - CodeGraph now indexes nested repositories that git records as gitlinks, so a workspace built by stacking several repos inside one another indexes completely from a single `codegraph init` at the top. When a repo contains another git repo that was `git add`ed into it — so git tracks it as a `160000` "commit" pointer rather than a folder of files — or a submodule that isn't an active, initialized submodule in your checkout, that nested repo's source used to be skipped entirely: indexing the top level stopped at the nested repo's boundary and pulled in only the outer repo's own files, so a stacked-repo project came up nearly empty (one report saw ~10 files indexed at the root). CodeGraph now descends into each such nested repo that has a real working tree on disk and indexes it as its own embedded repository, recursively, so every layer of a stacked workspace is covered. Active submodules (already handled) and plain untracked nested clones are unchanged; a nested repo under a dependency directory such as `vendor/` or `node_modules/` stays excluded; and a submodule with nothing checked out on disk is correctly left alone rather than reported as empty. Thanks @ofergr and @kun-yx for the reports. (#1031, #1033)
+- CodeGraph no longer shows a misleading "different git working tree" warning when you work inside a submodule (or other nested repo) of a workspace you indexed at its root. Because indexing a workspace now pulls in its submodules and embedded clones, a query run from inside one correctly resolves up to the workspace's single index — but it was still warning that the results came from "a different working tree" and suggesting you run `codegraph init -i`, which would have split the submodule back out into its own separate index and undone the unified view. CodeGraph now recognizes that the nested repo's code is already part of the workspace index and stays quiet. The warning still appears for a genuine git worktree — a second checkout of the *same* repository on another branch, which really does have its own uncommitted symbols — since that's the case it exists for. (#1031, #1033)
 
 
 ## [1.1.2] - 2026-06-28

+ 98 - 0
__tests__/worktree-detection.test.ts

@@ -20,6 +20,7 @@ import {
   detectWorktreeIndexMismatch,
   worktreeMismatchWarning,
   gitWorktreeRoot,
+  gitCommonDir,
 } from '../src/sync/worktree';
 import CodeGraph from '../src/index';
 import { ToolHandler } from '../src/mcp/tools';
@@ -262,3 +263,100 @@ describe('worktree mismatch verdict re-resolves when the index root changes (iss
     expect(after.content[0].text).not.toContain('different git working tree');
   });
 });
+
+/**
+ * A nested repo (submodule / embedded clone) whose files the PARENT index
+ * already covers must NOT be flagged as a borrowed worktree: indexing a
+ * super-repo descends into its submodules and gitlinked clones, so a query run
+ * from inside one resolves up to the parent index — which genuinely contains
+ * that nested repo's symbols. The warning's premise is false there, and its
+ * "run codegraph init -i" advice would fragment the unified index. (#1031, #1033)
+ */
+describe('detectWorktreeIndexMismatch — nested repos covered by the parent index (#1031, #1033)', () => {
+  let parent: string;     // super-repo that owns the .codegraph index
+  let subSource: string;  // separate repo used as the submodule source
+
+  beforeEach(() => {
+    parent = fs.mkdtempSync(path.join(os.tmpdir(), 'cg-wt-parent-'));
+    subSource = fs.mkdtempSync(path.join(os.tmpdir(), 'cg-wt-subsrc-'));
+    for (const r of [parent, subSource]) {
+      git(r, 'init', '-q');
+      git(r, 'config', 'user.email', 'test@example.com');
+      git(r, 'config', 'user.name', 'Test');
+      git(r, 'config', 'commit.gpgsign', 'false');
+    }
+    fs.writeFileSync(path.join(subSource, 'lib.ts'), 'export const x = 1;\n');
+    git(subSource, 'add', '.');
+    git(subSource, 'commit', '-q', '-m', 'sub');
+    fs.writeFileSync(path.join(parent, 'README.md'), '# parent\n');
+    git(parent, 'add', '.');
+    git(parent, 'commit', '-q', '-m', 'init');
+  });
+
+  afterEach(() => {
+    fs.rmSync(parent, { recursive: true, force: true });
+    fs.rmSync(subSource, { recursive: true, force: true });
+  });
+
+  function addSubmodule(name: string): string {
+    execFileSync(
+      'git',
+      ['-c', 'protocol.file.allow=always', 'submodule', 'add', '-q', subSource, name],
+      { cwd: parent, stdio: ['ignore', 'ignore', 'ignore'] },
+    );
+    git(parent, 'commit', '-q', '-m', 'add submodule');
+    return path.join(parent, name);
+  }
+
+  function addBareGitlink(name: string): string {
+    const dir = path.join(parent, name);
+    fs.mkdirSync(dir);
+    git(dir, 'init', '-q');
+    git(dir, 'config', 'user.email', 'test@example.com');
+    git(dir, 'config', 'user.name', 'Test');
+    git(dir, 'config', 'commit.gpgsign', 'false');
+    fs.writeFileSync(path.join(dir, 'tool.ts'), 'export const y = 2;\n');
+    git(dir, 'add', '.');
+    git(dir, 'commit', '-q', '-m', 'tool');
+    git(parent, 'add', name); // records a 160000 gitlink, no .gitmodules
+    git(parent, 'commit', '-q', '-m', 'gitlink');
+    return dir;
+  }
+
+  it('does NOT flag an active submodule covered by the parent index', () => {
+    const sub = addSubmodule('service-a');
+    // The submodule IS its own working-tree root (so the old logic flagged it)…
+    expect(gitWorktreeRoot(sub)).toBe(real(sub));
+    // …but the parent index covers it, so there must be no warning.
+    expect(detectWorktreeIndexMismatch(sub, parent)).toBeNull();
+    expect(detectWorktreeIndexMismatch(path.join(sub, 'src'), parent)).toBeNull();
+  });
+
+  it('does NOT flag a bare gitlink (embedded clone, no .gitmodules) covered by the parent index', () => {
+    const embedded = addBareGitlink('embedded');
+    expect(gitWorktreeRoot(embedded)).toBe(real(embedded));
+    expect(detectWorktreeIndexMismatch(embedded, parent)).toBeNull();
+  });
+
+  it('gitCommonDir differs for a nested repo vs the parent (the discriminator)', () => {
+    const sub = addSubmodule('service-a');
+    const subCommon = gitCommonDir(sub);
+    const parentCommon = gitCommonDir(parent);
+    expect(subCommon).not.toBeNull();
+    expect(parentCommon).not.toBeNull();
+    expect(subCommon).not.toBe(parentCommon); // different repository → suppress
+  });
+
+  it('still flags a genuine linked worktree (same repo, different branch)', () => {
+    // A real worktree shares the parent's git common dir, so it stays flagged —
+    // the suppression must not weaken the issue-#155 case.
+    const wt = path.join(parent, 'wt');
+    git(parent, 'worktree', 'add', '-q', '-b', 'feature', wt);
+    try {
+      expect(gitCommonDir(wt)).toBe(gitCommonDir(parent)); // SAME repository
+      expect(detectWorktreeIndexMismatch(wt, parent)).not.toBeNull();
+    } finally {
+      try { git(parent, 'worktree', 'remove', '--force', wt); } catch { /* best effort */ }
+    }
+  });
+});

+ 38 - 0
src/sync/worktree.ts

@@ -43,6 +43,30 @@ export function gitWorktreeRoot(dir: string): string | null {
   }
 }
 
+/**
+ * Absolute, symlink-resolved git **common** directory for `dir` — the shared
+ * `.git` that all worktrees of one repository point at. Linked worktrees of the
+ * same repo report the SAME common dir; a submodule or an embedded clone is a
+ * DIFFERENT repository and reports its own (`…/.git/modules/<name>` or its own
+ * `.git`). That distinction is what separates a genuine "borrowed worktree"
+ * from a nested repo the parent index already covers. Null when not a repo.
+ */
+export function gitCommonDir(dir: string): string | null {
+  try {
+    const out = execFileSync('git', ['rev-parse', '--git-common-dir'], {
+      cwd: dir,
+      encoding: 'utf8',
+      stdio: ['ignore', 'pipe', 'ignore'],
+      windowsHide: true,
+    }).trim();
+    if (!out) return null;
+    // `--git-common-dir` is relative to cwd unless already absolute.
+    return realpath(path.isAbsolute(out) ? out : path.resolve(dir, out));
+  } catch {
+    return null;
+  }
+}
+
 export interface WorktreeIndexMismatch {
   /** The git working tree the command was run from. */
   worktreeRoot: string;
@@ -76,6 +100,20 @@ export function detectWorktreeIndexMismatch(
   // plain ancestor directory", and avoids warning outside git entirely.
   if (gitWorktreeRoot(resolvedIndexRoot) !== resolvedIndexRoot) return null;
 
+  // Don't flag a nested repo (submodule / embedded clone) that `indexRoot`'s
+  // index ALREADY covers: indexing a super-repo descends into its submodules
+  // and gitlinked clones, so a query run from inside one resolves up to the
+  // parent index — whose graph *does* contain that nested repo's files. The
+  // warning's premise ("results are a different branch; symbols changed only
+  // here are missing") is false there, and its "run codegraph init -i" advice
+  // would needlessly fragment the unified workspace index. A genuine borrowed
+  // worktree and the index root are the SAME repository (they share a git
+  // common dir); a submodule/embedded clone is a DIFFERENT repository and does
+  // not — so suppress only when the two clearly differ. (#1031, #1033)
+  const worktreeCommon = gitCommonDir(worktreeRoot);
+  const indexCommon = gitCommonDir(resolvedIndexRoot);
+  if (worktreeCommon && indexCommon && worktreeCommon !== indexCommon) return null;
+
   return { worktreeRoot, indexRoot: resolvedIndexRoot };
 }