Преглед изворни кода

fix(extraction): drop the `./` self-entry from `git ls-files --directory` (#936) (#941)

When the indexed root is a directory an enclosing git repo ignores,
`git ls-files --directory` collapses the whole cwd to a single literal
`./` entry. That sentinel reached the `ignore` matcher, which rejects it
("path should be a `path.relative()`d string, but got "./""), aborting
buildScopeIgnore — the one ignore-building call in FileWatcher.start().
So the MCP daemon's startWatching() threw, was caught as "Failed to open
project", and auto-sync never started: the index silently went stale
until a manual `codegraph sync` (CODEGRAPH_NO_DAEMON=1 was the only
workaround).

Filter the `./`/`.` self-entry wherever we consume `--directory` output
(listIgnoredDirs + the untracked-dir loop in discoverEmbeddedRepoRoots).
Semantically correct, not just a crash guard: `./` means "the whole cwd",
never a nested repo to recurse into.

Not platform-specific (reported on Codex/Windows, reproduced on macOS):
the trigger is git state, not the OS.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Colby Mchenry пре 23 часа
родитељ
комит
03666584ed
3 измењених фајлова са 39 додато и 2 уклоњено
  1. 1 0
      CHANGELOG.md
  2. 21 0
      __tests__/multi-repo-workspace.test.ts
  3. 17 2
      src/extraction/index.ts

+ 1 - 0
CHANGELOG.md

@@ -40,6 +40,7 @@ and adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 - On Linux, hitting the kernel's inotify watch limit on a large project no longer silently leaves half the tree unwatched. CodeGraph now tells you once — naming the exact setting to raise (`fs.inotify.max_user_watches`, e.g. `sudo sysctl fs.inotify.max_user_watches=1048576`) — and keeps live-watching the directories it could register while `codegraph sync` (or the git sync hooks) covers the rest. (#876)
 - A long-running MCP server now notices when a git worktree gains its own index. Before, if the server (or shared daemon) first saw a worktree before you ran `codegraph init` in it — so the lookup walked up to the main checkout's index — it pinned that decision for its whole life: even after the worktree had its own `.codegraph/`, every query kept hitting the main checkout's index and every result carried a false "this index belongs to a different git working tree" warning, until you restarted the server. The CLI got it right but the MCP server didn't, and re-indexing didn't help. The server now re-checks which index a path belongs to on each call, so the worktree's own index is picked up — and the stale warning drops — without a restart. (#926)
 - A long-running MCP server now recovers when your index is deleted and rebuilt at the same path. If `.codegraph/` was removed and recreated while the server held it open — most easily by recreating a git worktree at the same path, or `rm`-ing `.codegraph/` and running `codegraph init` again — the server kept reading the old, deleted database file and served a frozen snapshot: renamed or removed symbols still showed as live, new ones were missing, and `codegraph sync` couldn't refresh it — only restarting the server fixed it. The server now detects that the database file was swapped out from under it and reopens the live one in place, so results stay correct without a restart. (On Linux and macOS; Windows doesn't allow deleting an open file, so it isn't affected.) (#925)
+- The MCP server now opens and auto-syncs a project that lives inside a folder an enclosing git repository ignores. Before, if the directory you indexed sat within a larger repo that gitignored it, the shared MCP daemon failed to open the project — its log repeated `Failed to open project … path should be a` `path.relative()` `d string, but got "./"` — so the file watcher never started and the index silently went stale until you ran `codegraph sync` by hand (setting `CODEGRAPH_NO_DAEMON=1` was the only workaround). The daemon now opens the project and starts watching as expected. Most visible with Codex on Windows, but the cause wasn't platform-specific. (#936)
 
 
 ## [1.0.1] - 2026-06-13

+ 21 - 0
__tests__/multi-repo-workspace.test.ts

@@ -203,6 +203,27 @@ describe('multi-repo workspaces (#514)', () => {
     expect(scope.ignores('src/app.ts')).toBe(false);
   });
 
+  it('buildScopeIgnore: indexed root is itself a gitignored subdir of an enclosing repo (#936)', () => {
+    // `child/` is NOT its own repo, so `git` resolves the ENCLOSING repo from
+    // inside it — and `git ls-files --directory`, whose cwd is then a wholly
+    // ignored directory, emits the literal `./` ("this entire directory").
+    // That sentinel used to reach the `ignore` matcher and throw
+    // ("path should be a `path.relative()`d string, but got "./""), aborting
+    // buildScopeIgnore → the MCP daemon's watcher never started and auto-sync
+    // silently stalled until a manual `codegraph sync`.
+    write(path.join(ws, 'child/src/a.ts'), 'export const x = 1;\n');
+    write(path.join(ws, '.gitignore'), '/child/\n');
+    makeRepo(ws);
+
+    const child = path.join(ws, 'child');
+    // The crux: building scope for the ignored subdir must not throw.
+    const scope = buildScopeIgnore(child);
+    // The subdir's own source is watchable/indexable, not ignored.
+    expect(scope.ignores('src/a.ts')).toBe(false);
+    // And the `./` self entry must not be mistaken for a nested embedded repo.
+    expect(discoverEmbeddedRepoRoots(child)).toEqual([]);
+  });
+
   it('sync picks up a change inside a gitignored embedded repo', async () => {
     write(path.join(ws, 'packages/proj-a/src/auth.ts'), 'export function login() { return 1; }\n');
     makeRepo(path.join(ws, 'packages/proj-a'));

+ 17 - 2
src/extraction/index.ts

@@ -256,6 +256,21 @@ function defaultsOnlyIgnore(): Ignore {
   return ignore().add(DEFAULT_IGNORE_PATTERNS);
 }
 
+/**
+ * `git ls-files --directory` collapses a wholly-untracked/ignored directory into
+ * one entry — and when the command's own cwd is such a directory (the indexed
+ * root is itself a git-ignored subdir of an enclosing repo), git emits the
+ * literal `./` meaning "this entire directory". That sentinel is not a real
+ * nested path: feeding it to the `ignore` matcher throws ("path should be a
+ * `path.relative()`d string, but got "./""), which used to abort `buildScopeIgnore`
+ * and so break the MCP daemon's watcher/auto-sync on connect; and joining it back
+ * onto `repoDir` would just re-point at the cwd. Drop it wherever we consume
+ * `--directory` output. (#936)
+ */
+function isWholeCwdEntry(entry: string): boolean {
+  return entry === './' || entry === '.' || entry === '';
+}
+
 /**
  * List the gitignored DIRECTORIES of a repo (collapsed, trailing-slash form),
  * relative to `repoDir`. These are invisible to every other `git ls-files` /
@@ -270,7 +285,7 @@ function listIgnoredDirs(repoDir: string): string[] {
       ['ls-files', '-z', '-o', '-i', '--exclude-standard', '--directory'],
       { cwd: repoDir, encoding: 'utf-8' as const, timeout: 30000, maxBuffer: 50 * 1024 * 1024, stdio: ['pipe', 'pipe', 'pipe'] as ['pipe', 'pipe', 'pipe'], windowsHide: true }
     );
-    return out.split('\0').filter((e) => e.endsWith('/'));
+    return out.split('\0').filter((e) => e.endsWith('/') && !isWholeCwdEntry(e));
   } catch {
     return [];
   }
@@ -434,7 +449,7 @@ export function discoverEmbeddedRepoRoots(rootDir: string): string[] {
         { cwd: repoAbs, encoding: 'utf-8', timeout: 30000, maxBuffer: 50 * 1024 * 1024, stdio: ['pipe', 'pipe', 'pipe'], windowsHide: true }
       );
       for (const e of o.split('\0')) {
-        if (e.endsWith('/') && !defaults.ignores(e)) {
+        if (e.endsWith('/') && !isWholeCwdEntry(e) && !defaults.ignores(e)) {
           candidates.push(...findNestedGitRepos(path.join(repoAbs, e), e));
         }
       }