فهرست منبع

docs: explain auto-syncing (no manual sync needed) in site + README (#457)

Originated from issue #438 ("Will newly created files be missing from
query results if sync is not manually run?"). Real users are second-
guessing whether their agent's freshly-created files are getting
indexed. They shouldn't have to test for themselves to find out.

## site/src/content/docs/guides/indexing.md

Expanded the existing 2-sentence "Stay fresh automatically" section
into the full three-layer explanation:

  1. File watcher with debounced auto-sync (default 2000ms, tunable
     via CODEGRAPH_WATCH_DEBOUNCE_MS, clamp [100ms, 60s]).
  2. Per-file staleness banner (#403) — covers the debounce window.
     Quoted the actual banner format + the verified Claude Code
     follow-up Read behaviour.
  3. Connect-time catch-up (#414) — covers gaps when the MCP server
     wasn't running.

Plus: how to verify state via codegraph_status (### Pending sync:),
when manual codegraph sync DOES make sense (watcher disabled / CI
scripting), and a link out to the v0.9.5 release notes.

## README.md

Added a <details><summary> collapsible right under the Key Features
table — primed by the existing 'Always Fresh' row in that table.
Condensed to ~10 lines covering the same three layers + a code-block
flow diagram + the verify command, with a deep link to the full guide.
GitHub renders <details> blocks natively, so the section is collapsed
by default and doesn't make the README scroll-length grow visibly.

Heading kept as 'Stay fresh automatically' (single-word slug) so the
README's deep-link anchor is predictable; the longer tagline lives on
its own line below.

940/942 tests still pass; no code changes.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Colby Mchenry 3 هفته پیش
والد
کامیت
7479c5e82b
2فایلهای تغییر یافته به همراه91 افزوده شده و 2 حذف شده
  1. 27 0
      README.md
  2. 64 2
      site/src/content/docs/guides/indexing.md

+ 27 - 0
README.md

@@ -140,6 +140,33 @@ The gains scale with codebase size: on large repos the agent answers from the in
 | **Mixed iOS / React Native / Expo** | Closes cross-language flows that static parsing misses: Swift ↔ ObjC bridging, React Native legacy bridge + TurboModules + Fabric view components, native → JS event emitters, Expo Modules |
 | **100% Local** | No data leaves your machine. No API keys. No external services. SQLite database only |
 
+<details>
+<summary><strong>How auto-syncing works — and why you don't need to run <code>codegraph sync</code> manually</strong></summary>
+
+When your agent (Claude Code, Cursor, Codex, opencode) launches `codegraph serve --mcp`, three layers keep the index in step with your code — and make sure the agent never gets a silent wrong answer in the brief window between an edit and the next sync:
+
+1. **File watcher with debounced auto-sync.** A native FSEvents / inotify / ReadDirectoryChangesW watcher captures every source-file create / modify / delete and triggers a re-index after a debounce window (default `2000ms`, tunable via `CODEGRAPH_WATCH_DEBOUNCE_MS`, clamped to `[100ms, 60s]`). Bursts of edits collapse into a single sync.
+
+2. **Per-file staleness banner.** During the brief debounce window, MCP tool responses that would reference a still-pending file prepend a `⚠️` banner naming it and telling the agent to `Read` it directly. Pending files NOT referenced by the response surface as a small footer instead. Either way, the agent gets an explicit signal — validated with Claude Code, where the agent literally says "Reading the file directly for the live content" before opening it.
+
+3. **Connect-time catch-up.** When the MCP server (re)connects, codegraph runs a fast `(size, mtime)` + content-hash reconciliation against the working tree before answering the first query — so edits made while no MCP server was running (a `git pull` from the terminal, edits from another editor, a previous agent session that exited) get absorbed on the next session's first tool call.
+
+```
+agent writes src/Widget.ts
+  → watcher fires (<100ms)
+  → debounce (default 2s)
+  → sync; Widget.ts is in the index
+  → next agent query sees it
+```
+
+**Verify any time** with `codegraph_status` (via MCP) or `codegraph status` (CLI). If anything is pending, you'll see a `### Pending sync:` section naming the files and their edit age.
+
+The handful of cases where manual `codegraph sync` makes sense: the watcher is disabled (sandboxed environments, or `CODEGRAPH_NO_DAEMON=1`), or you're scripting against the index outside an agent session and want a pre-flight sync at the start of your script.
+
+→ Full deep-dive in [Guides → Indexing a Project](https://colbymchenry.github.io/codegraph/guides/indexing/#stay-fresh-automatically).
+
+</details>
+
 ---
 
 ## Framework-aware Routes

+ 64 - 2
site/src/content/docs/guides/indexing.md

@@ -24,7 +24,69 @@ codegraph sync            # incremental — only changed files
 
 ## Stay fresh automatically
 
-When the MCP server is running, CodeGraph watches your project with native OS file events and syncs in the background — debounced, and filtered to source files only. You don't need to run `sync` by hand during an agent session.
+**You don't need to run `codegraph sync` by hand during an agent session.** When your agent (Claude Code, Cursor, Codex, opencode) launches `codegraph serve --mcp`, three layers cooperate to keep the index in step with your code — and to never give the agent a quiet wrong answer in the small window between an edit and the next sync.
+
+### 1. File watcher with debounced auto-sync (always on)
+
+`serve --mcp` spins up a native file watcher (FSEvents on macOS, inotify on Linux, ReadDirectoryChangesW on Windows) over the project root. Every source-file create / modify / delete is captured. A debounce timer collapses bursts of edits into a single sync.
+
+```
+agent writes src/Widget.ts
+  → watcher fires (event delivery: typically <100ms)
+  → 2000ms debounce
+  → sync runs; Widget.ts's nodes + edges are in the index
+  → next agent query sees it
+```
+
+**Tunable**: `CODEGRAPH_WATCH_DEBOUNCE_MS` overrides the default 2000ms, clamped to `[100ms, 60s]`. Useful when a build step or formatter writes many files in a tight burst — bump it to `5000` or `10000` so the watcher coalesces them into one sync.
+
+### 2. Per-file staleness banner — covers the debounce window
+
+The watcher debounce introduces a small window (typically 2s) where a freshly-edited file is on disk but not yet in the index. CodeGraph closes that window with a per-file staleness banner: if any MCP tool response would reference a file that's currently pending re-index, the response prepends a `⚠️` banner naming the stale file:
+
+```
+⚠️ Some files referenced below were edited since the last index sync —
+their codegraph entries may be stale:
+  - src/Widget.ts (edited 800ms ago, pending sync)
+For accurate content of those specific files, Read them directly.
+The rest of this response is fresh.
+
+## Code Context
+…
+```
+
+Agents read this and follow up with a direct `Read` on the named file — validated end-to-end with Claude Code, where the agent literally says "Reading the file directly for the live content" before opening it. So even during the 2-second debounce window, the agent never gets a silent wrong answer.
+
+Pending files **not** referenced by the response surface as a small footer instead (`(Note: N file(s) elsewhere in this project are pending index sync but were not referenced above: …)`). Either way, the signal is explicit.
+
+### 3. Connect-time catch-up — covers gaps when the MCP server wasn't running
+
+When your editor / agent (re)connects to the MCP server, codegraph runs a fast filesystem-based reconciliation (a `(size, mtime)` stat pre-filter, then a content hash on the rest) before answering the first query. So files changed while no MCP server was running — a `git pull` from the terminal, an edit from another editor, an agent that finished and exited — are caught up automatically on the next session's first tool call.
+
+### Verify what the watcher sees
+
+`codegraph_status` exposes the pending set first-class — useful for an agent asking "is the index caught up?" in one call:
+
+```
+codegraph_status →
+  ## CodeGraph Status
+  …
+  ### Pending sync:
+  - src/Widget.ts (edited 1200ms ago)
+```
+
+If `### Pending sync:` isn't in the response, nothing is in flight.
+
+### When manual `codegraph sync` makes sense
+
+Almost never. The edge cases:
+
+- **The watcher is disabled.** Sandboxes that block local fs watchers, or you've set `CODEGRAPH_NO_DAEMON=1` to opt out of the shared daemon. In those cases `codegraph sync` is the manual fallback.
+- **Pre-flight before a CI run.** If you're scripting against the index outside an agent session, a single `codegraph sync` at the start of the script guarantees the index reflects the current working tree.
+
+Otherwise: just use it. The watcher + banner + connect-sync covers the AI-assisted workflow end-to-end. If you're seeing files genuinely missed after the debounce window has passed, that's a bug — please file an issue with a reproduction.
+
+> See the v0.9.5 release notes for the [staleness banner (#403)](https://github.com/colbymchenry/codegraph/releases/tag/v0.9.5) and the connect-time catch-up (#414); both shipped together.
 
 ## Check status
 
@@ -32,7 +94,7 @@ When the MCP server is running, CodeGraph watches your project with native OS fi
 codegraph status
 ```
 
-Reports node/edge/file counts, the active SQLite backend, and the journal mode.
+Reports node/edge/file counts, the active SQLite backend, and the journal mode. In an agent session, the MCP-side `codegraph_status` additionally surfaces the `### Pending sync:` block described above.
 
 ## What gets indexed