v1.33.0.0 feat: /sync-gbrain memory-stage batch-import refactor (D1-D8) + F6/F9 + signal cleanup (#1432)

* refactor: batch-import architecture (D1-D8) + F6 atomic state + F9 full-file hash bin/gstack-memory-ingest.ts: rewrite memory ingest around `gbrain import <dir>` batch path. Replaces per-file gbrainPutPage loop (~470s of subprocess startup per cold run) with prepare-then-batch: walkAllSources -> preparePages: mtime-skip + optional gitleaks (--scan-secrets) + parse -> writeStaged: mkdir -p per slug segment, hierarchical (D1) -> snapshot ~/.gbrain/sync-failures.jsonl byte offset -> runGbrainImport (async spawn) -> parseImportJson -> readNewFailures: read appended bytes, map back to source paths (D7) -> state.sessions[path] = {...} for files NOT in failed set -> saveStateAtomic (F6) + cleanupStagingDir Architecture decisions: D1 hierarchical staging dir D2 cut over, deleted gbrainPutPage entirely D3 source-file gitleaks made opt-in via --scan-secrets (gstack-brain-sync owns the cross-machine boundary; per-file scan was redundant ~470s tax) D4 OK/ERR verdict (no DEGRADED tri-state) D5 unified state schema (no separate skip-list) D6 trust gbrain content_hash idempotency (no skip_reason bookkeeping) D7 byte-offset snapshot of sync-failures.jsonl + per-source mapping F6 saveState uses tmp+rename atomic write F9 fileSha256 removes 1MB cap; full-file hash (no more silent tail-edit misses on long partial transcripts) Signal handling: installSignalForwarder propagates SIGTERM/SIGINT to the gbrain child process AND synchronously cleans the staging dir before process.exit. Pre-fix, orchestrator timeouts left gbrain processes orphaned holding the PGLite write lock (observed: 15-hour-CPU-time orphan still alive a day later). parseImportJson returns null on unparseable output (treated as ERR by caller) instead of silently zeroing through. gbrainAvailable() probes for the `import` subcommand instead of `put`. Plan + review chain at /Users/garrytan/.claude/plans/purrfect-tumbling-quiche.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat: orchestrator OK/ERR verdict parser for batch memory ingest gstack-gbrain-sync.ts: memory-stage parser now picks [memory-ingest] ERR lines preferentially over the latest [memory-ingest] line, strips the prefix and any leading 'ERR: ' for cleaner summary output, and surfaces '(killed by signal / timeout)' when the child exits with status=null. Matches D6's OK/ERR contract: per-file failures (FILE_TOO_LARGE etc.) show in the summary count but only system-level failures (gbrain crash, process kill, missing CLI) mark the stage ERR. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test: batch-ingest writer regressions + refresh golden ship fixtures test/gstack-memory-ingest.test.ts: 5 new tests for the batch-import architecture: 1. D1 hierarchical staging slug round-trip — asserts staged file lives in transcripts/claude-code/<dir>/*.md, not flat at staging root 2. Frontmatter injection — asserts title/type/tags written into the staged page's YAML block 3. D7 sync-failures.jsonl exclusion — files listed as failed by gbrain do NOT get state-recorded; one of two test sessions lands, the other stays un-ingested for retry next run 4. Missing-`import`-subcommand error path — when gbrain only advertises legacy `put`, memory-ingest exits 1 with [memory-ingest] ERR 5. --scan-secrets opt-in path — verifies a dirty-source file is skipped via the secret-scan match when the flag is on, while a clean session in the same run still gets staged Replaces the prior put-per-file shim with an import-batch shim. The shim fails loudly (exit 99) if the new code ever regresses to per-file `gbrain put` calls. test/fixtures/golden/{claude,codex,factory}-ship-SKILL.md: refresh golden baselines to match the current generated SKILL.md content after the v1.31.0.0 AskUserQuestion fallback-clause deletion. Goldens were stale from that release; test was failing on origin/main before this PR. Caught by the /ship test pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * v1.33.0.0 docs: design doc, P2 perf TODOs, gbrain guidance block, changelog docs/designs/SYNC_GBRAIN_BATCH_INGEST.md: full design doc with the 8 decisions (D1-D8), source-verified gbrain behaviors (content_hash idempotency, frontmatter parity, path-authoritative slug, per-file failure surface), measured performance vs plan target, F9 hash migration one-time cliff note, and follow-up TODOs. CLAUDE.md: append `## GBrain Search Guidance` block from /sync-gbrain indicating this worktree's pin and how the agent should prefer gbrain search over Grep for semantic queries. TODOS.md: P2 `gbrain import` perf-on-large-staging-dirs investigation (5,131 files takes >10min in gbrain when 501 takes 10s — likely N+1 SQL or auto-link reconciliation). P3 cache-no-changes-since-last-import at the prepare-batch level for true no-op fast paths. VERSION + package.json: bump to 1.33.0.0 (queue-aware via bin/gstack-next-version — skipped v1.32.0.0 which is claimed by sibling worktree garrytan/wellington / PR #1431). CHANGELOG.md: v1.33.0.0 entry per the release-summary format. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs: setup-gbrain/memory.md reflects opt-in per-file gitleaks Per-file gitleaks scanning during memory ingest is now opt-in via --scan-secrets (or GSTACK_MEMORY_INGEST_SCAN_SECRETS=1). Update the user-facing reference doc so it stops claiming "every page passes through gitleaks." Also corrects the /gbrain-sync → /sync-gbrain command typo and the post-incident recovery section. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-13 07:53:04 +08:00 · 2026-05-11 18:47:33 -07:00
parent 74895062fb
commit d21ba06b5a
12 changed files with 1523 additions and 223 deletions
--- a/bin/gstack-memory-ingest.ts
+++ b/bin/gstack-memory-ingest.ts
@@ -47,9 +47,14 @@ import {
  statSync,
  mkdirSync,
  appendFileSync,
+  renameSync,
+  openSync,
+  readSync,
+  closeSync,
+  rmSync,
 } from "fs";
 import { join, basename, dirname } from "path";
-import { execSync, execFileSync } from "child_process";
+import { execSync, execFileSync, spawnSync, spawn, type ChildProcess } from "child_process";
 import { homedir } from "os";
 import { createHash } from "crypto";

@@ -73,6 +78,12 @@ interface CliArgs {
  sources: Set<MemoryType>;
  limit: number | null;
  noWrite: boolean;
+  /**
+   * Opt-in per-file gitleaks scan during the prepare phase. Off by
+   * default — the cross-machine boundary (gstack-brain-sync, git push)
+   * has its own scanner. Setting this adds ~4-8 min to cold runs.
+   */
+  scanSecrets: boolean;
 }

 type MemoryType =
@@ -137,6 +148,14 @@ interface BulkResult {
  failed: number;
  duration_ms: number;
  partial_pages: number;
+  /**
+   * D6: when set, indicates a process-level failure (gbrain CLI missing
+   * or `gbrain import` crashed). Per-file errors (FILE_TOO_LARGE etc.)
+   * land in `failed` but do NOT set this flag — the orchestrator should
+   * still treat the run as OK with summary mentioning the failure count.
+   * Only when this is set does the verdict become ERR.
+   */
+  system_error?: string;
 }

 // ── Constants ──────────────────────────────────────────────────────────────
@@ -176,6 +195,9 @@ Options:
  --limit <N>          Stop after N pages written (smoke testing).
  --no-write           Skip gbrain put_page calls (still updates state file).
                       Used by tests + dry runs without actual ingest.
+  --scan-secrets       Opt-in per-file gitleaks scan during prepare. Off by
+                       default; gstack-brain-sync already gates the git-push
+                       boundary. Adds ~4-8 min to cold runs.
  --help               This text.
 `);
 }
@@ -190,6 +212,7 @@ function parseArgs(): CliArgs {
  let limit: number | null = null;
  let sources: Set<MemoryType> = new Set(ALL_TYPES);
  let noWrite = process.env.GSTACK_MEMORY_INGEST_NO_WRITE === "1";
+  let scanSecrets = process.env.GSTACK_MEMORY_INGEST_SCAN_SECRETS === "1";

  for (let i = 0; i < args.length; i++) {
    const a = args[i];
@@ -202,6 +225,7 @@ function parseArgs(): CliArgs {
      case "--include-unattributed": includeUnattributed = true; break;
      case "--all-history": allHistory = true; break;
      case "--no-write": noWrite = true; break;
+      case "--scan-secrets": scanSecrets = true; break;
      case "--limit":
        limit = parseInt(args[++i] || "0", 10);
        if (!Number.isFinite(limit) || limit <= 0) {
@@ -229,7 +253,7 @@ function parseArgs(): CliArgs {
    }
  }

-  return { mode, quiet, benchmark, includeUnattributed, allHistory, sources, limit, noWrite };
+  return { mode, quiet, benchmark, includeUnattributed, allHistory, sources, limit, noWrite, scanSecrets };
 }

 // ── State file ─────────────────────────────────────────────────────────────
@@ -268,9 +292,14 @@ function loadState(): IngestState {
 }

 function saveState(state: IngestState): void {
+  // F6 (Codex finding 6): tmp+rename atomic write so a crash mid-write
+  // never leaves a truncated/corrupt state file. Matches the pattern
+  // in gstack-gbrain-sync.ts:saveSyncState.
  try {
    mkdirSync(dirname(STATE_PATH), { recursive: true });
-    writeFileSync(STATE_PATH, JSON.stringify(state, null, 2), "utf-8");
+    const tmp = `${STATE_PATH}.tmp.${process.pid}`;
+    writeFileSync(tmp, JSON.stringify(state, null, 2), "utf-8");
+    renameSync(tmp, STATE_PATH);
  } catch (err) {
    console.error(`[state] write failed: ${(err as Error).message}`);
  }
@@ -278,12 +307,15 @@ function saveState(state: IngestState): void {

 // ── File hash + change detection ───────────────────────────────────────────

-function fileSha256(path: string, maxBytes = 1024 * 1024): string {
-  // Hash the first 1MB only; sufficient for change detection on big JSONL.
+function fileSha256(path: string): string {
+  // F9 (Codex finding 9): full-file hash. The prior 1MB cap silently
+  // missed tail edits to long partial transcripts — exactly the
+  // recovery case this pipeline needs to handle correctly. Realistic
+  // max for an ingest source is ~50MB (long JSONL); fine to load in
+  // memory for hashing.
  try {
-    const fd = readFileSync(path);
-    const slice = fd.length > maxBytes ? fd.subarray(0, maxBytes) : fd;
-    return createHash("sha256").update(slice).digest("hex");
+    const buf = readFileSync(path);
+    return createHash("sha256").update(buf).digest("hex");
  } catch {
    return "";
  }
@@ -753,51 +785,66 @@ function buildArtifactPage(path: string, type: MemoryType): PageRecord {
  };
 }

-// ── Writer (calls `gbrain put`) ────────────────────────────────────────────
+// ── Writer (batch via `gbrain import <dir>`) ───────────────────────────────
+//
+// Architecture (post plan-eng-review + Codex outside-voice):
+//
+//   walkAllSources(ctx)
+//     → for each path: mtime-skip / source-file gitleaks (D3) / parse / buildPage
+//     → renderPageBody injects title/type/tags into YAML frontmatter
+//     → writeStaged: mkdir -p slug subdirs (D1), write ${slug}.md
+//   → snapshot ~/.gbrain/sync-failures.jsonl byte-offset           (D7)
+//   → spawnSync `gbrain import <stagingDir> --no-embed --json`     (D6)
+//   → parseImportJson(stdout) → { imported, skipped, errors, ... } (D6 OK/ERR)
+//   → readNewFailures(preImportOffset, slugMap) → Set<sourcePath>  (D7)
+//   → state.sessions[path] = { ... } for prepared files NOT in failed set
+//   → saveStateAtomic (F6 tmp+rename) + cleanupStagingDir
+//
+// We trust gbrain's content_hash idempotency (verified in
+// ~/git/gbrain/src/core/import-file.ts:242-243, :478) — repeated imports
+// of identical content are cheap. So we do NOT track per-file skip_reasons,
+// do NOT keep a SIGTERM checkpoint, and do NOT advance a three-state verdict.

 let _gbrainAvailability: boolean | null = null;
 function gbrainAvailable(): boolean {
  if (_gbrainAvailability !== null) return _gbrainAvailability;
  try {
    execSync("command -v gbrain", { stdio: "ignore" });
-    // gbrain v0.27 retired the legacy `put_page` flag-form for `put <slug>`
-    // (content via stdin, metadata as YAML frontmatter). Probe `--help` for
-    // the `put` subcommand so we surface a single clean error here rather
-    // than failing every page with "Unknown command: put_page". The regex
-    // anchors on the indented subcommand format gbrain's help actually uses
-    // ("  put   ..."), not any whitespace-bordered "put" word in prose.
+    // Probe `--help` for the `import` subcommand. gbrain v0.20.0+ ships
+    // `import <dir>` (batch markdown import via path-authoritative slugs).
+    // If absent, we surface a single clean error here rather than failing
+    // the whole stage with a confusing usage message from gbrain itself.
    const help = execFileSync("gbrain", ["--help"], {
      encoding: "utf-8",
      timeout: 5000,
      stdio: ["ignore", "pipe", "pipe"],
    });
-    _gbrainAvailability = /^\s+put\s/m.test(help);
+    _gbrainAvailability = /^\s+import\s/m.test(help);
  } catch {
    _gbrainAvailability = false;
  }
  return _gbrainAvailability;
 }

-function gbrainPutPage(page: PageRecord): { ok: boolean; error?: string } {
-  if (!gbrainAvailable()) {
-    return { ok: false, error: "gbrain CLI not in PATH or missing `put` subcommand" };
-  }
-  // gbrain v0.27+ uses `put <slug>` (positional, content via stdin) instead
-  // of the legacy `put_page` flag form. Metadata rides as YAML frontmatter:
-  //  - When the page body already starts with frontmatter (transcripts), inject
-  //    title/type/tags into the existing block so gbrain's frontmatter parser
-  //    picks them up.
-  //  - When the page body has no frontmatter (raw artifacts: design-docs,
-  //    learnings, builder-profile-entries), wrap with a fresh frontmatter
-  //    carrying the same fields. Without this branch, artifact pages would
-  //    land in gbrain with empty title/type/tags.
+/**
+ * Build the markdown body with YAML frontmatter (title/type/tags) injected.
+ *
+ * Two cases:
+ *  - Page body already starts with `---\n` (transcripts) — inject into the
+ *    existing frontmatter block before its close fence so gbrain's frontmatter
+ *    parser picks up the fields alongside any session-level metadata the
+ *    transcript builder already wrote (session_id, cwd, git_remote, etc.).
+ *  - No leading frontmatter (raw artifacts: design-docs, learnings, etc.) —
+ *    wrap with a fresh frontmatter block carrying title/type/tags. Without
+ *    this branch, artifact pages would land in gbrain with empty metadata.
+ *
+ * gbrain enforces slug = path-derived (slugifyPath in gbrain's sync.ts).
+ * We do NOT set `slug:` in frontmatter — the staging-dir filename is the
+ * source of truth and gbrain rejects mismatches.
+ */
+function renderPageBody(page: PageRecord): string {
  let body = page.body;
  if (body.startsWith("---\n")) {
-    // Locate the closing --- delimiter. buildTranscriptPage joins with "\n"
-    // and does not append a trailing newline, so the close fence looks like
-    // "...\n---" followed directly by body content (no "\n---\n" pattern).
-    // Match the close on "\n---" only — the inject lands BEFORE the close
-    // fence, inside the frontmatter block, regardless of what follows it.
    const end = body.indexOf("\n---", 4);
    if (end > 0) {
      const inject = [
@@ -822,27 +869,155 @@ function gbrainPutPage(page: PageRecord): { ok: boolean; error?: string } {
  // Strip NUL bytes — Postgres rejects 0x00 in UTF-8 text columns. Some Claude
  // Code transcripts contain NUL inside user-pasted content or tool output, and
  // surfacing those as `internal_error: invalid byte sequence` from the brain
-  // is unhelpful when we can sanitize at write time.
+  // is unhelpful when we can sanitize at write time. Originally landed in v1.32.0.0
+  // (PR #1411) on the per-file `gbrain put` path; moved here so all staged
+  // pages still get the same sanitization.
  body = body.replace(/\x00/g, "");
-  try {
-    execFileSync("gbrain", ["put", page.slug], {
-      input: body,
-      encoding: "utf-8",
-      // Bumped from 30s: auto-link reconciliation on dense transcripts hits
-      // 30s once the brain has a few hundred existing pages.
-      timeout: 60000,
-      // Bumped from default 1MB: without this, gbrain's actual stderr gets
-      // truncated and callers see only "Command failed:" with no detail.
-      maxBuffer: 16 * 1024 * 1024,
-      stdio: ["pipe", "pipe", "pipe"],
-    });
-    return { ok: true };
-  } catch (err: any) {
-    const stderr = err?.stderr?.toString?.() ?? "";
-    const stdout = err?.stdout?.toString?.() ?? "";
-    const detail = stderr || stdout || (err instanceof Error ? err.message : String(err));
-    return { ok: false, error: detail.split("\n")[0].slice(0, 300) };
+  return body;
+}
+
+interface PreparedPage {
+  /** Page slug (path-shaped, e.g. "transcripts/claude-code/foo"). */
+  slug: string;
+  /** Original source file on disk (e.g. ~/.claude/projects/.../foo.jsonl). */
+  source_path: string;
+  /** Full markdown including frontmatter — ready to write. */
+  rendered_body: string;
+  /** Carry-through fields for state recording on success. */
+  page_slug: string;
+  partial: boolean;
+}
+
+interface StagingResult {
+  staging_dir: string;
+  written: number;
+  errors: Array<{ slug: string; error: string }>;
+  /** Map from staging-dir-relative path (e.g. "transcripts/foo.md") → source path. */
+  stagedPathToSource: Map<string, string>;
+}
+
+/**
+ * Write prepared pages to a staging dir, mirroring slug hierarchy.
+ *
+ * D1: gbrain's `slugifyPath` (sync.ts:260) derives the slug from the
+ * directory-aware relative path inside the import dir, so slugs containing
+ * slashes (e.g. "transcripts/claude-code/foo") must live in matching
+ * subdirectories of the staging dir. Otherwise the slug becomes flattened
+ * or rejected by gbrain's path-vs-frontmatter slug check (import-file.ts:429).
+ *
+ * Filename = `${slug}.md`. mkdir is recursive. Existing files overwrite.
+ * Errors per-file are collected; the whole batch is best-effort.
+ */
+function writeStaged(prepared: PreparedPage[], stagingDir: string): StagingResult {
+  mkdirSync(stagingDir, { recursive: true });
+  const stagedPathToSource = new Map<string, string>();
+  const errors: Array<{ slug: string; error: string }> = [];
+  let written = 0;
+  for (const p of prepared) {
+    const relPath = `${p.slug}.md`;
+    const absPath = join(stagingDir, relPath);
+    try {
+      mkdirSync(dirname(absPath), { recursive: true });
+      writeFileSync(absPath, p.rendered_body, "utf-8");
+      stagedPathToSource.set(relPath, p.source_path);
+      written++;
+    } catch (err) {
+      errors.push({ slug: p.slug, error: (err as Error).message });
+    }
  }
+  return { staging_dir: stagingDir, written, errors, stagedPathToSource };
+}
+
+interface ImportJsonResult {
+  status?: string;
+  duration_s?: number;
+  imported?: number;
+  skipped?: number;
+  errors?: number;
+  chunks?: number;
+  total_files?: number;
+}
+
+/**
+ * Parse the `gbrain import --json` stdout payload (single JSON object on
+ * the last non-empty line per commands/import.ts:271-275).
+ *
+ * Returns parsed counts on success, or `null` to signal "unparseable" — the
+ * caller treats null as ERR (system_error) rather than silently passing
+ * through as zeros. Pre-2026-05-11 this returned zeros on parse failure,
+ * which silently masked gbrain crashes as "0 imported, 0 failed = OK".
+ */
+function parseImportJson(stdout: string): ImportJsonResult | null {
+  const lines = stdout.split("\n").map((s) => s.trim()).filter(Boolean);
+  for (let i = lines.length - 1; i >= 0; i--) {
+    const line = lines[i];
+    if (line.startsWith("{") && line.endsWith("}")) {
+      try {
+        const parsed = JSON.parse(line);
+        if (typeof parsed === "object" && parsed && "imported" in parsed) {
+          return parsed as ImportJsonResult;
+        }
+      } catch {
+        // try next line up
+      }
+    }
+  }
+  return null;
+}
+
+/**
+ * Read failures appended to ~/.gbrain/sync-failures.jsonl since the
+ * snapshotted byte offset, and map them back to source paths.
+ *
+ * D7: gbrain import writes per-file failures to sync-failures.jsonl
+ * (commands/import.ts:308-310) explicitly so "callers can gate state
+ * advances" (comment at :28). We snapshot the file size before import
+ * and read only the appended bytes after, so we never confuse new
+ * entries with prior-run leftovers.
+ *
+ * Each line is `{ path, error, code, commit, ts }`. The `path` is the
+ * staging-dir-relative filename gbrain saw (e.g. "transcripts/foo.md").
+ * stagedPathToSource maps that back to the original source file.
+ */
+function readNewFailures(
+  syncFailuresPath: string,
+  preImportOffset: number,
+  stagedPathToSource: Map<string, string>,
+): Set<string> {
+  const failed = new Set<string>();
+  try {
+    if (!existsSync(syncFailuresPath)) return failed;
+    const stat = statSync(syncFailuresPath);
+    if (stat.size <= preImportOffset) return failed;
+    // Read appended bytes only. readSync with a positional offset works
+    // synchronously without slurping the whole file.
+    const fd = openSync(syncFailuresPath, "r");
+    try {
+      const buf = Buffer.alloc(stat.size - preImportOffset);
+      readSync(fd, buf, 0, buf.length, preImportOffset);
+      const text = buf.toString("utf-8");
+      for (const line of text.split("\n")) {
+        const trimmed = line.trim();
+        if (!trimmed) continue;
+        try {
+          const entry = JSON.parse(trimmed) as { path?: string };
+          if (entry.path) {
+            const source = stagedPathToSource.get(entry.path);
+            if (source) failed.add(source);
+          }
+        } catch {
+          // ignore malformed line
+        }
+      }
+    } finally {
+      closeSync(fd);
+    }
+  } catch {
+    // Best-effort. If we can't read failures, we conservatively assume
+    // none — caller will state-record all prepared files. Worst case:
+    // failed files get a retry-on-next-run shot anyway via content_hash.
+  }
+  return failed;
 }

 // ── Main ingest passes ─────────────────────────────────────────────────────
@@ -901,34 +1076,72 @@ async function probeMode(args: CliArgs): Promise<ProbeReport> {
  };
 }

-async function ingestPass(args: CliArgs): Promise<BulkResult> {
-  const t0 = Date.now();
-  const state = loadState();
-  const ctx = makeWalkContext(args, state);
-
-  let written = 0;
+/**
+ * Prepare phase: walk sources, apply incremental + optional-secret-scan filters,
+ * parse transcripts/artifacts into PageRecord, render bodies with
+ * frontmatter. Returns the PreparedPage[] to stage + counts of files
+ * filtered at each gate.
+ *
+ * Secret scanning policy (post 2026-05-10 perf review):
+ *
+ *   The actual cross-machine exfiltration boundary is `gstack-brain-sync`,
+ *   which runs a regex-based secret scanner on the staged diff before
+ *   `git commit` (see bin/gstack-brain-sync:78-110: AWS keys, GitHub
+ *   tokens, OpenAI keys, PEM blocks, JWTs, bearer-token-in-JSON). That's
+ *   the right place — it gates content leaving the machine.
+ *
+ *   memory-ingest, by contrast, moves data from one local file to a
+ *   local PGLite database. Scanning every source file at ingest time
+ *   doesn't change exposure (the secret already lives in plaintext
+ *   where the user keeps their transcripts and artifacts) but costs
+ *   ~470s on cold runs. We removed the per-file gitleaks gate as
+ *   redundant defense-in-depth and made it opt-in via `--scan-secrets`
+ *   for users who want belt-and-suspenders.
+ */
+function preparePages(
+  args: CliArgs,
+  ctx: WalkContext,
+  state: IngestState,
+): {
+  prepared: PreparedPage[];
+  skippedSecret: number;
+  skippedDedup: number;
+  skippedUnattributed: number;
+  parseFailed: number;
+  partialPages: number;
+} {
+  const prepared: PreparedPage[] = [];
  let skippedSecret = 0;
  let skippedDedup = 0;
  let skippedUnattributed = 0;
-  let failed = 0;
+  let parseFailed = 0;
  let partialPages = 0;

  for (const { path, type } of walkAllSources(ctx)) {
-    if (args.limit !== null && written >= args.limit) break;
+    if (args.limit !== null && prepared.length >= args.limit) break;

    if (args.mode === "incremental" && !fileChangedSinceState(path, state)) {
      skippedDedup++;
      continue;
    }

-    // Secret scan first
-    const scan = secretScanFile(path);
-    if (scan.scanner === "gitleaks" && scan.findings.length > 0) {
-      skippedSecret++;
-      if (!args.quiet) {
-        console.error(`[secret-scan match] ${path} (${scan.findings.length} finding${scan.findings.length === 1 ? "" : "s"}); skipped`);
+    // Optional belt-and-suspenders: when --scan-secrets is set, scan the
+    // source file with gitleaks and skip dirty ones. Off by default
+    // because gstack-brain-sync already gates the cross-machine boundary
+    // and per-file gitleaks costs ~256ms/file (4-8 min on a real corpus).
+    if (args.scanSecrets) {
+      const scan = secretScanFile(path);
+      if (scan.scanner === "gitleaks" && scan.findings.length > 0) {
+        skippedSecret++;
+        if (!args.quiet) {
+          console.error(
+            `[secret-scan match] ${path} (${scan.findings.length} finding${
+              scan.findings.length === 1 ? "" : "s"
+            }); skipped`,
+          );
+        }
+        continue;
      }
-      continue;
    }

    let page: PageRecord;
@@ -936,7 +1149,7 @@ async function ingestPass(args: CliArgs): Promise<BulkResult> {
      if (type === "transcript") {
        const session = parseTranscriptJsonl(path);
        if (!session) {
-          failed++;
+          parseFailed++;
          continue;
        }
        if (!args.includeUnattributed && !session.cwd) {
@@ -953,38 +1166,373 @@ async function ingestPass(args: CliArgs): Promise<BulkResult> {
        page = buildArtifactPage(path, type);
      }
    } catch (err) {
-      failed++;
+      parseFailed++;
      console.error(`[parse-error] ${path}: ${(err as Error).message}`);
      continue;
    }

-    const result = args.noWrite
-      ? { ok: true }
-      : await withErrorContext(
-          `put_page:${page.slug}`,
-          async () => gbrainPutPage(page),
-          "gstack-memory-ingest"
-        );
-    if (!result.ok) {
-      failed++;
-      if (!args.quiet) {
-        console.error(`[put-error] ${page.slug}: ${result.error || "unknown"}`);
+    prepared.push({
+      slug: page.slug,
+      source_path: path,
+      rendered_body: renderPageBody(page),
+      page_slug: page.slug,
+      partial: page.partial ?? false,
+    });
+  }
+
+  return {
+    prepared,
+    skippedSecret,
+    skippedDedup,
+    skippedUnattributed,
+    parseFailed,
+    partialPages,
+  };
+}
+
+/**
+ * Make a per-run staging directory at ~/.gstack/.staging-ingest-<pid>-<ts>/
+ * The pid+ts namespace avoids collisions when two ingest passes run
+ * concurrently (the orchestrator's lock should prevent this, but
+ * defense-in-depth).
+ */
+function makeStagingDir(): string {
+  const dir = join(GSTACK_HOME, `.staging-ingest-${process.pid}-${Date.now()}`);
+  mkdirSync(dir, { recursive: true });
+  return dir;
+}
+
+/**
+ * Best-effort recursive cleanup. Failures swallowed — at worst we leak a
+ * staging dir to disk; the next run uses a new one and they age out via
+ * normal disk hygiene. We deliberately do NOT crash the pipeline on
+ * cleanup failure.
+ */
+function cleanupStagingDir(dir: string): void {
+  try {
+    rmSync(dir, { recursive: true, force: true });
+  } catch {
+    // best-effort
+  }
+}
+
+/**
+ * Track the currently-running gbrain import child + active staging dir so
+ * SIGTERM/SIGINT on the parent process can:
+ *   1. forward the signal to the child (otherwise gbrain orphans, holds the
+ *      PGLite write lock, and burns CPU — observed during 2026-05-10 cold-run
+ *      testing)
+ *   2. synchronously clean up the staging dir BEFORE process.exit (otherwise
+ *      finally blocks in async callers don't run after process.exit from
+ *      inside a signal handler, leaking the staging dir on every interrupt)
+ */
+let _activeImportChild: ChildProcess | null = null;
+let _activeStagingDir: string | null = null;
+let _signalHandlersInstalled = false;
+function installSignalForwarder(): void {
+  if (_signalHandlersInstalled) return;
+  _signalHandlersInstalled = true;
+  const forward = (signal: NodeJS.Signals) => () => {
+    if (_activeImportChild && _activeImportChild.pid && !_activeImportChild.killed) {
+      try {
+        process.kill(_activeImportChild.pid, signal);
+      } catch {
+        // child may have already exited between the alive-check and the kill
+      }
+    }
+    // Synchronously clean up the active staging dir before exiting. The async
+    // `finally` blocks in ingestPass never run after process.exit fires from
+    // inside this handler, so cleanup has to happen here.
+    if (_activeStagingDir) {
+      cleanupStagingDir(_activeStagingDir);
+      _activeStagingDir = null;
+    }
+    // Re-raise to default action so the parent actually exits. Without this,
+    // a SIGTERM handler that doesn't exit holds the process alive.
+    process.exit(signal === "SIGINT" ? 130 : 143);
+  };
+  process.on("SIGTERM", forward("SIGTERM"));
+  process.on("SIGINT", forward("SIGINT"));
+}
+
+/**
+ * Run gbrain import as an async child so we can install signal handlers
+ * that kill the child on parent SIGTERM/SIGINT. Returns the same shape as
+ * spawnSync's result so the caller doesn't care which mode was used.
+ */
+function runGbrainImport(
+  stagingDir: string,
+  timeoutMs: number,
+): Promise<{ status: number | null; stdout: string; stderr: string }> {
+  installSignalForwarder();
+  return new Promise((resolve) => {
+    const child = spawn(
+      "gbrain",
+      ["import", stagingDir, "--no-embed", "--json"],
+      { stdio: ["ignore", "pipe", "pipe"] },
+    );
+    _activeImportChild = child;
+    let stdout = "";
+    let stderr = "";
+    let timedOut = false;
+    const timer = setTimeout(() => {
+      timedOut = true;
+      try {
+        if (child.pid) process.kill(child.pid, "SIGTERM");
+      } catch {
+        // already gone
+      }
+    }, timeoutMs);
+    child.stdout?.on("data", (chunk) => {
+      stdout += chunk.toString("utf-8");
+    });
+    child.stderr?.on("data", (chunk) => {
+      stderr += chunk.toString("utf-8");
+    });
+    child.on("close", (status) => {
+      clearTimeout(timer);
+      _activeImportChild = null;
+      resolve({
+        status: timedOut ? null : status,
+        stdout,
+        stderr,
+      });
+    });
+    child.on("error", (err) => {
+      clearTimeout(timer);
+      _activeImportChild = null;
+      resolve({
+        status: null,
+        stdout,
+        stderr: stderr + `\n[spawn-error] ${(err as Error).message}`,
+      });
+    });
+  });
+}
+
+async function ingestPass(args: CliArgs): Promise<BulkResult> {
+  const t0 = Date.now();
+  const state = loadState();
+  const ctx = makeWalkContext(args, state);
+
+  // Phase 1: prepare (parse + secret-scan + filter + render frontmatter).
+  const prep = preparePages(args, ctx, state);
+
+  let written = 0;
+  let failed = 0;
+
+  if (args.noWrite) {
+    // --no-write: skip the gbrain import call but still record state for
+    // prepared pages (treat them as ingested for dedup purposes). Matches
+    // the prior contract from --help: "Skip gbrain put_page calls (still
+    // updates state file)".
+    const nowIso = new Date().toISOString();
+    for (const p of prep.prepared) {
+      try {
+        state.sessions[p.source_path] = {
+          mtime_ns: Math.floor(statSync(p.source_path).mtimeMs * 1e6),
+          sha256: fileSha256(p.source_path),
+          ingested_at: nowIso,
+          page_slug: p.page_slug,
+          partial: p.partial,
+        };
+        written++;
+      } catch {
+        // best-effort state record
+      }
+    }
+    state.last_full_walk = new Date().toISOString();
+    state.last_writer = "gstack-memory-ingest";
+    saveState(state);
+    return {
+      written,
+      skipped_secret: prep.skippedSecret,
+      skipped_dedup: prep.skippedDedup,
+      skipped_unattributed: prep.skippedUnattributed,
+      failed: prep.parseFailed,
+      duration_ms: Date.now() - t0,
+      partial_pages: prep.partialPages,
+    };
+  }
+
+  if (prep.prepared.length === 0) {
+    // Nothing to import — still touch state.last_full_walk and exit.
+    state.last_full_walk = new Date().toISOString();
+    state.last_writer = "gstack-memory-ingest";
+    saveState(state);
+    return {
+      written: 0,
+      skipped_secret: prep.skippedSecret,
+      skipped_dedup: prep.skippedDedup,
+      skipped_unattributed: prep.skippedUnattributed,
+      failed: prep.parseFailed,
+      duration_ms: Date.now() - t0,
+      partial_pages: prep.partialPages,
+    };
+  }
+
+  if (!gbrainAvailable()) {
+    const msg =
+      "gbrain CLI not in PATH or missing `import` subcommand. Run /setup-gbrain.";
+    console.error(`[memory-ingest] ERR: ${msg}`);
+    return {
+      written: 0,
+      skipped_secret: prep.skippedSecret,
+      skipped_dedup: prep.skippedDedup,
+      skipped_unattributed: prep.skippedUnattributed,
+      failed: prep.parseFailed + prep.prepared.length,
+      duration_ms: Date.now() - t0,
+      partial_pages: prep.partialPages,
+      system_error: msg,
+    };
+  }
+
+  // Phase 2: stage to a per-run dir + invoke gbrain import.
+  const stagingDir = makeStagingDir();
+  // Register staging dir with the signal forwarder so SIGTERM/SIGINT can
+  // synchronously clean it up before process.exit (the async finally block
+  // below does NOT run after a signal-handler exit).
+  _activeStagingDir = stagingDir;
+  try {
+    const staging = writeStaged(prep.prepared, stagingDir);
+    failed += staging.errors.length;
+    if (!args.quiet && staging.errors.length > 0) {
+      for (const e of staging.errors.slice(0, 5)) {
+        console.error(`[stage-error] ${e.slug}: ${e.error}`);
      }
-      continue;
    }

-    state.sessions[path] = {
-      mtime_ns: Math.floor(statSync(path).mtimeMs * 1e6),
-      sha256: page.content_sha256,
-      ingested_at: new Date().toISOString(),
-      page_slug: page.slug,
-      partial: page.partial,
-    };
-    written++;
-    if (!args.quiet) {
-      const tag = page.partial ? " [partial]" : "";
-      console.log(`[${written}] ${page.slug}${tag}`);
+    // D7: snapshot sync-failures.jsonl byte-offset before import so we
+    // can read only newly-appended failure entries afterwards.
+    const syncFailuresPath = join(homedir(), ".gbrain", "sync-failures.jsonl");
+    let preImportOffset = 0;
+    try {
+      if (existsSync(syncFailuresPath)) {
+        preImportOffset = statSync(syncFailuresPath).size;
+      }
+    } catch {
+      // best-effort; absent file → 0 offset, all future entries are "new"
    }
+
+    if (!args.quiet) {
+      console.error(
+        `[memory-ingest] staged ${staging.written} pages → ${stagingDir}; running gbrain import...`,
+      );
+    }
+
+    // D6: single batch import. `--no-embed` matches the prior per-file
+    // behavior (we never enabled embedding); embeddings happen on-demand
+    // via gbrain's own pipelines. `--json` gives us structured counts.
+    //
+    // Async spawn (not spawnSync) so the signal forwarder installed in
+    // runGbrainImport propagates SIGTERM/SIGINT to the child. With sync
+    // spawn, parent termination orphans the gbrain process (observed
+    // during 2026-05-10 cold-run testing — gbrain kept running 15 min
+    // after the orchestrator timed out).
+    const importResult = await runGbrainImport(stagingDir, 30 * 60 * 1000);
+
+    const stdout = importResult.stdout || "";
+    const stderr = importResult.stderr || "";
+    const importJson = parseImportJson(stdout);
+
+    if (importResult.status !== 0) {
+      const tail = (stderr.trim().split("\n").pop() || "").slice(0, 300);
+      const msg = `gbrain import exited ${importResult.status}: ${tail}`;
+      console.error(`[memory-ingest] ERR: ${msg}`);
+      // We conservatively state-record nothing on a non-zero exit — per-run
+      // partial progress is invisible to us when the importer crashed.
+      // sync-failures.jsonl entries may still hold per-file detail.
+      failed += prep.prepared.length;
+      return {
+        written: 0,
+        skipped_secret: prep.skippedSecret,
+        skipped_dedup: prep.skippedDedup,
+        skipped_unattributed: prep.skippedUnattributed,
+        failed,
+        duration_ms: Date.now() - t0,
+        partial_pages: prep.partialPages,
+        system_error: msg,
+      };
+    }
+
+    if (!args.quiet) {
+      // Echo gbrain's own progress lines on stderr through so the user sees
+      // them when running interactively. Already on our stderr from the
+      // child via `stdio: pipe`, but we explicitly forward for clarity.
+      process.stderr.write(stderr);
+    }
+
+    if (importJson === null) {
+      // gbrain exited 0 but didn't emit a parseable --json line. Treat as
+      // ERR rather than silently passing zeros through — silent zeros let
+      // a future gbrain-output regression mask data loss.
+      const msg =
+        "gbrain import exited 0 but emitted no parseable --json payload. " +
+        "Refusing to advance state.";
+      console.error(`[memory-ingest] ERR: ${msg}`);
+      failed += prep.prepared.length;
+      return {
+        written: 0,
+        skipped_secret: prep.skippedSecret,
+        skipped_dedup: prep.skippedDedup,
+        skipped_unattributed: prep.skippedUnattributed,
+        failed,
+        duration_ms: Date.now() - t0,
+        partial_pages: prep.partialPages,
+        system_error: msg,
+      };
+    }
+
+    // D7: identify which staged files failed to import and exclude them
+    // from state recording. Source paths get a retry on the next run.
+    const failedSources = readNewFailures(
+      syncFailuresPath,
+      preImportOffset,
+      staging.stagedPathToSource,
+    );
+    failed += failedSources.size;
+
+    // Phase 3: state recording. Only files that landed in gbrain get
+    // their mtime+sha256 stamped. Failed source paths are deliberately
+    // left un-state'd so the next run re-prepares them and gbrain's
+    // content_hash dedup short-circuits the import.
+    const nowIso = new Date().toISOString();
+    for (const p of prep.prepared) {
+      if (failedSources.has(p.source_path)) continue;
+      try {
+        state.sessions[p.source_path] = {
+          mtime_ns: Math.floor(statSync(p.source_path).mtimeMs * 1e6),
+          sha256: fileSha256(p.source_path),
+          ingested_at: nowIso,
+          page_slug: p.page_slug,
+          partial: p.partial,
+        };
+        written++;
+        if (!args.quiet) {
+          const tag = p.partial ? " [partial]" : "";
+          console.log(`[${written}] ${p.page_slug}${tag}`);
+        }
+      } catch (err) {
+        // statSync can fail if the source file was removed mid-run; skip
+        // recording but don't fail the whole pass.
+        console.error(
+          `[state-record] ${p.source_path}: ${(err as Error).message}`,
+        );
+      }
+    }
+
+    if (!args.quiet) {
+      console.error(
+        `[memory-ingest] gbrain import: ${importJson.imported ?? 0} imported, ` +
+          `${importJson.skipped ?? 0} unchanged, ${importJson.errors ?? 0} failed` +
+          (failedSources.size > 0
+            ? ` (see ~/.gbrain/sync-failures.jsonl for details)`
+            : ""),
+      );
+    }
+  } finally {
+    cleanupStagingDir(stagingDir);
+    _activeStagingDir = null;
  }

  state.last_full_walk = new Date().toISOString();
@@ -993,12 +1541,12 @@ async function ingestPass(args: CliArgs): Promise<BulkResult> {

  return {
    written,
-    skipped_secret: skippedSecret,
-    skipped_dedup: skippedDedup,
-    skipped_unattributed: skippedUnattributed,
-    failed,
+    skipped_secret: prep.skippedSecret,
+    skipped_dedup: prep.skippedDedup,
+    skipped_unattributed: prep.skippedUnattributed,
+    failed: failed + prep.parseFailed,
    duration_ms: Date.now() - t0,
-    partial_pages: partialPages,
+    partial_pages: prep.partialPages,
  };
 }

@@ -1072,11 +1620,15 @@ async function main(): Promise<void> {
    if (result.written > 0 || result.failed > 0) {
      console.error(`[memory-ingest] ${result.written} written, ${result.failed} failed in ${dt}ms`);
    }
+    // D6: system_error → process-level failure; orchestrator sees ERR.
+    // Per-file errors do NOT exit non-zero.
+    if (result.system_error) process.exit(1);
    return;
  }

  const result = await ingestPass(args);
  printBulkResult(result, args);
+  if (result.system_error) process.exit(1);
 }

 main().catch((err) => {