mirror of
https://github.com/garrytan/gstack.git
synced 2026-05-16 09:12:13 +08:00
* feat(gbrain): add lib/gbrain-local-status classifier with 5-state engine status + 60s cache
Foundation for split-engine gbrain: shared classifier used by both
bin/gstack-gbrain-detect (preamble probe) and bin/gstack-gbrain-sync.ts
(orchestrator SKIP-when-not-ok). Single source of truth.
Probes via `gbrain sources list --json` and classifies stderr against the
same patterns lib/gbrain-sources.ts:66-67 already uses ("Cannot connect to
database", "config.json"). Returns one of: ok, no-cli, missing-config,
broken-config, broken-db. Defensive default: unrecognized failures
classify as broken-config so the raw stderr can be surfaced upstream.
Cache at ~/.gstack/.gbrain-local-status-cache.json keyed on
{home, path_hash, gbrain_bin_path, gbrain_version, config_mtime, config_size}
with 60s TTL. Cache invalidates on any invariant change. --no-cache option
busts the cache for callers that just mutated state (/setup-gbrain,
/sync-gbrain after init/migration).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* refactor(gbrain): rewrite gstack-gbrain-detect bash→TS + add gbrain_local_status field
Replaces the bash detect helper with a bun shebang script sharing the
gbrain_local_status classifier from lib/gbrain-local-status.ts with the
sync orchestrator. Single source of truth for engine-status classification
between preamble-probe and orchestrator-skip paths.
Filename stays gstack-gbrain-detect (no .ts extension) so existing skill
preamble callers shell out unchanged. Shebang `#!/usr/bin/env -S bun run`
resolves bun at runtime.
Output is key/type backward-compatible with the bash version per plan
codex #5: the 9 pre-existing keys (gbrain_on_path, gbrain_version,
gbrain_config_exists, gbrain_engine, gbrain_doctor_ok, gbrain_mcp_mode,
gstack_brain_sync_mode, gstack_brain_git, gstack_artifacts_remote) stay
identical in name + type + value semantics. One new key added:
gbrain_local_status (5-state string enum).
Updates the existing schema regression at test/gstack-gbrain-detect-mcp-mode.test.ts
to include the new key. Adds test/gbrain-detect-shape.test.ts asserting
the regression contract for future changes.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(gbrain): orchestrator SKIP when local engine not ok + remote-http transcripts via artifacts pipeline
Two changes in the sync orchestrator, both per plan D11/D12:
1. bin/gstack-gbrain-sync.ts: runCodeImport + runMemoryIngest call
localEngineStatus() (shared classifier from lib/gbrain-local-status.ts).
When status is not 'ok', return a SKIP stage result with a clear reason
instead of crashing with "source registration failed: gbrain not
configured". Brain-sync stage runs regardless — it doesn't depend on
local engine. dry-run preview path is gated above the check so it
continues to show would-do steps even when the engine is broken.
2. bin/gstack-memory-ingest.ts: when gbrain MCP is registered as
remote-http (Path 4), persist staged transcripts to
~/.gstack/transcripts/run-<pid>-<ts>/ instead of the ephemeral
~/.gstack/.staging-ingest-<pid>-<ts>/ tmp dir, and SKIP the local
`gbrain import` call entirely. The artifacts pipeline (gstack-brain-sync
push to git, brain admin pulls and indexes) handles routing to the
remote brain. Local PGLite (when present via Step 4.5) stays code-only.
State recording still happens — prepared pages get their mtime+sha256
stamped under remote-http mode so the next /sync-gbrain doesn't
re-stage them. Cleanup is skipped intentionally so the persisted dir
survives until gstack-brain-sync moves it.
Adds test/gbrain-sync-skip.test.ts covering 5 SKIP scenarios (broken-db,
broken-config, no-cli, missing-config, ok pass-through). All 25
sync-related unit tests pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(gbrain): v1.34.0.0 migration notice + transcripts allowlist for artifacts pipeline
Per plan D5 + D11. Two pieces of the split-engine rollout:
1. gstack-upgrade/migrations/v1.34.0.0.sh — prints a one-time
discoverability notice for existing Path 4 (remote-http MCP) users
whose machine has no local engine yet. Tells them about /setup-gbrain
Step 4.5 (the new local-PGLite opt-in). Silent for everyone else.
User can suppress permanently via `gstack-config set
local_code_index_offered true`. Touchfile at
~/.gstack/.migrations/v1.34.0.0.done makes it idempotent.
2. bin/gstack-artifacts-init — adds `transcripts/run-*/*.md` and
`transcripts/run-*/**/*.md` to the managed allowlist so the
gstack-memory-ingest persistent staging dir (used in remote-http
mode per D11) gets pushed to the artifacts repo. Brain admin's
pull job then indexes transcripts into the remote brain.
Privacy class: behavioral (matches transcript content).
Adds test/gstack-upgrade-migration-v1_34_0_0.test.ts with 5 cases:
state match, no-MCP, local-config-present, opt-out, and idempotency.
All 5 pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(gbrain): /setup-gbrain Step 1.5/4.5 + /sync-gbrain Step 1.5 templates
Per plan D4, D10, D11, D12. Wires the skill prose to the new
split-engine flow + classifier introduced in earlier commits.
setup-gbrain/SKILL.md.tmpl:
- Step 1: detect output description now includes the v1.34.0.0
gbrain_local_status field (5 values).
- Step 1.5 (NEW): broken-db / broken-config remediation. AskUserQuestion
with 4 options — Retry / Switch to PGLite / Switch brain mode / Quit
(plan D4). Retry is recommended first since broken-db often = transient
Postgres outage. PGLite is explicitly one-way + destructive (moves
existing config to ~/.gbrain/config.json.gstack-bak-<ts>); rollback on
init failure restores the .bak (plan D7).
- Step 4d → Step 4.5 (NEW): in Path 4, after the verify step, offer
local PGLite for code search. AskUserQuestion Yes/No (plan D10/D11).
Yes path runs gstack-gbrain-install + `gbrain init --pglite --json`
with the same rollback-safe sequence. No path skips Steps 3/4/5/7.5.
- Step 10 verdict (Path 4): adds "Code search" row reflecting Step 4.5
choice. Updates "Transcripts" row to describe the new D11 routing
(artifacts repo → remote brain).
sync-gbrain/SKILL.md.tmpl:
- Step 1 split-engine prose: corrects the prior misleading claim that
"memory routes through whatever setup-gbrain configured, including
remote-MCP" (codex finding #3). Memory stage shells out to local
`gbrain import` in local-stdio mode; in remote-http mode it persists
to ~/.gstack/transcripts/ for the artifacts pipeline.
- Step 1.5 (NEW): local-engine pre-flight. STOP on no-cli, broken-config,
broken-db. Soft skip (continue with code+memory SKIP) on
missing-config + remote-http per plan D12. Surfaces actionable user
remediation message instead of the orchestrator crashing two stages
with ERR.
Regenerated SKILL.md for all hosts (claude, kiro, opencode, slate,
cursor, openclaw, hermes, gbrain). All 712 skill-validation + gen-skill-docs
tests pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* test(gbrain): .bak-rollback contract for Step 1.5 / 4.5 init failure path
Per plan D7 (rollback semantics) and codex #10 (rollback scope). The
/setup-gbrain skill instructs the model to follow a specific shell
sequence when running `gbrain init --pglite` against an existing
config:
1. mv ~/.gbrain/config.json ~/.gbrain/config.json.gstack-bak-<ts>
2. gbrain init --pglite --json
3. on non-zero exit: mv .bak back; surface error
This test verifies that contract using a fake `gbrain` binary that
fails on init. Three cases:
- FAILURE: gbrain init exits non-zero → broken config restored to
original path, no leftover .bak.
- SUCCESS: gbrain init exits 0 → new config in place, .bak survives
for audit (user reviews + deletes manually).
- SCOPE: any partial PGLite directory at ~/.gbrain/pglite/ is NOT
auto-cleaned. We only promise to restore config.json; PGLite
cleanup is the user's call (codex #10).
If the skill template rewrites this sequence in a future change, this
test should fail until the test's shell is updated too. That's the
point — keep the test and the skill template aligned.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* test(gbrain): periodic E2E for /setup-gbrain Path 4 + Step 4.5 Yes flow
End-to-end coverage of the new opt-in question via runAgentSdkTest.
Stubs the MCP endpoint at /tools/list with a 200 response carrying a
fake gbrain v0.32.3.0 serverInfo, and fakes the gbrain + claude CLIs
so init writes a PGLite config and mcp add succeeds. Asserts the model:
1. invokes gstack-gbrain-install (Step 4.5 Yes branch)
2. invokes `gbrain init --pglite --json`
3. writes a working ~/.gbrain/config.json with engine=pglite
4. registers the remote MCP via `claude mcp add --transport http`
5. never leaks the bearer token to CLAUDE.md
Classified as periodic-tier per plan D6 (codex #12 flagged AgentSDK
flakiness; gate-tier coverage of the split-engine behavior lives in the
deterministic unit tests at gbrain-local-status.test.ts and
gbrain-sync-skip.test.ts). Touchfile fires the test when the skill
template, install/verify/init helpers, the local-status classifier, or
the agent-sdk-runner harness changes.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* chore(gbrain): bump migration to v1.35.0.0 after main merge
main shipped v1.34.0.0 (factory-export submodule) + v1.34.1.0 (update-check
hardening) while this branch was in flight. The migration file I named
v1.34.0.0.sh now belongs at v1.35.0.0 — the next minor on top of main,
matching the scale of split-engine work (new lib + orchestrator skip +
template overhaul + transcripts routing).
Renames the migration script and its test file; updates all internal
version references in both files. Behavior unchanged.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* perf(gbrain): memoize gbrain resolution + use --fast doctor in detect
Cuts detect's wall time substantially by sharing fork-exec results
between the helper that walks the JSON output and the localEngineStatus
classifier from lib/gbrain-local-status.ts.
Before: detect made 2x `command -v gbrain` calls (one in detect's
detectGbrain, one in the classifier's resolveGbrainBin) and 2x
`gbrain --version` calls. With memoization keyed on PATH, both
collapse to one fork each (~400ms saved per skill preamble).
Also adds `--fast` to the `gbrain doctor --json` call in detect so a
broken-db config (Garry's repro) doesn't burn a full 5s timeout on the
doctor's DB-connection check. The classifier still probes the DB
directly via `gbrain sources list --json` for engine reachability —
that's `gbrain_local_status`, separate from the coarse
`gbrain_doctor_ok` summary flag.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* test(gbrain): relax E2E assertions to smoke-test contract
Per codex #12 (AgentSDK harness is non-deterministic): the E2E now
asserts the model followed the split-engine path WITHOUT requiring a
specific subcommand sequence. Three assertions:
1. AskUserQuestion was called (model reached interactive branches)
2. At least one of {gstack-gbrain-install, `gbrain init --pglite`,
`claude mcp add`} fired (model followed the skill, not a no-op)
3. The fake bearer token never leaked to CLAUDE.md (security regression)
Deterministic per-step coverage of the same flow lives in the gate-tier
unit tests (gbrain-local-status, gbrain-sync-skip, init-rollback,
upgrade-migration). The E2E exists to catch the "model can't follow
the skill at all" regression class, not to pin the exact tool sequence.
Test passes in 280s against the live Agent SDK.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* test(version): bump CLI smoke-test timeout to 15s (flaky at 5s under load)
The gstack-next-version integration smoke test spawns a child process
that does git operations + sibling-worktree probing. Wall time hovers
4-5s on M-series Macs; flakes at exactly 5001-5002ms when the test
suite runs under load (bun's parallel scheduling). Bumping per-test
timeout to 15s eliminates the flake without changing test logic.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* chore: bump version and changelog (v1.37.0.0)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
682 lines
25 KiB
TypeScript
682 lines
25 KiB
TypeScript
#!/usr/bin/env bun
|
|
/**
|
|
* gstack-gbrain-sync — V1 unified sync verb.
|
|
*
|
|
* Orchestrates three storage tiers per plan §"Storage tiering":
|
|
*
|
|
* 1. Code (current repo) → `gbrain sources add` (idempotent via
|
|
* lib/gbrain-sources.ts) + `gbrain sync
|
|
* --strategy code` (incremental) or
|
|
* `gbrain reindex-code --yes` (--full).
|
|
* NEVER `gbrain import` (markdown only).
|
|
* 2. Transcripts + curated memory → gstack-memory-ingest (typed put_page)
|
|
* 3. Curated artifacts to git → gstack-brain-sync (existing pipeline)
|
|
*
|
|
* Modes:
|
|
* --incremental (default) — mtime fast-path; runs all 3 stages with cache hits
|
|
* --full — first-run; full walk + reindex; honest budget per ED2
|
|
* --dry-run — preview what would sync; no writes anywhere (incl. state file)
|
|
*
|
|
* Concurrency safety per /plan-eng-review D1:
|
|
* - Lock file at ~/.gstack/.sync-gbrain.lock (PID + start ts).
|
|
* - Stale-lock takeover after 5 min (process death).
|
|
* - State file written via tmp+rename for atomicity.
|
|
* - Lock released in finally; SIGINT/SIGTERM trapped for cleanup.
|
|
*
|
|
* --watch (V1.5 P0 TODO): file-watcher daemon. NOTE: gbrain v0.25.1 already
|
|
* ships `gbrain sync --watch [--interval N]` and `gbrain sync --install-cron`;
|
|
* when revisited, /sync-gbrain --watch wires through to the gbrain CLI rather
|
|
* than building a gstack-side daemon.
|
|
*/
|
|
|
|
import { existsSync, statSync, mkdirSync, writeFileSync, readFileSync, unlinkSync, renameSync } from "fs";
|
|
import { join, dirname } from "path";
|
|
import { execSync, spawnSync } from "child_process";
|
|
import { homedir } from "os";
|
|
import { createHash } from "crypto";
|
|
|
|
import { detectEngineTier, withErrorContext, canonicalizeRemote } from "../lib/gstack-memory-helpers";
|
|
import { ensureSourceRegistered, sourcePageCount } from "../lib/gbrain-sources";
|
|
import { localEngineStatus, type LocalEngineStatus } from "../lib/gbrain-local-status";
|
|
|
|
// ── Types ──────────────────────────────────────────────────────────────────
|
|
|
|
type Mode = "incremental" | "full" | "dry-run";
|
|
|
|
interface CliArgs {
|
|
mode: Mode;
|
|
quiet: boolean;
|
|
noCode: boolean;
|
|
noMemory: boolean;
|
|
noBrainSync: boolean;
|
|
codeOnly: boolean;
|
|
}
|
|
|
|
interface CodeStageDetail {
|
|
source_id?: string;
|
|
source_path?: string;
|
|
page_count?: number | null;
|
|
last_imported?: string;
|
|
status?: "ok" | "skipped" | "failed";
|
|
}
|
|
|
|
interface StageResult {
|
|
name: string;
|
|
ran: boolean;
|
|
ok: boolean;
|
|
duration_ms: number;
|
|
summary: string;
|
|
/** Stage-specific structured detail. Code stage carries source_id + page_count. */
|
|
detail?: CodeStageDetail;
|
|
}
|
|
|
|
// ── Constants ──────────────────────────────────────────────────────────────
|
|
|
|
const HOME = homedir();
|
|
const GSTACK_HOME = process.env.GSTACK_HOME || join(HOME, ".gstack");
|
|
const STATE_PATH = join(GSTACK_HOME, ".gbrain-sync-state.json");
|
|
const LOCK_PATH = join(GSTACK_HOME, ".sync-gbrain.lock");
|
|
const STALE_LOCK_MS = 5 * 60 * 1000;
|
|
|
|
// ── CLI ────────────────────────────────────────────────────────────────────
|
|
|
|
function printUsage(): void {
|
|
console.error(`Usage: gstack-gbrain-sync [--incremental|--full|--dry-run] [options]
|
|
|
|
Modes:
|
|
--incremental Default. mtime fast-path; ~50ms steady-state.
|
|
--full First-run; full walk + reindex. Honest ~25-35 min for big Macs (ED2).
|
|
--dry-run Preview what would sync; no writes anywhere.
|
|
|
|
Options:
|
|
--quiet Suppress per-stage output.
|
|
--no-code Skip the cwd code-import stage.
|
|
--no-memory Skip the gstack-memory-ingest stage (transcripts + artifacts).
|
|
--no-brain-sync Skip the gstack-brain-sync git pipeline stage.
|
|
--code-only Only run the code-import stage (alias for --no-memory --no-brain-sync).
|
|
--help This text.
|
|
|
|
Stages run in order: code → memory ingest → curated git push.
|
|
Each stage failure is non-fatal; subsequent stages still run.
|
|
`);
|
|
}
|
|
|
|
function parseArgs(): CliArgs {
|
|
const args = process.argv.slice(2);
|
|
let mode: Mode = "incremental";
|
|
let quiet = false;
|
|
let noCode = false;
|
|
let noMemory = false;
|
|
let noBrainSync = false;
|
|
let codeOnly = false;
|
|
|
|
for (let i = 0; i < args.length; i++) {
|
|
const a = args[i];
|
|
switch (a) {
|
|
case "--incremental": mode = "incremental"; break;
|
|
case "--full": mode = "full"; break;
|
|
case "--dry-run": mode = "dry-run"; break;
|
|
case "--quiet": quiet = true; break;
|
|
case "--no-code": noCode = true; break;
|
|
case "--no-memory": noMemory = true; break;
|
|
case "--no-brain-sync": noBrainSync = true; break;
|
|
case "--code-only":
|
|
codeOnly = true;
|
|
noMemory = true;
|
|
noBrainSync = true;
|
|
break;
|
|
case "--help":
|
|
case "-h":
|
|
printUsage();
|
|
process.exit(0);
|
|
default:
|
|
console.error(`Unknown argument: ${a}`);
|
|
printUsage();
|
|
process.exit(1);
|
|
}
|
|
}
|
|
|
|
return { mode, quiet, noCode, noMemory, noBrainSync, codeOnly };
|
|
}
|
|
|
|
// ── Helpers ────────────────────────────────────────────────────────────────
|
|
|
|
function repoRoot(): string | null {
|
|
try {
|
|
const out = execSync("git rev-parse --show-toplevel", { encoding: "utf-8", timeout: 2000 });
|
|
return out.trim();
|
|
} catch {
|
|
return null;
|
|
}
|
|
}
|
|
|
|
function originUrl(): string | null {
|
|
try {
|
|
const out = execSync("git remote get-url origin", { encoding: "utf-8", timeout: 2000 });
|
|
return out.trim();
|
|
} catch {
|
|
return null;
|
|
}
|
|
}
|
|
|
|
/**
|
|
* Derive a worktree-aware source id for the cwd code corpus.
|
|
*
|
|
* Pattern: `gstack-code-<slug>-<pathhash8>` where slug comes from origin
|
|
* (org/repo) and pathhash8 is the first 8 hex chars of sha1(absolute repo
|
|
* path). The pathhash8 is what makes Conductor worktrees of the same repo
|
|
* coexist as separate sources in the same gbrain DB instead of stomping on
|
|
* each other.
|
|
*
|
|
* Falls back to the repo basename when there is no origin (local repo).
|
|
*
|
|
* gbrain enforces source ids to be 1-32 lowercase alnum chars with
|
|
* optional interior hyphens. `constrainSourceId` handles the 32-char cap
|
|
* with a hashed-tail fallback when the combined slug exceeds budget.
|
|
*/
|
|
function deriveCodeSourceId(repoPath: string): string {
|
|
const pathHash = createHash("sha1").update(repoPath).digest("hex").slice(0, 8);
|
|
const remote = canonicalizeRemote(originUrl());
|
|
if (remote) {
|
|
const segs = remote.split("/").filter(Boolean);
|
|
const slugSource = segs.slice(-2).join("-");
|
|
return constrainSourceId("gstack-code", `${slugSource}-${pathHash}`);
|
|
}
|
|
const base = repoPath.split("/").pop() || "repo";
|
|
return constrainSourceId("gstack-code", `${base}-${pathHash}`);
|
|
}
|
|
|
|
/**
|
|
* Pre-pathhash source id, kept for orphan detection only.
|
|
*
|
|
* Earlier /sync-gbrain versions registered `gstack-code-<slug>` (no pathhash
|
|
* suffix). On a multi-worktree repo, those collapsed onto a single source id
|
|
* with last-sync-wins semantics. The new path-keyed id leaves the legacy
|
|
* source orphaned in the brain — federated cross-source search would return
|
|
* stale duplicate hits. We remove the legacy id once, on the first new-format
|
|
* sync from any worktree of this repo, so users don't accumulate orphans.
|
|
*/
|
|
function deriveLegacyCodeSourceId(repoPath: string): string {
|
|
const remote = canonicalizeRemote(originUrl());
|
|
if (remote) {
|
|
const segs = remote.split("/").filter(Boolean);
|
|
const slugSource = segs.slice(-2).join("-");
|
|
return constrainSourceId("gstack-code", slugSource);
|
|
}
|
|
const base = repoPath.split("/").pop() || "repo";
|
|
return constrainSourceId("gstack-code", base);
|
|
}
|
|
|
|
/**
|
|
* Build a gbrain-valid source id (1-32 lowercase alnum + interior hyphens). Sanitizes
|
|
* `raw`, prefixes with `prefix`, and falls back to a hashed-tail form when total length
|
|
* would exceed 32 chars.
|
|
*/
|
|
function constrainSourceId(prefix: string, raw: string): string {
|
|
const MAX = 32;
|
|
const slug = raw.toLowerCase().replace(/[^a-z0-9]+/g, "-").replace(/^-+|-+$/g, "");
|
|
// Empty slug after sanitize (e.g. raw was all non-alnum like "___") would
|
|
// produce "${prefix}-" which fails gbrain's validator on the trailing
|
|
// hyphen. Fall back to a deterministic hash of the original input so the
|
|
// result is stable across runs of the same repo.
|
|
if (!slug) {
|
|
const hash = createHash("sha1").update(raw || "_empty").digest("hex").slice(0, 6);
|
|
return `${prefix}-${hash}`;
|
|
}
|
|
const full = `${prefix}-${slug}`;
|
|
if (full.length <= MAX) return full;
|
|
const hash = createHash("sha1").update(slug).digest("hex").slice(0, 6);
|
|
// Total budget: prefix + "-" + tail + "-" + hash
|
|
const tailBudget = MAX - prefix.length - 2 - hash.length;
|
|
if (tailBudget < 1) return `${prefix}-${hash}`;
|
|
const tail = slug.slice(-tailBudget).replace(/^-+|-+$/g, "");
|
|
return tail ? `${prefix}-${tail}-${hash}` : `${prefix}-${hash}`;
|
|
}
|
|
|
|
function gbrainAvailable(): boolean {
|
|
try {
|
|
execSync("command -v gbrain", { stdio: "ignore" });
|
|
return true;
|
|
} catch {
|
|
return false;
|
|
}
|
|
}
|
|
|
|
// ── Lock file (D1) ─────────────────────────────────────────────────────────
|
|
|
|
interface LockInfo {
|
|
pid: number;
|
|
started_at: string;
|
|
}
|
|
|
|
function acquireLock(): boolean {
|
|
mkdirSync(GSTACK_HOME, { recursive: true });
|
|
if (existsSync(LOCK_PATH)) {
|
|
// Check if stale.
|
|
try {
|
|
const stat = statSync(LOCK_PATH);
|
|
const ageMs = Date.now() - stat.mtimeMs;
|
|
if (ageMs > STALE_LOCK_MS) {
|
|
// Stale; take over.
|
|
unlinkSync(LOCK_PATH);
|
|
} else {
|
|
return false;
|
|
}
|
|
} catch {
|
|
// Cannot stat; bail conservatively.
|
|
return false;
|
|
}
|
|
}
|
|
const info: LockInfo = { pid: process.pid, started_at: new Date().toISOString() };
|
|
try {
|
|
writeFileSync(LOCK_PATH, JSON.stringify(info), { encoding: "utf-8", flag: "wx" });
|
|
return true;
|
|
} catch {
|
|
return false;
|
|
}
|
|
}
|
|
|
|
function releaseLock(): void {
|
|
try {
|
|
if (!existsSync(LOCK_PATH)) return;
|
|
const raw = readFileSync(LOCK_PATH, "utf-8");
|
|
const info = JSON.parse(raw) as LockInfo;
|
|
if (info.pid === process.pid) {
|
|
unlinkSync(LOCK_PATH);
|
|
}
|
|
} catch {
|
|
// Best-effort cleanup.
|
|
}
|
|
}
|
|
|
|
// ── Stage runners ──────────────────────────────────────────────────────────
|
|
|
|
/**
|
|
* Build a SKIP result for the code/memory stage when the local engine is
|
|
* not in 'ok' state (per plan D12). Surface the status verbatim so the
|
|
* verdict block tells the user exactly what's wrong without re-probing.
|
|
*
|
|
* Reasons mapped to user-actionable summaries:
|
|
* no-cli → "gbrain CLI not on PATH; install via /setup-gbrain"
|
|
* missing-config → "no local engine; run /setup-gbrain to add local PGLite"
|
|
* broken-config → "config file at ~/.gbrain/config.json is malformed; see /setup-gbrain Step 1.5"
|
|
* broken-db → "config points at unreachable DB; see /setup-gbrain Step 1.5"
|
|
*/
|
|
function skipStageForLocalStatus(
|
|
stage: "code" | "memory",
|
|
status: LocalEngineStatus,
|
|
t0: number,
|
|
): StageResult {
|
|
const reasons: Record<Exclude<LocalEngineStatus, "ok">, string> = {
|
|
"no-cli": "gbrain CLI not on PATH; install via /setup-gbrain",
|
|
"missing-config":
|
|
"no local engine; run /setup-gbrain to add local PGLite for code search",
|
|
"broken-config":
|
|
"config at ~/.gbrain/config.json is malformed; see /setup-gbrain Step 1.5",
|
|
"broken-db":
|
|
"config points at unreachable DB; see /setup-gbrain Step 1.5",
|
|
};
|
|
const reason = reasons[status as Exclude<LocalEngineStatus, "ok">];
|
|
return {
|
|
name: stage,
|
|
ran: false,
|
|
ok: true, // SKIP (per D12) — not a stage failure, just an unsatisfied prerequisite
|
|
duration_ms: Date.now() - t0,
|
|
summary: `skipped — local engine ${status} — ${reason}`,
|
|
};
|
|
}
|
|
|
|
|
|
async function runCodeImport(args: CliArgs): Promise<StageResult> {
|
|
const t0 = Date.now();
|
|
const root = repoRoot();
|
|
if (!root) {
|
|
return { name: "code", ran: false, ok: true, duration_ms: 0, summary: "skipped (not in git repo)" };
|
|
}
|
|
if (!gbrainAvailable()) {
|
|
return { name: "code", ran: false, ok: false, duration_ms: 0, summary: "skipped (gbrain CLI not in PATH)" };
|
|
}
|
|
|
|
const sourceId = deriveCodeSourceId(root);
|
|
|
|
// dry-run preview always shows the would-do steps, regardless of local
|
|
// engine state. Useful for "what would /sync-gbrain do" without probing
|
|
// the engine.
|
|
if (args.mode === "dry-run") {
|
|
return {
|
|
name: "code",
|
|
ran: false,
|
|
ok: true,
|
|
duration_ms: 0,
|
|
summary: `would: gbrain sources add ${sourceId} --path ${root} --federated; gbrain sync --strategy code --source ${sourceId}; gbrain sources attach ${sourceId}`,
|
|
detail: { source_id: sourceId, source_path: root, status: "skipped" },
|
|
};
|
|
}
|
|
|
|
// Split-engine pre-flight (per plan D12): when local engine is not ok, SKIP
|
|
// code stage cleanly. Brain-sync stage still runs because it doesn't depend
|
|
// on local engine. The /sync-gbrain Step 1.5 pre-flight surfaces the user
|
|
// remediation message; this skip just keeps the orchestrator from crashing
|
|
// when the local DB is dead. Skipped on --dry-run (above) since dry-run
|
|
// never actually probes anything.
|
|
const localStatus = localEngineStatus({ noCache: false });
|
|
if (localStatus !== "ok") {
|
|
return skipStageForLocalStatus("code", localStatus, t0);
|
|
}
|
|
|
|
// Step 0: Best-effort cleanup of pre-pathhash legacy source.
|
|
// Earlier /sync-gbrain versions registered `gstack-code-<slug>` (no path
|
|
// suffix). On a multi-worktree repo, those collapsed onto a single id
|
|
// with last-sync-wins. Federated search would return stale duplicate
|
|
// hits forever if we left the orphan in place. Remove the legacy id once
|
|
// here so users don't accumulate orphans.
|
|
// Failure is non-fatal — we still register the new id below.
|
|
const legacyId = deriveLegacyCodeSourceId(root);
|
|
let legacyRemoved = false;
|
|
if (legacyId !== sourceId) {
|
|
const rm = spawnSync("gbrain", ["sources", "remove", legacyId, "--confirm-destructive"], {
|
|
encoding: "utf-8",
|
|
timeout: 30_000,
|
|
stdio: ["ignore", "pipe", "pipe"],
|
|
});
|
|
// Treat absent-source as success (clean state). gbrain emits "not found" on
|
|
// missing id; treat any non-zero exit without "not found" as a soft fail.
|
|
if (rm.status === 0) legacyRemoved = true;
|
|
}
|
|
|
|
// Step 1: Ensure source registered (idempotent). Single source of truth in lib —
|
|
// no synchronous duplicate here (per /codex review #12).
|
|
let registered = false;
|
|
try {
|
|
const result = await ensureSourceRegistered(sourceId, root, { federated: true });
|
|
registered = result.changed;
|
|
} catch (err) {
|
|
return {
|
|
name: "code",
|
|
ran: true,
|
|
ok: false,
|
|
duration_ms: Date.now() - t0,
|
|
summary: `source registration failed: ${(err as Error).message}`,
|
|
detail: { source_id: sourceId, source_path: root, status: "failed" },
|
|
};
|
|
}
|
|
|
|
// Step 2: Run sync or reindex.
|
|
const syncArgs = args.mode === "full"
|
|
? ["reindex-code", "--source", sourceId, "--yes"]
|
|
: ["sync", "--strategy", "code", "--source", sourceId];
|
|
|
|
const syncResult = spawnSync("gbrain", syncArgs, {
|
|
stdio: args.quiet ? ["ignore", "ignore", "ignore"] : ["ignore", "inherit", "inherit"],
|
|
timeout: 35 * 60 * 1000,
|
|
});
|
|
|
|
if (syncResult.status !== 0) {
|
|
return {
|
|
name: "code",
|
|
ran: true,
|
|
ok: false,
|
|
duration_ms: Date.now() - t0,
|
|
summary: `gbrain ${syncArgs.join(" ")} exited ${syncResult.status}`,
|
|
detail: { source_id: sourceId, source_path: root, status: "failed" },
|
|
};
|
|
}
|
|
|
|
// Step 3: Pin this worktree's CWD to the source via .gbrain-source. Subsequent
|
|
// gbrain code-def / code-refs / code-callers calls from anywhere under <root>
|
|
// route to this source by default — no --source flag needed.
|
|
//
|
|
// If attach fails the whole flow has a silent correctness problem: sync
|
|
// succeeded but unqualified `gbrain code-def` from this worktree will hit
|
|
// the wrong/default source. Treat it as a stage failure (ok=false) so the
|
|
// verdict block surfaces ERR and the user knows to retry rather than
|
|
// trusting stale results.
|
|
const attach = spawnSync("gbrain", ["sources", "attach", sourceId], {
|
|
encoding: "utf-8",
|
|
timeout: 10_000,
|
|
cwd: root,
|
|
stdio: ["ignore", "pipe", "pipe"],
|
|
});
|
|
const pageCount = sourcePageCount(sourceId);
|
|
const legacyNote = legacyRemoved ? `, removed legacy ${legacyId}` : "";
|
|
const baseSummary = `${registered ? "registered + " : ""}synced ${sourceId} (page_count=${pageCount ?? "unknown"}${legacyNote})`;
|
|
|
|
if (attach.status !== 0) {
|
|
const reason = (attach.stderr || attach.stdout || "").trim().split("\n").pop() || `exit ${attach.status}`;
|
|
return {
|
|
name: "code",
|
|
ran: true,
|
|
ok: false,
|
|
duration_ms: Date.now() - t0,
|
|
summary: `${baseSummary}; attach FAILED (${reason}) — code-def queries from this worktree will hit the default source until /sync-gbrain succeeds`,
|
|
detail: {
|
|
source_id: sourceId,
|
|
source_path: root,
|
|
page_count: pageCount,
|
|
last_imported: new Date().toISOString(),
|
|
status: "failed",
|
|
},
|
|
};
|
|
}
|
|
|
|
return {
|
|
name: "code",
|
|
ran: true,
|
|
ok: true,
|
|
duration_ms: Date.now() - t0,
|
|
summary: baseSummary,
|
|
detail: {
|
|
source_id: sourceId,
|
|
source_path: root,
|
|
page_count: pageCount,
|
|
last_imported: new Date().toISOString(),
|
|
status: "ok",
|
|
},
|
|
};
|
|
}
|
|
|
|
function runMemoryIngest(args: CliArgs): StageResult {
|
|
const t0 = Date.now();
|
|
|
|
if (args.mode === "dry-run") {
|
|
return { name: "memory", ran: false, ok: true, duration_ms: 0, summary: "would: gstack-memory-ingest --probe" };
|
|
}
|
|
|
|
// Split-engine pre-flight (per plan D12). gstack-memory-ingest shells out
|
|
// to `gbrain import` which targets the LOCAL engine. When that engine is
|
|
// not ok, SKIP cleanly so brain-sync (the only stage that doesn't depend
|
|
// on local engine) still runs.
|
|
const localStatus = localEngineStatus({ noCache: false });
|
|
if (localStatus !== "ok") {
|
|
return skipStageForLocalStatus("memory", localStatus, t0);
|
|
}
|
|
|
|
const ingestPath = join(import.meta.dir, "gstack-memory-ingest.ts");
|
|
const ingestArgs = ["run", ingestPath];
|
|
if (args.mode === "full") ingestArgs.push("--bulk");
|
|
else ingestArgs.push("--incremental");
|
|
if (args.quiet) ingestArgs.push("--quiet");
|
|
|
|
const result = spawnSync("bun", ingestArgs, {
|
|
encoding: "utf-8",
|
|
timeout: 35 * 60 * 1000,
|
|
});
|
|
|
|
// D6: parse [memory-ingest] lines from the child's stderr. ERR-prefixed
|
|
// lines indicate a system-level failure (gbrain crashed or CLI missing)
|
|
// and the child exits non-zero. Per-file failures are summarized in the
|
|
// last non-ERR [memory-ingest] line but do NOT make the verdict ERR.
|
|
const stderrLines = (result.stderr || "").split("\n");
|
|
const memLines = stderrLines.filter((l) => l.includes("[memory-ingest]"));
|
|
const errLine = memLines.find((l) => l.includes("[memory-ingest] ERR"));
|
|
const lastMemLine = memLines.slice(-1)[0];
|
|
const rawSummary = errLine || lastMemLine || "ingest pass complete";
|
|
// Strip the "[memory-ingest] " prefix and any leading "ERR: " for cleaner
|
|
// verdict output. The orchestrator's own formatStage will prefix with OK/ERR.
|
|
const summary = rawSummary
|
|
.replace(/^.*\[memory-ingest\]\s*/, "")
|
|
.replace(/^ERR:\s*/, "");
|
|
|
|
const ok = result.status === 0;
|
|
return {
|
|
name: "memory",
|
|
ran: true,
|
|
ok,
|
|
duration_ms: Date.now() - t0,
|
|
summary: ok
|
|
? summary
|
|
: `${summary}${result.status === null ? " (killed by signal / timeout)" : ` (exit ${result.status})`}`,
|
|
};
|
|
}
|
|
|
|
function runBrainSyncPush(args: CliArgs): StageResult {
|
|
const t0 = Date.now();
|
|
|
|
if (args.mode === "dry-run") {
|
|
return { name: "brain-sync", ran: false, ok: true, duration_ms: 0, summary: "would: gstack-brain-sync --discover-new --once" };
|
|
}
|
|
|
|
const brainSyncPath = join(import.meta.dir, "gstack-brain-sync");
|
|
if (!existsSync(brainSyncPath)) {
|
|
return { name: "brain-sync", ran: false, ok: true, duration_ms: 0, summary: "skipped (gstack-brain-sync not installed)" };
|
|
}
|
|
|
|
spawnSync(brainSyncPath, ["--discover-new"], {
|
|
stdio: args.quiet ? ["ignore", "ignore", "ignore"] : ["ignore", "inherit", "inherit"],
|
|
timeout: 60 * 1000,
|
|
});
|
|
const result = spawnSync(brainSyncPath, ["--once"], {
|
|
stdio: args.quiet ? ["ignore", "ignore", "ignore"] : ["ignore", "inherit", "inherit"],
|
|
timeout: 60 * 1000,
|
|
});
|
|
|
|
return {
|
|
name: "brain-sync",
|
|
ran: true,
|
|
ok: result.status === 0,
|
|
duration_ms: Date.now() - t0,
|
|
summary: result.status === 0 ? "curated artifacts pushed" : `gstack-brain-sync exited ${result.status}`,
|
|
};
|
|
}
|
|
|
|
// ── State file ─────────────────────────────────────────────────────────────
|
|
|
|
interface SyncState {
|
|
schema_version: 1;
|
|
last_writer: string;
|
|
last_sync?: string;
|
|
last_full_sync?: string;
|
|
last_stages?: StageResult[];
|
|
}
|
|
|
|
function loadSyncState(): SyncState {
|
|
if (!existsSync(STATE_PATH)) {
|
|
return { schema_version: 1, last_writer: "gstack-gbrain-sync" };
|
|
}
|
|
try {
|
|
const raw = JSON.parse(readFileSync(STATE_PATH, "utf-8")) as SyncState;
|
|
if (raw.schema_version === 1) return raw;
|
|
} catch {
|
|
// fall through
|
|
}
|
|
return { schema_version: 1, last_writer: "gstack-gbrain-sync" };
|
|
}
|
|
|
|
/**
|
|
* Atomic state file write per /plan-eng-review D1: write tmp file then rename.
|
|
* rename(2) is atomic on POSIX filesystems.
|
|
*/
|
|
function saveSyncState(state: SyncState): void {
|
|
try {
|
|
mkdirSync(dirname(STATE_PATH), { recursive: true });
|
|
const tmp = `${STATE_PATH}.tmp.${process.pid}`;
|
|
writeFileSync(tmp, JSON.stringify(state, null, 2), "utf-8");
|
|
renameSync(tmp, STATE_PATH);
|
|
} catch {
|
|
// non-fatal
|
|
}
|
|
}
|
|
|
|
// ── Output ─────────────────────────────────────────────────────────────────
|
|
|
|
function formatStage(s: StageResult): string {
|
|
const status = !s.ran ? "SKIP" : s.ok ? "OK" : "ERR";
|
|
const dur = s.duration_ms > 0 ? ` (${(s.duration_ms / 1000).toFixed(1)}s)` : "";
|
|
return ` ${status.padEnd(5)} ${s.name.padEnd(12)} ${s.summary}${dur}`;
|
|
}
|
|
|
|
// ── Main ───────────────────────────────────────────────────────────────────
|
|
|
|
async function main(): Promise<void> {
|
|
const args = parseArgs();
|
|
|
|
if (!args.quiet) {
|
|
const engine = detectEngineTier();
|
|
console.error(`[gbrain-sync] mode=${args.mode} engine=${engine.engine}`);
|
|
}
|
|
|
|
// Acquire lock (skip on dry-run since dry-run never writes).
|
|
const needsLock = args.mode !== "dry-run";
|
|
let haveLock = false;
|
|
if (needsLock) {
|
|
haveLock = acquireLock();
|
|
if (!haveLock) {
|
|
console.error(
|
|
`[gbrain-sync] another /sync-gbrain is running (lock at ${LOCK_PATH}). ` +
|
|
`If that process died, the lock auto-clears after 5 min, or remove it manually.`
|
|
);
|
|
process.exit(2);
|
|
}
|
|
}
|
|
|
|
const cleanup = () => {
|
|
if (haveLock) releaseLock();
|
|
};
|
|
process.on("SIGINT", () => { cleanup(); process.exit(130); });
|
|
process.on("SIGTERM", () => { cleanup(); process.exit(143); });
|
|
|
|
let exitCode = 0;
|
|
try {
|
|
const state = loadSyncState();
|
|
const stages: StageResult[] = [];
|
|
|
|
if (!args.noCode) {
|
|
stages.push(await withErrorContext("sync:code", () => runCodeImport(args), "gstack-gbrain-sync"));
|
|
}
|
|
if (!args.noMemory) {
|
|
stages.push(await withErrorContext("sync:memory", () => runMemoryIngest(args), "gstack-gbrain-sync"));
|
|
}
|
|
if (!args.noBrainSync) {
|
|
stages.push(await withErrorContext("sync:brain-sync", () => runBrainSyncPush(args), "gstack-gbrain-sync"));
|
|
}
|
|
|
|
if (args.mode !== "dry-run") {
|
|
state.last_sync = new Date().toISOString();
|
|
if (args.mode === "full") state.last_full_sync = state.last_sync;
|
|
state.last_stages = stages;
|
|
saveSyncState(state);
|
|
}
|
|
|
|
if (!args.quiet || args.mode === "dry-run") {
|
|
console.log(`\ngstack-gbrain-sync (${args.mode}):`);
|
|
for (const s of stages) console.log(formatStage(s));
|
|
const okCount = stages.filter((s) => s.ok).length;
|
|
const errCount = stages.filter((s) => !s.ok && s.ran).length;
|
|
console.log(`\n ${okCount} ok, ${errCount} error, ${stages.length - okCount - errCount} skipped`);
|
|
}
|
|
|
|
const anyError = stages.some((s) => s.ran && !s.ok);
|
|
exitCode = anyError ? 1 : 0;
|
|
} finally {
|
|
cleanup();
|
|
}
|
|
|
|
process.exit(exitCode);
|
|
}
|
|
|
|
main().catch((err) => {
|
|
console.error(`gstack-gbrain-sync fatal: ${err instanceof Error ? err.message : String(err)}`);
|
|
releaseLock();
|
|
process.exit(1);
|
|
});
|