mirror of
https://github.com/garrytan/gstack.git
synced 2026-05-18 10:31:30 +08:00
* fix(gbrain-sync): fold hostname into code-source id hash + migration (#1414) Cherry-picked from #1468 by 0xDevNinja and extended with the hostname-fold migration that codex review surfaced. Pre-fix `deriveCodeSourceId` hashed the absolute repo path alone, so two machines with identical home-dir layouts (chezmoi-managed dotfiles, ansible-provisioned VMs) derived the same id and clobbered each other's `local_path` in a federated brain. Last-writer-wins, with cryptic "Not a git repository" errors on the loser. Hash key is now `\${hostname}::\${path}`. Conductor worktrees on a single host stay distinct (path entropy unchanged within a host); cross-machine federations stop colliding. Migration (D1=B + codex refinements): every existing user has a pre-#1468 path-only-hash source id in their brain that no longer matches what `deriveCodeSourceId` produces. Without migration, the next sync registers a fresh source and orphans the old one. This commit adds: - \`derivePathOnlyHashLegacyId\` — separate helper for the pre-#1468 form. Distinct from \`deriveLegacyCodeSourceId\` (pre-pathhash v1.x form); both probes run. - \`planHostnameFoldMigration\` — feature-checks \`gbrain sources rename <old> <new>\` (exact argument shape, not just \`--help\`), gates on path-drift (skip migration if old source's \`local_path\` differs from current repo root), and falls back to register-new + sync-OK + remove-old when rename is unsupported. As of gbrain 0.35.0.0 the rename subcommand does not exist, so users go through the cleanup path; the rename path stays dormant until gbrain ships it. - \`removeOrphanedSource\` — called only AFTER new-source sync verifies page_count > 0. Closes the data-loss window codex flagged where "register new, remove old before sync" can wipe pages if sync fails. - \`sourceLocalPath\` — looks up a source's \`local_path\` from \`gbrain sources list --json\` for the drift gate. - Helpers accept an optional \`env\` parameter so tests can inject a gbrain shim via PATH without process-wide PATH mutation (Bun's spawnSync doesn't pick up runtime PATH changes). Pre-positions for commit 4's centralized gbrain-exec helper. - \`if (import.meta.main)\` guard around \`main()\` so the helpers can be imported for in-process unit tests. Tests cover: pure derivation, ids-match degenerate case, no-legacy short-circuit, path-drift skip path, rename path with shim, cleanup fallback when rename unsupported, cleanup fallback when rename call itself fails, source-lookup happy/missing/error paths. \`GSTACK_HOSTNAME\` env var is a test-only knob; production uses \`os.hostname()\`. Fixes #1414 Co-Authored-By: Claude <noreply@anthropic.com> * fix(gbrain-sync): cut source-id slugs on hyphen boundaries (+ #1357) Cherry-picked from #1481 by drummerms and extended with the explicit HTTPS-remote regression case for #1357 (decision D2=A). `constrainSourceId` truncated the slug with `slug.slice(-tailBudget)`, which cut mid-word when the boundary fell inside a token. For a repo where the combined `prefix-org-repo-pathhash` exceeded 32 chars, this produced embarrassing artifacts like `gstack-code-kill-270c0001-c32152` (from `drummerms-av-sow-wiz-skill-270c0001`). Two changes carried from #1481, adapted for the #1468 hostpathhash: 1. `constrainSourceId` now walks hyphen-separated tokens from the right, accumulating whole tokens until adding the next would exceed `tailBudget`. When no token fits, falls through to the existing `${prefix}-${hash}` form. 2. `deriveCodeSourceId` now retries with `repo-only-hostpathhash` (dropping the org segment) when the full `org-repo-hostpathhash` triggers truncation. Keeps the repo name readable when it fits at all. Plus a new test asserting the source id is period-free for the exact HTTPS-with-.git remote shape from #1357 (`https://github.com/foo/bar.git`). canonicalizeRemote strips `.git`; the sanitizer strips any residual non-alnum. The test closes #1357 by pinning the property. Closes #1357 Co-Authored-By: Claude <noreply@anthropic.com> * fix(gbrain): probe CLI without command builtin * fix(gbrain-sync): centralize gbrain spawn surface + seed DATABASE_URL Cherry-picked from #1508 by jasshultz, restructured per codex review #4 and #7 to widen scope and centralize the spawn surface. The bug: gbrain auto-loads .env.local from cwd via dotenv. When /sync-gbrain runs inside a Next.js / Prisma / Rails project whose .env.local defines its own DATABASE_URL (pointing at the app's local DB), gbrain reads that value instead of its own ~/.gbrain/config.json — auth fails, code + memory stages crash. This commit: - Adds lib/gbrain-exec.ts: buildGbrainEnv, spawnGbrain, execGbrainJson, execGbrainText, spawnGbrainAsync (the last one for memory-ingest's streaming gbrain import call). buildGbrainEnv seeds DATABASE_URL from ${GBRAIN_HOME:-$HOME/.gbrain}/config.json, returns a fresh env object (never the caller's by identity — codex review #11), and honors the GSTACK_RESPECT_ENV_DATABASE_URL=1 escape hatch. - Routes every gbrain spawn in bin/gstack-gbrain-sync.ts and bin/gstack-memory-ingest.ts through the helpers. Both files now own zero direct spawnSync("gbrain"|spawn("gbrain"|execFileSync("gbrain" call sites. - Threads buildGbrainEnv into the spawnSync("bun", [memory-ingest], ...) grandchild in runMemoryIngest (codex review #7). Without this, the parent fix is half-baked — the bun child inherits a clean env but needs DATABASE_URL pre-seeded too. spawnGbrainAsync inside memory-ingest provides defense in depth for standalone invocations. - Adds GBRAIN_HOME support — aligns with detectEngineTier (already honors GBRAIN_HOME) so all gstack-side gbrain calls agree on which config file matters. Resolves baseEnv.HOME first, then homedir(), so test injection works without process-wide HOME mutation. - Adds test/build-gbrain-env.test.ts: 10 unit tests covering all five env-seeding branches (seed from config / override caller / GSTACK_RESPECT escape hatch / missing config / unparseable config / no database_url field / GBRAIN_HOME path / object-identity guard / unrelated-vars preservation / idempotent-when-matches). - Adds test/gbrain-exec-invariant.test.ts: static-source check that greps both bin/gstack-gbrain-sync.ts and bin/gstack-memory-ingest.ts for direct spawnSync("gbrain"|spawn("gbrain"|execFileSync("gbrain"| execSync(...gbrain matches and fails the build if any are found. Refactor-proof against future contributors adding a new gbrain spawn without env threading. The invariant is intentionally narrow — only the two files where the DATABASE_URL bug actually hurts users are guarded. Migrating the spawn sites in lib/gbrain-local-status.ts, lib/gstack-memory-helpers.ts, and bin/gstack-brain-context-load.ts is a follow-up. Co-Authored-By: Jason Shultz <jasshultz@gmail.com> Co-Authored-By: Claude <noreply@anthropic.com> * fix(gbrain-sync): add .gbrain-source to consumer repo .gitignore (#1384) The v1.29.0.0 changelog promised .gbrain-source would be added to the consuming repo's .gitignore so the per-worktree pin stays local, but the change actually only added it to gstack's own .gitignore. Without the consumer-side entry, the pin gets committed and Conductor sibling worktrees of the same repo + branch step on each other's pin every time anyone commits. Add ensureGbrainSourceGitignored after a successful gbrain sources attach in runCodeImport. Idempotent on repeat runs (line-trim match), creates .gitignore if missing, logs a warning and continues on permission errors so a read-only checkout doesn't fail the sync. Gate the top-level main() call behind import.meta.main so tests can import the helper without triggering a full sync run on module load. Tests in test/gbrain-source-gitignore.test.ts cover: create-when-missing, append-without-trailing-newline, append-with-trailing-newline, idempotent on repeat, recognize whitespace-surrounded entry, no-throw on read-only file. 6 pass. * fix(gbrain-sources): bump gbrain sources list --json timeout 10s → 30s Supabase free-tier cold-starts can push `gbrain sources list --json` past 10s (observed 14.5s in the wild), causing probeSource() to throw ETIMEDOUT during /sync-gbrain code stage even though the underlying CLI was healthy. Matches the 30s ceiling already used by `sources add` / `sources remove` in the same file. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(brain-allowlist): sync project-root eng-review-test-plan artifacts (#1452) Cherry-picked from #1465 by genisis0x and extended with the v1.40.0.0 upgrade migration that codex review #5 surfaced. #1465 alone only patches bin/gstack-artifacts-init, which means fresh installs and re-inits pick up the new pattern. But existing users who already ran v1.38.1.0 have a `.migrations/v1.38.1.0.done` marker — that migration won't re-run no matter what we change. So their installed `.brain-allowlist`, `.brain-privacy-map.json`, and `.gitattributes` stay without the new pattern, and `/plan-eng-review` artifacts continue to silently drop out of their federation queue. This commit: - bin/gstack-artifacts-init: adds projects/*/*-eng-review-test-plan-*.md to the three managed blocks. v1.38.1.0 covered design + test-plan; this completes the set for /plan-eng-review. - gstack-upgrade/migrations/v1.40.0.0.sh: targeted in-place repair for existing installs. Same idempotent jq-based shape as v1.38.1.0. Adds the new pattern to .brain-allowlist (before the USER ADDITIONS marker), .brain-privacy-map.json (as class=artifact), and .gitattributes (as merge=union). NEVER commits + pushes — the user controls when the patches ship to their federated artifacts repo. - test/artifacts-init-migration.test.ts: 5 new tests covering the v1.40.0.0 migration applied on top of a post-v1.38.1.0 state, jq patching, gitattributes append, idempotent re-run, and done-marker write when files are missing entirely. Co-Authored-By: Claude <noreply@anthropic.com> * fix(gbrain-install): skip postinstall on Windows MSYS/MINGW + post-install probe Cherry-picked from #1487 by genisis0x and extended with the post-install subcommand probe per T6 / codex review #19. `bun install` in $INSTALL_DIR fails on Windows MSYS/MINGW/Cygwin shells because gbrain's native postinstall script mis-parses path arguments and aborts with a non-zero exit, breaking gstack-gbrain-install for Windows users running git-bash/MSYS2. The package installs cleanly without scripts. This commit: - Adds Windows shell detection via `uname -s` matching MINGW*/MSYS*/CYGWIN*/Windows_NT (#1487's case statement already covers all four — codex review #18 confirmed MINGW* is included). Windows paths get `bun install --ignore-scripts`; macOS and Linux unchanged. - Adds a post-install probe of `gbrain sources --help`. `gbrain --version` already runs (D19 PATH-shadowing validation), but version success doesn't prove the subcommand surface is reachable — and `--ignore-scripts` may have skipped artifacts that subcommands need. Probe failure logs a clear warning (with Windows-specific remediation pointing at re-running `bun install` outside MSYS) but does NOT exit non-zero; users may still get value from gbrain even if the probe fails transiently. Refs #1271 Co-Authored-By: Claude <noreply@anthropic.com> * chore: v1.40.0.0 — gbrain sync hardening wave Bumps VERSION 1.39.2.0 → 1.40.0.0 (MINOR — substantial gbrain capability hardening across sync pipeline, install path, federation allowlist; ~600 net LOC added across 8 community PRs + plan-review refinements). CHANGELOG entry follows the release-summary format: two-line headline, lead paragraph, "numbers that matter" with before/after table across 8 user-visible surfaces, "what this means for builders" closer, itemized Added/Changed/Fixed/NOT fixed/For contributors sections. Per-commit contributor credits: 0xDevNinja, drummerms, Jayesh Betala, Jason Shultz, genisis0x. Also names NikhileshNanduri and realcarsonterry in the wave's "Fixed" section for independent submissions of the .gbrain-source gitignore bug. Co-Authored-By: Claude <noreply@anthropic.com> --------- Co-authored-by: 0xDevNinja <manmit0x@gmail.com> Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: drummerms <mike@av2o.com> Co-authored-by: Jayesh Betala <jayesh.betala7@gmail.com> Co-authored-by: Jason Shultz <jasshultz@gmail.com> Co-authored-by: genisis0x <manietdavv@gmail.com>
175 lines
6.6 KiB
TypeScript
175 lines
6.6 KiB
TypeScript
/**
|
|
* Centralized gbrain CLI invocation.
|
|
*
|
|
* Every `gbrain ...` spawn from `bin/gstack-gbrain-sync.ts` and
|
|
* `bin/gstack-memory-ingest.ts` MUST go through `spawnGbrain` (or
|
|
* `execGbrainJson`), and the invariant test
|
|
* `test/gbrain-exec-invariant.test.ts` enforces this with a static-source
|
|
* grep. The helper layer guarantees three properties:
|
|
*
|
|
* 1. **DATABASE_URL is seeded from gbrain's own config**, not from the
|
|
* caller's `.env.local`. gbrain auto-loads `.env.local` via dotenv on
|
|
* startup. When `/sync-gbrain` runs inside a Next.js / Prisma / Rails
|
|
* project with its own `DATABASE_URL`, gbrain reads that one and not
|
|
* its own `${GBRAIN_HOME:-$HOME/.gbrain}/config.json`. Auth fails;
|
|
* code + memory stages crash; only brain-sync's git push survives.
|
|
*
|
|
* 2. **Bun-aware env passing.** Mutating `process.env.DATABASE_URL` does
|
|
* NOT propagate to children of `child_process.spawnSync`/`spawn` in
|
|
* Bun — the child gets the original startup env. So we cannot just
|
|
* set process.env; we must thread an explicit `env:` dict to every
|
|
* spawn. This is the central bug the helper exists to prevent
|
|
* regressing on.
|
|
*
|
|
* 3. **`GBRAIN_HOME` honored consistently.** Other gstack helpers
|
|
* (`detectEngineTier`) already honor `GBRAIN_HOME`. `buildGbrainEnv`
|
|
* reads from `${GBRAIN_HOME:-$HOME/.gbrain}/config.json` so all
|
|
* gstack-side gbrain calls agree on which config file matters.
|
|
*
|
|
* **Escape hatch:** `GSTACK_RESPECT_ENV_DATABASE_URL=1` returns the
|
|
* caller's env unchanged. Use only when the brain intentionally lives in
|
|
* the project's local DB (rare).
|
|
*/
|
|
|
|
import { existsSync, readFileSync } from "fs";
|
|
import { join } from "path";
|
|
import { homedir } from "os";
|
|
import { spawnSync, spawn, execFileSync, type SpawnSyncReturns, type ChildProcess, type SpawnOptions } from "child_process";
|
|
|
|
interface GbrainConfig {
|
|
database_url?: string;
|
|
}
|
|
|
|
export interface BuildGbrainEnvOptions {
|
|
/**
|
|
* Caller env to extend. Defaults to `process.env`. Tests inject a
|
|
* synthetic env so the helper can be exercised without polluting the
|
|
* real process env.
|
|
*/
|
|
baseEnv?: NodeJS.ProcessEnv;
|
|
/**
|
|
* When true, announce on stderr that we overrode the caller's
|
|
* DATABASE_URL. Suppressed for the `--quiet` sync flow.
|
|
*/
|
|
announce?: boolean;
|
|
}
|
|
|
|
/**
|
|
* Build an env dict with DATABASE_URL seeded from
|
|
* `${GBRAIN_HOME:-$HOME/.gbrain}/config.json`. Returns the base env
|
|
* unchanged when:
|
|
* - `GSTACK_RESPECT_ENV_DATABASE_URL=1` (intentional opt-out),
|
|
* - the config file is missing or unparseable,
|
|
* - the config has no `database_url`,
|
|
* - the caller already set DATABASE_URL to the same value.
|
|
*
|
|
* Always returns a fresh object — mutating the returned env never
|
|
* affects the caller's env. Tests assert on effective values, not
|
|
* object identity.
|
|
*/
|
|
export function buildGbrainEnv(opts: BuildGbrainEnvOptions = {}): NodeJS.ProcessEnv {
|
|
const baseEnv = opts.baseEnv || process.env;
|
|
const out: NodeJS.ProcessEnv = { ...baseEnv };
|
|
if (baseEnv.GSTACK_RESPECT_ENV_DATABASE_URL === "1") return out;
|
|
|
|
const homeBase = baseEnv.HOME || homedir();
|
|
const gbrainHome = baseEnv.GBRAIN_HOME || join(homeBase, ".gbrain");
|
|
const configPath = join(gbrainHome, "config.json");
|
|
if (!existsSync(configPath)) return out;
|
|
|
|
let cfg: GbrainConfig = {};
|
|
try {
|
|
cfg = JSON.parse(readFileSync(configPath, "utf-8")) as GbrainConfig;
|
|
} catch {
|
|
return out;
|
|
}
|
|
if (!cfg.database_url) return out;
|
|
if (baseEnv.DATABASE_URL === cfg.database_url) return out;
|
|
|
|
const hadCaller = baseEnv.DATABASE_URL !== undefined;
|
|
out.DATABASE_URL = cfg.database_url;
|
|
if (opts.announce) {
|
|
const note = hadCaller ? " (overrode value from caller env / .env.local)" : "";
|
|
process.stderr.write(`[gbrain-exec] seeded DATABASE_URL from ${configPath}${note}\n`);
|
|
}
|
|
return out;
|
|
}
|
|
|
|
export interface SpawnGbrainOptions {
|
|
/** Timeout in milliseconds. Defaults to 30s. */
|
|
timeout?: number;
|
|
/** Working directory for the child process. */
|
|
cwd?: string;
|
|
/** Stdio configuration. Defaults to capturing both stdout and stderr. */
|
|
stdio?: "inherit" | "pipe" | "ignore" | Array<"inherit" | "pipe" | "ignore">;
|
|
/**
|
|
* Base env to extend before seeding DATABASE_URL. Defaults to
|
|
* `process.env`. Tests inject a synthetic env so the spawn picks up a
|
|
* gbrain shim on PATH and a fake `~/.gbrain/config.json`.
|
|
*/
|
|
baseEnv?: NodeJS.ProcessEnv;
|
|
/** Whether to announce DATABASE_URL seeding on stderr. */
|
|
announce?: boolean;
|
|
}
|
|
|
|
/**
|
|
* Spawn `gbrain <args>` with the seeded env. Returns the raw
|
|
* `SpawnSyncReturns<string>` so callers can inspect `status`, `stdout`,
|
|
* `stderr` exactly as they would with `spawnSync` directly.
|
|
*/
|
|
export function spawnGbrain(args: string[], opts: SpawnGbrainOptions = {}): SpawnSyncReturns<string> {
|
|
return spawnSync("gbrain", args, {
|
|
encoding: "utf-8",
|
|
timeout: opts.timeout ?? 30_000,
|
|
cwd: opts.cwd,
|
|
stdio: opts.stdio || ["ignore", "pipe", "pipe"],
|
|
env: buildGbrainEnv({ baseEnv: opts.baseEnv, announce: opts.announce }),
|
|
});
|
|
}
|
|
|
|
/**
|
|
* Run `gbrain <args>` and parse stdout as JSON. Returns `null` on
|
|
* non-zero exit, parse failure, or timeout. Useful for `gbrain sources
|
|
* list --json` and similar.
|
|
*/
|
|
export function execGbrainJson<T = unknown>(args: string[], opts: SpawnGbrainOptions = {}): T | null {
|
|
const r = spawnGbrain(args, opts);
|
|
if (r.status !== 0) return null;
|
|
try {
|
|
return JSON.parse(r.stdout || "null") as T;
|
|
} catch {
|
|
return null;
|
|
}
|
|
}
|
|
|
|
/**
|
|
* Async streaming variant for callers that need to attach stdout/stderr
|
|
* listeners (e.g., `gbrain import` in `gstack-memory-ingest.ts`). Always
|
|
* injects the seeded env. Returns the raw `ChildProcess` so the caller
|
|
* can wire up its own promise around exit/timeout/signal handling.
|
|
*/
|
|
export function spawnGbrainAsync(
|
|
args: string[],
|
|
opts: { stdio?: SpawnOptions["stdio"]; cwd?: string; baseEnv?: NodeJS.ProcessEnv } = {},
|
|
): ChildProcess {
|
|
return spawn("gbrain", args, {
|
|
stdio: opts.stdio || ["ignore", "pipe", "pipe"],
|
|
cwd: opts.cwd,
|
|
env: buildGbrainEnv({ baseEnv: opts.baseEnv, announce: false }),
|
|
});
|
|
}
|
|
|
|
/**
|
|
* Run `gbrain <args>` via execFileSync. Throws on non-zero exit. Useful
|
|
* for callers that want to surface gbrain's stderr as the error message.
|
|
*/
|
|
export function execGbrainText(args: string[], opts: SpawnGbrainOptions = {}): string {
|
|
return execFileSync("gbrain", args, {
|
|
encoding: "utf-8",
|
|
timeout: opts.timeout ?? 30_000,
|
|
cwd: opts.cwd,
|
|
stdio: opts.stdio || ["ignore", "pipe", "pipe"],
|
|
env: buildGbrainEnv({ baseEnv: opts.baseEnv, announce: opts.announce }),
|
|
});
|
|
}
|