mirror of
https://github.com/garrytan/gstack.git
synced 2026-05-11 23:17:26 +08:00
* fix(codex): use resume-compatible flags * fix: V-001 security vulnerability Automated security fix generated by Orbis Security AI * docs: align prompt-injection thresholds to security.ts (v1.6.4.0 catch-up) CLAUDE.md:290 and ARCHITECTURE.md:159 were missed when WARN was bumped 0.60 → 0.75 ind75402bb(v1.6.4.0, "cut Haiku classifier FP from 44% to 23%, gate now enforced", #1135). browse/src/security.ts:37 has WARN: 0.75 and BROWSER.md:743 was updated alongside that commit; CLAUDE.md and ARCHITECTURE.md still read 0.60. Also adds the SOLO_CONTENT_BLOCK: 0.92 entry to CLAUDE.md (already in security.ts:50 and BROWSER.md:745, missing from CLAUDE.md's threshold table). No code change. No behavior change. Pure doc-vs-code alignment. Verification: $ grep -n "WARN" browse/src/security.ts CLAUDE.md ARCHITECTURE.md BROWSER.md browse/src/security.ts:37: WARN: 0.75, CLAUDE.md:290: - \`WARN: 0.75\` ... ARCHITECTURE.md:159: ...>= \`WARN\` (0.75)... BROWSER.md:743: - \`WARN: 0.75\` ... Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: Korean/CJK IME input and rendering in Sidebar Terminal Fixes #1272 This commit addresses three separate Korean/CJK bugs in the Sidebar Terminal: **Bug 1 - IME Input**: Korean text typed via IME composition was not reaching the PTY correctly. Added compositionstart/compositionend event listeners to suppress partial jamo fragments and only send the final composed string. **Bug 2a - Font Rendering**: Added CJK monospace font fallbacks ("Noto Sans Mono CJK KR", "Malgun Gothic") to both the xterm.js fontFamily config and the CSS --font-mono variable. This ensures consistent cell-width calculations for Korean characters. **Bug 2b - UTF-8 Boundary Detection**: Added buffering logic to prevent multi-byte UTF-8 characters (Korean is 3 bytes) from being split across WebSocket chunks. This follows the same pattern as PR #1007 which fixed the sidebar-agent path, but extends it to the terminal-agent path. Special thanks to @ldybob for the excellent root cause analysis and proposed solutions in issue #1272. Tested on WSL2 + Windows 11 with Korean IME. * fix(ship): tighten Plan Completion gate (VAS-449 remediation) VAS-446 shipped with a PLAN.md acceptance criterion (domain-hq has /docs/dashboard.md) silently skipped. /ship's Plan Completion subagent existed at ship time (added in v1.4.1.0) but the gate let the failure through. Four structural fixes: 1. Path concreteness rule: items naming a concrete filesystem path MUST be classified DONE/NOT DONE via [ -f <path> ], never UNVERIFIABLE. 2. Validator detection: CONTENT-SHAPE items scan target repo's package.json for validate-* scripts and run them before falling back to UNVERIFIABLE. 3. Per-item UNVERIFIABLE confirmation: replaces blanket "I've checked each one" with per-item Y/N/D loop. The blanket-confirm path is the exact failure VAS-449 surfaced. 4. Subagent fail-closed: if Plan Completion subagent + inline fallback both fail, surface explicit AskUserQuestion instead of silent pass. Replaces the prior "Never block /ship on subagent failure" fail-open. Locked in by test/ship-plan-completion-invariants.test.ts (5 assertions, no LLM dependency, ~60ms). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(browse): bash.exe wrap for telemetry on Windows reportAttemptTelemetry() in browse/src/security.ts calls spawn(bin, args) where bin is the gstack-telemetry-log bash script. On Windows this fails silently with ENOENT — CreateProcess can't dispatch on shebang lines. Adopts v1.24.0.0's Bun.which + GSTACK_*_BIN override pattern (from browse/src/claude-bin.ts:resolveClaudeCommand, introduced in #1252) for resolving bash.exe. resolveBashBinary() honors GSTACK_BASH_BIN absolute-path or PATH-resolvable override, falling back to Bun.which('bash') which finds Git Bash on the standard Windows install. buildTelemetrySpawnCommand() wraps the script invocation on win32 only; POSIX path is bit-identical. Returns null when bash can't be resolved on Windows so caller skips spawn — local attempts.jsonl audit trail keeps working without surfacing a Windows-only failure. 8 new unit tests cover resolveBashBinary (POSIX bash, absolute override, quote-stripping, BASH_BIN fallback, empty-PATH null) and buildTelemetrySpawnCommand (POSIX pass-through, win32 bash wrap, win32 null on unresolvable, arg-array immutability). POSIX path is bit-identical — Bun.which('bash') on Linux/macOS returns the same /bin/bash or /usr/bin/bash that the old hardcoded spawn relied on. * fix(make-pdf): Bun.which-based binary resolution for browse + pdftotext on Windows Extends v1.24.0.0's Bun.which + GSTACK_*_BIN override pattern (introduced in browse/src/claude-bin.ts via #1252) to the two other binary resolvers in the codebase: make-pdf/src/browseClient.ts:resolveBrowseBin and make-pdf/src/pdftotext.ts:resolvePdftotext. Same Windows quirks (fs.accessSync(X_OK) degrades to existence-check; `which` isn't available outside Git Bash; bun --compile --outfile X emits X.exe), same Bun.which-based fix shape, same env override convention. Changes: - GSTACK_BROWSE_BIN / GSTACK_PDFTOTEXT_BIN as the v1.24-aligned overrides; BROWSE_BIN / PDFTOTEXT_BIN remain as back-compat aliases. - Bun.which() replaces execFileSync('which', ...) for PATH lookup. Handles Windows PATHEXT natively; no more `where`-vs-`which` branch. - findExecutable(base) helper exported from each module, probes .exe/.cmd/.bat after the bare-path miss on win32. Linux/macOS behavior is bit-identical (isExecutable short-circuits before the win32 branch ever runs). - macCandidates renamed posixCandidates (always was — /opt/homebrew, /usr/local, /usr/bin). No Windows candidates added; Poppler installs scatter across Scoop/Chocolatey/portable zips and guessing causes false positives. - Error messages get a Windows install hint (scoop install poppler / oschwartz10612) and `setx` example for GSTACK_*_BIN. - Pre-existing test 'honors BROWSE_BIN when it points at a real executable' was hardcoded /bin/sh — made cross-platform via a REAL_EXE constant (cmd.exe on win32, /bin/sh on POSIX). Was a Windows-CI blocker on its own. Coordination: PR #1094 (@BkashJEE) covered browseClient.ts independently with a narrower scope; this PR's pdftotext + cross-platform tests + GSTACK_*_BIN naming are additive. Either order of merge works. Test plan: - bun test make-pdf/test/browseClient.test.ts make-pdf/test/pdftotext.test.ts on win32 — 29 pass, 0 fail (12 new assertions: findExecutable POSIX/win32/null, resolveBrowseBin GSTACK_BROWSE_BIN + BROWSE_BIN + precedence + quote-strip, same shape for resolvePdftotext + Windows install hint in error message). - POSIX branch unchanged — fs.accessSync(X_OK) on Linux/macOS short-circuits before any win32 logic runs, matching the v1.24 claude-bin.ts pattern. * fix(browse): NTFS ACL hardening for Windows state files via icacls gstack's ~/.gstack/ state directory holds bearer tokens, canary tokens, agent queue contents (with prompt history), session state, security-decision logs, and saved cookie bundles — all written with { mode: 0o600 } / 0o700. On Windows, those mode bits are a silent no-op: Node's fs module doesn't translate POSIX modes to NTFS ACLs, and inherited ACLs leave every "restricted" file readable by other principals on the machine (verified via icacls — six ACEs, the intended user is the LAST of six). Threat model is non-trivial on: - Self-hosted CI runners (different service account on the same Windows box can read developer tokens, canary tokens, prompt history) - Shared development machines (agencies, studios, lab environments) - Multi-tenant servers with shared home directories Orthogonal to v1.24.0.0's binary-resolution work — complementary at the write side. v1.24's bin/gstack-paths resolves ~/.gstack/ correctly across plugin / global / local installs; this PR ensures files written into those resolved paths actually get the POSIX 0o600 semantic translated to NTFS. The fix: - New browse/src/file-permissions.ts (158 LOC, 5 public + 1 test-reset). restrictFilePermissions / restrictDirectoryPermissions wrap chmod (POSIX) or icacls /inheritance:r /grant:r <user>:(F) (Windows). writeSecureFile / appendSecureFile / mkdirSecure are drop-in wrappers for the common patterns. - 19 call sites converted across 9 source files: browser-manager.ts, browser-skill-write.ts, cli.ts, config.ts, meta-commands.ts, security-classifier.ts, security.ts (4 sites), server.ts (5 sites), terminal-agent.ts (8 sites), tunnel-denial-log.ts. - (OI)(CI) inheritance flags on directories mean files created via fs.write* *inside* an mkdirSecure-created dir inherit the owner-only ACL automatically — important for tunnel-denial-log.ts where appends use async fsp.appendFile. Error handling: icacls failures (nonexistent path, missing icacls.exe, hardened environments) log a one-shot warning to stderr and proceed. Once-per-process gating prevents log spam if the condition persists. Filesystem stays functional; the file just ends up with inherited ACLs. Test plan: - bun test browse/test/file-permissions.test.ts — 13 pass, 0 fail (POSIX mode-bit assertions, Windows no-throw, mkdir idempotence, recursive creation, Buffer payloads, append-creates-then-reapplies-once semantics) - bun test browse/test/security.test.ts — 38 pass, 0 fail (existing security test suite plus the bash-binary resolution tests added in fix #1119; the converted writeFileSync/appendFileSync/mkdirSync sites in security.ts integrate cleanly) - Empirical icacls before/after on a real file — 6 ACEs → 1 ACE - bun build typecheck on all modified files — clean (server.ts has a pre-existing playwright-core/electron resolution issue unrelated to this PR) POSIX behavior is bit-identical to old code — fs.chmodSync(path, 0o6XX) on the helper's POSIX branch matches the inline { mode: 0o6XX } it replaces. Linux and macOS see no behavior change. Inviting pushback on three judgment calls (in PR description): 1. icacls vs npm library 2. ACL scope — just user, or user + SYSTEM? 3. Graceful degradation — once-per-process warn, not silent, not hard-fail. * fix(browse): declare lastConsoleFlushed to restore console-log persistence flushBuffers() references a `lastConsoleFlushed` cursor at server.ts:337 and assigns it at :344, but the `let lastConsoleFlushed = 0;` declaration is missing — only the network and dialog siblings are declared at lines 327-328. Result: every 1-second flushBuffers tick (line 376) throws `ReferenceError: lastConsoleFlushed is not defined`, gets swallowed by the catch at line 369 ("[browse] Buffer flush failed: ..."), and the console branch's append never runs. browse-console.log is never written in any production deployment since this regressed. Discovered by stress-testing the daemon with 15 concurrent CLIs against cold state — the race surfaced the buffer-flush error spam in one spawned daemon's stderr. Verified by running the daemon against a real file:// page with console.log events: in-memory `browse console` returns the entries, but `.gstack/browse-console.log` is never created on disk. Regression introduced by1a100a2a"fix: eliminate duplicate command sets in chain, improve flush perf and type safety" — the flush refactor switched from `Bun.write` to `fs.appendFileSync` and added the `lastConsoleFlushed` cursor pattern alongside its network/dialog siblings, but missed the matching `let` declaration. Tests don't currently exercise flushBuffers, so the regression shipped silently. Fix: - Declare `let lastConsoleFlushed = 0;` next to `lastNetworkFlushed` and `lastDialogFlushed` (browse/src/server.ts:327) - Add a source-level guard test (browse/test/server-flush-trackers.test.ts) that fails any future refactor that adds a fourth `last*Flushed` cursor without the matching declaration. Same pattern as terminal-agent.test.ts and dual-listener.test.ts — read source as text, assert invariant, no daemon required. Test plan: - [x] New regression test fails on current main, passes with the fix - [x] `bun run build` clean - [x] Manual smoke: spawn daemon -> goto file:// page with console.log -> wait 4s -> .gstack/browse-console.log now exists with the expected entries (163 bytes vs zero before) 🤖 Generated with [Claude Code](https://claude.com/claude-code) * fix(browse): per-process state-file temp path to fix concurrent-write ENOENT The daemon writes `.gstack/browse.json` via the standard atomic-rename pattern: `writeFileSync(tmp, …) → renameSync(tmp, stateFile)`. Four sites in server.ts use this pattern (initial daemon-startup state at :2002, /tunnel/start handler at :1479, BROWSE_TUNNEL=1 inline tunnel update at :2083, BROWSE_TUNNEL_LOCAL_ONLY=1 update at :2113), and all four hard-code the same temp filename `${stateFile}.tmp`. Under concurrent writers the shared filename races on the rename: t0 Writer A: writeFileSync(stateFile + '.tmp', payloadA) t1 Writer B: writeFileSync(stateFile + '.tmp', payloadB) // overwrites A t2 Writer A: renameSync(stateFile + '.tmp', stateFile) // moves B's payload t3 Writer B: renameSync(stateFile + '.tmp', stateFile) // ENOENT — file gone Reproduced empirically with 15 concurrent CLIs against a fresh `.gstack/`: [browse] Failed to start: ENOENT: no such file or directory, rename '…/.gstack/browse.json.tmp' -> '…/.gstack/browse.json' Pre-fix success rate: **0 / 15** under cold-start race. Post-fix success rate: **15 / 15**, zero ENOENT. Fix: - New `tmpStatePath()` helper (server.ts:333) returns `${stateFile}.tmp.${pid}.${randomBytes(4).toString('hex')}` - All 4 call sites use `tmpStatePath()` instead of the shared literal - Atomic rename still gives last-writer-wins semantics on the final state.json content; only behavior change is that concurrent writers no longer kill each other on the rename step Source-level guard test (browse/test/server-tmp-state-path.test.ts) locks two invariants: (1) no remaining `stateFile + '.tmp'` literals, (2) every state-write `writeFileSync` call uses `tmpStatePath()`. Same read-source-as-text pattern as terminal-agent.test.ts and dual-listener.test.ts — no daemon required, runs in tier-1 free. Test plan: - [x] Targeted source-level guard test passes (3 / 0) - [x] `bun run build` clean - [x] Live regression: 15 concurrent CLIs against cold state → 15 / 15 healthy, 0 ENOENT (vs 0 / 15 pre-fix) - [x] No `.tmp.*` orphans left behind after rename succeeds - [x] Related test cluster (server-auth, dual-listener, cdp-mutex, findport) — same pre-existing flakes as `main`, no new regressions introduced 🤖 Generated with [Claude Code](https://claude.com/claude-code) * fix(browse): clear refs when iframe auto-detaches in getActiveFrameOrPage Asymmetric cleanup between two equivalent staleness conditions: onMainFrameNavigated() → clearRefs() + activeFrame = null ✓ getActiveFrameOrPage() → activeFrame = null (refs NOT cleared) ✗ Both paths see the same staleness condition — refs were captured against a frame that no longer exists. The main-frame path correctly clears both pieces of state. The iframe-detach path nulls the frame but leaves the refMap intact. The lazy click-time check in `resolveRef` (tab-session.ts:97) partially saves us — `entry.locator.count()` on a detached-frame locator throws or returns 0, so the click errors out as "Ref X is stale". But the user has no signal that frame context silently changed underfoot: the next `snapshot` runs against `this.page` (main) while old iframe refs still litter `refMap` with the same role+name keys. New refs collide with stale ones, the resolver picks one at random, the user clicks the wrong element. TODOS.md line 816-820 documents "Detached frame auto-recovery" as a shipped iframe-support feature in v0.12.1.0. This restores the documented intent — the recovery should leave the session in a clean state, not a half-cleared one. Fix: 1 line — add `this.clearRefs()` next to `this.activeFrame = null` inside the if-branch. Test plan: - [x] New regression test: 4/4 pass - refs cleared when getActiveFrameOrPage detects detached iframe - refs preserved when active frame is still attached (no regression) - refs preserved when no frame set (page-level path untouched) - matches onMainFrameNavigated symmetry — both paths reach the same clean end state - [x] `bun run build` clean 🤖 Generated with [Claude Code](https://claude.com/claude-code) * fix(codex): resolve python for JSON parser * fix: add fail-fast probe for base branch in ship step 12 * fix(plan-devex-review): remove contradictory plan-mode handshake * fix(design): honor Retry-After header in variants 429 handler Closes #1244. The 429 handler in `generateVariant` discarded the `Retry-After` response header and fell straight through to a local exponential schedule (2s/4s/8s). In image-generation batches, that burns retry attempts inside the provider's cooldown window and the request never recovers. Now we parse `Retry-After` per RFC 7231 — both delta-seconds (`Retry-After: 5`) and HTTP-date (`Retry-After: Fri, 31 Dec 1999 23:59:59 GMT`). Honored waits are capped at 60s to bound stalls from hostile or buggy headers. Delta-seconds are validated as digits-only (rejects `2abc`). When `Retry-After` is honored (including 0 / past-date "retry now"), the next iteration's leading exponential sleep is skipped so we don't double-wait. Invalid or missing headers fall through to the existing exponential schedule unchanged. Behavior matrix: | Header | Behavior | |---------------------------------|-------------------------------------------| | Retry-After: 5 | wait 5s, skip leading on next attempt | | Retry-After: 999999 | capped to 60s, skip leading | | Retry-After: 2abc | invalid, fall through to exponential | | Retry-After: 0 | wait 0, skip leading (retry immediately) | | Retry-After: <past HTTP-date> | wait 0, skip leading | | Retry-After: <future date> | wait diff capped at 60s, skip leading | | no header | fall through to existing exponential | `generateVariant` now accepts an optional `fetchFn` parameter (defaults to `globalThis.fetch`) so tests can inject a stub. Production call sites are unchanged. Tests cover the five behavior buckets above, asserting both the 1st-to-2nd call timing gap and call counts. All five pass in ~8s. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(docs): correct per-skill symlink removal snippet in README uninstall Closes #1130. The manual-uninstall fallback in `## Uninstall` → `### Option 2` used `find ~/.claude/skills -maxdepth 1 -type l`, which finds nothing on real installs. Each `~/.claude/skills/<name>/` is a real directory, and only `<name>/SKILL.md` inside it is a symlink into `gstack/`. The find never matched, so the snippet silently removed nothing. Replace with a directory walk that inspects each `<name>/SKILL.md`: find ~/.claude/skills -mindepth 1 -maxdepth 1 -type d ! -name gstack → check $dir/SKILL.md is a symlink → readlink it → if target is gstack/* or */gstack/*: rm -f the link, rmdir the dir (only if empty — preserves any user-added files) Excludes the top-level `gstack/` dir from the walk; that's removed by step 3 of the same uninstall block. `bin/gstack-uninstall` (the script-mode path) already handles the layout correctly via its own walk; only this manual fallback needed updating. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: reject partial browse client env integers * fix(gemini-adapter): detect new ~/.gemini/oauth_creds.json auth path gemini-cli >=0.30 stores OAuth credentials at ~/.gemini/oauth_creds.json instead of the legacy ~/.config/gemini/ directory. The benchmark adapter's availability check now succeeds for users on recent gemini-cli releases who have authenticated via interactive login. Both paths are accepted so users on older versions still work. * fix(browser): add --no-sandbox for root user on Linux/WSL2 Chromium's sandbox can't initialize when running as root on Linux, causing an immediate exit. Extend the existing CI/CONTAINER check to also cover this case, keeping the Windows-safe `typeof getuid` guard. * security: pass cwd to git via execFileSync, not interpolation through /bin/sh `bin/gstack-memory-ingest.ts:632-643` ran `execSync(\`git -C ${JSON.stringify(cwd)} remote get-url origin 2>/dev/null\`, ...)`. JSON.stringify escapes `"` and `\` but not `$` or backticks, so a `cwd` of `"$(touch /tmp/marker)"` survived JSON quoting and detonated under /bin/sh's command-substitution-inside-double-quotes. `cwd` originates from transcript JSONL records under `~/.claude/projects/<encoded-cwd>/<uuid>.jsonl` and `~/.codex/sessions/YYYY/MM/DD/rollout-*.jsonl`. The walker grabs the first `.cwd` it sees per session. That's an untrusted surface in the gstack threat model — the L1-L6 sidebar security stack exists exactly because agent transcripts can carry attacker-influenced text. Two pivots above the local same-uid bar: (a) prompt-injection appending `cwd="$(...)"` to the active session log turns the next /sync-gbrain run into RCE under the user's uid; (b) cross-machine transcript share (a colleague's `.claude/projects` snippet untar'd into HOME, a documented gbrain dogfooding shape) → RCE on first sync. Fix swaps the one execSync for `execFileSync("git", ["-C", cwd, "remote", "get-url", "origin"], ...)`. No shell, argv passed directly to git. The same module already uses execFileSync for `gbrainAvailable()` (line 762 pre-patch) and `gbrainPutPage()` (line 816 pre-patch) — this single execSync was the outlier. Test: `gstack-memory-ingest security: untrusted cwd cannot trigger shell substitution` plants a Claude-Code-shaped JSONL with cwd=`$(touch <marker>)` and asserts the marker file is not created after `--incremental --quiet`. Negative control: with the patch reverted, the test fails (marker created); with the patch applied, it passes (18/18 in test/gstack-memory-ingest.test.ts). * security: gate domain-skill auto-promote on classifier_score > 0 `browse/src/domain-skill-commands.ts:140` (handleSave) writes `classifier_score: 0` with the comment "L4 deferred to load-time / sidebar-agent fills this in on first prompt-injection load." But CLAUDE.md "Sidebar architecture" documents that sidebar-agent.ts was ripped, and grep for recordSkillUse + classifierFlagged callers across browse/src/ returns zero hits outside the module under test. Net effect: every quarantined skill that survives three benign uses without flag (`recordSkillUse(... , classifierFlagged: false)` x3) auto-promotes to `active` and lands in prompt context wrapped as UNTRUSTED on every subsequent visit to that host. The L4 score that was supposed to gate the promotion was never written — the production save path puts 0 on disk and nothing later updates it. Threat model: a domain-skill body authored by an agent under the influence of a poisoned page (the new `gstackInjectToTerminal` PTY path runs no L1-L3 either) would lose its auto-promote barrier after three uses. The exploit isn't single-step but the bar is exactly N=3 prompt-injection-shaped uses on a hostile page, which is well within reach. Fix adds a single condition to the auto-promote gate in `recordSkillUse`: if (state === 'quarantined' && useCount >= PROMOTE_THRESHOLD && flagCount === 0 && current.classifier_score > 0) { state = 'active'; } `classifier_score` is set once at writeSkill and never updated. Production saves it as 0 (handleSave), so the gate stays closed; existing tests that explicitly pass `classifierScore: 0.1` still auto-promote (the auto-promote path is preserved for the day L4 is rewired). Manual promotion via `domain-skill promote-to-global` is unaffected (it goes through `promoteToGlobal` which has its own state-machine guard at line 337+). Test: new regression case `does NOT auto-promote when classifier_score is 0 (production handleSave shape)` plants a skill with classifierScore=0 (matches domain-skill-commands.ts:140), runs three uses without flag, asserts the skill stays quarantined and readSkill returns null. Negative control: revert the patch, the test fails with `Received: "active"`. With the patch: 15/15 pass. * fix(ship): port #1302 SKILL.md edits to .tmpl + resolver source PR #1302 added Verification Mode + UNVERIFIABLE classification + per-item confirmation gate to ship/SKILL.md, but only the generated SKILL.md was edited — not the .tmpl source or scripts/resolvers/review.ts. The next `bun run gen:skill-docs` run would have wiped the changes. Port the same content into the resolver and .tmpl so regeneration produces the intended output. * ci(windows): extend free-tests lane to cover icacls + Bun.which resolvers from fix-wave PRs Closes #1306/#1307/#1308 validation gap. The four newly-added test files already have process.platform guards so they run safely on both POSIX and Windows lanes — only platform-relevant assertions execute on each. Tests added to the windows-latest lane: - browse/test/file-permissions.test.ts (#1308 icacls + writeSecureFile) - browse/test/security.test.ts (#1306 bash.exe wrap pure-function path) - make-pdf/test/browseClient.test.ts (#1307 Bun.which browse resolver) - make-pdf/test/pdftotext.test.ts (#1307 Bun.which pdftotext resolver) * test(codex): live flag-semantics smoke for codex exec resume Closes #1270's regex-only test gap. PR #1270 asserted that codex/SKILL.md's `codex exec resume` invocation drops -C/-s and uses sandbox_mode config. That regex catches the skill template regressing, but not codex CLI itself flipping flag semantics again. This test probes `codex exec resume --help` and asserts the surface gstack relies on: -c/sandbox_mode is accepted, top-level -C is absent. Skips silently when codex isn't on PATH, so dev machines without codex installed never see it fail. * chore: regen SKILL.md after fix wave One regen commit at the end of the merge wave per the plan. plan-devex-review loses the contradictory plan-mode handshake (#1333). review/SKILL.md picks up the Verification Mode + UNVERIFIABLE classification additions that #1302 authored against ship/SKILL.md (same resolver shared between ship and review modes). * fix(server.ts): keep fs.writeFileSync for state-file writes #1308's writeSecureFile wrapper added Windows icacls hardening for the 4 state-file write sites in server.ts, but #1310's regression test grep's for fs.writeFileSync(tmpStatePath()) calls. The two changes are technically compatible only if the test relaxes — keeping the test strict (the safer choice for catching regressions on the cold-start race) means the 4 state- file sites stay on fs.writeFileSync(..., { mode: 0o600 }). POSIX 0o600 hardening is preserved on those 4 sites. Windows icacls hardening still applies to all the other writeSecureFile call sites #1308 added (auth.json, mkdirSecure, etc.). Also refreshes golden baselines after #1302 / port + minor wording tweak in scripts/resolvers/review.ts to keep gen-skill-docs.test.ts assertion 'Cite the specific file' satisfied. * v1.30.0.0: fix wave — 21 community PRs + 2 closing fixes for Windows + codex CI gaps Headline release. Browse stops dropping console logs, cold-start race fixed, codex resume works without python3, Windows hardening (icacls + Bun.which + bash.exe wrap), ship gate gets VAS-449 remediation, two closing fixes that put icacls/Bun.which/codex flag semantics under CI. * test(domain-skills): cover #1369 classifier_score=0 quarantine + score>0 promote path The pre-existing T6 test seeded skills via writeSkill (which defaults classifier_score to 0 until L4 is rewired) and then expected 3 uses to auto-promote. PR #1369 added `current.classifier_score > 0` to the gate specifically to block that path — a quarantined skill written under the influence of a poisoned page would otherwise auto-promote after three benign uses. Updated test asserts both halves of the new contract: - classifier_score=0 + 3 uses → stays quarantined (the security guarantee) - classifier_score>0 + 3 more uses → promotes to active (unblock path) Catches both regressions: the gate going away (would re-allow the bypass) and the unblock path breaking (would silently quarantine all skills forever once L4 is rewired). --------- Co-authored-by: Jayesh Betala <jayesh.betala7@gmail.com> Co-authored-by: orbisai0security <mediratta01.pally@gmail.com> Co-authored-by: Bryce Alan <brycealan.eth@gmail.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> Co-authored-by: Terry Carson YM <cym3118288@gmail.com> Co-authored-by: Vasko Ckorovski <vckorovski@gmail.com> Co-authored-by: Samuel Carson <samuel.carson@gmail.com> Co-authored-by: Yashwant Kotipalli <yashwant7kotipalli@gmail.com> Co-authored-by: Jasper Chen <jasperchen925@gmail.com> Co-authored-by: Stefan Neamtu <stefan.neamtu@gmail.com> Co-authored-by: 陈家名 <chenjiaming@kezaihui.com> Co-authored-by: Abigail Atheryon <abi@atheryon.ai> Co-authored-by: Furkan Köykıran <furkankoykiran@gmail.com> Co-authored-by: gus <gustavoraularagon@gmail.com>
1887 lines
75 KiB
TypeScript
1887 lines
75 KiB
TypeScript
import { describe, test, expect } from 'bun:test';
|
|
import { validateSkill, extractRemoteSlugPatterns, extractWeightsFromTable } from './helpers/skill-parser';
|
|
import { ALL_COMMANDS, COMMAND_DESCRIPTIONS, READ_COMMANDS, WRITE_COMMANDS, META_COMMANDS } from '../browse/src/commands';
|
|
import { SNAPSHOT_FLAGS } from '../browse/src/snapshot';
|
|
import * as fs from 'fs';
|
|
import * as path from 'path';
|
|
|
|
const ROOT = path.resolve(import.meta.dir, '..');
|
|
|
|
describe('SKILL.md command validation', () => {
|
|
test('all $B commands in SKILL.md are valid browse commands', () => {
|
|
const result = validateSkill(path.join(ROOT, 'SKILL.md'));
|
|
expect(result.invalid).toHaveLength(0);
|
|
expect(result.valid.length).toBeGreaterThan(0);
|
|
});
|
|
|
|
test('all snapshot flags in SKILL.md are valid', () => {
|
|
const result = validateSkill(path.join(ROOT, 'SKILL.md'));
|
|
expect(result.snapshotFlagErrors).toHaveLength(0);
|
|
});
|
|
|
|
test('all $B commands in browse/SKILL.md are valid browse commands', () => {
|
|
const result = validateSkill(path.join(ROOT, 'browse', 'SKILL.md'));
|
|
expect(result.invalid).toHaveLength(0);
|
|
expect(result.valid.length).toBeGreaterThan(0);
|
|
});
|
|
|
|
test('all snapshot flags in browse/SKILL.md are valid', () => {
|
|
const result = validateSkill(path.join(ROOT, 'browse', 'SKILL.md'));
|
|
expect(result.snapshotFlagErrors).toHaveLength(0);
|
|
});
|
|
|
|
test('all $B commands in qa/SKILL.md are valid browse commands', () => {
|
|
const qaSkill = path.join(ROOT, 'qa', 'SKILL.md');
|
|
if (!fs.existsSync(qaSkill)) return; // skip if missing
|
|
const result = validateSkill(qaSkill);
|
|
expect(result.invalid).toHaveLength(0);
|
|
});
|
|
|
|
test('all snapshot flags in qa/SKILL.md are valid', () => {
|
|
const qaSkill = path.join(ROOT, 'qa', 'SKILL.md');
|
|
if (!fs.existsSync(qaSkill)) return;
|
|
const result = validateSkill(qaSkill);
|
|
expect(result.snapshotFlagErrors).toHaveLength(0);
|
|
});
|
|
|
|
test('all $B commands in qa-only/SKILL.md are valid browse commands', () => {
|
|
const qaOnlySkill = path.join(ROOT, 'qa-only', 'SKILL.md');
|
|
if (!fs.existsSync(qaOnlySkill)) return;
|
|
const result = validateSkill(qaOnlySkill);
|
|
expect(result.invalid).toHaveLength(0);
|
|
});
|
|
|
|
test('all snapshot flags in qa-only/SKILL.md are valid', () => {
|
|
const qaOnlySkill = path.join(ROOT, 'qa-only', 'SKILL.md');
|
|
if (!fs.existsSync(qaOnlySkill)) return;
|
|
const result = validateSkill(qaOnlySkill);
|
|
expect(result.snapshotFlagErrors).toHaveLength(0);
|
|
});
|
|
|
|
test('all $B commands in plan-design-review/SKILL.md are valid browse commands', () => {
|
|
const skill = path.join(ROOT, 'plan-design-review', 'SKILL.md');
|
|
if (!fs.existsSync(skill)) return;
|
|
const result = validateSkill(skill);
|
|
expect(result.invalid).toHaveLength(0);
|
|
});
|
|
|
|
test('all snapshot flags in plan-design-review/SKILL.md are valid', () => {
|
|
const skill = path.join(ROOT, 'plan-design-review', 'SKILL.md');
|
|
if (!fs.existsSync(skill)) return;
|
|
const result = validateSkill(skill);
|
|
expect(result.snapshotFlagErrors).toHaveLength(0);
|
|
});
|
|
|
|
test('all $B commands in design-review/SKILL.md are valid browse commands', () => {
|
|
const skill = path.join(ROOT, 'design-review', 'SKILL.md');
|
|
if (!fs.existsSync(skill)) return;
|
|
const result = validateSkill(skill);
|
|
expect(result.invalid).toHaveLength(0);
|
|
});
|
|
|
|
test('all snapshot flags in design-review/SKILL.md are valid', () => {
|
|
const skill = path.join(ROOT, 'design-review', 'SKILL.md');
|
|
if (!fs.existsSync(skill)) return;
|
|
const result = validateSkill(skill);
|
|
expect(result.snapshotFlagErrors).toHaveLength(0);
|
|
});
|
|
|
|
test('all $B commands in design-consultation/SKILL.md are valid browse commands', () => {
|
|
const skill = path.join(ROOT, 'design-consultation', 'SKILL.md');
|
|
if (!fs.existsSync(skill)) return;
|
|
const result = validateSkill(skill);
|
|
expect(result.invalid).toHaveLength(0);
|
|
});
|
|
|
|
test('all snapshot flags in design-consultation/SKILL.md are valid', () => {
|
|
const skill = path.join(ROOT, 'design-consultation', 'SKILL.md');
|
|
if (!fs.existsSync(skill)) return;
|
|
const result = validateSkill(skill);
|
|
expect(result.snapshotFlagErrors).toHaveLength(0);
|
|
});
|
|
|
|
test('all $B commands in autoplan/SKILL.md are valid browse commands', () => {
|
|
const skill = path.join(ROOT, 'autoplan', 'SKILL.md');
|
|
if (!fs.existsSync(skill)) return;
|
|
const result = validateSkill(skill);
|
|
expect(result.invalid).toHaveLength(0);
|
|
});
|
|
|
|
test('all snapshot flags in autoplan/SKILL.md are valid', () => {
|
|
const skill = path.join(ROOT, 'autoplan', 'SKILL.md');
|
|
if (!fs.existsSync(skill)) return;
|
|
const result = validateSkill(skill);
|
|
expect(result.snapshotFlagErrors).toHaveLength(0);
|
|
});
|
|
});
|
|
|
|
describe('Command registry consistency', () => {
|
|
test('COMMAND_DESCRIPTIONS covers all commands in sets', () => {
|
|
const allCmds = new Set([...READ_COMMANDS, ...WRITE_COMMANDS, ...META_COMMANDS]);
|
|
const descKeys = new Set(Object.keys(COMMAND_DESCRIPTIONS));
|
|
for (const cmd of allCmds) {
|
|
expect(descKeys.has(cmd)).toBe(true);
|
|
}
|
|
});
|
|
|
|
test('COMMAND_DESCRIPTIONS has no extra commands not in sets', () => {
|
|
const allCmds = new Set([...READ_COMMANDS, ...WRITE_COMMANDS, ...META_COMMANDS]);
|
|
for (const key of Object.keys(COMMAND_DESCRIPTIONS)) {
|
|
expect(allCmds.has(key)).toBe(true);
|
|
}
|
|
});
|
|
|
|
test('ALL_COMMANDS matches union of all sets', () => {
|
|
const union = new Set([...READ_COMMANDS, ...WRITE_COMMANDS, ...META_COMMANDS]);
|
|
expect(ALL_COMMANDS.size).toBe(union.size);
|
|
for (const cmd of union) {
|
|
expect(ALL_COMMANDS.has(cmd)).toBe(true);
|
|
}
|
|
});
|
|
|
|
test('SNAPSHOT_FLAGS option keys are valid SnapshotOptions fields', () => {
|
|
const validKeys = new Set([
|
|
'interactive', 'compact', 'depth', 'selector',
|
|
'diff', 'annotate', 'outputPath', 'cursorInteractive',
|
|
'heatmap',
|
|
]);
|
|
for (const flag of SNAPSHOT_FLAGS) {
|
|
expect(validKeys.has(flag.optionKey)).toBe(true);
|
|
}
|
|
});
|
|
});
|
|
|
|
describe('Usage string consistency', () => {
|
|
// Normalize a usage string to its structural skeleton for comparison.
|
|
// Replaces <param-names> with <>, [optional] with [], strips parenthetical hints.
|
|
// This catches format mismatches (e.g., <name>:<value> vs <name> <value>)
|
|
// without tripping on abbreviation differences (e.g., <sel> vs <selector>).
|
|
function skeleton(usage: string): string {
|
|
return usage
|
|
.replace(/\(.*?\)/g, '') // strip parenthetical hints like (e.g., Enter, Tab)
|
|
.replace(/<[^>]*>/g, '<>') // normalize <param-name> → <>
|
|
.replace(/\[[^\]]*\]/g, '[]') // normalize [optional] → []
|
|
.replace(/\s+/g, ' ') // collapse whitespace
|
|
.trim();
|
|
}
|
|
|
|
// Cross-check Usage: patterns in implementation against COMMAND_DESCRIPTIONS
|
|
test('implementation Usage: structural format matches COMMAND_DESCRIPTIONS', () => {
|
|
const implFiles = [
|
|
path.join(ROOT, 'browse', 'src', 'write-commands.ts'),
|
|
path.join(ROOT, 'browse', 'src', 'read-commands.ts'),
|
|
path.join(ROOT, 'browse', 'src', 'meta-commands.ts'),
|
|
];
|
|
|
|
// Extract "Usage: browse <pattern>" from throw new Error(...) calls
|
|
const usagePattern = /throw new Error\(['"`]Usage:\s*browse\s+(.+?)['"`]\)/g;
|
|
const implUsages = new Map<string, string>();
|
|
|
|
for (const file of implFiles) {
|
|
const content = fs.readFileSync(file, 'utf-8');
|
|
let match;
|
|
while ((match = usagePattern.exec(content)) !== null) {
|
|
const usage = match[1].split('\\n')[0].trim();
|
|
const cmd = usage.split(/\s/)[0];
|
|
implUsages.set(cmd, usage);
|
|
}
|
|
}
|
|
|
|
// Compare structural skeletons
|
|
const mismatches: string[] = [];
|
|
for (const [cmd, implUsage] of implUsages) {
|
|
const desc = COMMAND_DESCRIPTIONS[cmd];
|
|
if (!desc) continue;
|
|
if (!desc.usage) continue;
|
|
const descSkel = skeleton(desc.usage);
|
|
const implSkel = skeleton(implUsage);
|
|
if (descSkel !== implSkel) {
|
|
mismatches.push(`${cmd}: docs "${desc.usage}" (${descSkel}) vs impl "${implUsage}" (${implSkel})`);
|
|
}
|
|
}
|
|
|
|
expect(mismatches).toEqual([]);
|
|
});
|
|
});
|
|
|
|
describe('Generated SKILL.md freshness', () => {
|
|
test('no unresolved {{placeholders}} in generated SKILL.md', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'SKILL.md'), 'utf-8');
|
|
const unresolved = content.match(/\{\{\w+\}\}/g);
|
|
expect(unresolved).toBeNull();
|
|
});
|
|
|
|
test('no unresolved {{placeholders}} in generated browse/SKILL.md', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'browse', 'SKILL.md'), 'utf-8');
|
|
const unresolved = content.match(/\{\{\w+\}\}/g);
|
|
expect(unresolved).toBeNull();
|
|
});
|
|
|
|
test('generated SKILL.md has AUTO-GENERATED header', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'SKILL.md'), 'utf-8');
|
|
expect(content).toContain('AUTO-GENERATED');
|
|
});
|
|
});
|
|
|
|
// --- Update check preamble validation ---
|
|
|
|
describe('Update check preamble', () => {
|
|
const skillsWithUpdateCheck = [
|
|
'SKILL.md', 'browse/SKILL.md', 'qa/SKILL.md',
|
|
'qa-only/SKILL.md',
|
|
'setup-browser-cookies/SKILL.md',
|
|
'ship/SKILL.md', 'review/SKILL.md',
|
|
'plan-ceo-review/SKILL.md', 'plan-eng-review/SKILL.md',
|
|
'retro/SKILL.md',
|
|
'office-hours/SKILL.md', 'investigate/SKILL.md',
|
|
'plan-design-review/SKILL.md',
|
|
'design-review/SKILL.md',
|
|
'design-consultation/SKILL.md',
|
|
'document-release/SKILL.md',
|
|
'canary/SKILL.md',
|
|
'benchmark/SKILL.md',
|
|
'land-and-deploy/SKILL.md',
|
|
'setup-deploy/SKILL.md',
|
|
'cso/SKILL.md',
|
|
];
|
|
|
|
for (const skill of skillsWithUpdateCheck) {
|
|
test(`${skill} update check line ends with || true`, () => {
|
|
const content = fs.readFileSync(path.join(ROOT, skill), 'utf-8');
|
|
// The second line of the bash block must end with || true
|
|
// to avoid exit code 1 when _UPD is empty (up to date)
|
|
const match = content.match(/\[ -n "\$_UPD" \].*$/m);
|
|
expect(match).not.toBeNull();
|
|
expect(match![0]).toContain('|| true');
|
|
});
|
|
}
|
|
|
|
test('all skills with update check are generated from .tmpl', () => {
|
|
for (const skill of skillsWithUpdateCheck) {
|
|
const tmplPath = path.join(ROOT, skill + '.tmpl');
|
|
expect(fs.existsSync(tmplPath)).toBe(true);
|
|
}
|
|
});
|
|
|
|
test('update check bash block exits 0 when up to date', () => {
|
|
// Simulate the exact preamble command from SKILL.md
|
|
const result = Bun.spawnSync(['bash', '-c',
|
|
'_UPD=$(echo "" || true); [ -n "$_UPD" ] && echo "$_UPD" || true'
|
|
], { stdout: 'pipe', stderr: 'pipe' });
|
|
expect(result.exitCode).toBe(0);
|
|
});
|
|
|
|
test('update check bash block exits 0 when upgrade available', () => {
|
|
const result = Bun.spawnSync(['bash', '-c',
|
|
'_UPD=$(echo "UPGRADE_AVAILABLE 0.3.3 0.4.0" || true); [ -n "$_UPD" ] && echo "$_UPD" || true'
|
|
], { stdout: 'pipe', stderr: 'pipe' });
|
|
expect(result.exitCode).toBe(0);
|
|
expect(result.stdout.toString().trim()).toBe('UPGRADE_AVAILABLE 0.3.3 0.4.0');
|
|
});
|
|
});
|
|
|
|
// --- Part 7: Cross-skill path consistency (A1) ---
|
|
|
|
describe('Cross-skill path consistency', () => {
|
|
test('REMOTE_SLUG derivation pattern is identical across files that use it', () => {
|
|
const patterns = extractRemoteSlugPatterns(ROOT, ['qa', 'review']);
|
|
const allPatterns: string[] = [];
|
|
|
|
for (const [, filePatterns] of patterns) {
|
|
allPatterns.push(...filePatterns);
|
|
}
|
|
|
|
// Should find at least 2 occurrences (qa/SKILL.md + review/greptile-triage.md)
|
|
expect(allPatterns.length).toBeGreaterThanOrEqual(2);
|
|
|
|
// All occurrences must be character-for-character identical
|
|
const unique = new Set(allPatterns);
|
|
if (unique.size > 1) {
|
|
const variants = Array.from(unique);
|
|
throw new Error(
|
|
`REMOTE_SLUG pattern differs across files:\n` +
|
|
variants.map((v, i) => ` ${i + 1}: ${v}`).join('\n')
|
|
);
|
|
}
|
|
});
|
|
|
|
test('all greptile-history write references specify both per-project and global paths', () => {
|
|
const filesToCheck = [
|
|
'review/SKILL.md',
|
|
'ship/SKILL.md',
|
|
'review/greptile-triage.md',
|
|
];
|
|
|
|
for (const file of filesToCheck) {
|
|
const filePath = path.join(ROOT, file);
|
|
if (!fs.existsSync(filePath)) continue;
|
|
const content = fs.readFileSync(filePath, 'utf-8');
|
|
|
|
const hasBoth = (content.includes('per-project') && content.includes('global')) ||
|
|
(content.includes('$REMOTE_SLUG/greptile-history') && content.includes('~/.gstack/greptile-history'));
|
|
|
|
expect(hasBoth).toBe(true);
|
|
}
|
|
});
|
|
|
|
test('greptile-triage.md contains both project and global history paths', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'review', 'greptile-triage.md'), 'utf-8');
|
|
expect(content).toContain('$REMOTE_SLUG/greptile-history.md');
|
|
expect(content).toContain('~/.gstack/greptile-history.md');
|
|
});
|
|
|
|
test('retro/SKILL.md reads global greptile-history (not per-project)', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'retro', 'SKILL.md'), 'utf-8');
|
|
expect(content).toContain('~/.gstack/greptile-history.md');
|
|
// Should NOT reference per-project path for reads
|
|
expect(content).not.toContain('$REMOTE_SLUG/greptile-history.md');
|
|
});
|
|
});
|
|
|
|
// --- Part 7: QA skill structure validation (A2) ---
|
|
|
|
describe('QA skill structure validation', () => {
|
|
const qaContent = fs.readFileSync(path.join(ROOT, 'qa', 'SKILL.md'), 'utf-8');
|
|
|
|
test('qa/SKILL.md has all 11 phases', () => {
|
|
const phases = [
|
|
'Phase 1', 'Initialize',
|
|
'Phase 2', 'Authenticate',
|
|
'Phase 3', 'Orient',
|
|
'Phase 4', 'Explore',
|
|
'Phase 5', 'Document',
|
|
'Phase 6', 'Wrap Up',
|
|
'Phase 7', 'Triage',
|
|
'Phase 8', 'Fix Loop',
|
|
'Phase 9', 'Final QA',
|
|
'Phase 10', 'Report',
|
|
'Phase 11', 'TODOS',
|
|
];
|
|
for (const phase of phases) {
|
|
expect(qaContent).toContain(phase);
|
|
}
|
|
});
|
|
|
|
test('has all four QA modes defined', () => {
|
|
const modes = [
|
|
'Diff-aware',
|
|
'Full',
|
|
'Quick',
|
|
'Regression',
|
|
];
|
|
for (const mode of modes) {
|
|
expect(qaContent).toContain(mode);
|
|
}
|
|
|
|
// Mode triggers/flags
|
|
expect(qaContent).toContain('--quick');
|
|
expect(qaContent).toContain('--regression');
|
|
});
|
|
|
|
test('has all three tiers defined', () => {
|
|
const tiers = ['Quick', 'Standard', 'Exhaustive'];
|
|
for (const tier of tiers) {
|
|
expect(qaContent).toContain(tier);
|
|
}
|
|
});
|
|
|
|
test('health score weights sum to 100%', () => {
|
|
const weights = extractWeightsFromTable(qaContent);
|
|
expect(weights.size).toBeGreaterThan(0);
|
|
|
|
let sum = 0;
|
|
for (const pct of weights.values()) {
|
|
sum += pct;
|
|
}
|
|
expect(sum).toBe(100);
|
|
});
|
|
|
|
test('health score has all 8 categories', () => {
|
|
const weights = extractWeightsFromTable(qaContent);
|
|
const expectedCategories = [
|
|
'Console', 'Links', 'Visual', 'Functional',
|
|
'UX', 'Performance', 'Content', 'Accessibility',
|
|
];
|
|
for (const cat of expectedCategories) {
|
|
expect(weights.has(cat)).toBe(true);
|
|
}
|
|
expect(weights.size).toBe(8);
|
|
});
|
|
|
|
test('has four mode definitions (Diff-aware/Full/Quick/Regression)', () => {
|
|
expect(qaContent).toContain('### Diff-aware');
|
|
expect(qaContent).toContain('### Full');
|
|
expect(qaContent).toContain('### Quick');
|
|
expect(qaContent).toContain('### Regression');
|
|
});
|
|
|
|
test('output structure references report directory layout', () => {
|
|
expect(qaContent).toContain('qa-report-');
|
|
expect(qaContent).toContain('baseline.json');
|
|
expect(qaContent).toContain('screenshots/');
|
|
expect(qaContent).toContain('.gstack/qa-reports/');
|
|
});
|
|
});
|
|
|
|
// --- Part 7: Greptile history format consistency (A3) ---
|
|
|
|
describe('Greptile history format consistency', () => {
|
|
test('greptile-triage.md defines the canonical history format', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'review', 'greptile-triage.md'), 'utf-8');
|
|
expect(content).toContain('<YYYY-MM-DD>');
|
|
expect(content).toContain('<owner/repo>');
|
|
expect(content).toContain('<type');
|
|
expect(content).toContain('<file-pattern>');
|
|
expect(content).toContain('<category>');
|
|
});
|
|
|
|
test('review/SKILL.md and ship/SKILL.md both reference greptile-triage.md for write details', () => {
|
|
const reviewContent = fs.readFileSync(path.join(ROOT, 'review', 'SKILL.md'), 'utf-8');
|
|
const shipContent = fs.readFileSync(path.join(ROOT, 'ship', 'SKILL.md'), 'utf-8');
|
|
|
|
expect(reviewContent.toLowerCase()).toContain('greptile-triage.md');
|
|
expect(shipContent.toLowerCase()).toContain('greptile-triage.md');
|
|
});
|
|
|
|
test('greptile-triage.md defines all 9 valid categories', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'review', 'greptile-triage.md'), 'utf-8');
|
|
const categories = [
|
|
'race-condition', 'null-check', 'error-handling', 'style',
|
|
'type-safety', 'security', 'performance', 'correctness', 'other',
|
|
];
|
|
for (const cat of categories) {
|
|
expect(content).toContain(cat);
|
|
}
|
|
});
|
|
});
|
|
|
|
// --- Hardcoded branch name detection in templates ---
|
|
|
|
describe('No hardcoded branch names in SKILL templates', () => {
|
|
const tmplFiles = [
|
|
'ship/SKILL.md.tmpl',
|
|
'review/SKILL.md.tmpl',
|
|
'qa/SKILL.md.tmpl',
|
|
'plan-ceo-review/SKILL.md.tmpl',
|
|
'retro/SKILL.md.tmpl',
|
|
'document-release/SKILL.md.tmpl',
|
|
'plan-eng-review/SKILL.md.tmpl',
|
|
'plan-design-review/SKILL.md.tmpl',
|
|
'codex/SKILL.md.tmpl',
|
|
];
|
|
|
|
// Patterns that indicate hardcoded 'main' in git commands
|
|
const gitMainPatterns = [
|
|
/\bgit\s+diff\s+(?:origin\/)?main\b/,
|
|
/\bgit\s+log\s+(?:origin\/)?main\b/,
|
|
/\bgit\s+fetch\s+origin\s+main\b/,
|
|
/\bgit\s+merge\s+origin\/main\b/,
|
|
/\borigin\/main\b/,
|
|
];
|
|
|
|
// Lines that are allowed to mention 'main' (fallback logic, prose)
|
|
const allowlist = [
|
|
/fall\s*back\s+to\s+`main`/i,
|
|
/fall\s*back\s+to\s+`?main`?/i,
|
|
/typically\s+`?main`?/i,
|
|
/If\s+on\s+`main`/i, // old pattern — should not exist
|
|
];
|
|
|
|
for (const tmplFile of tmplFiles) {
|
|
test(`${tmplFile} has no hardcoded 'main' in git commands`, () => {
|
|
const filePath = path.join(ROOT, tmplFile);
|
|
if (!fs.existsSync(filePath)) return;
|
|
const lines = fs.readFileSync(filePath, 'utf-8').split('\n');
|
|
const violations: string[] = [];
|
|
|
|
for (let i = 0; i < lines.length; i++) {
|
|
const line = lines[i];
|
|
const isAllowlisted = allowlist.some(p => p.test(line));
|
|
if (isAllowlisted) continue;
|
|
|
|
for (const pattern of gitMainPatterns) {
|
|
if (pattern.test(line)) {
|
|
violations.push(`Line ${i + 1}: ${line.trim()}`);
|
|
break;
|
|
}
|
|
}
|
|
}
|
|
|
|
if (violations.length > 0) {
|
|
throw new Error(
|
|
`${tmplFile} has hardcoded 'main' in git commands:\n` +
|
|
violations.map(v => ` ${v}`).join('\n')
|
|
);
|
|
}
|
|
});
|
|
}
|
|
});
|
|
|
|
// --- Part 7b: TODOS-format.md reference consistency ---
|
|
|
|
describe('TODOS-format.md reference consistency', () => {
|
|
test('review/TODOS-format.md exists and defines canonical format', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'review', 'TODOS-format.md'), 'utf-8');
|
|
expect(content).toContain('**What:**');
|
|
expect(content).toContain('**Why:**');
|
|
expect(content).toContain('**Priority:**');
|
|
expect(content).toContain('**Effort:**');
|
|
expect(content).toContain('## Completed');
|
|
});
|
|
|
|
test('skills that write TODOs reference TODOS-format.md', () => {
|
|
const shipContent = fs.readFileSync(path.join(ROOT, 'ship', 'SKILL.md'), 'utf-8');
|
|
const ceoPlanContent = fs.readFileSync(path.join(ROOT, 'plan-ceo-review', 'SKILL.md'), 'utf-8');
|
|
const engPlanContent = fs.readFileSync(path.join(ROOT, 'plan-eng-review', 'SKILL.md'), 'utf-8');
|
|
|
|
expect(shipContent).toContain('TODOS-format.md');
|
|
expect(ceoPlanContent).toContain('TODOS-format.md');
|
|
expect(engPlanContent).toContain('TODOS-format.md');
|
|
});
|
|
});
|
|
|
|
// --- v0.4.1 feature coverage: RECOMMENDATION format, session awareness, enum completeness ---
|
|
|
|
describe('v0.4.1 preamble features', () => {
|
|
// Tier 1 skills have core preamble only (no AskUserQuestion format)
|
|
const tier1Skills = ['SKILL.md', 'browse/SKILL.md', 'setup-browser-cookies/SKILL.md', 'benchmark/SKILL.md'];
|
|
|
|
// Tier 2+ skills have AskUserQuestion format with RECOMMENDATION
|
|
const tier2PlusSkills = [
|
|
'qa/SKILL.md', 'qa-only/SKILL.md',
|
|
'ship/SKILL.md', 'review/SKILL.md',
|
|
'plan-ceo-review/SKILL.md', 'plan-eng-review/SKILL.md',
|
|
'retro/SKILL.md',
|
|
'office-hours/SKILL.md', 'investigate/SKILL.md',
|
|
'plan-design-review/SKILL.md',
|
|
'design-review/SKILL.md',
|
|
'design-consultation/SKILL.md',
|
|
'document-release/SKILL.md',
|
|
'canary/SKILL.md',
|
|
'land-and-deploy/SKILL.md',
|
|
'setup-deploy/SKILL.md',
|
|
'cso/SKILL.md',
|
|
];
|
|
|
|
const skillsWithPreamble = [...tier1Skills, ...tier2PlusSkills];
|
|
|
|
for (const skill of tier2PlusSkills) {
|
|
test(`${skill} contains AskUserQuestion Pros/Cons format`, () => {
|
|
const content = fs.readFileSync(path.join(ROOT, skill), 'utf-8');
|
|
// v1.7.0.0 Pros/Cons format tokens. The preamble resolver
|
|
// (generate-ask-user-format.ts) injects all of these into every
|
|
// tier-2+ skill. Drop any of them and the test catches it on the
|
|
// next `bun test` run.
|
|
expect(content).toContain('AskUserQuestion');
|
|
expect(content).toContain('Pros / cons:');
|
|
expect(content).toContain('Recommendation: <choice>');
|
|
expect(content).toContain('Net:');
|
|
expect(content).toContain('ELI10');
|
|
expect(content).toContain('Stakes if we pick wrong:');
|
|
// Concrete format markers must be documented in the resolver text
|
|
expect(content).toMatch(/✅/);
|
|
expect(content).toMatch(/❌/);
|
|
});
|
|
}
|
|
|
|
for (const skill of skillsWithPreamble) {
|
|
test(`${skill} contains session awareness`, () => {
|
|
const content = fs.readFileSync(path.join(ROOT, skill), 'utf-8');
|
|
expect(content).toContain('_SESSIONS');
|
|
});
|
|
}
|
|
|
|
for (const skill of skillsWithPreamble) {
|
|
test(`${skill} contains escalation protocol`, () => {
|
|
const content = fs.readFileSync(path.join(ROOT, skill), 'utf-8');
|
|
expect(content).toContain('DONE_WITH_CONCERNS');
|
|
expect(content).toContain('BLOCKED');
|
|
expect(content).toContain('NEEDS_CONTEXT');
|
|
});
|
|
}
|
|
});
|
|
|
|
// --- Structural tests for new skills ---
|
|
|
|
describe('office-hours skill structure', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'office-hours', 'SKILL.md'), 'utf-8');
|
|
|
|
// Original structural assertions
|
|
for (const section of ['Phase 1', 'Phase 2', 'Phase 3', 'Phase 4', 'Phase 5', 'Phase 6',
|
|
'Design Doc', 'Supersedes', 'APPROVED', 'Premise Challenge',
|
|
'Alternatives', 'Smart-skip']) {
|
|
test(`contains ${section}`, () => expect(content).toContain(section));
|
|
}
|
|
|
|
// Dual-mode structure
|
|
for (const section of ['Startup mode', 'Builder mode']) {
|
|
test(`contains ${section}`, () => expect(content).toContain(section));
|
|
}
|
|
|
|
// Mode detection question
|
|
test('contains explicit mode detection question', () => {
|
|
expect(content).toContain("what's your goal");
|
|
});
|
|
|
|
// Six forcing questions (startup mode)
|
|
for (const question of ['Demand Reality', 'Status Quo', 'Desperate Specificity',
|
|
'Narrowest Wedge', 'Observation & Surprise', 'Future-Fit']) {
|
|
test(`contains forcing question: ${question}`, () => expect(content).toContain(question));
|
|
}
|
|
|
|
// Builder mode questions
|
|
test('contains builder brainstorming questions', () => {
|
|
expect(content).toContain('coolest version');
|
|
expect(content).toContain('delightful');
|
|
});
|
|
|
|
// Intrapreneurship adaptation
|
|
test('contains intrapreneurship adaptation', () => {
|
|
expect(content).toContain('Intrapreneurship');
|
|
});
|
|
|
|
// YC founder discovery engine
|
|
test('contains YC apply CTA with ref tracking', () => {
|
|
expect(content).toContain('ycombinator.com/apply?ref=gstack');
|
|
});
|
|
|
|
test('contains "What I noticed" design doc section', () => {
|
|
expect(content).toContain('What I noticed about how you think');
|
|
});
|
|
|
|
test('contains golden age framing', () => {
|
|
expect(content).toContain('golden age');
|
|
});
|
|
|
|
test('contains Garry Tan personal plea', () => {
|
|
expect(content).toContain('Garry Tan, the creator of GStack');
|
|
});
|
|
|
|
test('contains founder signal synthesis phase', () => {
|
|
expect(content).toContain('Founder Signal Synthesis');
|
|
});
|
|
|
|
test('contains three-tier decision rubric', () => {
|
|
expect(content).toContain('Top tier');
|
|
expect(content).toContain('Middle tier');
|
|
expect(content).toContain('Base tier');
|
|
});
|
|
|
|
test('contains anti-slop examples', () => {
|
|
expect(content).toContain('GOOD:');
|
|
expect(content).toContain('BAD:');
|
|
});
|
|
|
|
test('contains "One more thing" transition beat', () => {
|
|
expect(content).toContain('One more thing');
|
|
});
|
|
|
|
// Operating principles per mode
|
|
test('contains startup operating principles', () => {
|
|
expect(content).toContain('Specificity is the only currency');
|
|
});
|
|
|
|
test('contains builder operating principles', () => {
|
|
expect(content).toContain('Delight is the currency');
|
|
});
|
|
|
|
// Spec Review Loop (Phase 5.5)
|
|
test('contains spec review loop', () => {
|
|
expect(content).toContain('Spec Review Loop');
|
|
});
|
|
|
|
test('contains adversarial review dimensions', () => {
|
|
for (const dim of ['Completeness', 'Consistency', 'Clarity', 'Scope', 'Feasibility']) {
|
|
expect(content).toContain(dim);
|
|
}
|
|
});
|
|
|
|
test('contains subagent dispatch instruction', () => {
|
|
expect(content).toMatch(/Agent.*tool|subagent/i);
|
|
});
|
|
|
|
test('contains max 3 iterations', () => {
|
|
expect(content).toMatch(/3.*iteration|maximum.*3/i);
|
|
});
|
|
|
|
test('contains quality score', () => {
|
|
expect(content).toContain('quality score');
|
|
});
|
|
|
|
test('contains spec review metrics path', () => {
|
|
expect(content).toContain('spec-review.jsonl');
|
|
});
|
|
|
|
test('contains convergence guard', () => {
|
|
expect(content).toMatch(/convergence/i);
|
|
});
|
|
|
|
// Visual Sketch (Phase 4.5)
|
|
test('contains visual sketch section', () => {
|
|
expect(content).toContain('Visual Sketch');
|
|
});
|
|
|
|
test('contains wireframe generation', () => {
|
|
expect(content).toMatch(/wireframe|sketch/i);
|
|
});
|
|
|
|
test('contains DESIGN.md awareness', () => {
|
|
expect(content).toContain('DESIGN.md');
|
|
});
|
|
|
|
test('contains browse rendering', () => {
|
|
expect(content).toContain('$B goto');
|
|
expect(content).toContain('$B screenshot');
|
|
});
|
|
|
|
test('contains rough aesthetic instruction', () => {
|
|
expect(content).toMatch(/rough|hand-drawn/i);
|
|
});
|
|
});
|
|
|
|
describe('investigate skill structure', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'investigate', 'SKILL.md'), 'utf-8');
|
|
for (const section of ['Iron Law', 'Root Cause', 'Pattern Analysis', 'Hypothesis',
|
|
'DEBUG REPORT', '3-strike', 'BLOCKED']) {
|
|
test(`contains ${section}`, () => expect(content).toContain(section));
|
|
}
|
|
});
|
|
|
|
// Contributor mode was removed in v0.13.10.0 — replaced by operational self-improvement.
|
|
// Tests for contributor mode preamble structure are no longer applicable.
|
|
|
|
describe('Enum & Value Completeness in review checklist', () => {
|
|
const checklist = fs.readFileSync(path.join(ROOT, 'review', 'checklist.md'), 'utf-8');
|
|
|
|
test('checklist has Enum & Value Completeness section', () => {
|
|
expect(checklist).toContain('Enum & Value Completeness');
|
|
});
|
|
|
|
test('Enum & Value Completeness is classified as CRITICAL', () => {
|
|
// It should appear under Pass 1 — CRITICAL, not Pass 2
|
|
const pass1Start = checklist.indexOf('### Pass 1');
|
|
const pass2Start = checklist.indexOf('### Pass 2');
|
|
const enumStart = checklist.indexOf('Enum & Value Completeness');
|
|
expect(enumStart).toBeGreaterThan(pass1Start);
|
|
expect(enumStart).toBeLessThan(pass2Start);
|
|
});
|
|
|
|
test('Enum & Value Completeness mentions tracing through consumers', () => {
|
|
expect(checklist).toContain('Trace it through every consumer');
|
|
expect(checklist).toContain('case');
|
|
expect(checklist).toContain('allowlist');
|
|
});
|
|
|
|
test('Enum & Value Completeness is in the severity classification as CRITICAL', () => {
|
|
const gateSection = checklist.slice(checklist.indexOf('## Severity Classification'));
|
|
// The ASCII art has CRITICAL on the left and INFORMATIONAL on the right
|
|
// Enum & Value Completeness should appear on a line with the CRITICAL tree (├─ or └─)
|
|
const enumLine = gateSection.split('\n').find(l => l.includes('Enum & Value Completeness'));
|
|
expect(enumLine).toBeDefined();
|
|
// It's on the left (CRITICAL) side — starts with ├─ or └─
|
|
expect(enumLine!.trimStart().startsWith('├─') || enumLine!.trimStart().startsWith('└─')).toBe(true);
|
|
});
|
|
|
|
test('Fix-First Heuristic exists in checklist and is referenced by review + ship', () => {
|
|
expect(checklist).toContain('## Fix-First Heuristic');
|
|
expect(checklist).toContain('AUTO-FIX');
|
|
expect(checklist).toContain('ASK');
|
|
|
|
const reviewSkill = fs.readFileSync(path.join(ROOT, 'review/SKILL.md'), 'utf-8');
|
|
const shipSkill = fs.readFileSync(path.join(ROOT, 'ship/SKILL.md'), 'utf-8');
|
|
expect(reviewSkill).toContain('AUTO-FIX');
|
|
expect(reviewSkill).toContain('[AUTO-FIXED]');
|
|
expect(shipSkill).toContain('AUTO-FIX');
|
|
expect(shipSkill).toContain('[AUTO-FIXED]');
|
|
});
|
|
});
|
|
|
|
// --- Completeness Principle spot-check ---
|
|
|
|
describe('Completeness Principle in generated SKILL.md files', () => {
|
|
const skillsWithPreamble = [
|
|
'qa/SKILL.md',
|
|
'qa-only/SKILL.md',
|
|
'ship/SKILL.md', 'review/SKILL.md',
|
|
'plan-ceo-review/SKILL.md', 'plan-eng-review/SKILL.md',
|
|
'retro/SKILL.md',
|
|
'plan-design-review/SKILL.md',
|
|
'design-review/SKILL.md',
|
|
'design-consultation/SKILL.md',
|
|
'document-release/SKILL.md',
|
|
'cso/SKILL.md', ];
|
|
|
|
for (const skill of skillsWithPreamble) {
|
|
test(`${skill} contains Completeness Principle section`, () => {
|
|
const content = fs.readFileSync(path.join(ROOT, skill), 'utf-8');
|
|
expect(content).toContain('Completeness Principle');
|
|
expect(content).toContain('Boil the Lake');
|
|
});
|
|
}
|
|
|
|
test('Completeness Principle keeps compact scoring guidance in tier 2+ skills', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'cso', 'SKILL.md'), 'utf-8');
|
|
expect(content).toContain('Completeness: X/10');
|
|
expect(content).toContain('10 = all edge cases');
|
|
expect(content).toContain('Note: options differ in kind, not coverage');
|
|
expect(content).toContain('Do not fabricate scores');
|
|
});
|
|
});
|
|
|
|
// --- Part 7: Planted-bug fixture validation (A4) ---
|
|
|
|
describe('Planted-bug fixture validation', () => {
|
|
test('qa-eval ground truth has exactly 5 planted bugs', () => {
|
|
const groundTruth = JSON.parse(
|
|
fs.readFileSync(path.join(ROOT, 'test', 'fixtures', 'qa-eval-ground-truth.json'), 'utf-8')
|
|
);
|
|
expect(groundTruth.bugs).toHaveLength(5);
|
|
expect(groundTruth.total_bugs).toBe(5);
|
|
});
|
|
|
|
test('qa-eval-spa ground truth has exactly 5 planted bugs', () => {
|
|
const groundTruth = JSON.parse(
|
|
fs.readFileSync(path.join(ROOT, 'test', 'fixtures', 'qa-eval-spa-ground-truth.json'), 'utf-8')
|
|
);
|
|
expect(groundTruth.bugs).toHaveLength(5);
|
|
expect(groundTruth.total_bugs).toBe(5);
|
|
});
|
|
|
|
test('qa-eval-checkout ground truth has exactly 5 planted bugs', () => {
|
|
const groundTruth = JSON.parse(
|
|
fs.readFileSync(path.join(ROOT, 'test', 'fixtures', 'qa-eval-checkout-ground-truth.json'), 'utf-8')
|
|
);
|
|
expect(groundTruth.bugs).toHaveLength(5);
|
|
expect(groundTruth.total_bugs).toBe(5);
|
|
});
|
|
|
|
test('qa-eval.html contains the planted bugs', () => {
|
|
const html = fs.readFileSync(path.join(ROOT, 'browse', 'test', 'fixtures', 'qa-eval.html'), 'utf-8');
|
|
// BUG 1: broken link
|
|
expect(html).toContain('/nonexistent-404-page');
|
|
// BUG 2: disabled submit
|
|
expect(html).toContain('disabled');
|
|
// BUG 3: overflow
|
|
expect(html).toContain('overflow: hidden');
|
|
// BUG 4: missing alt
|
|
expect(html).toMatch(/<img[^>]*src="\/logo\.png"[^>]*>/);
|
|
expect(html).not.toMatch(/<img[^>]*src="\/logo\.png"[^>]*alt=/);
|
|
// BUG 5: console error
|
|
expect(html).toContain("Cannot read properties of undefined");
|
|
});
|
|
|
|
test('review-eval-vuln.rb contains expected vulnerability patterns', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'test', 'fixtures', 'review-eval-vuln.rb'), 'utf-8');
|
|
expect(content).toContain('params[:id]');
|
|
expect(content).toContain('update_column');
|
|
});
|
|
});
|
|
|
|
// --- CEO review mode validation ---
|
|
|
|
describe('CEO review mode validation', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'plan-ceo-review', 'SKILL.md'), 'utf-8');
|
|
|
|
test('has all four CEO review modes defined', () => {
|
|
const modes = ['SCOPE EXPANSION', 'SELECTIVE EXPANSION', 'HOLD SCOPE', 'SCOPE REDUCTION'];
|
|
for (const mode of modes) {
|
|
expect(content).toContain(mode);
|
|
}
|
|
});
|
|
|
|
test('has CEO plan persistence step', () => {
|
|
expect(content).toContain('ceo-plans');
|
|
expect(content).toContain('status: ACTIVE');
|
|
});
|
|
|
|
test('has docs/designs promotion section', () => {
|
|
expect(content).toContain('docs/designs');
|
|
expect(content).toContain('PROMOTED');
|
|
});
|
|
|
|
test('mode quick reference has four columns', () => {
|
|
expect(content).toContain('EXPANSION');
|
|
expect(content).toContain('SELECTIVE');
|
|
expect(content).toContain('HOLD SCOPE');
|
|
expect(content).toContain('REDUCTION');
|
|
});
|
|
|
|
// Skill chaining (benefits-from)
|
|
test('contains prerequisite skill offer for office-hours', () => {
|
|
expect(content).toContain('Prerequisite Skill Offer');
|
|
expect(content).toContain('/office-hours');
|
|
});
|
|
|
|
test('contains mid-session detection', () => {
|
|
expect(content).toContain('Mid-session detection');
|
|
expect(content).toMatch(/still figuring out|seems lost/i);
|
|
});
|
|
|
|
// Spec review on CEO plans
|
|
test('contains spec review loop for CEO plan documents', () => {
|
|
expect(content).toContain('Spec Review Loop');
|
|
});
|
|
});
|
|
|
|
// --- gstack-slug helper ---
|
|
|
|
describe('gstack-slug', () => {
|
|
const SLUG_BIN = path.join(ROOT, 'bin', 'gstack-slug');
|
|
|
|
test('binary exists and is executable', () => {
|
|
expect(fs.existsSync(SLUG_BIN)).toBe(true);
|
|
const stat = fs.statSync(SLUG_BIN);
|
|
expect(stat.mode & 0o111).toBeGreaterThan(0);
|
|
});
|
|
|
|
test('outputs SLUG and BRANCH lines in a git repo', () => {
|
|
const result = Bun.spawnSync([SLUG_BIN], { cwd: ROOT, stdout: 'pipe', stderr: 'pipe' });
|
|
expect(result.exitCode).toBe(0);
|
|
const output = result.stdout.toString();
|
|
expect(output).toContain('SLUG=');
|
|
expect(output).toContain('BRANCH=');
|
|
});
|
|
|
|
test('SLUG does not contain forward slashes', () => {
|
|
const result = Bun.spawnSync([SLUG_BIN], { cwd: ROOT, stdout: 'pipe', stderr: 'pipe' });
|
|
const slug = result.stdout.toString().match(/SLUG=(.*)/)?.[1] ?? '';
|
|
expect(slug).not.toContain('/');
|
|
expect(slug.length).toBeGreaterThan(0);
|
|
});
|
|
|
|
test('BRANCH does not contain forward slashes', () => {
|
|
const result = Bun.spawnSync([SLUG_BIN], { cwd: ROOT, stdout: 'pipe', stderr: 'pipe' });
|
|
const branch = result.stdout.toString().match(/BRANCH=(.*)/)?.[1] ?? '';
|
|
expect(branch).not.toContain('/');
|
|
expect(branch.length).toBeGreaterThan(0);
|
|
});
|
|
|
|
test('output is eval-compatible (KEY=VALUE format)', () => {
|
|
const result = Bun.spawnSync([SLUG_BIN], { cwd: ROOT, stdout: 'pipe', stderr: 'pipe' });
|
|
const lines = result.stdout.toString().trim().split('\n');
|
|
expect(lines.length).toBe(2);
|
|
expect(lines[0]).toMatch(/^SLUG=.+/);
|
|
expect(lines[1]).toMatch(/^BRANCH=.+/);
|
|
});
|
|
|
|
test('output values contain only safe characters (no shell metacharacters)', () => {
|
|
const result = Bun.spawnSync([SLUG_BIN], { cwd: ROOT, stdout: 'pipe', stderr: 'pipe' });
|
|
const slug = result.stdout.toString().match(/SLUG=(.*)/)?.[1] ?? '';
|
|
const branch = result.stdout.toString().match(/BRANCH=(.*)/)?.[1] ?? '';
|
|
// Only alphanumeric, dot, dash, underscore are allowed (#133)
|
|
expect(slug).toMatch(/^[a-zA-Z0-9._-]+$/);
|
|
expect(branch).toMatch(/^[a-zA-Z0-9._-]+$/);
|
|
});
|
|
test('eval sets variables under bash with set -euo pipefail', () => {
|
|
const result = Bun.spawnSync(
|
|
['bash', '-c', 'set -euo pipefail; eval "$(./bin/gstack-slug 2>/dev/null)"; echo "SLUG=$SLUG"; echo "BRANCH=$BRANCH"'],
|
|
{ cwd: ROOT, stdout: 'pipe', stderr: 'pipe' }
|
|
);
|
|
expect(result.exitCode).toBe(0);
|
|
const output = result.stdout.toString();
|
|
expect(output).toMatch(/^SLUG=.+/m);
|
|
expect(output).toMatch(/^BRANCH=.+/m);
|
|
});
|
|
|
|
test('no templates or bin scripts use source process substitution for gstack-slug', () => {
|
|
const result = Bun.spawnSync(
|
|
['grep', '-r', 'source <(.*gstack-slug', '--include=*.tmpl', '--include=gstack-review-*', '.'],
|
|
{ cwd: ROOT, stdout: 'pipe', stderr: 'pipe' }
|
|
);
|
|
// grep returns exit code 1 when no matches found — that's what we want
|
|
expect(result.stdout.toString().trim()).toBe('');
|
|
});
|
|
});
|
|
|
|
// --- Test Bootstrap validation ---
|
|
|
|
describe('Test Bootstrap ({{TEST_BOOTSTRAP}}) integration', () => {
|
|
test('TEST_BOOTSTRAP resolver produces valid content', () => {
|
|
const qaContent = fs.readFileSync(path.join(ROOT, 'qa', 'SKILL.md'), 'utf-8');
|
|
expect(qaContent).toContain('Test Framework Bootstrap');
|
|
expect(qaContent).toContain('RUNTIME:ruby');
|
|
expect(qaContent).toContain('RUNTIME:node');
|
|
expect(qaContent).toContain('RUNTIME:python');
|
|
expect(qaContent).toContain('no-test-bootstrap');
|
|
expect(qaContent).toContain('BOOTSTRAP_DECLINED');
|
|
});
|
|
|
|
test('TEST_BOOTSTRAP appears in qa/SKILL.md', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'qa', 'SKILL.md'), 'utf-8');
|
|
expect(content).toContain('Test Framework Bootstrap');
|
|
expect(content).toContain('TESTING.md');
|
|
expect(content).toContain('CLAUDE.md');
|
|
});
|
|
|
|
test('TEST_BOOTSTRAP appears in ship/SKILL.md', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'ship', 'SKILL.md'), 'utf-8');
|
|
expect(content).toContain('Test Framework Bootstrap');
|
|
expect(content).toContain('Step 4');
|
|
});
|
|
|
|
test('TEST_BOOTSTRAP appears in design-review/SKILL.md', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'design-review', 'SKILL.md'), 'utf-8');
|
|
expect(content).toContain('Test Framework Bootstrap');
|
|
});
|
|
|
|
test('TEST_BOOTSTRAP does NOT appear in qa-only/SKILL.md', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'qa-only', 'SKILL.md'), 'utf-8');
|
|
expect(content).not.toContain('Test Framework Bootstrap');
|
|
// But should have the recommendation note
|
|
expect(content).toContain('No test framework detected');
|
|
expect(content).toContain('Run `/qa` to bootstrap');
|
|
});
|
|
|
|
test('bootstrap includes framework knowledge table', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'qa', 'SKILL.md'), 'utf-8');
|
|
expect(content).toContain('vitest');
|
|
expect(content).toContain('minitest');
|
|
expect(content).toContain('pytest');
|
|
expect(content).toContain('cargo test');
|
|
expect(content).toContain('phpunit');
|
|
expect(content).toContain('ExUnit');
|
|
});
|
|
|
|
test('bootstrap includes CI/CD pipeline generation', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'qa', 'SKILL.md'), 'utf-8');
|
|
expect(content).toContain('.github/workflows/test.yml');
|
|
expect(content).toContain('GitHub Actions');
|
|
});
|
|
|
|
test('bootstrap includes first real tests step', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'qa', 'SKILL.md'), 'utf-8');
|
|
expect(content).toContain('First real tests');
|
|
expect(content).toContain('git log --since=30.days');
|
|
expect(content).toContain('Prioritize by risk');
|
|
});
|
|
|
|
test('bootstrap includes vibe coding philosophy', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'qa', 'SKILL.md'), 'utf-8');
|
|
expect(content).toContain('vibe coding');
|
|
expect(content).toContain('100% test coverage');
|
|
});
|
|
|
|
test('WebSearch is in allowed-tools for qa, ship, design-review', () => {
|
|
const qa = fs.readFileSync(path.join(ROOT, 'qa', 'SKILL.md'), 'utf-8');
|
|
const ship = fs.readFileSync(path.join(ROOT, 'ship', 'SKILL.md'), 'utf-8');
|
|
const qaDesign = fs.readFileSync(path.join(ROOT, 'design-review', 'SKILL.md'), 'utf-8');
|
|
expect(qa).toContain('WebSearch');
|
|
expect(ship).toContain('WebSearch');
|
|
expect(qaDesign).toContain('WebSearch');
|
|
});
|
|
});
|
|
|
|
// --- Phase 8e.5 regression test validation ---
|
|
|
|
describe('Phase 8e.5 regression test generation', () => {
|
|
test('qa/SKILL.md contains Phase 8e.5', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'qa', 'SKILL.md'), 'utf-8');
|
|
expect(content).toContain('8e.5. Regression Test');
|
|
expect(content).toContain('test(qa): regression test');
|
|
expect(content).toContain('WTF-likelihood exclusion');
|
|
});
|
|
|
|
test('qa/SKILL.md Rule 13 is amended for regression tests', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'qa', 'SKILL.md'), 'utf-8');
|
|
expect(content).toContain('Only modify tests when generating regression tests in Phase 8e.5');
|
|
expect(content).not.toContain('Never modify tests or CI configuration');
|
|
});
|
|
|
|
test('design-review has CSS-aware Phase 8e.5 variant', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'design-review', 'SKILL.md'), 'utf-8');
|
|
expect(content).toContain('8e.5. Regression Test (design-review variant)');
|
|
expect(content).toContain('CSS-only');
|
|
expect(content).toContain('test(design): regression test');
|
|
});
|
|
|
|
test('regression test includes full attribution comment format', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'qa', 'SKILL.md'), 'utf-8');
|
|
expect(content).toContain('// Regression: ISSUE-NNN');
|
|
expect(content).toContain('// Found by /qa on');
|
|
expect(content).toContain('// Report: .gstack/qa-reports/');
|
|
});
|
|
|
|
test('regression test uses auto-incrementing names', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'qa', 'SKILL.md'), 'utf-8');
|
|
expect(content).toContain('auto-incrementing');
|
|
expect(content).toContain('max number + 1');
|
|
});
|
|
});
|
|
|
|
// --- Step 3.4 coverage audit validation ---
|
|
|
|
describe('Step 3.4 test coverage audit', () => {
|
|
test('ship/SKILL.md contains Step 7', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'ship', 'SKILL.md'), 'utf-8');
|
|
expect(content).toContain('Step 7: Test Coverage Audit');
|
|
// The coverage diagram collapses code-path and user-flow counts onto one
|
|
// summary line. Verify that summary is present (labels are stable).
|
|
expect(content).toContain('Code paths:');
|
|
});
|
|
|
|
test('Step 3.4 includes quality scoring rubric', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'ship', 'SKILL.md'), 'utf-8');
|
|
expect(content).toContain('★★★');
|
|
expect(content).toContain('★★');
|
|
expect(content).toContain('edge cases AND error paths');
|
|
expect(content).toContain('happy path only');
|
|
});
|
|
|
|
test('Step 3.4 includes before/after test count', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'ship', 'SKILL.md'), 'utf-8');
|
|
expect(content).toContain('Count test files before');
|
|
expect(content).toContain('Count test files after');
|
|
});
|
|
|
|
test('ship PR body includes Test Coverage section', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'ship', 'SKILL.md'), 'utf-8');
|
|
expect(content).toContain('## Test Coverage');
|
|
});
|
|
|
|
test('ship rules include test generation rule', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'ship', 'SKILL.md'), 'utf-8');
|
|
expect(content).toContain('Step 7 generates coverage tests');
|
|
expect(content).toContain('Never commit failing tests');
|
|
});
|
|
|
|
test('Step 3.4 includes vibe coding philosophy', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'ship', 'SKILL.md'), 'utf-8');
|
|
expect(content).toContain('vibe coding becomes yolo coding');
|
|
});
|
|
|
|
test('Step 3.4 traces actual codepaths, not just syntax', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'ship', 'SKILL.md'), 'utf-8');
|
|
expect(content).toContain('Trace every codepath');
|
|
expect(content).toContain('Trace data flow');
|
|
expect(content).toContain('Diagram the execution');
|
|
});
|
|
|
|
test('Step 3.4 maps user flows and interaction edge cases', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'ship', 'SKILL.md'), 'utf-8');
|
|
expect(content).toContain('Map user flows');
|
|
expect(content).toContain('Interaction edge cases');
|
|
expect(content).toContain('Double-click');
|
|
expect(content).toContain('Navigate away');
|
|
expect(content).toContain('Error states the user can see');
|
|
expect(content).toContain('Empty/zero/boundary states');
|
|
});
|
|
|
|
test('Step 3.4 diagram includes user-flow coverage summary', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'ship', 'SKILL.md'), 'utf-8');
|
|
// The diagram was compressed from separate CODE PATH COVERAGE / USER FLOW
|
|
// COVERAGE section headers into a single summary line. Assert on the
|
|
// labels that still appear on that summary line.
|
|
expect(content).toContain('Code paths:');
|
|
expect(content).toContain('User flows:');
|
|
});
|
|
});
|
|
|
|
// --- Ship step numbering regression guard ---
|
|
|
|
describe('ship step numbering', () => {
|
|
// Allowed sub-steps that are resolver-generated and intentionally nested:
|
|
// 8.1 (Plan Verification), 8.2 (Scope Drift), 9.1 (Review Army), 9.2 (Findings Merge),
|
|
// 9.3 (Cross-review dedup), 15.0 (WIP squash — continuous checkpoint), 15.1 (Bisectable commits).
|
|
const ALLOWED_SUBSTEPS = new Set(['8.1', '8.2', '9.1', '9.2', '9.3', '15.0', '15.1']);
|
|
|
|
test('ship/SKILL.md.tmpl contains no unexpected fractional step numbers', () => {
|
|
const tmpl = fs.readFileSync(path.join(ROOT, 'ship', 'SKILL.md.tmpl'), 'utf-8');
|
|
// Match "Step X.Y" where X.Y is a decimal step reference (e.g., "Step 3.47", "Step 8.1")
|
|
const matches = Array.from(tmpl.matchAll(/Step (\d+\.\d+)/g));
|
|
const violations = matches
|
|
.map((m) => m[1])
|
|
.filter((n) => !ALLOWED_SUBSTEPS.has(n));
|
|
if (violations.length > 0) {
|
|
const unique = Array.from(new Set(violations)).sort();
|
|
throw new Error(
|
|
`ship/SKILL.md.tmpl contains fractional step numbers that are not in the allowed sub-step list.\n` +
|
|
` Found: ${unique.join(', ')}\n` +
|
|
` Allowed sub-steps: ${Array.from(ALLOWED_SUBSTEPS).sort().join(', ')}\n` +
|
|
` Fix: use clean integer step numbers (1-20), or add to ALLOWED_SUBSTEPS if intentional.`
|
|
);
|
|
}
|
|
});
|
|
|
|
test('ship/SKILL.md main headings use clean integer step numbers', () => {
|
|
const skill = fs.readFileSync(path.join(ROOT, 'ship', 'SKILL.md'), 'utf-8');
|
|
// Headings like "## Step 7: Test Coverage Audit" — NOT sub-steps like "## Step 8.1:"
|
|
const headings = Array.from(skill.matchAll(/^## Step (\d+(?:\.\d+)?):/gm)).map(
|
|
(m) => m[1]
|
|
);
|
|
const fractional = headings.filter((n) => n.includes('.'));
|
|
const unexpected = fractional.filter((n) => !ALLOWED_SUBSTEPS.has(n));
|
|
expect(unexpected).toEqual([]);
|
|
});
|
|
|
|
test('review/SKILL.md step numbers unchanged (regression guard for resolver conditionals)', () => {
|
|
const skill = fs.readFileSync(path.join(ROOT, 'review', 'SKILL.md'), 'utf-8');
|
|
// /review uses its own fractional numbering: 1.5, 2.5, 4.5, 5.5, 5.6, 5.7, 5.8
|
|
// If the ship-side renumber accidentally touched the review-side of resolver conditionals,
|
|
// these would vanish. This test catches that.
|
|
expect(skill).toContain('## Step 1.5: Scope Drift Detection');
|
|
expect(skill).toContain('## Step 4.5: Review Army');
|
|
expect(skill).toContain('## Step 5.7: Adversarial review');
|
|
});
|
|
});
|
|
|
|
// --- Retro test health validation ---
|
|
|
|
describe('Retro test health tracking', () => {
|
|
test('retro/SKILL.md has test health data gathering commands', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'retro', 'SKILL.md'), 'utf-8');
|
|
expect(content).toContain('# 10. Test file count');
|
|
expect(content).toContain('# 11. Regression test commits');
|
|
expect(content).toContain('# 12. Test files changed');
|
|
});
|
|
|
|
test('retro/SKILL.md has Test Health metrics row', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'retro', 'SKILL.md'), 'utf-8');
|
|
expect(content).toContain('Test Health');
|
|
expect(content).toContain('regression tests');
|
|
});
|
|
|
|
test('retro/SKILL.md has Test Health narrative section', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'retro', 'SKILL.md'), 'utf-8');
|
|
expect(content).toContain('### Test Health');
|
|
expect(content).toContain('Total test files');
|
|
expect(content).toContain('vibe coding safe');
|
|
});
|
|
|
|
test('retro JSON schema includes test_health field', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'retro', 'SKILL.md'), 'utf-8');
|
|
expect(content).toContain('test_health');
|
|
expect(content).toContain('total_test_files');
|
|
expect(content).toContain('regression_test_commits');
|
|
});
|
|
});
|
|
|
|
// --- QA report template regression tests section ---
|
|
|
|
describe('QA report template', () => {
|
|
test('qa-report-template.md has Regression Tests section', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'qa', 'templates', 'qa-report-template.md'), 'utf-8');
|
|
expect(content).toContain('## Regression Tests');
|
|
expect(content).toContain('committed / deferred / skipped');
|
|
expect(content).toContain('### Deferred Tests');
|
|
expect(content).toContain('**Precondition:**');
|
|
});
|
|
});
|
|
|
|
// --- Codex skill validation ---
|
|
|
|
describe('Codex skill', () => {
|
|
test('codex/SKILL.md exists and has correct frontmatter', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'codex', 'SKILL.md'), 'utf-8');
|
|
expect(content).toContain('name: codex');
|
|
expect(content).toContain('version: 1.0.0');
|
|
expect(content).toContain('allowed-tools:');
|
|
});
|
|
|
|
test('codex/SKILL.md contains all three modes', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'codex', 'SKILL.md'), 'utf-8');
|
|
expect(content).toContain('Step 2A: Review Mode');
|
|
expect(content).toContain('Step 2B: Challenge');
|
|
expect(content).toContain('Step 2C: Consult Mode');
|
|
});
|
|
|
|
test('codex/SKILL.md contains gate verdict logic', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'codex', 'SKILL.md'), 'utf-8');
|
|
expect(content).toContain('[P1]');
|
|
expect(content).toContain('GATE: PASS');
|
|
expect(content).toContain('GATE: FAIL');
|
|
});
|
|
|
|
test('codex/SKILL.md contains session continuity', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'codex', 'SKILL.md'), 'utf-8');
|
|
expect(content).toContain('codex-session-id');
|
|
expect(content).toContain('codex exec resume');
|
|
});
|
|
|
|
test('codex/SKILL.md resume command only uses resume-supported flags', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'codex', 'SKILL.md'), 'utf-8');
|
|
const match = content.match(/codex exec resume[^\n]+/);
|
|
expect(match).not.toBeNull();
|
|
const resumeCommand = match![0];
|
|
expect(resumeCommand).not.toContain(' -C ');
|
|
expect(resumeCommand).not.toContain(' -s read-only');
|
|
expect(resumeCommand).toContain("-c 'sandbox_mode=\"read-only\"'");
|
|
});
|
|
|
|
test('codex/SKILL.md contains cost tracking', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'codex', 'SKILL.md'), 'utf-8');
|
|
expect(content).toContain('tokens used');
|
|
expect(content).toContain('Est. cost');
|
|
});
|
|
|
|
test('codex/SKILL.md contains cross-model comparison', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'codex', 'SKILL.md'), 'utf-8');
|
|
expect(content).toContain('CROSS-MODEL ANALYSIS');
|
|
expect(content).toContain('Agreement rate');
|
|
});
|
|
|
|
test('codex/SKILL.md contains review log persistence', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'codex', 'SKILL.md'), 'utf-8');
|
|
expect(content).toContain('codex-review');
|
|
expect(content).toContain('gstack-review-log');
|
|
});
|
|
|
|
test('codex/SKILL.md uses which for binary discovery, not hardcoded path', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'codex', 'SKILL.md'), 'utf-8');
|
|
expect(content).toContain('which codex');
|
|
expect(content).not.toContain('/opt/homebrew/bin/codex');
|
|
});
|
|
|
|
test('codex/SKILL.md contains error handling for missing binary and auth', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'codex', 'SKILL.md'), 'utf-8');
|
|
expect(content).toContain('NOT_FOUND');
|
|
expect(content).toContain('codex login');
|
|
});
|
|
|
|
test('codex/SKILL.md uses mktemp for temp files', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'codex', 'SKILL.md'), 'utf-8');
|
|
expect(content).toContain('mktemp');
|
|
});
|
|
|
|
test('codex JSON stream parser uses portable Python discovery', () => {
|
|
const files = ['codex/SKILL.md.tmpl', 'codex/SKILL.md'];
|
|
|
|
for (const rel of files) {
|
|
const content = fs.readFileSync(path.join(ROOT, rel), 'utf-8');
|
|
expect(content).toContain('PYTHON_CMD=$(command -v python3 2>/dev/null || command -v python 2>/dev/null || true)');
|
|
expect(content).toContain('PYTHONUNBUFFERED=1 "$PYTHON_CMD" -u -c');
|
|
expect(content).not.toContain('PYTHONUNBUFFERED=1 python3 -u -c');
|
|
}
|
|
});
|
|
|
|
test('adversarial review in /review always runs both passes', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'review', 'SKILL.md'), 'utf-8');
|
|
expect(content).toContain('Adversarial review (always-on)');
|
|
// Always-on: both Claude and Codex adversarial
|
|
expect(content).toContain('Claude adversarial subagent (always runs)');
|
|
expect(content).toContain('Codex adversarial challenge (always runs when available)');
|
|
// Claude adversarial subagent dispatch
|
|
expect(content).toContain('Agent tool');
|
|
expect(content).toContain('FIXABLE');
|
|
expect(content).toContain('INVESTIGATE');
|
|
// Codex availability check
|
|
expect(content).toContain('CODEX_NOT_AVAILABLE');
|
|
// OLD_CFG only gates Codex, not Claude
|
|
expect(content).toContain('skip Codex passes only');
|
|
// Review log
|
|
expect(content).toContain('adversarial-review');
|
|
expect(content).toContain('reasoning_effort="high"');
|
|
expect(content).toContain('ADVERSARIAL REVIEW SYNTHESIS');
|
|
// Large diff structured review still gated
|
|
expect(content).toContain('Codex structured review (large diffs only');
|
|
expect(content).toContain('200');
|
|
});
|
|
|
|
test('adversarial review in /ship always runs both passes', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'ship', 'SKILL.md'), 'utf-8');
|
|
expect(content).toContain('Adversarial review (always-on)');
|
|
expect(content).toContain('adversarial-review');
|
|
expect(content).toContain('reasoning_effort="high"');
|
|
expect(content).toContain('Investigate and fix');
|
|
expect(content).toContain('Claude adversarial subagent (always runs)');
|
|
});
|
|
|
|
test('scope drift detection in /review and /ship', () => {
|
|
const reviewContent = fs.readFileSync(path.join(ROOT, 'review', 'SKILL.md'), 'utf-8');
|
|
const shipContent = fs.readFileSync(path.join(ROOT, 'ship', 'SKILL.md'), 'utf-8');
|
|
// Both should contain scope drift from the shared resolver
|
|
for (const content of [reviewContent, shipContent]) {
|
|
expect(content).toContain('Scope Check:');
|
|
expect(content).toContain('DRIFT DETECTED');
|
|
expect(content).toContain('SCOPE CREEP');
|
|
expect(content).toContain('MISSING REQUIREMENTS');
|
|
expect(content).toContain('stated intent');
|
|
}
|
|
});
|
|
|
|
test('codex-host ship/review do NOT contain adversarial review step', () => {
|
|
// .agents/ is gitignored — generate on demand
|
|
Bun.spawnSync(['bun', 'run', 'scripts/gen-skill-docs.ts', '--host', 'codex'], {
|
|
cwd: ROOT, stdout: 'pipe', stderr: 'pipe',
|
|
});
|
|
const shipContent = fs.readFileSync(path.join(ROOT, '.agents', 'skills', 'gstack-ship', 'SKILL.md'), 'utf-8');
|
|
expect(shipContent).not.toContain('codex review --base');
|
|
expect(shipContent).not.toContain('CODEX_REVIEWS');
|
|
|
|
const reviewContent = fs.readFileSync(path.join(ROOT, '.agents', 'skills', 'gstack-review', 'SKILL.md'), 'utf-8');
|
|
expect(reviewContent).not.toContain('codex review --base');
|
|
expect(reviewContent).not.toContain('codex_reviews');
|
|
expect(reviewContent).not.toContain('CODEX_REVIEWS');
|
|
expect(reviewContent).not.toContain('adversarial-review');
|
|
expect(reviewContent).not.toContain('Investigate and fix');
|
|
});
|
|
|
|
test('codex integration in /plan-eng-review offers plan critique', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'plan-eng-review', 'SKILL.md'), 'utf-8');
|
|
expect(content).toContain('Codex');
|
|
expect(content).toContain('codex exec');
|
|
});
|
|
|
|
test('/review persists a review-log entry for ship readiness', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'review', 'SKILL.md'), 'utf-8');
|
|
expect(content).toContain('"skill":"review"');
|
|
expect(content).toContain('"issues_found":N');
|
|
expect(content).toContain('Persist Eng Review result');
|
|
});
|
|
|
|
test('Review Readiness Dashboard includes Adversarial Review row', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'ship', 'SKILL.md'), 'utf-8');
|
|
expect(content).toContain('Adversarial');
|
|
expect(content).toContain('codex-review');
|
|
});
|
|
});
|
|
|
|
// --- Trigger phrase validation ---
|
|
|
|
describe('Skill trigger phrases', () => {
|
|
// Skills that must have "Use when" trigger phrases in their description.
|
|
// Excluded: root gstack (browser tool), gstack-upgrade (gstack-specific),
|
|
// humanizer (text tool)
|
|
const SKILLS_REQUIRING_TRIGGERS = [
|
|
'qa', 'qa-only', 'ship', 'review', 'investigate', 'office-hours',
|
|
'plan-ceo-review', 'plan-eng-review', 'plan-design-review',
|
|
'design-review', 'design-consultation', 'retro', 'document-release',
|
|
'codex', 'browse', 'setup-browser-cookies',
|
|
];
|
|
|
|
for (const skill of SKILLS_REQUIRING_TRIGGERS) {
|
|
test(`${skill}/SKILL.md has "Use when" trigger phrases`, () => {
|
|
const skillPath = path.join(ROOT, skill, 'SKILL.md');
|
|
if (!fs.existsSync(skillPath)) return;
|
|
const content = fs.readFileSync(skillPath, 'utf-8');
|
|
// Extract description from frontmatter
|
|
const frontmatterEnd = content.indexOf('---', 4);
|
|
const frontmatter = content.slice(0, frontmatterEnd);
|
|
expect(frontmatter).toMatch(/Use when/i);
|
|
});
|
|
}
|
|
|
|
// Skills with proactive triggers should have "Proactively suggest" in description
|
|
const SKILLS_REQUIRING_PROACTIVE = [
|
|
'qa', 'qa-only', 'ship', 'review', 'investigate', 'office-hours',
|
|
'plan-ceo-review', 'plan-eng-review', 'plan-design-review',
|
|
'design-review', 'design-consultation', 'retro', 'document-release',
|
|
];
|
|
|
|
for (const skill of SKILLS_REQUIRING_PROACTIVE) {
|
|
test(`${skill}/SKILL.md has proactive routing phrase`, () => {
|
|
const skillPath = path.join(ROOT, skill, 'SKILL.md');
|
|
if (!fs.existsSync(skillPath)) return;
|
|
const content = fs.readFileSync(skillPath, 'utf-8');
|
|
const frontmatterEnd = content.indexOf('---', 4);
|
|
const frontmatter = content.slice(0, frontmatterEnd);
|
|
expect(frontmatter).toMatch(/Proactively (suggest|invoke)/i);
|
|
});
|
|
}
|
|
});
|
|
|
|
// ─── Private-path leak detector ──────────────────────────────
|
|
//
|
|
// Catches accidental references to maintainer-private files in skill output.
|
|
// Adapted from the McGluut fork's skill-contract-audit.ts (we don't take the
|
|
// whole script — these are the unique checks not already covered by
|
|
// test/gen-skill-docs.test.ts:1668-2074 .claude/skills leakage tests).
|
|
|
|
describe('Private-path leak detection', () => {
|
|
const PRIVATE_PATTERNS: Array<{ pattern: RegExp; label: string }> = [
|
|
{ pattern: /coordination-board\.md/i, label: 'coordination-board.md' },
|
|
{ pattern: /SEEKING_LOG\.md/, label: 'SEEKING_LOG.md' },
|
|
{ pattern: /RATIONAL_SUBJECT\.md/, label: 'RATIONAL_SUBJECT.md' },
|
|
{ pattern: /VALUE_SIGNAL_LOOP\.md/, label: 'VALUE_SIGNAL_LOOP.md' },
|
|
{ pattern: /C:\\\\LLM Playground\\\\go/i, label: 'C:\\LLM Playground\\go' },
|
|
];
|
|
|
|
// Walk every SKILL.md and SKILL.md.tmpl in the repo (excluding node_modules,
|
|
// generated host outputs, and .git).
|
|
function discoverSkillSurface(): string[] {
|
|
const results: string[] = [];
|
|
function walk(dir: string) {
|
|
for (const entry of fs.readdirSync(dir, { withFileTypes: true })) {
|
|
if (entry.name.startsWith('.') && entry.name !== '.agents') continue;
|
|
if (entry.name === 'node_modules' || entry.name === 'dist') continue;
|
|
const full = path.join(dir, entry.name);
|
|
if (entry.isDirectory()) {
|
|
walk(full);
|
|
} else if (entry.name === 'SKILL.md' || entry.name === 'SKILL.md.tmpl') {
|
|
results.push(full);
|
|
}
|
|
}
|
|
}
|
|
walk(ROOT);
|
|
return results;
|
|
}
|
|
|
|
test('no SKILL.md or SKILL.md.tmpl references private maintainer files', () => {
|
|
const files = discoverSkillSurface();
|
|
expect(files.length).toBeGreaterThan(0);
|
|
const leaks: string[] = [];
|
|
for (const file of files) {
|
|
const content = fs.readFileSync(file, 'utf-8');
|
|
for (const { pattern, label } of PRIVATE_PATTERNS) {
|
|
if (pattern.test(content)) {
|
|
leaks.push(`${path.relative(ROOT, file)} mentions ${label}`);
|
|
}
|
|
}
|
|
}
|
|
expect(leaks).toEqual([]);
|
|
});
|
|
});
|
|
|
|
// ─── Doc-inventory cross-check ───────────────────────────────
|
|
//
|
|
// Every skill directory (with a SKILL.md.tmpl) must appear in both AGENTS.md
|
|
// and docs/skills.md. Catches the inventory drift codex flagged (/debug
|
|
// → /investigate; missing /autoplan, /context-save, /plan-devex-review, etc.).
|
|
|
|
describe('Doc inventory cross-check', () => {
|
|
// Skills that don't get user-invocation lines in agent-facing docs.
|
|
// - 'qa-only' is a sub-mode of /qa with shared docs.
|
|
// - The 5 listed below are infrastructure (model overlays, shipped binary,
|
|
// hosts) that don't show up in the user-facing skill table.
|
|
const DOC_INVENTORY_EXCLUDE = new Set([
|
|
// Infra / non-skills
|
|
'agents', 'claude', 'connect-chrome', 'contrib', 'hosts',
|
|
'lib', 'model-overlays', 'openclaw', 'supabase', 'scripts', 'test',
|
|
]);
|
|
|
|
function discoverSkillDirs(): string[] {
|
|
const dirs: string[] = [];
|
|
for (const entry of fs.readdirSync(ROOT, { withFileTypes: true })) {
|
|
if (!entry.isDirectory()) continue;
|
|
if (entry.name.startsWith('.')) continue;
|
|
if (DOC_INVENTORY_EXCLUDE.has(entry.name)) continue;
|
|
const tmplPath = path.join(ROOT, entry.name, 'SKILL.md.tmpl');
|
|
if (fs.existsSync(tmplPath)) dirs.push(entry.name);
|
|
}
|
|
return dirs.sort();
|
|
}
|
|
|
|
test('every skill is documented in AGENTS.md', () => {
|
|
const agents = fs.readFileSync(path.join(ROOT, 'AGENTS.md'), 'utf-8');
|
|
const missing: string[] = [];
|
|
for (const skill of discoverSkillDirs()) {
|
|
// Match `/skill-name` as a token boundary.
|
|
if (!new RegExp(`/${skill}\\b`).test(agents)) missing.push(skill);
|
|
}
|
|
expect(missing).toEqual([]);
|
|
});
|
|
|
|
test('every skill is documented in docs/skills.md', () => {
|
|
const docs = fs.readFileSync(path.join(ROOT, 'docs', 'skills.md'), 'utf-8');
|
|
const missing: string[] = [];
|
|
for (const skill of discoverSkillDirs()) {
|
|
if (!new RegExp(`/${skill}\\b`).test(docs)) missing.push(skill);
|
|
}
|
|
expect(missing).toEqual([]);
|
|
});
|
|
});
|
|
|
|
// ─── Codex Skill Validation ──────────────────────────────────
|
|
|
|
describe('Codex skill validation', () => {
|
|
const AGENTS_DIR = path.join(ROOT, '.agents', 'skills');
|
|
|
|
// .agents/ is gitignored (v0.11.2.0) — generate on demand for tests
|
|
Bun.spawnSync(['bun', 'run', 'scripts/gen-skill-docs.ts', '--host', 'codex'], {
|
|
cwd: ROOT, stdout: 'pipe', stderr: 'pipe',
|
|
});
|
|
|
|
// Discover all shared skills with templates.
|
|
// Host-exclusive outside-voice skills are intentionally omitted here:
|
|
// - /codex is Claude-only
|
|
// - /claude is external-host-only
|
|
const CLAUDE_SKILLS_WITH_TEMPLATES = (() => {
|
|
const skills: string[] = [];
|
|
for (const entry of fs.readdirSync(ROOT, { withFileTypes: true })) {
|
|
if (!entry.isDirectory() || entry.name.startsWith('.') || entry.name === 'node_modules') continue;
|
|
if (entry.name === 'codex') continue; // Claude-only skill
|
|
if (entry.name === 'claude') continue; // External-host-only skill
|
|
if (fs.existsSync(path.join(ROOT, entry.name, 'SKILL.md.tmpl'))) {
|
|
skills.push(entry.name);
|
|
}
|
|
}
|
|
return skills;
|
|
})();
|
|
|
|
test('all skills (except /codex) have both Claude and Codex variants', () => {
|
|
for (const skillDir of CLAUDE_SKILLS_WITH_TEMPLATES) {
|
|
// Claude variant
|
|
const claudeMd = path.join(ROOT, skillDir, 'SKILL.md');
|
|
expect(fs.existsSync(claudeMd)).toBe(true);
|
|
|
|
// Codex variant
|
|
const codexName = skillDir.startsWith('gstack-') ? skillDir : `gstack-${skillDir}`;
|
|
const codexMd = path.join(AGENTS_DIR, codexName, 'SKILL.md');
|
|
expect(fs.existsSync(codexMd)).toBe(true);
|
|
}
|
|
// Root template has both too
|
|
expect(fs.existsSync(path.join(ROOT, 'SKILL.md'))).toBe(true);
|
|
expect(fs.existsSync(path.join(AGENTS_DIR, 'gstack', 'SKILL.md'))).toBe(true);
|
|
});
|
|
|
|
test('/codex skill is Claude-only — no Codex variant', () => {
|
|
// Claude variant should exist
|
|
expect(fs.existsSync(path.join(ROOT, 'codex', 'SKILL.md'))).toBe(true);
|
|
// Codex variant must NOT exist
|
|
expect(fs.existsSync(path.join(AGENTS_DIR, 'gstack-codex', 'SKILL.md'))).toBe(false);
|
|
});
|
|
|
|
test('/claude skill is external-host-only — no Claude-host variant', () => {
|
|
// Claude host should not get an outside-voice skill that shells into Claude.
|
|
expect(fs.existsSync(path.join(ROOT, 'claude', 'SKILL.md'))).toBe(false);
|
|
// Codex/external hosts should get the generated wrapper.
|
|
expect(fs.existsSync(path.join(AGENTS_DIR, 'gstack-claude', 'SKILL.md'))).toBe(true);
|
|
});
|
|
|
|
test('Codex skill names follow gstack-{name} convention', () => {
|
|
const codexDirs = fs.readdirSync(AGENTS_DIR);
|
|
for (const dir of codexDirs) {
|
|
// Every directory should start with gstack
|
|
expect(dir.startsWith('gstack')).toBe(true);
|
|
// Root is just 'gstack', others are 'gstack-{name}'
|
|
if (dir !== 'gstack') {
|
|
expect(dir.startsWith('gstack-')).toBe(true);
|
|
}
|
|
}
|
|
});
|
|
|
|
test('$B commands in Codex SKILL.md files are valid browse commands', () => {
|
|
const codexDirs = fs.readdirSync(AGENTS_DIR);
|
|
for (const dir of codexDirs) {
|
|
const skillMd = path.join(AGENTS_DIR, dir, 'SKILL.md');
|
|
if (!fs.existsSync(skillMd)) continue;
|
|
const content = fs.readFileSync(skillMd, 'utf-8');
|
|
// Only validate if the skill contains $B commands
|
|
if (!content.includes('$B ')) continue;
|
|
const result = validateSkill(skillMd);
|
|
expect(result.invalid).toHaveLength(0);
|
|
}
|
|
});
|
|
});
|
|
|
|
// --- Repo mode and test failure triage validation ---
|
|
|
|
describe('Repo mode preamble validation', () => {
|
|
test('generated SKILL.md preamble contains REPO_MODE output', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'SKILL.md'), 'utf-8');
|
|
expect(content).toContain('REPO_MODE:');
|
|
expect(content).toContain('gstack-repo-mode');
|
|
});
|
|
|
|
test('tier 3+ skills contain See Something Say Something section', () => {
|
|
// Root SKILL.md is tier 1 (no Repo Mode). Check a tier 3 skill instead.
|
|
const content = fs.readFileSync(path.join(ROOT, 'plan-ceo-review', 'SKILL.md'), 'utf-8');
|
|
expect(content).toContain('See Something, Say Something');
|
|
expect(content).toContain('REPO_MODE');
|
|
expect(content).toContain('solo');
|
|
expect(content).toContain('collaborative');
|
|
});
|
|
});
|
|
|
|
describe('Test failure triage in ship skill', () => {
|
|
test('ship/SKILL.md contains Test Failure Ownership Triage', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'ship', 'SKILL.md'), 'utf-8');
|
|
expect(content).toContain('Test Failure Ownership Triage');
|
|
});
|
|
|
|
test('ship/SKILL.md triage uses git diff for classification', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'ship', 'SKILL.md'), 'utf-8');
|
|
expect(content).toContain('git diff origin/<base>...HEAD --name-only');
|
|
});
|
|
|
|
test('ship/SKILL.md triage has solo and collaborative paths', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'ship', 'SKILL.md'), 'utf-8');
|
|
expect(content).toContain('REPO_MODE');
|
|
expect(content).toContain('solo');
|
|
expect(content).toContain('collaborative');
|
|
expect(content).toContain('Investigate and fix now');
|
|
expect(content).toContain('Add as P0 TODO');
|
|
});
|
|
|
|
test('ship/SKILL.md triage has GitHub issue assignment for collaborative mode', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'ship', 'SKILL.md'), 'utf-8');
|
|
expect(content).toContain('gh issue create');
|
|
expect(content).toContain('--assignee');
|
|
});
|
|
|
|
test('{{TEST_FAILURE_TRIAGE}} placeholder is fully resolved in ship/SKILL.md', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'ship', 'SKILL.md'), 'utf-8');
|
|
expect(content).not.toContain('{{TEST_FAILURE_TRIAGE}}');
|
|
});
|
|
|
|
test('ship/SKILL.md uses in-branch language for stop condition', () => {
|
|
const content = fs.readFileSync(path.join(ROOT, 'ship', 'SKILL.md'), 'utf-8');
|
|
expect(content).toContain('In-branch test failures');
|
|
});
|
|
});
|
|
|
|
describe('no compiled binaries in git', () => {
|
|
// Tracked files enumerated once and reused by both assertions. git ls-files -z
|
|
// + split is ~ms; the previous xargs-per-file shell loops blew past 5s on CI.
|
|
const trackedFiles: string[] = require('child_process')
|
|
.execSync('git ls-files -z', { cwd: ROOT, encoding: 'utf-8' })
|
|
.split('\0')
|
|
.filter(Boolean);
|
|
|
|
test('git tracks no Mach-O or ELF binaries', () => {
|
|
// Only mode 100755 (executable) files can be binaries we care about. Pre-filter
|
|
// via git ls-files -s to avoid running `file` on every text file.
|
|
const lsOut: string = require('child_process').execSync('git ls-files -s', {
|
|
cwd: ROOT,
|
|
encoding: 'utf-8',
|
|
});
|
|
const executableFiles = lsOut
|
|
.split('\n')
|
|
.filter(Boolean)
|
|
.map((line: string) => {
|
|
const parts = line.split(/\s+/);
|
|
return { mode: parts[0], file: line.split('\t')[1] };
|
|
})
|
|
.filter((e: { mode: string; file: string }) => e.mode === '100755')
|
|
.map((e: { mode: string; file: string }) => e.file);
|
|
|
|
if (executableFiles.length === 0) return;
|
|
|
|
// Batch-invoke `file --mime-type` across all executable files at once.
|
|
const result: string = require('child_process')
|
|
.execSync(`file --mime-type -- ${executableFiles.map((f: string) => `'${f.replace(/'/g, "'\\''")}'`).join(' ')}`, {
|
|
cwd: ROOT,
|
|
encoding: 'utf-8',
|
|
})
|
|
.trim();
|
|
|
|
const binaries = result
|
|
.split('\n')
|
|
.filter((l: string) =>
|
|
/application\/(x-mach-binary|x-executable|x-pie-executable|x-sharedlib)/.test(l)
|
|
)
|
|
.map((l: string) => l.split(':')[0].trim());
|
|
|
|
expect(binaries).toEqual([]);
|
|
});
|
|
|
|
test('warns about tracked files larger than 2MB', () => {
|
|
// Large fixtures can be legitimate test infrastructure. Keep visibility on
|
|
// repository size without blocking those fixtures from living in git.
|
|
// Known-good fixtures are exempted from the warning to keep CI logs clean.
|
|
const MAX_BYTES = 2 * 1024 * 1024;
|
|
const knownLargeFixtures = new Set([
|
|
// Deterministic replay fixture for BrowseSafe-Bench. The live bench is
|
|
// expensive; this file is intentionally committed so the gate is free.
|
|
'browse/test/fixtures/security-bench-haiku-responses.json',
|
|
]);
|
|
const oversized = trackedFiles.flatMap((f: string) => {
|
|
if (knownLargeFixtures.has(f)) return [];
|
|
const full = path.join(ROOT, f);
|
|
try {
|
|
const size = fs.statSync(full).size;
|
|
return size > MAX_BYTES ? [{ file: f, size }] : [];
|
|
} catch {
|
|
return [];
|
|
}
|
|
});
|
|
|
|
if (oversized.length > 0) {
|
|
const formatted = oversized
|
|
.map(({ file, size }: { file: string; size: number }) => {
|
|
const mib = (size / (1024 * 1024)).toFixed(1);
|
|
return `${file} (${mib} MiB)`;
|
|
})
|
|
.join(', ');
|
|
console.warn(`[size-warning] tracked files over 2 MiB: ${formatted}`);
|
|
}
|
|
|
|
expect(Array.isArray(oversized)).toBe(true);
|
|
});
|
|
});
|
|
|
|
// `sidebar agent (#584)` describe block was here. sidebar-agent.ts and
|
|
// the entire chat-queue path were ripped in favor of the interactive
|
|
// claude PTY (terminal-agent.ts); these assertions had no target file.
|
|
// Terminal-pane invariants are covered by browse/test/sidebar-tabs.test.ts
|
|
// and browse/test/terminal-agent.test.ts.
|
|
|
|
// ─── Browser-skills validation ──────────────────────────────────
|
|
//
|
|
// Browser-skills are bundled in <gstack-root>/browser-skills/<name>/. Each
|
|
// must have a SKILL.md whose frontmatter satisfies the contract enforced by
|
|
// browse/src/browser-skills.ts:parseSkillFile (host required, args + triggers
|
|
// parseable as the right shape). This test catches malformed bundled skills
|
|
// at CI time, before they ship.
|
|
|
|
describe('Bundled browser-skills frontmatter contract', () => {
|
|
const browserSkillsRoot = path.join(ROOT, 'browser-skills');
|
|
|
|
function listBundledSkillDirs(): string[] {
|
|
if (!fs.existsSync(browserSkillsRoot)) return [];
|
|
return fs.readdirSync(browserSkillsRoot)
|
|
.filter(name => !name.startsWith('.'))
|
|
.map(name => path.join(browserSkillsRoot, name))
|
|
.filter(dir => {
|
|
try { return fs.statSync(dir).isDirectory(); } catch { return false; }
|
|
});
|
|
}
|
|
|
|
test('each bundled skill has a SKILL.md', () => {
|
|
for (const dir of listBundledSkillDirs()) {
|
|
const skillFile = path.join(dir, 'SKILL.md');
|
|
expect(fs.existsSync(skillFile)).toBe(true);
|
|
}
|
|
});
|
|
|
|
test('each bundled skill SKILL.md frontmatter parses with required fields', async () => {
|
|
const { parseSkillFile } = await import('../browse/src/browser-skills');
|
|
for (const dir of listBundledSkillDirs()) {
|
|
const name = path.basename(dir);
|
|
const content = fs.readFileSync(path.join(dir, 'SKILL.md'), 'utf-8');
|
|
// parseSkillFile throws on missing required fields; we just want to
|
|
// make sure none of our shipped skills tripwire it.
|
|
const { frontmatter } = parseSkillFile(content, { skillName: name });
|
|
expect(frontmatter.name).toBe(name);
|
|
expect(typeof frontmatter.host).toBe('string');
|
|
expect(frontmatter.host.length).toBeGreaterThan(0);
|
|
expect(Array.isArray(frontmatter.triggers)).toBe(true);
|
|
expect(Array.isArray(frontmatter.args)).toBe(true);
|
|
}
|
|
});
|
|
|
|
test('each bundled skill has a script.ts', () => {
|
|
for (const dir of listBundledSkillDirs()) {
|
|
expect(fs.existsSync(path.join(dir, 'script.ts'))).toBe(true);
|
|
}
|
|
});
|
|
|
|
test('each bundled skill ships a sibling SDK at _lib/browse-client.ts', () => {
|
|
for (const dir of listBundledSkillDirs()) {
|
|
expect(fs.existsSync(path.join(dir, '_lib', 'browse-client.ts'))).toBe(true);
|
|
}
|
|
});
|
|
|
|
test('each bundled skill has a script.test.ts', () => {
|
|
for (const dir of listBundledSkillDirs()) {
|
|
expect(fs.existsSync(path.join(dir, 'script.test.ts'))).toBe(true);
|
|
}
|
|
});
|
|
|
|
test("each bundled skill's _lib/browse-client.ts matches the canonical SDK", () => {
|
|
// If the canonical SDK changes, the bundled copy must be updated. This
|
|
// test enforces that — the _lib copy should be byte-identical.
|
|
const canonical = fs.readFileSync(path.join(ROOT, 'browse', 'src', 'browse-client.ts'), 'utf-8');
|
|
for (const dir of listBundledSkillDirs()) {
|
|
const sibling = fs.readFileSync(path.join(dir, '_lib', 'browse-client.ts'), 'utf-8');
|
|
expect(sibling).toBe(canonical);
|
|
}
|
|
});
|
|
|
|
test('script.ts imports browse from ./_lib/browse-client', () => {
|
|
for (const dir of listBundledSkillDirs()) {
|
|
const content = fs.readFileSync(path.join(dir, 'script.ts'), 'utf-8');
|
|
expect(content).toMatch(/from\s+['"]\.\/_lib\/browse-client['"]/);
|
|
}
|
|
});
|
|
});
|