Merge remote-tracking branch 'origin/main' into garrytan/plan-tune-skill

Conflicts resolved:
- VERSION / package.json: keep 1.0.0.0 (our v1 milestone bump from
  commit 18fc95c0, supersedes main's 0.18.4.0).
- CHANGELOG.md: preserved both entries in chronological order — v1.0.0.0
  (writing-style + throughput receipts) above v0.18.4.0 (codex + Apple
  Silicon hardening). No gaps in sequence: 1.0.0.0 → 0.19.0.0 → 0.18.4.0
  → 0.18.3.0 → ...

Main added v0.18.4.0 (#1056): codex + Apple Silicon hardening wave.

Regenerated all SKILL.md files after merge so they pick up both main's
template changes AND our v1 writing-style + question-tuning additions.

Full free test suite: 1288 pass, 0 fail, 113 skip across 37 files, 8555
expect() calls.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Garry Tan
2026-04-18 13:59:54 +08:00
27 changed files with 1056 additions and 72 deletions

View File

@@ -1034,6 +1034,39 @@ Loaded review skills from disk. Starting full review pipeline with auto-decision
---
## Phase 0.5: Codex auth + version preflight
Before invoking any Codex voice, preflight the CLI: verify auth (multi-signal) and
warn on known-bad CLI versions. This is infrastructure for all 4 phases below —
source it once here and the helper functions stay in scope for the rest of the
workflow.
```bash
_TEL=$(~/.claude/skills/gstack/bin/gstack-config get telemetry 2>/dev/null || echo off)
source ~/.claude/skills/gstack/bin/gstack-codex-probe
# Check Codex binary. If missing, tag the degradation matrix and continue
# with Claude subagent only (autoplan's existing degradation fallback).
if ! command -v codex >/dev/null 2>&1; then
_gstack_codex_log_event "codex_cli_missing"
echo "[codex-unavailable: binary not found] — proceeding with Claude subagent only"
_CODEX_AVAILABLE=false
elif ! _gstack_codex_auth_probe >/dev/null; then
_gstack_codex_log_event "codex_auth_failed"
echo "[codex-unavailable: auth missing] — proceeding with Claude subagent only. Run \`codex login\` or set \$CODEX_API_KEY to enable dual-voice review."
_CODEX_AVAILABLE=false
else
_gstack_codex_version_check # non-blocking warn if known-bad
_CODEX_AVAILABLE=true
fi
```
If `_CODEX_AVAILABLE=false`, all Phase 1-3.5 Codex voices below degrade to
`[codex-unavailable]` in the degradation matrix. /autoplan completes with
Claude subagent only — saves token spend on Codex prompts we can't use.
---
## Phase 1: CEO Review (Strategy & Scope)
Follow plan-ceo-review/SKILL.md — all sections, full depth.
@@ -1057,7 +1090,7 @@ Override: every AskUserQuestion → auto-decide using the 6 principles.
**Codex CEO voice** (via Bash):
```bash
_REPO_ROOT=$(git rev-parse --show-toplevel) || { echo "ERROR: not in a git repo" >&2; exit 1; }
codex exec "IMPORTANT: Do NOT read or execute any SKILL.md files or files in skill definition directories (paths containing skills/gstack). These are AI assistant skill definitions meant for a different system. Stay focused on repository code only.
_gstack_codex_timeout_wrapper 600 codex exec "IMPORTANT: Do NOT read or execute any SKILL.md files or files in skill definition directories (paths containing skills/gstack). These are AI assistant skill definitions meant for a different system. Stay focused on repository code only.
You are a CEO/founder advisor reviewing a development plan.
Challenge the strategic foundations: Are the premises valid or assumed? Is this the
@@ -1065,9 +1098,15 @@ Override: every AskUserQuestion → auto-decide using the 6 principles.
What alternatives were dismissed too quickly? What competitive or market risks are
unaddressed? What scope decisions will look foolish in 6 months? Be adversarial.
No compliments. Just the strategic blind spots.
File: <plan_path>" -C "$_REPO_ROOT" -s read-only --enable web_search_cached
File: <plan_path>" -C "$_REPO_ROOT" -s read-only --enable web_search_cached < /dev/null
_CODEX_EXIT=$?
if [ "$_CODEX_EXIT" = "124" ]; then
_gstack_codex_log_event "codex_timeout" "600"
_gstack_codex_log_hang "autoplan" "0"
echo "[codex stalled past 10 minutes — tagging as [codex-unavailable] for this phase and proceeding with Claude subagent only]"
fi
```
Timeout: 10 minutes
Timeout: 10 minutes (shell-wrapper) + 12 minutes (Bash outer gate). On hang, auto-degrades this phase's Codex voice.
**Claude CEO subagent** (via Agent tool):
"Read the plan file at <plan_path>. You are an independent CEO/strategist
@@ -1168,7 +1207,7 @@ Override: every AskUserQuestion → auto-decide using the 6 principles.
**Codex design voice** (via Bash):
```bash
_REPO_ROOT=$(git rev-parse --show-toplevel) || { echo "ERROR: not in a git repo" >&2; exit 1; }
codex exec "IMPORTANT: Do NOT read or execute any SKILL.md files or files in skill definition directories (paths containing skills/gstack). These are AI assistant skill definitions meant for a different system. Stay focused on repository code only.
_gstack_codex_timeout_wrapper 600 codex exec "IMPORTANT: Do NOT read or execute any SKILL.md files or files in skill definition directories (paths containing skills/gstack). These are AI assistant skill definitions meant for a different system. Stay focused on repository code only.
Read the plan file at <plan_path>. Evaluate this plan's
UI/UX design decisions.
@@ -1182,9 +1221,15 @@ Override: every AskUserQuestion → auto-decide using the 6 principles.
accessibility requirements (keyboard nav, contrast, touch targets) specified or
aspirational? Does the plan describe specific UI decisions or generic patterns?
What design decisions will haunt the implementer if left ambiguous?
Be opinionated. No hedging." -C "$_REPO_ROOT" -s read-only --enable web_search_cached
Be opinionated. No hedging." -C "$_REPO_ROOT" -s read-only --enable web_search_cached < /dev/null
_CODEX_EXIT=$?
if [ "$_CODEX_EXIT" = "124" ]; then
_gstack_codex_log_event "codex_timeout" "600"
_gstack_codex_log_hang "autoplan" "0"
echo "[codex stalled past 10 minutes — tagging as [codex-unavailable] for this phase and proceeding with Claude subagent only]"
fi
```
Timeout: 10 minutes
Timeout: 10 minutes (shell-wrapper) + 12 minutes (Bash outer gate). On hang, auto-degrades this phase's Codex voice.
**Claude design subagent** (via Agent tool):
"Read the plan file at <plan_path>. You are an independent senior product designer
@@ -1243,7 +1288,7 @@ Override: every AskUserQuestion → auto-decide using the 6 principles.
**Codex eng voice** (via Bash):
```bash
_REPO_ROOT=$(git rev-parse --show-toplevel) || { echo "ERROR: not in a git repo" >&2; exit 1; }
codex exec "IMPORTANT: Do NOT read or execute any SKILL.md files or files in skill definition directories (paths containing skills/gstack). These are AI assistant skill definitions meant for a different system. Stay focused on repository code only.
_gstack_codex_timeout_wrapper 600 codex exec "IMPORTANT: Do NOT read or execute any SKILL.md files or files in skill definition directories (paths containing skills/gstack). These are AI assistant skill definitions meant for a different system. Stay focused on repository code only.
Review this plan for architectural issues, missing edge cases,
and hidden complexity. Be adversarial.
@@ -1252,9 +1297,15 @@ Override: every AskUserQuestion → auto-decide using the 6 principles.
CEO: <insert CEO consensus table summary — key concerns, DISAGREEs>
Design: <insert Design consensus table summary, or 'skipped, no UI scope'>
File: <plan_path>" -C "$_REPO_ROOT" -s read-only --enable web_search_cached
File: <plan_path>" -C "$_REPO_ROOT" -s read-only --enable web_search_cached < /dev/null
_CODEX_EXIT=$?
if [ "$_CODEX_EXIT" = "124" ]; then
_gstack_codex_log_event "codex_timeout" "600"
_gstack_codex_log_hang "autoplan" "0"
echo "[codex stalled past 10 minutes — tagging as [codex-unavailable] for this phase and proceeding with Claude subagent only]"
fi
```
Timeout: 10 minutes
Timeout: 10 minutes (shell-wrapper) + 12 minutes (Bash outer gate). On hang, auto-degrades this phase's Codex voice.
**Claude eng subagent** (via Agent tool):
"Read the plan file at <plan_path>. You are an independent senior engineer
@@ -1358,7 +1409,7 @@ Log: "Phase 3.5 skipped — no developer-facing scope detected."
**Codex DX voice** (via Bash):
```bash
_REPO_ROOT=$(git rev-parse --show-toplevel) || { echo "ERROR: not in a git repo" >&2; exit 1; }
codex exec "IMPORTANT: Do NOT read or execute any SKILL.md files or files in skill definition directories (paths containing skills/gstack). These are AI assistant skill definitions meant for a different system. Stay focused on repository code only.
_gstack_codex_timeout_wrapper 600 codex exec "IMPORTANT: Do NOT read or execute any SKILL.md files or files in skill definition directories (paths containing skills/gstack). These are AI assistant skill definitions meant for a different system. Stay focused on repository code only.
Read the plan file at <plan_path>. Evaluate this plan's developer experience.
@@ -1372,9 +1423,15 @@ Log: "Phase 3.5 skipped — no developer-facing scope detected."
3. API/CLI design: are names guessable? Are defaults sensible? Is it consistent?
4. Docs: can a dev find what they need in under 2 minutes? Are examples copy-paste-complete?
5. Upgrade path: can devs upgrade without fear? Migration guides? Deprecation warnings?
Be adversarial. Think like a developer who is evaluating this against 3 competitors." -C "$_REPO_ROOT" -s read-only --enable web_search_cached
Be adversarial. Think like a developer who is evaluating this against 3 competitors." -C "$_REPO_ROOT" -s read-only --enable web_search_cached < /dev/null
_CODEX_EXIT=$?
if [ "$_CODEX_EXIT" = "124" ]; then
_gstack_codex_log_event "codex_timeout" "600"
_gstack_codex_log_hang "autoplan" "0"
echo "[codex stalled past 10 minutes — tagging as [codex-unavailable] for this phase and proceeding with Claude subagent only]"
fi
```
Timeout: 10 minutes
Timeout: 10 minutes (shell-wrapper) + 12 minutes (Bash outer gate). On hang, auto-degrades this phase's Codex voice.
**Claude DX subagent** (via Agent tool):
"Read the plan file at <plan_path>. You are an independent DX engineer

View File

@@ -234,6 +234,39 @@ Loaded review skills from disk. Starting full review pipeline with auto-decision
---
## Phase 0.5: Codex auth + version preflight
Before invoking any Codex voice, preflight the CLI: verify auth (multi-signal) and
warn on known-bad CLI versions. This is infrastructure for all 4 phases below —
source it once here and the helper functions stay in scope for the rest of the
workflow.
```bash
_TEL=$(~/.claude/skills/gstack/bin/gstack-config get telemetry 2>/dev/null || echo off)
source ~/.claude/skills/gstack/bin/gstack-codex-probe
# Check Codex binary. If missing, tag the degradation matrix and continue
# with Claude subagent only (autoplan's existing degradation fallback).
if ! command -v codex >/dev/null 2>&1; then
_gstack_codex_log_event "codex_cli_missing"
echo "[codex-unavailable: binary not found] — proceeding with Claude subagent only"
_CODEX_AVAILABLE=false
elif ! _gstack_codex_auth_probe >/dev/null; then
_gstack_codex_log_event "codex_auth_failed"
echo "[codex-unavailable: auth missing] — proceeding with Claude subagent only. Run \`codex login\` or set \$CODEX_API_KEY to enable dual-voice review."
_CODEX_AVAILABLE=false
else
_gstack_codex_version_check # non-blocking warn if known-bad
_CODEX_AVAILABLE=true
fi
```
If `_CODEX_AVAILABLE=false`, all Phase 1-3.5 Codex voices below degrade to
`[codex-unavailable]` in the degradation matrix. /autoplan completes with
Claude subagent only — saves token spend on Codex prompts we can't use.
---
## Phase 1: CEO Review (Strategy & Scope)
Follow plan-ceo-review/SKILL.md — all sections, full depth.
@@ -257,7 +290,7 @@ Override: every AskUserQuestion → auto-decide using the 6 principles.
**Codex CEO voice** (via Bash):
```bash
_REPO_ROOT=$(git rev-parse --show-toplevel) || { echo "ERROR: not in a git repo" >&2; exit 1; }
codex exec "IMPORTANT: Do NOT read or execute any SKILL.md files or files in skill definition directories (paths containing skills/gstack). These are AI assistant skill definitions meant for a different system. Stay focused on repository code only.
_gstack_codex_timeout_wrapper 600 codex exec "IMPORTANT: Do NOT read or execute any SKILL.md files or files in skill definition directories (paths containing skills/gstack). These are AI assistant skill definitions meant for a different system. Stay focused on repository code only.
You are a CEO/founder advisor reviewing a development plan.
Challenge the strategic foundations: Are the premises valid or assumed? Is this the
@@ -265,9 +298,15 @@ Override: every AskUserQuestion → auto-decide using the 6 principles.
What alternatives were dismissed too quickly? What competitive or market risks are
unaddressed? What scope decisions will look foolish in 6 months? Be adversarial.
No compliments. Just the strategic blind spots.
File: <plan_path>" -C "$_REPO_ROOT" -s read-only --enable web_search_cached
File: <plan_path>" -C "$_REPO_ROOT" -s read-only --enable web_search_cached < /dev/null
_CODEX_EXIT=$?
if [ "$_CODEX_EXIT" = "124" ]; then
_gstack_codex_log_event "codex_timeout" "600"
_gstack_codex_log_hang "autoplan" "0"
echo "[codex stalled past 10 minutes — tagging as [codex-unavailable] for this phase and proceeding with Claude subagent only]"
fi
```
Timeout: 10 minutes
Timeout: 10 minutes (shell-wrapper) + 12 minutes (Bash outer gate). On hang, auto-degrades this phase's Codex voice.
**Claude CEO subagent** (via Agent tool):
"Read the plan file at <plan_path>. You are an independent CEO/strategist
@@ -368,7 +407,7 @@ Override: every AskUserQuestion → auto-decide using the 6 principles.
**Codex design voice** (via Bash):
```bash
_REPO_ROOT=$(git rev-parse --show-toplevel) || { echo "ERROR: not in a git repo" >&2; exit 1; }
codex exec "IMPORTANT: Do NOT read or execute any SKILL.md files or files in skill definition directories (paths containing skills/gstack). These are AI assistant skill definitions meant for a different system. Stay focused on repository code only.
_gstack_codex_timeout_wrapper 600 codex exec "IMPORTANT: Do NOT read or execute any SKILL.md files or files in skill definition directories (paths containing skills/gstack). These are AI assistant skill definitions meant for a different system. Stay focused on repository code only.
Read the plan file at <plan_path>. Evaluate this plan's
UI/UX design decisions.
@@ -382,9 +421,15 @@ Override: every AskUserQuestion → auto-decide using the 6 principles.
accessibility requirements (keyboard nav, contrast, touch targets) specified or
aspirational? Does the plan describe specific UI decisions or generic patterns?
What design decisions will haunt the implementer if left ambiguous?
Be opinionated. No hedging." -C "$_REPO_ROOT" -s read-only --enable web_search_cached
Be opinionated. No hedging." -C "$_REPO_ROOT" -s read-only --enable web_search_cached < /dev/null
_CODEX_EXIT=$?
if [ "$_CODEX_EXIT" = "124" ]; then
_gstack_codex_log_event "codex_timeout" "600"
_gstack_codex_log_hang "autoplan" "0"
echo "[codex stalled past 10 minutes — tagging as [codex-unavailable] for this phase and proceeding with Claude subagent only]"
fi
```
Timeout: 10 minutes
Timeout: 10 minutes (shell-wrapper) + 12 minutes (Bash outer gate). On hang, auto-degrades this phase's Codex voice.
**Claude design subagent** (via Agent tool):
"Read the plan file at <plan_path>. You are an independent senior product designer
@@ -443,7 +488,7 @@ Override: every AskUserQuestion → auto-decide using the 6 principles.
**Codex eng voice** (via Bash):
```bash
_REPO_ROOT=$(git rev-parse --show-toplevel) || { echo "ERROR: not in a git repo" >&2; exit 1; }
codex exec "IMPORTANT: Do NOT read or execute any SKILL.md files or files in skill definition directories (paths containing skills/gstack). These are AI assistant skill definitions meant for a different system. Stay focused on repository code only.
_gstack_codex_timeout_wrapper 600 codex exec "IMPORTANT: Do NOT read or execute any SKILL.md files or files in skill definition directories (paths containing skills/gstack). These are AI assistant skill definitions meant for a different system. Stay focused on repository code only.
Review this plan for architectural issues, missing edge cases,
and hidden complexity. Be adversarial.
@@ -452,9 +497,15 @@ Override: every AskUserQuestion → auto-decide using the 6 principles.
CEO: <insert CEO consensus table summary — key concerns, DISAGREEs>
Design: <insert Design consensus table summary, or 'skipped, no UI scope'>
File: <plan_path>" -C "$_REPO_ROOT" -s read-only --enable web_search_cached
File: <plan_path>" -C "$_REPO_ROOT" -s read-only --enable web_search_cached < /dev/null
_CODEX_EXIT=$?
if [ "$_CODEX_EXIT" = "124" ]; then
_gstack_codex_log_event "codex_timeout" "600"
_gstack_codex_log_hang "autoplan" "0"
echo "[codex stalled past 10 minutes — tagging as [codex-unavailable] for this phase and proceeding with Claude subagent only]"
fi
```
Timeout: 10 minutes
Timeout: 10 minutes (shell-wrapper) + 12 minutes (Bash outer gate). On hang, auto-degrades this phase's Codex voice.
**Claude eng subagent** (via Agent tool):
"Read the plan file at <plan_path>. You are an independent senior engineer
@@ -558,7 +609,7 @@ Log: "Phase 3.5 skipped — no developer-facing scope detected."
**Codex DX voice** (via Bash):
```bash
_REPO_ROOT=$(git rev-parse --show-toplevel) || { echo "ERROR: not in a git repo" >&2; exit 1; }
codex exec "IMPORTANT: Do NOT read or execute any SKILL.md files or files in skill definition directories (paths containing skills/gstack). These are AI assistant skill definitions meant for a different system. Stay focused on repository code only.
_gstack_codex_timeout_wrapper 600 codex exec "IMPORTANT: Do NOT read or execute any SKILL.md files or files in skill definition directories (paths containing skills/gstack). These are AI assistant skill definitions meant for a different system. Stay focused on repository code only.
Read the plan file at <plan_path>. Evaluate this plan's developer experience.
@@ -572,9 +623,15 @@ Log: "Phase 3.5 skipped — no developer-facing scope detected."
3. API/CLI design: are names guessable? Are defaults sensible? Is it consistent?
4. Docs: can a dev find what they need in under 2 minutes? Are examples copy-paste-complete?
5. Upgrade path: can devs upgrade without fear? Migration guides? Deprecation warnings?
Be adversarial. Think like a developer who is evaluating this against 3 competitors." -C "$_REPO_ROOT" -s read-only --enable web_search_cached
Be adversarial. Think like a developer who is evaluating this against 3 competitors." -C "$_REPO_ROOT" -s read-only --enable web_search_cached < /dev/null
_CODEX_EXIT=$?
if [ "$_CODEX_EXIT" = "124" ]; then
_gstack_codex_log_event "codex_timeout" "600"
_gstack_codex_log_hang "autoplan" "0"
echo "[codex stalled past 10 minutes — tagging as [codex-unavailable] for this phase and proceeding with Claude subagent only]"
fi
```
Timeout: 10 minutes
Timeout: 10 minutes (shell-wrapper) + 12 minutes (Bash outer gate). On hang, auto-degrades this phase's Codex voice.
**Claude DX subagent** (via Agent tool):
"Read the plan file at <plan_path>. You are an independent DX engineer