chore: bump version and changelog (v1.26.2.0)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Garry Tan
2026-05-03 20:08:41 -07:00
parent 9df3e140d5
commit e213c10da8
3 changed files with 53 additions and 2 deletions

View File

@@ -1,5 +1,56 @@
# Changelog
## [1.26.2.0] - 2026-05-03
## **`/plan-eng-review` always asks. Never silently writes findings to your plan first.**
Plan-mode review skills now have a hard STOP gate before any AskUserQuestion. The bug
this closes: a `/plan-eng-review` session would do Step 0 scope challenge, find real
issues, write the findings into the plan file as prose, then call `ExitPlanMode`
never invoking AskUserQuestion. The user only saw "ready to execute" with the model's
opinions already baked in. The tool to surface the question existed, the prompt
told the model to use it, and the model still routed around it.
Five sites in `plan-eng-review/SKILL.md.tmpl` now use the office-hours `b512be71`
pattern verbatim: "the AskUserQuestion call is a tool_use, not prose — call the
tool directly," named blockers ("do not edit the plan file, do not call
ExitPlanMode"), and an anti-rationalization clause ("loading the schema via
ToolSearch and writing the recommendation as chat prose is the failure mode this
gate exists to prevent"). The four review-section gates (Architecture, Code
Quality, Test, Performance) and the Step 0 complexity-check trigger all use the
same language.
### What you can now do
- **Trust that any plan-* review skill that produces a plan file ends with the review report.** All four plan-mode E2E tests (`plan-eng`, `plan-ceo`, `plan-design`, `plan-devex`) now assert `## GSTACK REVIEW REPORT` is the last `## ` section of the plan file whenever one was written. The `{{PLAN_FILE_REVIEW_REPORT}}` resolver mandated this contract; nothing tested it until now.
- **Seed a draft plan into a `runPlanSkillObservation` test run.** New `initialPlanContent` option in the runner pre-pumps a user message before invoking the skill, so STOP-gate regression tests have guaranteed-finding-triggering complexity (8+ files, custom-vs-builtin smell) to react to. Without this, plan mode created an empty plan and the skill had nothing to find.
- **Catch the "writes findings to plan as prose before asking" failure mode.** New `wrote_findings_before_asking` classifier outcome fires when a `Write`/`Edit` to `.claude/plans/*` precedes any AskUserQuestion render in the session window. Opt-in via `strictPlanWrites: true` so existing tests where zero-findings → write plan → plan_ready stays legitimate.
- **Run `plan-design-review-plan-mode` on PR CI again.** The touchfiles entry was duplicated — `plan-design-review-plan-mode` appeared at line 94 (gate, full deps) and line 243 (smaller deps). JS object literals: later wins. The effective tier was `periodic`, not `gate`. Three of four plan-mode siblings ran on every PR; design didn't.
### Itemized changes
#### Added
- `runPlanSkillObservation`'s `initialPlanContent?: string` option. Pre-pumps a user message containing the seeded plan before invoking the skill, with a 3s gap so the message renders before the slash command.
- `ClassifyResult` outcome `wrote_findings_before_asking` with companion `strictPlanWrites?` opt on `classifyVisible`. Six new unit tests in `claude-pty-runner.unit.test.ts` cover before/after-AUQ ordering plus the strict-off legacy path.
- Shared test helper `assertReportAtBottomIfPlanWritten(obs)` in `claude-pty-runner.ts`. Wraps the existing `assertReviewReportAtBottom(content)` and gates on `obs.planFile` (artifact existing), so the assertion fires under `'asked'` and `'plan_ready'` both — wherever a plan file was actually written.
- New seeded-plan test case in `skill-e2e-plan-eng-plan-mode.test.ts`: `STOP gate fires when seeded plan forces Step 0 findings`. Combines `initialPlanContent` + `--disallowedTools AskUserQuestion` to force the Conductor MCP-variant path through `mcp__*__AskUserQuestion`.
#### Changed
- `plan-eng-review/SKILL.md.tmpl` lines 116, 139, 152, 160, 169 ported from soft "STOP." prose to the office-hours pattern. Adds tool_use reminder, names blocked next steps explicitly, anti-rationalization clause.
- `runPlanSkillObservation` now captures `obs.planFile` on every classifier outcome (was: only `'plan_ready'`). Catches the case where the skill wrote a plan partway through then paused on a question.
#### Fixed
- `test/helpers/touchfiles.ts` duplicate `plan-design-review-plan-mode` keys deleted (line 243 in `E2E_TOUCHFILES`, line 524 in `E2E_TIERS`). Effective tier is now `gate` again, matching the other three siblings.
- `scripts/resolvers/review.ts` added to all four plan-mode-test touchfiles entries so changes to the `{{PLAN_FILE_REVIEW_REPORT}}` resolver text trigger all four sibling tests in `bun run eval:select`.
#### For contributors
- 6 new classifier unit tests in `test/helpers/claude-pty-runner.unit.test.ts` (70 → 76).
- Outside-voice review by Codex (gpt-5-codex / medium) caught three blockers in the planning stage: a duplicate weaker `planFileEndsWithReviewReport` helper proposal (rejected — reuse existing `assertReviewReportAtBottom`), the seeded-plan-invisibility issue (resolved via `initialPlanContent`), and the touchfiles duplicate-key bug (resolved here). Plan iteration tracked as D3/D4/D5 in `~/.claude/plans/system-instruction-you-are-working-fancy-aho.md`.
## [1.26.0.0] - 2026-05-02
## **Your coding agent now remembers everything. Every gstack skill auto-loads what you actually did.**

View File

@@ -1 +1 @@
1.26.0.0
1.26.2.0

View File

@@ -1,6 +1,6 @@
{
"name": "gstack",
"version": "1.26.0.0",
"version": "1.26.2.0",
"description": "Garry's Stack — Claude Code skills + fast headless browser. One repo, one install, entire AI engineering workflow.",
"license": "MIT",
"type": "module",