feat: interactive /plan-design-review + CEO invokes designer + 100% coverage (v0.6.4) (#149)

* refactor: rename qa-design-review → design-review

The "qa-" prefix was confusing — this is the live-site design audit with
fix loop, not a QA-only report. Rename directory and update all references
across docs, tests, scripts, and skill templates.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: interactive /plan-design-review + CEO invokes designer

Rewrite /plan-design-review from report-only grading to an interactive
plan-fixer that rates each design dimension 0-10, explains what a 10
looks like, and edits the plan to get there. Parallel structure with
/plan-ceo-review and /plan-eng-review — one issue = one AskUserQuestion.

CEO review now detects UI scope and invokes the designer perspective
when the plan has frontend/UX work, so you get design review
automatically when it matters.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: validation + touchfile entries for 100% coverage

Add design-consultation to command/snapshot flag validation. Add 4
skills to contributor mode validation (plan-design-review,
design-review, design-consultation, document-release). Add 2 templates
to hardcoded branch check. Register touchfile entries for 10 new
LLM-judge tests and 1 new E2E test.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: LLM-judge for 10 skills + gstack-upgrade E2E

Add LLM-judge quality evals for all uncovered skills using a DRY
runWorkflowJudge helper with section marker guards. Add real E2E
test for gstack-upgrade using mock git remote (replaces test.todo).
Add plan-edit assertion to plan-design-review E2E.

14/15 skills now at full coverage. setup-browser-cookies remains
deferred (needs real browser).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: add bisect commit style to CLAUDE.md

All commits should be single logical changes, split before pushing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: bump version and changelog (v0.6.4.0)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Garry Tan
2026-03-17 22:48:48 -05:00
committed by GitHub
parent f91222f5bd
commit 78c207efb4
24 changed files with 1120 additions and 765 deletions

View File

@@ -53,7 +53,7 @@ gstack/
│ └── skill-e2e.test.ts # Tier 2: E2E via claude -p (~$3.85/run)
├── qa-only/ # /qa-only skill (report-only QA, no fixes)
├── plan-design-review/ # /plan-design-review skill (report-only design audit)
├── qa-design-review/ # /qa-design-review skill (design audit + fix loop)
├── design-review/ # /design-review skill (design audit + fix loop)
├── ship/ # Ship workflow skill
├── review/ # PR review skill
├── plan-ceo-review/ # /plan-ceo-review skill
@@ -119,6 +119,22 @@ symlink or a real copy. If it's a symlink to your working directory, be aware th
gen-skill-docs pipeline, consider whether the changes should be tested in isolation
before going live (especially if the user is actively using gstack in other windows).
## Commit style
**Always bisect commits.** Every commit should be a single logical change. When
you've made multiple changes (e.g., a rename + a rewrite + new tests), split them
into separate commits before pushing. Each commit should be independently
understandable and revertable.
Examples of good bisection:
- Rename/move separate from behavior changes
- Test infrastructure (touchfiles, helpers) separate from test implementations
- Template changes separate from generated file regeneration
- Mechanical refactors separate from new features
When the user says "bisect commit" or "bisect and push," split staged/unstaged
changes into logical commits and push.
## CHANGELOG style
CHANGELOG.md is **for users**, not contributors. Write it like product release notes: