Merge origin/main into garrytan/tilde-fix-design

main shipped #1030 (context rot defense for /ship — subagent isolation + clean step numbering) under v0.18.2.0 while this wave's v0.18.2.0 was open. Per CLAUDE.md branch-scoped CHANGELOG rules, bumped this branch to v0.18.3.0 with its own entry on top, kept main's v0.18.2.0 entry intact. Auto-merged from main (no behavioral conflicts): - /ship resolver renumbering (review-army.ts, review.ts, testing.ts, utility.ts) - ship/SKILL.md.tmpl + regenerated ship/SKILL.md - design-review/SKILL.md, qa/SKILL.md (touched by ship resolver pass) - test/skill-validation.test.ts (regression guards) - test/gen-skill-docs.test.ts (golden baseline updates) - Golden test fixtures (test/fixtures/golden/*-ship-SKILL.md) - .gitignore, CLAUDE.md, TODOS.md (auto-merged cleanly) Conflict resolved: - CHANGELOG.md: kept both v0.18.2.0 entries by promoting this wave to v0.18.3.0 on top, leaving main's v0.18.2.0 (/ship subagent isolation) directly below. Verified contiguous version sequence: 0.18.3.0 → 0.18.2.0 → 0.18.1.0 → 0.18.0.1 → 0.18.0.0. Synced VERSION + package.json to 0.18.3.0. Verification: bun test passes (EXIT=0). Build clean.
2026-05-22 04:38:24 +08:00 · 2026-04-17 14:27:08 +08:00
parent 2473e6d5f9 b3eaffce07
commit 922f3ba42b
19 changed files with 901 additions and 553 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -1,6 +1,6 @@
 # Changelog

-## [0.18.2.0] - 2026-04-16
+## [0.18.3.0] - 2026-04-17

 ### Added
 - **Windows cookie import.** `/setup-browser-cookies` now works on Windows. Point it at Chrome, Edge, Brave, or Chromium, pick a profile, and gstack will pull your real browser cookies into the headless session. Handles AES-256-GCM (Chrome 80+), DPAPI key unwrap via PowerShell, and falls back to a headless CDP session for v20 App-Bound Encryption on Chrome 127+. Windows users can now do authenticated QA testing with `/qa` and `/design-review` for the first time.
@@ -18,6 +18,20 @@
 - Windows v20 App-Bound Encryption CDP fallback now logs the Chrome version on entry and has an inline comment documenting the debug-port security posture (127.0.0.1-only, random port in [9222, 9321] for collision avoidance, always killed in finally).
 - New regression test `test/openclaw-native-skills.test.ts` pins OpenClaw skill frontmatter to `name` + `description` only — catches version/metadata drift at PR time.

+## [0.18.2.0] - 2026-04-17
+
+### Fixed
+- **`/ship` stops skipping `/document-release` ~80% of the time.** The old Step 8.5 told Claude to `cat` a 2500-line external skill file *after* the PR URL was already output, at which point the model had 500-1,750 lines of intermediate tool output in context and was at its least intelligent. Now `/ship` dispatches `/document-release` as a subagent that runs in a fresh context window, *before* creating the PR, so the `## Documentation` section gets baked into the initial PR body instead of a create-then-re-edit dance. The result: documentation actually syncs on every ship.
+
+### Changed
+- **`/ship`'s 4 heaviest sub-workflows now run in isolated subagent contexts.** Coverage audit (Step 7), plan completion audit (Step 8), Greptile triage (Step 10), and documentation sync (Step 18) each dispatch a subagent that gets a fresh context window. The parent only sees the conclusion (structured JSON), not the intermediate file reads. This is the pattern Anthropic's "Using Claude Code: Session Management and 1M Context" blog post recommends for fighting context rot: "Will I need this tool output again, or just the conclusion? If just the conclusion, use a subagent."
+- **`/ship` step numbers are clean integers 1-20 instead of fractional (`3.47`, `8.5`, `8.75`).** Fractional step numbers signaled "optional appendix" to the model and contributed to late-stage steps getting skipped. Clean integers feel mandatory. Resolver sub-steps that are genuinely nested (Plan Verification 8.1, Scope Drift 8.2, Review Army 9.1/9.2, Cross-review dedup 9.3) are preserved.
+- **`/ship` now prints "You are NOT done" after push.** Breaks the natural stopping point where the model was treating a pushed branch as mission-accomplished and skipping doc sync + PR creation.
+
+### For contributors
+- New regression guards in `test/skill-validation.test.ts` prevent drift back to fractional step numbers and catch cross-contamination between `/ship` and `/review` resolver conditionals.
+- Ship template restructure: old Step 8.5 (post-PR doc sync with `cat` delegation) replaced by new Step 18 (pre-PR subagent dispatch that invokes full `/document-release` skill with its CHANGELOG clobber protections, doc exclusions, risky-change gates, and race-safe PR body editing). Codex caught that the original plan's reimplementation dropped those protections; this version reuses the real `/document-release`.
+
 ## [0.18.1.0] - 2026-04-16

 ### Fixed