fix: resolve merge conflicts with origin/main (v0.6.0 + v0.6.0.1 + v0.6.1)

Merge main's test bootstrap, boil-the-lake completeness principle, selective expansion, ship gate overrides, and gstack-upgrade vendor sync. Conflicts resolved: - CHANGELOG: keep main's 0.6.1/0.6.0.1/0.6.0/0.5.4/0.5.3 entries - VERSION: take main's 0.6.1 - design-consultation: office-hours naming + main's "what's out there" phrasing - ship: keep both verification rules (fresh evidence + coverage tests) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-19 19:02:29 +08:00 · 2026-03-17 14:37:22 -07:00
parent 46b6ded584 9d47619e4c
commit b071df3579
42 changed files with 3926 additions and 1054 deletions
--- a/design-consultation/SKILL.md
+++ b/design-consultation/SKILL.md
@@ -2,7 +2,7 @@
 name: design-consultation
 version: 1.0.0
 description: |
-  Design consultation: understands your product, researches competitors, proposes a
+  Design consultation: understands your product, researches the landscape, proposes a
  complete design system (aesthetic, typography, color, layout, spacing, motion), and
  generates font+color preview pages. Creates DESIGN.md as your project's design source
  of truth. For existing sites, use /plan-design-review to infer the system instead.
@@ -31,22 +31,61 @@ find ~/.gstack/sessions -mmin +120 -type f -delete 2>/dev/null || true
 _CONTRIB=$(~/.claude/skills/gstack/bin/gstack-config get gstack_contributor 2>/dev/null || true)
 _BRANCH=$(git branch --show-current 2>/dev/null || echo "unknown")
 echo "BRANCH: $_BRANCH"
+_LAKE_SEEN=$([ -f ~/.gstack/.completeness-intro-seen ] && echo "yes" || echo "no")
+echo "LAKE_INTRO: $_LAKE_SEEN"
 ```

 If output shows `UPGRADE_AVAILABLE <old> <new>`: read `~/.claude/skills/gstack/gstack-upgrade/SKILL.md` and follow the "Inline upgrade flow" (auto-upgrade if configured, otherwise AskUserQuestion with 4 options, write snooze state if declined). If `JUST_UPGRADED <from> <to>`: tell user "Running gstack v{to} (just updated!)" and continue.

+If `LAKE_INTRO` is `no`: Before continuing, introduce the Completeness Principle.
+Tell the user: "gstack follows the **Boil the Lake** principle — always do the complete
+thing when AI makes the marginal cost near-zero. Read more: https://garryslist.org/posts/boil-the-ocean"
+Then offer to open the essay in their default browser:
+
+```bash
+open https://garryslist.org/posts/boil-the-ocean
+touch ~/.gstack/.completeness-intro-seen
+```
+
+Only run `open` if the user says yes. Always run `touch` to mark as seen. This only happens once.
+
 ## AskUserQuestion Format

 **ALWAYS follow this structure for every AskUserQuestion call:**
 1. **Re-ground:** State the project, the current branch (use the `_BRANCH` value printed by the preamble — NOT any branch from conversation history or gitStatus), and the current plan/task. (1-2 sentences)
 2. **Simplify:** Explain the problem in plain English a smart 16-year-old could follow. No raw function names, no internal jargon, no implementation details. Use concrete examples and analogies. Say what it DOES, not what it's called.
-3. **Recommend:** `RECOMMENDATION: Choose [X] because [one-line reason]`
-4. **Options:** Lettered options: `A) ... B) ... C) ...`
+3. **Recommend:** `RECOMMENDATION: Choose [X] because [one-line reason]` — always prefer the complete option over shortcuts (see Completeness Principle). Include `Completeness: X/10` for each option. Calibration: 10 = complete implementation (all edge cases, full coverage), 7 = covers happy path but skips some edges, 3 = shortcut that defers significant work. If both options are 8+, pick the higher; if one is ≤5, flag it.
+4. **Options:** Lettered options: `A) ... B) ... C) ...` — when an option involves effort, show both scales: `(human: ~X / CC: ~Y)`

 Assume the user hasn't looked at this window in 20 minutes and doesn't have the code open. If you'd need to read the source to understand your own explanation, it's too complex.

 Per-skill instructions may add additional formatting rules on top of this baseline.

+## Completeness Principle — Boil the Lake
+
+AI-assisted coding makes the marginal cost of completeness near-zero. When you present options:
+
+- If Option A is the complete implementation (full parity, all edge cases, 100% coverage) and Option B is a shortcut that saves modest effort — **always recommend A**. The delta between 80 lines and 150 lines is meaningless with CC+gstack. "Good enough" is the wrong instinct when "complete" costs minutes more.
+- **Lake vs. ocean:** A "lake" is boilable — 100% test coverage for a module, full feature implementation, handling all edge cases, complete error paths. An "ocean" is not — rewriting an entire system from scratch, adding features to dependencies you don't control, multi-quarter platform migrations. Recommend boiling lakes. Flag oceans as out of scope.
+- **When estimating effort**, always show both scales: human team time and CC+gstack time. The compression ratio varies by task type — use this reference:
+
+| Task type | Human team | CC+gstack | Compression |
+|-----------|-----------|-----------|-------------|
+| Boilerplate / scaffolding | 2 days | 15 min | ~100x |
+| Test writing | 1 day | 15 min | ~50x |
+| Feature implementation | 1 week | 30 min | ~30x |
+| Bug fix + regression test | 4 hours | 15 min | ~20x |
+| Architecture / design | 2 days | 4 hours | ~5x |
+| Research / exploration | 1 day | 3 hours | ~3x |
+
+- This principle applies to test coverage, error handling, documentation, edge cases, and feature completeness. Don't skip the last 10% to "save time" — with AI, that 10% costs seconds.
+
+**Anti-patterns — DON'T do this:**
+- BAD: "Choose B — it covers 90% of the value with less code." (If A is only 70 lines more, choose A.)
+- BAD: "We can skip edge case handling to save time." (Edge case handling costs minutes with CC.)
+- BAD: "Let's defer test coverage to a follow-up PR." (Tests are the cheapest lake to boil.)
+- BAD: Quoting only human-team effort: "This would take 2 weeks." (Say: "2 weeks human / ~1 hour CC.")
+
 ## Contributor Mode

 If `_CONTRIB` is `true`: you are in **contributor mode**. You're a gstack user who also helps make it better.
@@ -183,7 +222,7 @@ Ask the user a single question that covers everything you need to know. Pre-fill
 3. "Want me to research what top products in your space are doing for design, or should I work from my design knowledge?"
 4. **Explicitly say:** "At any point you can just drop into chat and we'll talk through anything — this isn't a rigid form, it's a conversation."

-If the README or office-hours output gives you enough context, pre-fill and confirm: *"From what I can see, this is [X] for [Y] in the [Z] space. Sound right? And would you like me to research competitors, or should I work from what I know?"*
+If the README or office-hours output gives you enough context, pre-fill and confirm: *"From what I can see, this is [X] for [Y] in the [Z] space. Sound right? And would you like me to research what's out there in this space, or should I work from what I know?"*

 ---

@@ -191,7 +230,7 @@ If the README or office-hours output gives you enough context, pre-fill and conf

 If the user wants competitive research:

-**Step 1: Identify competitors via WebSearch**
+**Step 1: Identify what's out there via WebSearch**

 Use WebSearch to find 5-10 products in their space. Search for:
 - "[product category] website design"
@@ -200,17 +239,17 @@ Use WebSearch to find 5-10 products in their space. Search for:

 **Step 2: Visual research via browse (if available)**

-If the browse binary is available (`$B` is set), visit the top 3-5 competitor sites and capture visual evidence:
+If the browse binary is available (`$B` is set), visit the top 3-5 sites in the space and capture visual evidence:

 ```bash
-$B goto "https://competitor-site.com"
-$B screenshot "/tmp/design-research-competitor-name.png"
+$B goto "https://example-site.com"
+$B screenshot "/tmp/design-research-site-name.png"
 $B snapshot
 ```

-For each competitor, analyze: fonts actually used, color palette, layout approach, spacing density, aesthetic direction. The screenshot gives you the feel; the snapshot gives you structural data.
+For each site, analyze: fonts actually used, color palette, layout approach, spacing density, aesthetic direction. The screenshot gives you the feel; the snapshot gives you structural data.

-If a competitor site blocks the headless browser or requires login, skip it and note why.
+If a site blocks the headless browser or requires login, skip it and note why.

 If browse is not available, rely on WebSearch results and your built-in design knowledge — this is fine.

@@ -219,7 +258,7 @@ If browse is not available, rely on WebSearch results and your built-in design k
 The goal of research is NOT to copy. It is to get in the ballpark — to understand the visual language users in this category already expect. This gives you the baseline. The interesting design work starts after you have the baseline: deciding where to follow conventions (so the product feels literate) and where to break from them (so the product is memorable).

 Summarize conversationally:
-> "I looked at [competitors]. Here's the landscape: they converge on [patterns]. Most of them feel [observation — e.g., interchangeable, polished but generic, etc.]. The opportunity to stand out is [gap]. Here's where I'd play it safe and where I'd take a risk..."
+> "I looked at what's out there. Here's the landscape: they converge on [patterns]. Most of them feel [observation — e.g., interchangeable, polished but generic, etc.]. The opportunity to stand out is [gap]. Here's where I'd play it safe and where I'd take a risk..."

 **Graceful degradation:**
 - Browse available → screenshots + snapshots + WebSearch (richest research)
--- a/design-consultation/SKILL.md.tmpl
+++ b/design-consultation/SKILL.md.tmpl
@@ -2,7 +2,7 @@
 name: design-consultation
 version: 1.0.0
 description: |
-  Design consultation: understands your product, researches competitors, proposes a
+  Design consultation: understands your product, researches the landscape, proposes a
  complete design system (aesthetic, typography, color, layout, spacing, motion), and
  generates font+color preview pages. Creates DESIGN.md as your project's design source
  of truth. For existing sites, use /plan-design-review to infer the system instead.
@@ -76,7 +76,7 @@ Ask the user a single question that covers everything you need to know. Pre-fill
 3. "Want me to research what top products in your space are doing for design, or should I work from my design knowledge?"
 4. **Explicitly say:** "At any point you can just drop into chat and we'll talk through anything — this isn't a rigid form, it's a conversation."

-If the README or office-hours output gives you enough context, pre-fill and confirm: *"From what I can see, this is [X] for [Y] in the [Z] space. Sound right? And would you like me to research competitors, or should I work from what I know?"*
+If the README or office-hours output gives you enough context, pre-fill and confirm: *"From what I can see, this is [X] for [Y] in the [Z] space. Sound right? And would you like me to research what's out there in this space, or should I work from what I know?"*

 ---

@@ -84,7 +84,7 @@ If the README or office-hours output gives you enough context, pre-fill and conf

 If the user wants competitive research:

-**Step 1: Identify competitors via WebSearch**
+**Step 1: Identify what's out there via WebSearch**

 Use WebSearch to find 5-10 products in their space. Search for:
 - "[product category] website design"
@@ -93,17 +93,17 @@ Use WebSearch to find 5-10 products in their space. Search for:

 **Step 2: Visual research via browse (if available)**

-If the browse binary is available (`$B` is set), visit the top 3-5 competitor sites and capture visual evidence:
+If the browse binary is available (`$B` is set), visit the top 3-5 sites in the space and capture visual evidence:

 ```bash
-$B goto "https://competitor-site.com"
-$B screenshot "/tmp/design-research-competitor-name.png"
+$B goto "https://example-site.com"
+$B screenshot "/tmp/design-research-site-name.png"
 $B snapshot
 ```

-For each competitor, analyze: fonts actually used, color palette, layout approach, spacing density, aesthetic direction. The screenshot gives you the feel; the snapshot gives you structural data.
+For each site, analyze: fonts actually used, color palette, layout approach, spacing density, aesthetic direction. The screenshot gives you the feel; the snapshot gives you structural data.

-If a competitor site blocks the headless browser or requires login, skip it and note why.
+If a site blocks the headless browser or requires login, skip it and note why.

 If browse is not available, rely on WebSearch results and your built-in design knowledge — this is fine.

@@ -112,7 +112,7 @@ If browse is not available, rely on WebSearch results and your built-in design k
 The goal of research is NOT to copy. It is to get in the ballpark — to understand the visual language users in this category already expect. This gives you the baseline. The interesting design work starts after you have the baseline: deciding where to follow conventions (so the product feels literate) and where to break from them (so the product is memorable).

 Summarize conversationally:
-> "I looked at [competitors]. Here's the landscape: they converge on [patterns]. Most of them feel [observation — e.g., interchangeable, polished but generic, etc.]. The opportunity to stand out is [gap]. Here's where I'd play it safe and where I'd take a risk..."
+> "I looked at what's out there. Here's the landscape: they converge on [patterns]. Most of them feel [observation — e.g., interchangeable, polished but generic, etc.]. The opportunity to stand out is [gap]. Here's where I'd play it safe and where I'd take a risk..."

 **Graceful degradation:**
 - Browse available → screenshots + snapshots + WebSearch (richest research)