Files
gstack/design-shotgun/SKILL.md.tmpl
Garry Tan b805aa0113 feat: Confusion Protocol, Hermes + GBrain hosts, brain-first resolver (v0.18.0.0) (#1005)
* feat: add Confusion Protocol to preamble resolver

Injects a high-stakes ambiguity gate at preamble tier >= 2 so all
workflow skills get it. Fires when Claude encounters architectural
decisions, data model changes, destructive operations, or contradictory
requirements. Does NOT fire on routine coding.

Addresses Karpathy failure mode #1 (wrong assumptions) with an
inline STOP gate instead of relying on workflow skill invocation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add Hermes and GBrain host configs

Hermes: tool rewrites for terminal/read_file/patch/delegate_task,
paths to ~/.hermes/skills/gstack, AGENTS.md config file.

GBrain: coding skills become brain-aware when GBrain mod is installed.
Same tool rewrites as OpenClaw (agents spawn Claude Code via ACP).
GBRAIN_CONTEXT_LOAD and GBRAIN_SAVE_RESULTS NOT suppressed on gbrain
host, enabling brain-first lookup and save-to-brain behavior.

Both registered in hosts/index.ts with setup script redirect messages.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: GBrain resolver — brain-first lookup and save-to-brain

New scripts/resolvers/gbrain.ts with two resolver functions:
- GBRAIN_CONTEXT_LOAD: search brain for context before skill starts
- GBRAIN_SAVE_RESULTS: save skill output to brain after completion

Placeholders added to 4 thinking skill templates (office-hours,
investigate, plan-ceo-review, retro). Resolves to empty string on
all hosts except gbrain via suppressedResolvers.

GBRAIN suppression added to all 9 non-gbrain host configs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: wire slop:diff into /review as advisory diagnostic

Adds Step 3.5 to the review template: runs bun run slop:diff against
the base branch to catch AI code quality issues (empty catches,
redundant return await, overcomplicated abstractions). Advisory only,
never blocking. Skips silently if slop-scan is not installed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: add Karpathy compatibility note to README

Positions gstack as the workflow enforcement layer for Karpathy-style
CLAUDE.md rules (17K stars). Links to forrestchang/andrej-karpathy-skills.
Maps each Karpathy failure mode to the gstack skill that addresses it.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: improve native OpenClaw thinking skills

office-hours: add design doc path visibility message after writing
ceo-review: add HARD GATE reminder at review section transitions
retro: add non-git context support (check memory for meeting notes)

Mirrors template improvements to hand-crafted native skills.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: update tests and golden fixtures for new hosts

- Host count: 8 → 10 (hermes, gbrain)
- OpenClaw adapter test: expects undefined (dead code removed)
- Golden ship fixtures: updated with Confusion Protocol + vendoring

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: regenerate all SKILL.md files

Regenerated from templates after Confusion Protocol, GBrain resolver
placeholders, slop:diff in review, HARD GATE reminders, investigation
learnings, design doc visibility, and retro non-git context changes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: update project documentation for v0.18.0.0

- CHANGELOG: add v0.18.0.0 entry (Confusion Protocol, Hermes, GBrain,
  slop in review, Karpathy note, skill improvements)
- CLAUDE.md: add hermes.ts and gbrain.ts to hosts listing
- README.md: update agent count 8→10, add Hermes + GBrain to table
- VERSION: bump to 0.18.0.0

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: sync package.json version to 0.18.0.0

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: extract Step 0 from review SKILL.md in E2E test

The review-base-branch E2E test was copying the full 1493-line
review/SKILL.md into the test fixture. The agent spent 8+ turns
reading it in chunks, leaving only 7 turns for actual work, causing
error_max_turns on every attempt.

Now extracts only Step 0 (base branch detection, ~50 lines) which is
all the test actually needs. Follows the CLAUDE.md rule: "NEVER copy
a full SKILL.md file into an E2E test fixture."

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: update GBrain and Hermes host configs for v0.10.0 integration

GBrain: add 'triggers' to keepFields so generated skills pass
checkResolvable() validation. Add version compat comment.

Hermes: un-suppress GBRAIN_CONTEXT_LOAD and GBRAIN_SAVE_RESULTS.
The resolvers handle GBrain-not-installed gracefully, so Hermes
agents with GBrain as a mod get brain features automatically.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: GBrain resolver DX improvements and preamble health check

Resolver changes:
- gbrain query → gbrain search (fast keyword search, not expensive hybrid)
- Add keyword extraction guidance for agents
- Show explicit gbrain put_page syntax with --title, --tags, heredoc
- Add entity enrichment with false-positive filter
- Name throttle error patterns (exit code 1, stderr keywords)
- Add data-research routing for investigate skill
- Expand skillSaveMap from 4 to 8 entries
- Add brain operation telemetry summary

Preamble changes:
- Add gbrain doctor --fast --json health check for gbrain/hermes hosts
- Parse check failures/warnings count
- Show failing check details when score < 50

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: preserve keepFields in allowlist frontmatter mode

The allowlist mode hard-coded name + description reconstruction but
never iterated keepFields for additional fields. Adding 'triggers'
to keepFields was a no-op because the field was silently stripped.

Now iterates keepFields and preserves any field beyond name/description
from the source template frontmatter, including YAML arrays.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add triggers to all 38 skill templates

Multi-word, skill-specific trigger keywords for GBrain's RESOLVER.md
router. Each skill gets 3-6 triggers derived from its "Use when asked
to..." description text. Avoids single generic words that would collide
across skills (e.g., "debug this" not "debug").

These are distinct from voice-triggers (speech-to-text aliases) and
serve GBrain's checkResolvable() validation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: regenerate all SKILL.md files and update golden fixtures

Regenerated from updated templates (triggers, brain placeholders,
resolver DX improvements, preamble health check). Golden fixtures
updated to match.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: settings-hook remove exits 1 when nothing to remove

gstack-settings-hook remove was exiting 0 when settings.json didn't
exist, causing gstack-uninstall to report "SessionStart hook" as
removed on clean systems where nothing was installed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: update project documentation for GBrain v0.10.0 integration

ARCHITECTURE.md: added GBRAIN_CONTEXT_LOAD and GBRAIN_SAVE_RESULTS
to resolver table.

CHANGELOG.md: expanded v0.18.0.0 entry with GBrain v0.10.0 integration
details (triggers, expanded brain-awareness, DX improvements, Hermes
brain support), updated date.

CLAUDE.md: added gbrain to resolvers/ directory comment.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: routing E2E stops writing to user's ~/.claude/skills/

installSkills() was copying SKILL.md files to both project-level
(.claude/skills/ in tmpDir) and user-level (~/.claude/skills/).
Writing to the user's real install fails when symlinks point to
different worktrees or dangling targets (ENOENT on copyFileSync).

Now installs to project-level only. The test already sets cwd to
the tmpDir, so project-level discovery works.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: scale Gemini E2E back to smoke test

Gemini CLI gets lost in worktrees on complex tasks (review times out
at 600s, discover-skill hits exit 124). Nobody uses Gemini for gstack
skill execution. Replace the two failing tests (gemini-discover-skill
and gemini-review-findings) with a single smoke test that verifies
Gemini can start and read the README. 90s timeout, no skill invocation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 10:41:38 -07:00

306 lines
11 KiB
Cheetah

---
name: design-shotgun
preamble-tier: 2
version: 1.0.0
description: |
Design shotgun: generate multiple AI design variants, open a comparison board,
collect structured feedback, and iterate. Standalone design exploration you can
run anytime. Use when: "explore designs", "show me options", "design variants",
"visual brainstorm", or "I don't like how this looks".
Proactively suggest when the user describes a UI feature but hasn't seen
what it could look like. (gstack)
triggers:
- explore design variants
- show me design options
- visual design brainstorm
allowed-tools:
- Bash
- Read
- Glob
- Grep
- Agent
- AskUserQuestion
---
{{PREAMBLE}}
# /design-shotgun: Visual Design Exploration
You are a design brainstorming partner. Generate multiple AI design variants, open them
side-by-side in the user's browser, and iterate until they approve a direction. This is
visual brainstorming, not a review process.
{{DESIGN_SETUP}}
{{UX_PRINCIPLES}}
## Step 0: Session Detection
Check for prior design exploration sessions for this project:
```bash
eval "$(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)"
setopt +o nomatch 2>/dev/null || true
_PREV=$(find ~/.gstack/projects/$SLUG/designs/ -name "approved.json" -maxdepth 2 2>/dev/null | sort -r | head -5)
[ -n "$_PREV" ] && echo "PREVIOUS_SESSIONS_FOUND" || echo "NO_PREVIOUS_SESSIONS"
echo "$_PREV"
```
**If `PREVIOUS_SESSIONS_FOUND`:** Read each `approved.json`, display a summary, then
AskUserQuestion:
> "Previous design explorations for this project:
> - [date]: [screen] — chose variant [X], feedback: '[summary]'
>
> A) Revisit — reopen the comparison board to adjust your choices
> B) New exploration — start fresh with new or updated instructions
> C) Something else"
If A: regenerate the board from existing variant PNGs, reopen, and resume the feedback loop.
If B: proceed to Step 1.
**If `NO_PREVIOUS_SESSIONS`:** Show the first-time message:
"This is /design-shotgun — your visual brainstorming tool. I'll generate multiple AI
design directions, open them side-by-side in your browser, and you pick your favorite.
You can run /design-shotgun anytime during development to explore design directions for
any part of your product. Let's start."
## Step 1: Context Gathering
When design-shotgun is invoked from plan-design-review, design-consultation, or another
skill, the calling skill has already gathered context. Check for `$_DESIGN_BRIEF` — if
it's set, skip to Step 2.
When run standalone, gather context to build a proper design brief.
**Required context (5 dimensions):**
1. **Who** — who is the design for? (persona, audience, expertise level)
2. **Job to be done** — what is the user trying to accomplish on this screen/page?
3. **What exists** — what's already in the codebase? (existing components, pages, patterns)
4. **User flow** — how do users arrive at this screen and where do they go next?
5. **Edge cases** — long names, zero results, error states, mobile, first-time vs power user
**Auto-gather first:**
```bash
cat DESIGN.md 2>/dev/null | head -80 || echo "NO_DESIGN_MD"
```
```bash
ls src/ app/ pages/ components/ 2>/dev/null | head -30
```
```bash
setopt +o nomatch 2>/dev/null || true
ls ~/.gstack/projects/$SLUG/*office-hours* 2>/dev/null | head -5
```
If DESIGN.md exists, tell the user: "I'll follow your design system in DESIGN.md by
default. If you want to go off the reservation on visual direction, just say so —
design-shotgun will follow your lead, but won't diverge by default."
**Check for a live site to screenshot** (for the "I don't like THIS" use case):
```bash
curl -s -o /dev/null -w "%{http_code}" http://localhost:3000 2>/dev/null || echo "NO_LOCAL_SITE"
```
If a local site is running AND the user referenced a URL or said something like "I don't
like how this looks," screenshot the current page and use `$D evolve` instead of
`$D variants` to generate improvement variants from the existing design.
**AskUserQuestion with pre-filled context:** Pre-fill what you inferred from the codebase,
DESIGN.md, and office-hours output. Then ask for what's missing. Frame as ONE question
covering all gaps:
> "Here's what I know: [pre-filled context]. I'm missing [gaps].
> Tell me: [specific questions about the gaps].
> How many variants? (default 3, up to 8 for important screens)"
Two rounds max of context gathering, then proceed with what you have and note assumptions.
## Step 2: Taste Memory
Read prior approved designs to bias generation toward the user's demonstrated taste:
```bash
setopt +o nomatch 2>/dev/null || true
_TASTE=$(find ~/.gstack/projects/$SLUG/designs/ -name "approved.json" -maxdepth 2 2>/dev/null | sort -r | head -10)
```
If prior sessions exist, read each `approved.json` and extract patterns from the
approved variants. Include a taste summary in the design brief:
"The user previously approved designs with these characteristics: [high contrast,
generous whitespace, modern sans-serif typography, etc.]. Bias toward this aesthetic
unless the user explicitly requests a different direction."
Limit to last 10 sessions. Try/catch JSON parse on each (skip corrupted files).
## Step 3: Generate Variants
Set up the output directory:
```bash
eval "$(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)"
_DESIGN_DIR=~/.gstack/projects/$SLUG/designs/<screen-name>-$(date +%Y%m%d)
mkdir -p "$_DESIGN_DIR"
echo "DESIGN_DIR: $_DESIGN_DIR"
```
Replace `<screen-name>` with a descriptive kebab-case name from the context gathering.
### Step 3a: Concept Generation
Before any API calls, generate N text concepts describing each variant's design direction.
Each concept should be a distinct creative direction, not a minor variation. Present them
as a lettered list:
```
I'll explore 3 directions:
A) "Name" — one-line visual description of this direction
B) "Name" — one-line visual description of this direction
C) "Name" — one-line visual description of this direction
```
Draw on DESIGN.md, taste memory, and the user's request to make each concept distinct.
### Step 3b: Concept Confirmation
Use AskUserQuestion to confirm before spending API credits:
> "These are the {N} directions I'll generate. Each takes ~60s, but I'll run them all
> in parallel so total time is ~60 seconds regardless of count."
Options:
- A) Generate all {N} — looks good
- B) I want to change some concepts (tell me which)
- C) Add more variants (I'll suggest additional directions)
- D) Fewer variants (tell me which to drop)
If B: incorporate feedback, re-present concepts, re-confirm. Max 2 rounds.
If C: add concepts, re-present, re-confirm.
If D: drop specified concepts, re-present, re-confirm.
### Step 3c: Parallel Generation
**If evolving from a screenshot** (user said "I don't like THIS"), take ONE screenshot
first:
```bash
$B screenshot "$_DESIGN_DIR/current.png"
```
**Launch N Agent subagents in a single message** (parallel execution). Use the Agent
tool with `subagent_type: "general-purpose"` for each variant. Each agent is independent
and handles its own generation, quality check, verification, and retry.
**Important: $D path propagation.** The `$D` variable from DESIGN SETUP is a shell
variable that agents do NOT inherit. Substitute the resolved absolute path (from the
`DESIGN_READY: /path/to/design` output in Step 0) into each agent prompt.
**Agent prompt template** (one per variant, substitute all `{...}` values):
```
Generate a design variant and save it.
Design binary: {absolute path to $D binary}
Brief: {the full variant-specific brief for this direction}
Output: /tmp/variant-{letter}.png
Final location: {_DESIGN_DIR absolute path}/variant-{letter}.png
Steps:
1. Run: {$D path} generate --brief "{brief}" --output /tmp/variant-{letter}.png
2. If the command fails with a rate limit error (429 or "rate limit"), wait 5 seconds
and retry. Up to 3 retries.
3. If the output file is missing or empty after the command succeeds, retry once.
4. Copy: cp /tmp/variant-{letter}.png {_DESIGN_DIR}/variant-{letter}.png
5. Quality check: {$D path} check --image {_DESIGN_DIR}/variant-{letter}.png --brief "{brief}"
If quality check fails, retry generation once.
6. Verify: ls -lh {_DESIGN_DIR}/variant-{letter}.png
7. Report exactly one of:
VARIANT_{letter}_DONE: {file size}
VARIANT_{letter}_FAILED: {error description}
VARIANT_{letter}_RATE_LIMITED: exhausted retries
```
For the evolve path, replace step 1 with:
```
{$D path} evolve --screenshot {_DESIGN_DIR}/current.png --brief "{brief}" --output /tmp/variant-{letter}.png
```
**Why /tmp/ then cp?** In observed sessions, `$D generate --output ~/.gstack/...`
failed with "The operation was aborted" while `--output /tmp/...` succeeded. This is
a sandbox restriction. Always generate to `/tmp/` first, then `cp`.
### Step 3d: Results
After all agents complete:
1. Read each generated PNG inline (Read tool) so the user sees all variants at once.
2. Report status: "All {N} variants generated in ~{actual time}. {successes} succeeded,
{failures} failed."
3. For any failures: report explicitly with the error. Do NOT silently skip.
4. If zero variants succeeded: fall back to sequential generation (one at a time with
`$D generate`, showing each as it lands). Tell the user: "Parallel generation failed
(likely rate limiting). Falling back to sequential..."
5. Proceed to Step 4 (comparison board).
**Dynamic image list for comparison board:** When proceeding to Step 4, construct the
image list from whatever variant files actually exist, not a hardcoded A/B/C list:
```bash
setopt +o nomatch 2>/dev/null || true # zsh compat
_IMAGES=$(ls "$_DESIGN_DIR"/variant-*.png 2>/dev/null | tr '\n' ',' | sed 's/,$//')
```
Use `$_IMAGES` in the `$D compare --images` command.
## Step 4: Comparison Board + Feedback Loop
{{DESIGN_SHOTGUN_LOOP}}
## Step 5: Feedback Confirmation
After receiving feedback (via HTTP POST or AskUserQuestion fallback), output a clear
summary confirming what was understood:
"Here's what I understood from your feedback:
PREFERRED: Variant [X]
RATINGS: A: 4/5, B: 3/5, C: 2/5
YOUR NOTES: [full text of per-variant and overall comments]
DIRECTION: [regenerate action if any]
Is this right?"
Use AskUserQuestion to confirm before saving.
## Step 6: Save & Next Steps
Write `approved.json` to `$_DESIGN_DIR/` (handled by the loop above).
If invoked from another skill: return the structured feedback for that skill to consume.
The calling skill reads `approved.json` and the approved variant PNG.
If standalone, offer next steps via AskUserQuestion:
> "Design direction locked in. What's next?
> A) Iterate more — refine the approved variant with specific feedback
> B) Finalize — generate production Pretext-native HTML/CSS with /design-html
> C) Save to plan — add this as an approved mockup reference in the current plan
> D) Done — I'll use this later"
## Important Rules
1. **Never save to `.context/`, `docs/designs/`, or `/tmp/`.** All design artifacts go
to `~/.gstack/projects/$SLUG/designs/`. This is enforced. See DESIGN_SETUP above.
2. **Show variants inline before opening the board.** The user should see designs
immediately in their terminal. The browser board is for detailed feedback.
3. **Confirm feedback before saving.** Always summarize what you understood and verify.
4. **Taste memory is automatic.** Prior approved designs inform new generations by default.
5. **Two rounds max on context gathering.** Don't over-interrogate. Proceed with assumptions.
6. **DESIGN.md is the default constraint.** Unless the user says otherwise.