Phase 2: Rewrite SKILL.md as QA playbook + command reference

Reorient SKILL.md files from raw command reference to QA-first playbook with 10 workflow patterns (test user flows, verify deployments, dogfood features, responsive layouts, file upload, forms, dialogs, compare pages). Compact command reference tables at the bottom. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-17 01:31:26 +08:00 · 2026-03-12 13:33:54 -07:00
parent f3ebd0adbf
commit d827276a8d
3 changed files with 383 additions and 428 deletions
--- a/BROWSER.md
+++ b/BROWSER.md
@@ -8,15 +8,16 @@ This document covers the command reference and internals of gstack's headless br
 |----------|----------|----------|
 | Navigate | `goto`, `back`, `forward`, `reload`, `url` | Get to a page |
 | Read | `text`, `html`, `links`, `forms`, `accessibility` | Extract content |
-| Snapshot | `snapshot [-i] [-c] [-d N] [-s sel]` | Get refs for interaction |
+| Snapshot | `snapshot [-i] [-c] [-d N] [-s sel] [-D] [-a] [-o] [-C]` | Get refs, diff, annotate |
-| Interact | `click`, `fill`, `select`, `hover`, `type`, `press`, `scroll`, `wait`, `viewport` | Use the page |
+| Interact | `click`, `fill`, `select`, `hover`, `type`, `press`, `scroll`, `wait`, `viewport`, `upload` | Use the page |
-| Inspect | `js`, `eval`, `css`, `attrs`, `console`, `network`, `cookies`, `storage`, `perf` | Debug and verify |
+| Inspect | `js`, `eval`, `css`, `attrs`, `is`, `console`, `network`, `dialog`, `cookies`, `storage`, `perf` | Debug and verify |
 | Visual | `screenshot`, `pdf`, `responsive` | See what Claude sees |
 | Compare | `diff <url1> <url2>` | Spot differences between environments |
 | Dialogs | `dialog-accept [text]`, `dialog-dismiss` | Control alert/confirm/prompt handling |
 | Tabs | `tabs`, `tab`, `newtab`, `closetab` | Multi-page workflows |
 | Multi-step | `chain` (JSON from stdin) | Batch commands in one call |
-All selector arguments accept CSS selectors or `@ref` after `snapshot`. 40+ commands total.
+All selector arguments accept CSS selectors, `@e` refs after `snapshot`, or `@c` refs after `snapshot -C`. 50+ commands total.
 ## How it works
@@ -60,11 +61,11 @@ browse/
 │   ├── cli.ts              # Thin client — reads state file, sends HTTP, prints response
 │   ├── server.ts           # Bun.serve HTTP server — routes commands to Playwright
 │   ├── browser-manager.ts  # Chromium lifecycle — launch, tabs, ref map, crash handling
-│   ├── snapshot.ts         # Accessibility tree → @ref assignment → Locator map
+│   ├── snapshot.ts         # Accessibility tree → @ref assignment → Locator map + diff/annotate/-C
-│   ├── read-commands.ts    # Non-mutating commands (text, html, links, js, css, etc.)
+│   ├── read-commands.ts    # Non-mutating commands (text, html, links, js, css, is, dialog, etc.)
-│   ├── write-commands.ts   # Mutating commands (click, fill, select, navigate, etc.)
+│   ├── write-commands.ts   # Mutating commands (click, fill, select, upload, dialog-accept, etc.)
-│   ├── meta-commands.ts    # Server management (status, stop, restart)
+│   ├── meta-commands.ts    # Server management, chain, diff, snapshot routing
-│   └── buffers.ts          # Console + network log capture (in-memory + disk flush)
+│   └── buffers.ts          # CircularBuffer<T> + console/network/dialog capture
 ├── test/                   # Integration tests + HTML fixtures
 └── dist/
    └── browse              # Compiled binary (~58MB, Bun --compile)
@@ -82,18 +83,28 @@ The browser's key innovation is ref-based element selection, built on Playwright
 No DOM mutation. No injected scripts. Just Playwright's native accessibility API.
 **Extended snapshot features:**
 - `--diff` (`-D`): Stores each snapshot as a baseline. On the next `-D` call, returns a unified diff showing what changed. Use this to verify that an action (click, fill, etc.) actually worked.
 - `--annotate` (`-a`): Injects temporary overlay divs at each ref's bounding box, takes a screenshot with ref labels visible, then removes the overlays. Use `-o <path>` to control the output path.
 - `--cursor-interactive` (`-C`): Scans for non-ARIA interactive elements (divs with `cursor:pointer`, `onclick`, `tabindex>=0`) using `page.evaluate`. Assigns `@c1`, `@c2`... refs with deterministic `nth-child` CSS selectors. These are elements the ARIA tree misses but users can still click.
 ### Authentication
 Each server session generates a random UUID as a bearer token. The token is written to the state file (`/tmp/browse-server.json`) with chmod 600. Every HTTP request must include `Authorization: Bearer <token>`. This prevents other processes on the machine from controlling the browser.
-### Console and network capture
+### Console, network, and dialog capture
-The server hooks into Playwright's `page.on('console')` and `page.on('response')` events. All entries are kept in memory and flushed to disk every second:
+The server hooks into Playwright's `page.on('console')`, `page.on('response')`, and `page.on('dialog')` events. All entries are kept in O(1) circular buffers (50,000 capacity each) and flushed to disk asynchronously via `Bun.write()`:
 - Console: `/tmp/browse-console.log`
 - Network: `/tmp/browse-network.log`
 - Dialog: `/tmp/browse-dialog.log`
-The `console` and `network` commands read from the in-memory buffers, not disk.
+The `console`, `network`, and `dialog` commands read from the in-memory buffers, not disk.
 ### Dialog handling
 Dialogs (alert, confirm, prompt) are auto-accepted by default to prevent browser lockup. The `dialog-accept` and `dialog-dismiss` commands control this behavior. For prompts, `dialog-accept <text>` provides the response text. All dialogs are logged to the dialog buffer with type, message, and action taken.
 ### Multi-workspace support
@@ -184,7 +195,7 @@ bun test browse/test/commands    # run command integration tests only
 bun test browse/test/snapshot    # run snapshot tests only
 ```
-Tests spin up a local HTTP server (`browse/test/test-server.ts`) serving HTML fixtures from `browse/test/fixtures/`, then exercise the CLI commands against those pages. Tests take ~3 seconds.
+Tests spin up a local HTTP server (`browse/test/test-server.ts`) serving HTML fixtures from `browse/test/fixtures/`, then exercise the CLI commands against those pages. 148 tests across 2 files, ~15 seconds total.
 ### Source map
@@ -193,11 +204,11 @@ Tests spin up a local HTTP server (`browse/test/test-server.ts`) serving HTML fi
 | `browse/src/cli.ts` | Entry point. Reads `/tmp/browse-server.json`, sends HTTP to the server, prints response. |
 | `browse/src/server.ts` | Bun HTTP server. Routes commands to the right handler. Manages idle timeout. |
 | `browse/src/browser-manager.ts` | Chromium lifecycle — launch, tab management, ref map, crash detection. |
-| `browse/src/snapshot.ts` | Parses Playwright's accessibility tree, assigns `@ref` labels, builds Locator map. |
+| `browse/src/snapshot.ts` | Parses accessibility tree, assigns `@e`/`@c` refs, builds Locator map. Handles `--diff`, `--annotate`, `-C`. |
-| `browse/src/read-commands.ts` | Non-mutating commands: `text`, `html`, `links`, `js`, `css`, `forms`, etc. |
+| `browse/src/read-commands.ts` | Non-mutating commands: `text`, `html`, `links`, `js`, `css`, `is`, `dialog`, `forms`, etc. Exports `getCleanText()`. |
-| `browse/src/write-commands.ts` | Mutating commands: `goto`, `click`, `fill`, `select`, `scroll`, etc. |
+| `browse/src/write-commands.ts` | Mutating commands: `goto`, `click`, `fill`, `upload`, `dialog-accept`, `useragent` (with context recreation), etc. |
-| `browse/src/meta-commands.ts` | Server management: `status`, `stop`, `restart`. |
+| `browse/src/meta-commands.ts` | Server management, chain routing, diff (DRY via `getCleanText`), snapshot delegation. |
-| `browse/src/buffers.ts` | In-memory + disk capture for console and network logs. |
+| `browse/src/buffers.ts` | `CircularBuffer<T>` (O(1) ring buffer) + console/network/dialog capture with async disk flush. |
 ### Deploying to the active skill
--- a/SKILL.md
+++ b/SKILL.md
@@ -1,254 +1,324 @@
 ---
 name: gstack
-version: 1.0.0
+version: 1.1.0
 description: |
-  Fast web browsing for Claude Code via persistent headless Chromium daemon. Navigate to any URL,
+  Fast headless browser for QA testing and site dogfooding. Navigate any URL, interact with
-  read page content, click elements, fill forms, run JavaScript, take screenshots,
+  elements, verify page state, diff before/after actions, take annotated screenshots, check
-  inspect CSS/DOM, capture console/network logs, and more. ~100ms per command after
+  responsive layouts, test forms and uploads, handle dialogs, and assert element states.
-  first call. Use when you need to check a website, verify a deployment, read docs,
+  ~100ms per command. Use when you need to test a feature, verify a deployment, dogfood a
-  or interact with any web page. No MCP, no Chrome extension — just fast CLI.
+  user flow, or file a bug with evidence.
 allowed-tools:
  - Bash
  - Read
 ---
-# gstack: Persistent Browser for Claude Code
+# gstack browse: QA Testing & Dogfooding
-Persistent headless Chromium daemon. First call auto-starts the server (~3s).
+Persistent headless Chromium. First call auto-starts (~3s), then ~100-200ms per command.
-Every subsequent call: ~100-200ms. Auto-shuts down after 30 min idle.
+Auto-shuts down after 30 min idle. State persists between calls (cookies, tabs, sessions).
 ## SETUP (run this check BEFORE any browse command)
 Before using any browse command, find the skill and check if the binary exists:
 ```bash
-# Check project-level first, then user-level
+B=$(browse/bin/find-browse 2>/dev/null || ~/.claude/skills/gstack/browse/bin/find-browse 2>/dev/null)
-if test -x .claude/skills/gstack/browse/dist/browse; then
+if [ -n "$B" ]; then
-  echo "READY_PROJECT"
+  echo "READY: $B"
 elif test -x ~/.claude/skills/gstack/browse/dist/browse; then
  echo "READY_USER"
 else
  echo "NEEDS_SETUP"
 fi
 ```
 Set `B` to whichever path is READY and use it for all commands. Prefer project-level if both exist.
 If `NEEDS_SETUP`:
-1. Tell the user: "gstack browse needs a one-time build (~10 seconds). OK to proceed?" Then STOP and wait for their response.
+1. Tell the user: "gstack browse needs a one-time build (~10 seconds). OK to proceed?" Then STOP and wait.
-2. If they approve, determine the skill directory (project-level `.claude/skills/gstack` or user-level `~/.claude/skills/gstack`) and run:
+2. Run: `cd <SKILL_DIR> && ./setup`
-```bash
+3. If `bun` is not installed: `curl -fsSL https://bun.sh/install | bash`
 cd <SKILL_DIR> && ./setup
 ```
 3. If `bun` is not installed, tell the user to install it: `curl -fsSL https://bun.sh/install | bash`
 4. Verify the `.gitignore` in the skill directory contains `browse/dist/` and `node_modules/`. If either line is missing, add it.
 Once setup is done, it never needs to run again (the compiled binary persists).
 ## IMPORTANT
- Use the compiled binary via Bash: `.claude/skills/gstack/browse/dist/browse` (project) or `~/.claude/skills/gstack/browse/dist/browse` (user).
+- Use the compiled binary via Bash: `$B <command>`
 - NEVER use `mcp__claude-in-chrome__*` tools. They are slow and unreliable.
- The browser persists between calls — cookies, tabs, and state carry over.
+- Browser persists between calls — cookies, login sessions, and tabs carry over.
- The server auto-starts on first command. No setup needed.
+- Dialogs (alert/confirm/prompt) are auto-accepted by default — no browser lockup.
-## Quick Reference
+## QA Workflows
 ### Test a user flow (login, signup, checkout, etc.)
 ```bash
 B=~/.claude/skills/gstack/browse/dist/browse
-# Navigate to a page
+# 1. Go to the page
-$B goto https://example.com
+$B goto https://app.example.com/login
-# Read cleaned page text
+# 2. See what's interactive
 $B text
 # Take a screenshot (then Read the image)
 $B screenshot /tmp/page.png
 # Snapshot: accessibility tree with refs
 $B snapshot -i
-# Click by ref (after snapshot)
+# 3. Fill the form using refs
-$B click @e3
+$B fill @e3 "test@example.com"
 $B fill @e4 "password123"
 $B click @e5
-# Fill by ref
+# 4. Verify it worked
-$B fill @e4 "test@test.com"
+$B snapshot -D              # diff shows what changed after clicking
-
+$B is visible ".dashboard"  # assert the dashboard appeared
-# Run JavaScript
+$B screenshot /tmp/after-login.png
 $B js "document.title"
 # Get all links
 $B links
 # Click by CSS selector
 $B click "button.submit"
 # Fill a form by CSS selector
 $B fill "#email" "test@test.com"
 $B fill "#password" "abc123"
 $B click "button[type=submit]"
 # Get HTML of an element
 $B html "main"
 # Get computed CSS
 $B css "body" "font-family"
 # Get element attributes
 $B attrs "nav"
 # Wait for element to appear
 $B wait ".loaded"
 # Accessibility tree
 $B accessibility
 # Set viewport
 $B viewport 375x812
 # Set cookies / headers
 $B cookie "session=abc123"
 $B header "Authorization:Bearer token123"
 ```
-## Command Reference
+### Verify a deployment / check prod
-### Navigation
+```bash
-```
+$B goto https://yourapp.com
-browse goto <url>         Navigate current tab
+$B text                          # read the page — does it load?
-browse back               Go back
+$B console                       # any JS errors?
-browse forward            Go forward
+$B network                       # any failed requests?
-browse reload             Reload page
+$B js "document.title"           # correct title?
-browse url                Print current URL
+$B is visible ".hero-section"    # key elements present?
 $B screenshot /tmp/prod-check.png
 ```
-### Content extraction
+### Dogfood a feature end-to-end
-```
+
-browse text               Cleaned page text (no scripts/styles)
+```bash
-browse html [selector]    innerHTML of element, or full page HTML
+# Navigate to the feature
-browse links              All links as "text → href"
+$B goto https://app.example.com/new-feature
-browse forms              All forms + fields as JSON
+
-browse accessibility      Accessibility tree snapshot (ARIA)
+# Take annotated screenshot — shows every interactive element with labels
 $B snapshot -i -a -o /tmp/feature-annotated.png
 # Find ALL clickable things (including divs with cursor:pointer)
 $B snapshot -C
 # Walk through the flow
 $B snapshot -i          # baseline
 $B click @e3            # interact
 $B snapshot -D          # what changed? (unified diff)
 # Check element states
 $B is visible ".success-toast"
 $B is enabled "#next-step-btn"
 $B is checked "#agree-checkbox"
 # Check console for errors after interactions
 $B console
 ```
-### Snapshot (ref-based element selection)
+### Test responsive layouts
-```
+
-browse snapshot           Full accessibility tree with @refs
+```bash
-browse snapshot -i        Interactive elements only (buttons, links, inputs)
+# Quick: 3 screenshots at mobile/tablet/desktop
-browse snapshot -c        Compact (no empty structural elements)
+$B goto https://yourapp.com
-browse snapshot -d <N>    Limit depth to N levels
+$B responsive /tmp/layout
-browse snapshot -s <sel>  Scope to CSS selector
+
 # Manual: specific viewport
 $B viewport 375x812     # iPhone
 $B screenshot /tmp/mobile.png
 $B viewport 1440x900    # Desktop
 $B screenshot /tmp/desktop.png
 ```
-After snapshot, use @refs as selectors in any command:
+### Test file upload
 ```bash
 $B goto https://app.example.com/upload
 $B snapshot -i
 $B upload @e3 /path/to/test-file.pdf
 $B is visible ".upload-success"
 $B screenshot /tmp/upload-result.png
 ```
-browse click @e3          Click the element assigned ref @e3
+
-browse fill @e4 "value"   Fill the input assigned ref @e4
+### Test forms with validation
-browse hover @e1          Hover the element assigned ref @e1
+
-browse html @e2           Get innerHTML of ref @e2
+```bash
-browse css @e5 "color"    Get computed CSS of ref @e5
+$B goto https://app.example.com/form
-browse attrs @e6          Get attributes of ref @e6
+$B snapshot -i
 # Submit empty — check validation errors appear
 $B click @e10                        # submit button
 $B snapshot -D                       # diff shows error messages appeared
 $B is visible ".error-message"
 # Fill and resubmit
 $B fill @e3 "valid input"
 $B click @e10
 $B snapshot -D                       # diff shows errors gone, success state
 ```
 ### Test dialogs (delete confirmations, prompts)
 ```bash
 # Set up dialog handling BEFORE triggering
 $B dialog-accept              # will auto-accept next alert/confirm
 $B click "#delete-button"     # triggers confirmation dialog
 $B dialog                     # see what dialog appeared
 $B snapshot -D                # verify the item was deleted
 # For prompts that need input
 $B dialog-accept "my answer"  # accept with text
 $B click "#rename-button"     # triggers prompt
 ```
 ### Compare two pages / environments
 ```bash
 $B diff https://staging.app.com https://prod.app.com
 ```
 ### Multi-step chain (efficient for long flows)
 ```bash
 echo '[
  ["goto","https://app.example.com"],
  ["snapshot","-i"],
  ["fill","@e3","test@test.com"],
  ["fill","@e4","password"],
  ["click","@e5"],
  ["snapshot","-D"],
  ["screenshot","/tmp/result.png"]
 ]' | $B chain
 ```
 ## Quick Assertion Patterns
 ```bash
 # Element exists and is visible
 $B is visible ".modal"
 # Button is enabled/disabled
 $B is enabled "#submit-btn"
 $B is disabled "#submit-btn"
 # Checkbox state
 $B is checked "#agree"
 # Input is editable
 $B is editable "#name-field"
 # Element has focus
 $B is focused "#search-input"
 # Page contains text
 $B js "document.body.textContent.includes('Success')"
 # Element count
 $B js "document.querySelectorAll('.list-item').length"
 # Specific attribute value
 $B attrs "#logo"    # returns all attributes as JSON
 # CSS property
 $B css ".button" "background-color"
 ```
 ## Snapshot System
 The snapshot is your primary tool for understanding and interacting with pages.
 ```bash
 $B snapshot -i           # Interactive elements only (buttons, links, inputs) with @e refs
 $B snapshot -c           # Compact (no empty structural elements)
 $B snapshot -d 3         # Limit depth to 3 levels
 $B snapshot -s "main"    # Scope to CSS selector
 $B snapshot -D           # Diff against previous snapshot (what changed?)
 $B snapshot -a           # Annotated screenshot with ref labels
 $B snapshot -o /tmp/x.png  # Output path for annotated screenshot
 $B snapshot -C           # Cursor-interactive elements (@c refs — divs with pointer, onclick)
 ```
 Combine flags: `$B snapshot -i -a -C -o /tmp/annotated.png`
 After snapshot, use @refs everywhere:
 ```bash
 $B click @e3       $B fill @e4 "value"     $B hover @e1
 $B html @e2        $B css @e5 "color"      $B attrs @e6
 $B click @c1       # cursor-interactive ref (from -C)
 ```
 Refs are invalidated on navigation — run `snapshot` again after `goto`.
 ## Command Reference
 ### Navigation
 | Command | Description |
 |---------|-------------|
 | `goto <url>` | Navigate to URL |
 | `back` / `forward` | History navigation |
 | `reload` | Reload page |
 | `url` | Print current URL |
 ### Reading
 | Command | Description |
 |---------|-------------|
 | `text` | Cleaned page text |
 | `html [selector]` | innerHTML |
 | `links` | All links as "text -> href" |
 | `forms` | Forms + fields as JSON |
 | `accessibility` | Full ARIA tree |
 ### Interaction
-```
+| Command | Description |
-browse click <selector>        Click element (CSS selector or @ref)
+|---------|-------------|
-browse fill <selector> <value> Fill input field
+| `click <sel>` | Click element |
-browse select <selector> <val> Select dropdown value
+| `fill <sel> <val>` | Fill input |
-browse hover <selector>        Hover over element
+| `select <sel> <val>` | Select dropdown |
-browse type <text>             Type into focused element
+| `hover <sel>` | Hover element |
-browse press <key>             Press key (Enter, Tab, Escape, etc.)
+| `type <text>` | Type into focused element |
-browse scroll [selector]       Scroll element into view, or page bottom
+| `press <key>` | Press key (Enter, Tab, Escape) |
-browse wait <selector>         Wait for element to appear (max 10s)
+| `scroll [sel]` | Scroll element into view |
-browse viewport <WxH>          Set viewport size (e.g. 375x812)
+| `wait <sel>` | Wait for element (max 10s) |
-```
+| `wait --networkidle` | Wait for network to be idle |
 | `wait --load` | Wait for page load event |
 | `upload <sel> <file...>` | Upload file(s) |
 | `cookie-import <json>` | Import cookies from JSON file |
 | `dialog-accept [text]` | Auto-accept dialogs |
 | `dialog-dismiss` | Auto-dismiss dialogs |
 | `viewport <WxH>` | Set viewport size |
 ### Inspection
-```
+| Command | Description |
-browse js <expression>         Run JS, print result
+|---------|-------------|
-browse eval <js-file>          Run JS file against page
+| `js <expr>` | Run JavaScript |
-browse css <selector> <prop>   Get computed CSS property
+| `eval <file>` | Run JS file |
-browse attrs <selector>        Get element attributes as JSON
+| `css <sel> <prop>` | Computed CSS |
-browse console                 Dump captured console messages
+| `attrs <sel>` | Element attributes |
-browse console --clear         Clear console buffer
+| `is <prop> <sel>` | State check (visible/hidden/enabled/disabled/checked/editable/focused) |
-browse network                 Dump captured network requests
+| `console [--clear\|--errors]` | Console messages (--errors filters to error/warning) |
-browse network --clear         Clear network buffer
+| `network [--clear]` | Network requests |
-browse cookies                 Dump all cookies as JSON
+| `dialog [--clear]` | Dialog messages |
-browse storage                 localStorage + sessionStorage as JSON
+| `cookies` | All cookies |
-browse storage set <key> <val> Set localStorage value
+| `storage` | localStorage + sessionStorage |
-browse perf                    Page load performance timings
+| `perf` | Page load timings |
 ```
 ### Visual
-```
+| Command | Description |
-browse screenshot [path]       Screenshot (default: /tmp/browse-screenshot.png)
+|---------|-------------|
-browse pdf [path]              Save as PDF
+| `screenshot [path]` | Screenshot |
-browse responsive [prefix]     Screenshots at mobile/tablet/desktop
+| `pdf [path]` | Save as PDF |
-```
+| `responsive [prefix]` | Mobile/tablet/desktop screenshots |
-
+| `diff <url1> <url2>` | Text diff between pages |
 ### Compare
 ```
 browse diff <url1> <url2>      Text diff between two pages
 ```
 ### Multi-step (chain)
 ```
 echo '[["goto","https://example.com"],["snapshot","-i"],["click","@e1"],["screenshot","/tmp/result.png"]]' | browse chain
 ```
 ### Tabs
-```
+| Command | Description |
-browse tabs                    List tabs (id, url, title)
+|---------|-------------|
-browse tab <id>                Switch to tab
+| `tabs` | List tabs |
-browse newtab [url]            Open new tab
+| `tab <id>` | Switch tab |
-browse closetab [id]           Close tab
+| `newtab [url]` | Open tab |
-```
+| `closetab [id]` | Close tab |
-### Server management
+### Server
-```
+| Command | Description |
-browse status                  Server health, uptime, tab count
+|---------|-------------|
-browse stop                    Shutdown server
+| `status` | Health check |
-browse restart                 Kill + restart server
+| `stop` | Shutdown |
-```
+| `restart` | Restart |
-## Speed Rules
+## Tips
-1. **Navigate once, query many times.** `goto` loads the page; then `text`, `js`, `css`, `screenshot` all run against the loaded page instantly.
+1. **Navigate once, query many times.** `goto` loads the page; then `text`, `js`, `screenshot` all hit the loaded page instantly.
-2. **Use `snapshot -i` for interaction.** Get refs for all interactive elements, then click/fill by ref. No need to guess CSS selectors.
+2. **Use `snapshot -i` first.** See all interactive elements, then click/fill by ref. No CSS selector guessing.
-3. **Use `js` for precision.** `js "document.querySelector('.price').textContent"` is faster than parsing full page text.
+3. **Use `snapshot -D` to verify.** Baseline → action → diff. See exactly what changed.
-4. **Use `links` to survey.** Faster than `text` when you just need navigation structure.
+4. **Use `is` for assertions.** `is visible .modal` is faster and more reliable than parsing page text.
-5. **Use `chain` for multi-step flows.** Avoids CLI overhead per step.
+5. **Use `snapshot -a` for evidence.** Annotated screenshots are great for bug reports.
-6. **Use `responsive` for layout checks.** One command = 3 viewport screenshots.
+6. **Use `snapshot -C` for tricky UIs.** Finds clickable divs that the accessibility tree misses.
-
+7. **Check `console` after actions.** Catch JS errors that don't surface visually.
-## When to Use What
+8. **Use `chain` for long flows.** Single command, no per-step CLI overhead.
 | Task | Commands |
 |------|----------|
 | Read a page | `goto <url>` then `text` |
 | Interact with elements | `snapshot -i` then `click @e3` |
 | Check if element exists | `js "!!document.querySelector('.thing')"` |
 | Extract specific data | `js "document.querySelector('.price').textContent"` |
 | Visual check | `screenshot /tmp/x.png` then Read the image |
 | Fill and submit form | `snapshot -i` → `fill @e4 "val"` → `click @e5` → `screenshot` |
 | Check CSS | `css "selector" "property"` or `css @e3 "property"` |
 | Inspect DOM | `html "selector"` or `attrs @e3` |
 | Debug console errors | `console` |
 | Check network requests | `network` |
 | Check local dev | `goto http://127.0.0.1:3000` |
 | Compare two pages | `diff <url1> <url2>` |
 | Mobile layout check | `responsive /tmp/prefix` |
 | Multi-step flow | `echo '[...]' \| browse chain` |
 ## Architecture
 - Persistent Chromium daemon on localhost (port 9400-9410)
 - Bearer token auth per session
 - State file: `/tmp/browse-server.json`
 - Console log: `/tmp/browse-console.log`
 - Network log: `/tmp/browse-network.log`
 - Auto-shutdown after 30 min idle
 - Chromium crash → server exits → auto-restarts on next command
--- a/browse/SKILL.md
+++ b/browse/SKILL.md
@@ -1,254 +1,128 @@
 ---
 name: browse
-version: 1.0.0
+version: 1.1.0
 description: |
-  Fast web browsing for Claude Code via persistent headless Chromium daemon. Navigate to any URL,
+  Fast headless browser for QA testing and site dogfooding. Navigate any URL, interact with
-  read page content, click elements, fill forms, run JavaScript, take screenshots,
+  elements, verify page state, diff before/after actions, take annotated screenshots, check
-  inspect CSS/DOM, capture console/network logs, and more. ~100ms per command after
+  responsive layouts, test forms and uploads, handle dialogs, and assert element states.
-  first call. Use when you need to check a website, verify a deployment, read docs,
+  ~100ms per command. Use when you need to test a feature, verify a deployment, dogfood a
-  or interact with any web page. No MCP, no Chrome extension — just fast CLI.
+  user flow, or file a bug with evidence.
 allowed-tools:
  - Bash
  - Read
 ---
-# gstack: Persistent Browser for Claude Code
+# browse: QA Testing & Dogfooding
-Persistent headless Chromium daemon. First call auto-starts the server (~3s).
+Persistent headless Chromium. First call auto-starts (~3s), then ~100ms per command.
-Every subsequent call: ~100-200ms. Auto-shuts down after 30 min idle.
+State persists between calls (cookies, tabs, login sessions).
-## SETUP (run this check BEFORE any browse command)
+## Core QA Patterns
 Before using any browse command, find the skill and check if the binary exists:
 ### 1. Verify a page loads correctly
 ```bash
-# Check project-level first, then user-level
+$B goto https://yourapp.com
-if test -x .claude/skills/gstack/browse/dist/browse; then
+$B text                          # content loads?
-  echo "READY_PROJECT"
+$B console                       # JS errors?
-elif test -x ~/.claude/skills/gstack/browse/dist/browse; then
+$B network                       # failed requests?
-  echo "READY_USER"
+$B is visible ".main-content"    # key elements present?
 else
  echo "NEEDS_SETUP"
 fi
 ```
-Set `B` to whichever path is READY and use it for all commands. Prefer project-level if both exist.
+### 2. Test a user flow
 If `NEEDS_SETUP`:
 1. Tell the user: "gstack browse needs a one-time build (~10 seconds). OK to proceed?" Then STOP and wait for their response.
 2. If they approve, determine the skill directory (project-level `.claude/skills/gstack` or user-level `~/.claude/skills/gstack`) and run:
 ```bash
-cd <SKILL_DIR> && ./setup
+$B goto https://app.com/login
 $B snapshot -i                   # see all interactive elements
 $B fill @e3 "user@test.com"
 $B fill @e4 "password"
 $B click @e5                     # submit
 $B snapshot -D                   # diff: what changed after submit?
 $B is visible ".dashboard"       # success state present?
 ```
 3. If `bun` is not installed, tell the user to install it: `curl -fsSL https://bun.sh/install | bash`
 4. Verify the `.gitignore` in the skill directory contains `browse/dist/` and `node_modules/`. If either line is missing, add it.
 Once setup is done, it never needs to run again (the compiled binary persists).
 ## IMPORTANT
 - Use the compiled binary via Bash: `.claude/skills/gstack/browse/dist/browse` (project) or `~/.claude/skills/gstack/browse/dist/browse` (user).
 - NEVER use `mcp__claude-in-chrome__*` tools. They are slow and unreliable.
 - The browser persists between calls — cookies, tabs, and state carry over.
 - The server auto-starts on first command. No setup needed.
 ## Quick Reference
 ### 3. Verify an action worked
 ```bash
-B=~/.claude/skills/gstack/browse/dist/browse
+$B snapshot                      # baseline
-
+$B click @e3                     # do something
-# Navigate to a page
+$B snapshot -D                   # unified diff shows exactly what changed
 $B goto https://example.com
 # Read cleaned page text
 $B text
 # Take a screenshot (then Read the image)
 $B screenshot /tmp/page.png
 # Snapshot: accessibility tree with refs
 $B snapshot -i
 # Click by ref (after snapshot)
 $B click @e3
 # Fill by ref
 $B fill @e4 "test@test.com"
 # Run JavaScript
 $B js "document.title"
 # Get all links
 $B links
 # Click by CSS selector
 $B click "button.submit"
 # Fill a form by CSS selector
 $B fill "#email" "test@test.com"
 $B fill "#password" "abc123"
 $B click "button[type=submit]"
 # Get HTML of an element
 $B html "main"
 # Get computed CSS
 $B css "body" "font-family"
 # Get element attributes
 $B attrs "nav"
 # Wait for element to appear
 $B wait ".loaded"
 # Accessibility tree
 $B accessibility
 # Set viewport
 $B viewport 375x812
 # Set cookies / headers
 $B cookie "session=abc123"
 $B header "Authorization:Bearer token123"
 ```
-## Command Reference
+### 4. Visual evidence for bug reports
-
+```bash
-### Navigation
+$B snapshot -i -a -o /tmp/annotated.png   # labeled screenshot
-```
+$B screenshot /tmp/bug.png                # plain screenshot
-browse goto <url>         Navigate current tab
+$B console                                # error log
 browse back               Go back
 browse forward            Go forward
 browse reload             Reload page
 browse url                Print current URL
 ```
-### Content extraction
+### 5. Find all clickable elements (including non-ARIA)
-```
+```bash
-browse text               Cleaned page text (no scripts/styles)
+$B snapshot -C                   # finds divs with cursor:pointer, onclick, tabindex
-browse html [selector]    innerHTML of element, or full page HTML
+$B click @c1                     # interact with them
 browse links              All links as "text → href"
 browse forms              All forms + fields as JSON
 browse accessibility      Accessibility tree snapshot (ARIA)
 ```
-### Snapshot (ref-based element selection)
+### 6. Assert element states
-```
+```bash
-browse snapshot           Full accessibility tree with @refs
+$B is visible ".modal"
-browse snapshot -i        Interactive elements only (buttons, links, inputs)
+$B is enabled "#submit-btn"
-browse snapshot -c        Compact (no empty structural elements)
+$B is disabled "#submit-btn"
-browse snapshot -d <N>    Limit depth to N levels
+$B is checked "#agree-checkbox"
-browse snapshot -s <sel>  Scope to CSS selector
+$B is editable "#name-field"
 $B is focused "#search-input"
 $B js "document.body.textContent.includes('Success')"
 ```
-After snapshot, use @refs as selectors in any command:
+### 7. Test responsive layouts
-```
+```bash
-browse click @e3          Click the element assigned ref @e3
+$B responsive /tmp/layout        # mobile + tablet + desktop screenshots
-browse fill @e4 "value"   Fill the input assigned ref @e4
+$B viewport 375x812              # or set specific viewport
-browse hover @e1          Hover the element assigned ref @e1
+$B screenshot /tmp/mobile.png
 browse html @e2           Get innerHTML of ref @e2
 browse css @e5 "color"    Get computed CSS of ref @e5
 browse attrs @e6          Get attributes of ref @e6
 ```
-Refs are invalidated on navigation — run `snapshot` again after `goto`.
+### 8. Test file uploads
-
+```bash
-### Interaction
+$B upload "#file-input" /path/to/file.pdf
-```
+$B is visible ".upload-success"
 browse click <selector>        Click element (CSS selector or @ref)
 browse fill <selector> <value> Fill input field
 browse select <selector> <val> Select dropdown value
 browse hover <selector>        Hover over element
 browse type <text>             Type into focused element
 browse press <key>             Press key (Enter, Tab, Escape, etc.)
 browse scroll [selector]       Scroll element into view, or page bottom
 browse wait <selector>         Wait for element to appear (max 10s)
 browse viewport <WxH>          Set viewport size (e.g. 375x812)
 ```
-### Inspection
+### 9. Test dialogs
-```
+```bash
-browse js <expression>         Run JS, print result
+$B dialog-accept "yes"           # set up handler
-browse eval <js-file>          Run JS file against page
+$B click "#delete-button"        # trigger dialog
-browse css <selector> <prop>   Get computed CSS property
+$B dialog                        # see what appeared
-browse attrs <selector>        Get element attributes as JSON
+$B snapshot -D                   # verify deletion happened
 browse console                 Dump captured console messages
 browse console --clear         Clear console buffer
 browse network                 Dump captured network requests
 browse network --clear         Clear network buffer
 browse cookies                 Dump all cookies as JSON
 browse storage                 localStorage + sessionStorage as JSON
 browse storage set <key> <val> Set localStorage value
 browse perf                    Page load performance timings
 ```
-### Visual
+### 10. Compare environments
-```
+```bash
-browse screenshot [path]       Screenshot (default: /tmp/browse-screenshot.png)
+$B diff https://staging.app.com https://prod.app.com
 browse pdf [path]              Save as PDF
 browse responsive [prefix]     Screenshots at mobile/tablet/desktop
 ```
-### Compare
+## Snapshot Flags
 ```
-browse diff <url1> <url2>      Text diff between two pages
+-i        Interactive elements only (buttons, links, inputs)
 -c        Compact (no empty structural nodes)
 -d <N>    Limit depth
 -s <sel>  Scope to CSS selector
 -D        Diff against previous snapshot
 -a        Annotated screenshot with ref labels
 -o <path> Output path for screenshot
 -C        Cursor-interactive elements (@c refs)
 ```
-### Multi-step (chain)
+Combine: `$B snapshot -i -a -C -o /tmp/annotated.png`
 ```
 echo '[["goto","https://example.com"],["snapshot","-i"],["click","@e1"],["screenshot","/tmp/result.png"]]' | browse chain
 ```
-### Tabs
+Use @refs after snapshot: `$B click @e3`, `$B fill @e4 "value"`, `$B click @c1`
 ```
 browse tabs                    List tabs (id, url, title)
 browse tab <id>                Switch to tab
 browse newtab [url]            Open new tab
 browse closetab [id]           Close tab
 ```
-### Server management
+## Full Command List
 ```
 browse status                  Server health, uptime, tab count
 browse stop                    Shutdown server
 browse restart                 Kill + restart server
 ```
-## Speed Rules
+**Navigate:** goto, back, forward, reload, url
-
+**Read:** text, html, links, forms, accessibility
-1. **Navigate once, query many times.** `goto` loads the page; then `text`, `js`, `css`, `screenshot` all run against the loaded page instantly.
+**Snapshot:** snapshot (with flags above)
-2. **Use `snapshot -i` for interaction.** Get refs for all interactive elements, then click/fill by ref. No need to guess CSS selectors.
+**Interact:** click, fill, select, hover, type, press, scroll, wait, wait --networkidle, wait --load, viewport, upload, cookie-import, dialog-accept, dialog-dismiss
-3. **Use `js` for precision.** `js "document.querySelector('.price').textContent"` is faster than parsing full page text.
+**Inspect:** js, eval, css, attrs, is, console, console --errors, network, dialog, cookies, storage, perf
-4. **Use `links` to survey.** Faster than `text` when you just need navigation structure.
+**Visual:** screenshot, pdf, responsive
-5. **Use `chain` for multi-step flows.** Avoids CLI overhead per step.
+**Compare:** diff
-6. **Use `responsive` for layout checks.** One command = 3 viewport screenshots.
+**Multi-step:** chain (pipe JSON array)
-
+**Tabs:** tabs, tab, newtab, closetab
-## When to Use What
+**Server:** status, stop, restart
 | Task | Commands |
 |------|----------|
 | Read a page | `goto <url>` then `text` |
 | Interact with elements | `snapshot -i` then `click @e3` |
 | Check if element exists | `js "!!document.querySelector('.thing')"` |
 | Extract specific data | `js "document.querySelector('.price').textContent"` |
 | Visual check | `screenshot /tmp/x.png` then Read the image |
 | Fill and submit form | `snapshot -i` → `fill @e4 "val"` → `click @e5` → `screenshot` |
 | Check CSS | `css "selector" "property"` or `css @e3 "property"` |
 | Inspect DOM | `html "selector"` or `attrs @e3` |
 | Debug console errors | `console` |
 | Check network requests | `network` |
 | Check local dev | `goto http://127.0.0.1:3000` |
 | Compare two pages | `diff <url1> <url2>` |
 | Mobile layout check | `responsive /tmp/prefix` |
 | Multi-step flow | `echo '[...]' \| browse chain` |
 ## Architecture
 - Persistent Chromium daemon on localhost (port 9400-9410)
 - Bearer token auth per session
 - State file: `/tmp/browse-server.json`
 - Console log: `/tmp/browse-console.log`
 - Network log: `/tmp/browse-network.log`
 - Auto-shutdown after 30 min idle
 - Chromium crash → server exits → auto-restarts on next command