mirror of
https://github.com/garrytan/gstack.git
synced 2026-05-08 21:49:45 +08:00
chore: bump version and changelog (v1.1.0.0)
- VERSION: 1.0.0.0 → 1.1.0.0 (MINOR bump — new user-facing commands) - package.json: matching version bump - CHANGELOG.md: new 1.1.0.0 entry describing load-html, screenshot --selector, viewport --scale, file:// support, setContent replay, and DX polish in user voice with a dedicated Security section for file:// safe-dirs policy - browse/SKILL.md.tmpl: adds pattern #12 "Render local HTML", pattern #13 "Retina screenshots", and a full Puppeteer → browse cheatsheet with side-by- side API mapping and a worked tweet-renderer migration example - browse/SKILL.md + SKILL.md: regenerated from templates via `bun run gen:skill-docs` to reflect the new command descriptions Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
24
CHANGELOG.md
24
CHANGELOG.md
@@ -1,5 +1,29 @@
|
||||
# Changelog
|
||||
|
||||
## [1.1.0.0] - 2026-04-18
|
||||
|
||||
### Added
|
||||
- **Browse can now render local HTML without an HTTP server.** Two ways: `$B goto file:///tmp/report.html` navigates to a local file (including cwd-relative `file://./x` and home-relative `file://~/x` forms, smart-parsed so you don't have to think about URL grammar), or `$B load-html /tmp/tweet.html` reads the file and loads it via `page.setContent()`. Both are scoped to cwd + temp dir for safety. If you're migrating a Puppeteer script that generates HTML in memory, this kills your Python-HTTP-server workaround.
|
||||
- **Element screenshots with an explicit flag.** `$B screenshot out.png --selector .card` is now the unambiguous way to screenshot a single element. Positional selectors still work, but tag selectors like `button` weren't recognized positionally, so the flag form fixes that. `--selector` composes with `--base64` and rejects alongside `--clip` (choose one).
|
||||
- **Retina screenshots via `--scale`.** `$B viewport 480x2000 --scale 2` sets `deviceScaleFactor: 2` and produces pixel-doubled screenshots. `$B viewport --scale 2` alone changes just the scale factor and keeps the current size. Scale is capped at 1-3 (gstack policy). Headed mode rejects the flag since scale is controlled by the real browser window.
|
||||
- **Load-HTML content survives scale changes.** Changing `--scale` rebuilds the browser context (that's how Playwright works), which previously would have wiped pages loaded via `load-html`. Now the HTML is cached in tab state and replayed into the new context automatically. In-memory only; never persisted to disk.
|
||||
- **Puppeteer → browse cheatsheet in SKILL.md.** Side-by-side table of Puppeteer APIs mapped to browse commands, plus a full worked example (tweet-renderer flow: viewport + scale + load-html + element screenshot).
|
||||
- **Guess-friendly aliases.** Type `setcontent` or `set-content` and it routes to `load-html`. Canonicalization happens before scope checks, so read-scoped tokens can't use the alias to bypass write-scope enforcement.
|
||||
- **`Did you mean ...?` on unknown commands.** `$B load-htm` returns `Unknown command: 'load-htm'. Did you mean 'load-html'?`. Levenshtein match within distance 2, gated on input length ≥ 4 so 2-letter typos don't produce noise.
|
||||
- **Rich, actionable errors on `load-html`.** Every rejection path (file not found, directory, oversize, outside safe dirs, binary content, frame context) names the input, explains the cause, and says what to do next. Extension allowlist `.html/.htm/.xhtml/.svg` + magic-byte sniff (with UTF-8 BOM strip) catches mis-renamed binaries before they render as garbage.
|
||||
|
||||
### Security
|
||||
- `file://` navigation is now an accepted scheme in `goto`, scoped to cwd + temp dir via the existing `validateReadPath()` policy. UNC/network hosts (`file://host.example.com/...`), IP hosts, IPv6 hosts, and Windows drive-letter hosts are all rejected with explicit errors.
|
||||
|
||||
### For contributors
|
||||
- `validateNavigationUrl()` now returns the normalized URL (previously void). All four callers — goto, diff, newTab, restoreState — updated to consume the return value so smart-parsing takes effect at every navigation site.
|
||||
- New `normalizeFileUrl()` helper uses `fileURLToPath()` + `pathToFileURL()` from `node:url` — never string-concat — so URL escapes like `%20` decode correctly and encoded-slash traversal (`%2F..%2F`) is rejected by Node outright.
|
||||
- New `TabSession.loadedHtml` field + `setTabContent()` / `getLoadedHtml()` / `clearLoadedHtml()` methods. ASCII lifecycle diagram in the source. The `clear` call happens BEFORE navigation starts (not after) so a goto that times out post-commit doesn't leave stale metadata that could resurrect on a later context recreation.
|
||||
- `BrowserManager.setDeviceScaleFactor(scale, w, h)` is atomic: validates input, stores new values, calls `recreateContext()`, rolls back the fields on failure. `currentViewport` tracking means recreateContext preserves your size instead of hardcoding 1280×720.
|
||||
- `COMMAND_ALIASES` + `canonicalizeCommand()` + `buildUnknownCommandError()` + `NEW_IN_VERSION` are exported from `browse/src/commands.ts`. Single source of truth — both the server dispatcher and `chain` prevalidation import from the same place. Chain uses `{ rawName, name }` shape per step so audit logs preserve what the user typed while dispatch uses the canonical name.
|
||||
- `load-html` is registered in `SCOPE_WRITE` in `browse/src/token-registry.ts`.
|
||||
- Review history for the curious: 3 Codex consults (20 + 10 + 6 gaps), DX review (TTHW ~4min → <60s, Champion tier), 2 Eng review passes. Third Codex pass caught the 4-caller bug for `validateNavigationUrl` that the eng passes missed. All findings folded into the plan.
|
||||
|
||||
## [1.0.0.0] - 2026-04-18
|
||||
|
||||
### Added
|
||||
|
||||
7
SKILL.md
7
SKILL.md
@@ -797,7 +797,8 @@ Refs are invalidated on navigation — run `snapshot` again after `goto`.
|
||||
|---------|-------------|
|
||||
| `back` | History back |
|
||||
| `forward` | History forward |
|
||||
| `goto <url>` | Navigate to URL |
|
||||
| `goto <url>` | Navigate to URL (http://, https://, or file:// scoped to cwd/TEMP_DIR) |
|
||||
| `load-html <file> [--wait-until load|domcontentloaded|networkidle]` | Load a local HTML file via setContent (no HTTP server needed). For self-contained HTML (inline CSS/JS, data URIs). For HTML on disk, goto file://... is often cleaner. |
|
||||
| `reload` | Reload page |
|
||||
| `url` | Print current URL |
|
||||
|
||||
@@ -848,7 +849,7 @@ Refs are invalidated on navigation — run `snapshot` again after `goto`.
|
||||
| `type <text>` | Type into focused element |
|
||||
| `upload <sel> <file> [file2...]` | Upload file(s) |
|
||||
| `useragent <string>` | Set user agent |
|
||||
| `viewport <WxH>` | Set viewport size |
|
||||
| `viewport [<WxH>] [--scale <n>]` | Set viewport size and optional deviceScaleFactor (1-3, for retina screenshots). --scale requires a context rebuild. |
|
||||
| `wait <sel|--networkidle|--load>` | Wait for element, network idle, or page load (timeout: 15s) |
|
||||
|
||||
### Inspection
|
||||
@@ -875,7 +876,7 @@ Refs are invalidated on navigation — run `snapshot` again after `goto`.
|
||||
| `pdf [path]` | Save as PDF |
|
||||
| `prettyscreenshot [--scroll-to sel|text] [--cleanup] [--hide sel...] [--width px] [path]` | Clean screenshot with optional cleanup, scroll positioning, and element hiding |
|
||||
| `responsive [prefix]` | Screenshots at mobile (375x812), tablet (768x1024), desktop (1280x720). Saves as {prefix}-mobile.png etc. |
|
||||
| `screenshot [--viewport] [--clip x,y,w,h] [selector|@ref] [path]` | Save screenshot (supports element crop via CSS/@ref, --clip region, --viewport) |
|
||||
| `screenshot [--selector <css>] [--viewport] [--clip x,y,w,h] [--base64] [selector|@ref] [path]` | Save screenshot. --selector targets a specific element (explicit flag form). Positional selectors starting with ./#/@/[ still work. |
|
||||
|
||||
### Snapshot
|
||||
| Command | Description |
|
||||
|
||||
@@ -584,6 +584,57 @@ $B diff https://staging.app.com https://prod.app.com
|
||||
### 11. Show screenshots to the user
|
||||
After `$B screenshot`, `$B snapshot -a -o`, or `$B responsive`, always use the Read tool on the output PNG(s) so the user can see them. Without this, screenshots are invisible.
|
||||
|
||||
### 12. Render local HTML (no HTTP server needed)
|
||||
Two paths, pick the cleaner one:
|
||||
```bash
|
||||
# HTML file on disk → goto file:// (absolute, or cwd-relative)
|
||||
$B goto file:///tmp/report.html
|
||||
$B goto file://./docs/page.html # cwd-relative
|
||||
$B goto file://~/Documents/page.html # home-relative
|
||||
|
||||
# HTML generated in memory → load-html reads the file into setContent
|
||||
echo '<div class="tweet">hello</div>' > /tmp/tweet.html
|
||||
$B load-html /tmp/tweet.html
|
||||
```
|
||||
|
||||
`goto file://...` is usually cleaner (URL is saved in state, relative asset URLs resolve against the file's dir, scale changes replay naturally). `load-html` uses `page.setContent()` — URL stays `about:blank`, but the content survives `viewport --scale` via in-memory replay. Both are scoped to files under cwd or `$TMPDIR`.
|
||||
|
||||
### 13. Retina screenshots (deviceScaleFactor)
|
||||
```bash
|
||||
$B viewport 480x600 --scale 2 # 2x deviceScaleFactor
|
||||
$B load-html /tmp/tweet.html # or: $B goto file://./tweet.html
|
||||
$B screenshot /tmp/out.png --selector .tweet-card
|
||||
# → /tmp/out.png is 2x the pixel dimensions of the element
|
||||
```
|
||||
Scale must be 1-3 (gstack policy cap). Changing `--scale` recreates the browser context; refs from `snapshot` are invalidated (rerun `snapshot`), but `load-html` content is replayed automatically. Not supported in headed mode.
|
||||
|
||||
## Puppeteer → browse cheatsheet
|
||||
|
||||
Migrating from Puppeteer? Here's the 1:1 mapping for the core workflow:
|
||||
|
||||
| Puppeteer | browse |
|
||||
|---|---|
|
||||
| `await page.goto(url)` | `$B goto <url>` |
|
||||
| `await page.setContent(html)` | `$B load-html <file>` (or `$B goto file://<abs>`) |
|
||||
| `await page.setViewport({width, height})` | `$B viewport WxH` |
|
||||
| `await page.setViewport({width, height, deviceScaleFactor: 2})` | `$B viewport WxH --scale 2` |
|
||||
| `await (await page.$('.x')).screenshot({path})` | `$B screenshot <path> --selector .x` |
|
||||
| `await page.screenshot({fullPage: true, path})` | `$B screenshot <path>` (full page default) |
|
||||
| `await page.screenshot({clip: {x, y, w, h}, path})` | `$B screenshot <path> --clip x,y,w,h` |
|
||||
|
||||
Worked example (the tweet-renderer flow — Puppeteer → browse):
|
||||
|
||||
```bash
|
||||
# Generate HTML in memory, render at 2x scale, screenshot the tweet card.
|
||||
echo '<div class="tweet-card" style="width:400px;height:200px;background:#1da1f2;color:white;padding:20px">hello</div>' > /tmp/tweet.html
|
||||
$B viewport 480x600 --scale 2
|
||||
$B load-html /tmp/tweet.html
|
||||
$B screenshot /tmp/out.png --selector .tweet-card
|
||||
# /tmp/out.png is 800x400 px, crisp (2x deviceScaleFactor).
|
||||
```
|
||||
|
||||
Aliases: typing `setcontent` or `set-content` routes to `load-html` automatically. Typing a typo (`load-htm`) returns `Did you mean 'load-html'?`.
|
||||
|
||||
## User Handoff
|
||||
|
||||
When you hit something you can't handle in headless mode (CAPTCHA, complex auth, multi-factor
|
||||
@@ -688,7 +739,8 @@ $B prettyscreenshot --cleanup --scroll-to ".pricing" --width 1440 ~/Desktop/hero
|
||||
|---------|-------------|
|
||||
| `back` | History back |
|
||||
| `forward` | History forward |
|
||||
| `goto <url>` | Navigate to URL |
|
||||
| `goto <url>` | Navigate to URL (http://, https://, or file:// scoped to cwd/TEMP_DIR) |
|
||||
| `load-html <file> [--wait-until load|domcontentloaded|networkidle]` | Load a local HTML file via setContent (no HTTP server needed). For self-contained HTML (inline CSS/JS, data URIs). For HTML on disk, goto file://... is often cleaner. |
|
||||
| `reload` | Reload page |
|
||||
| `url` | Print current URL |
|
||||
|
||||
@@ -739,7 +791,7 @@ $B prettyscreenshot --cleanup --scroll-to ".pricing" --width 1440 ~/Desktop/hero
|
||||
| `type <text>` | Type into focused element |
|
||||
| `upload <sel> <file> [file2...]` | Upload file(s) |
|
||||
| `useragent <string>` | Set user agent |
|
||||
| `viewport <WxH>` | Set viewport size |
|
||||
| `viewport [<WxH>] [--scale <n>]` | Set viewport size and optional deviceScaleFactor (1-3, for retina screenshots). --scale requires a context rebuild. |
|
||||
| `wait <sel|--networkidle|--load>` | Wait for element, network idle, or page load (timeout: 15s) |
|
||||
|
||||
### Inspection
|
||||
@@ -766,7 +818,7 @@ $B prettyscreenshot --cleanup --scroll-to ".pricing" --width 1440 ~/Desktop/hero
|
||||
| `pdf [path]` | Save as PDF |
|
||||
| `prettyscreenshot [--scroll-to sel|text] [--cleanup] [--hide sel...] [--width px] [path]` | Clean screenshot with optional cleanup, scroll positioning, and element hiding |
|
||||
| `responsive [prefix]` | Screenshots at mobile (375x812), tablet (768x1024), desktop (1280x720). Saves as {prefix}-mobile.png etc. |
|
||||
| `screenshot [--viewport] [--clip x,y,w,h] [selector|@ref] [path]` | Save screenshot (supports element crop via CSS/@ref, --clip region, --viewport) |
|
||||
| `screenshot [--selector <css>] [--viewport] [--clip x,y,w,h] [--base64] [selector|@ref] [path]` | Save screenshot. --selector targets a specific element (explicit flag form). Positional selectors starting with ./#/@/[ still work. |
|
||||
|
||||
### Snapshot
|
||||
| Command | Description |
|
||||
|
||||
@@ -111,6 +111,57 @@ $B diff https://staging.app.com https://prod.app.com
|
||||
### 11. Show screenshots to the user
|
||||
After `$B screenshot`, `$B snapshot -a -o`, or `$B responsive`, always use the Read tool on the output PNG(s) so the user can see them. Without this, screenshots are invisible.
|
||||
|
||||
### 12. Render local HTML (no HTTP server needed)
|
||||
Two paths, pick the cleaner one:
|
||||
```bash
|
||||
# HTML file on disk → goto file:// (absolute, or cwd-relative)
|
||||
$B goto file:///tmp/report.html
|
||||
$B goto file://./docs/page.html # cwd-relative
|
||||
$B goto file://~/Documents/page.html # home-relative
|
||||
|
||||
# HTML generated in memory → load-html reads the file into setContent
|
||||
echo '<div class="tweet">hello</div>' > /tmp/tweet.html
|
||||
$B load-html /tmp/tweet.html
|
||||
```
|
||||
|
||||
`goto file://...` is usually cleaner (URL is saved in state, relative asset URLs resolve against the file's dir, scale changes replay naturally). `load-html` uses `page.setContent()` — URL stays `about:blank`, but the content survives `viewport --scale` via in-memory replay. Both are scoped to files under cwd or `$TMPDIR`.
|
||||
|
||||
### 13. Retina screenshots (deviceScaleFactor)
|
||||
```bash
|
||||
$B viewport 480x600 --scale 2 # 2x deviceScaleFactor
|
||||
$B load-html /tmp/tweet.html # or: $B goto file://./tweet.html
|
||||
$B screenshot /tmp/out.png --selector .tweet-card
|
||||
# → /tmp/out.png is 2x the pixel dimensions of the element
|
||||
```
|
||||
Scale must be 1-3 (gstack policy cap). Changing `--scale` recreates the browser context; refs from `snapshot` are invalidated (rerun `snapshot`), but `load-html` content is replayed automatically. Not supported in headed mode.
|
||||
|
||||
## Puppeteer → browse cheatsheet
|
||||
|
||||
Migrating from Puppeteer? Here's the 1:1 mapping for the core workflow:
|
||||
|
||||
| Puppeteer | browse |
|
||||
|---|---|
|
||||
| `await page.goto(url)` | `$B goto <url>` |
|
||||
| `await page.setContent(html)` | `$B load-html <file>` (or `$B goto file://<abs>`) |
|
||||
| `await page.setViewport({width, height})` | `$B viewport WxH` |
|
||||
| `await page.setViewport({width, height, deviceScaleFactor: 2})` | `$B viewport WxH --scale 2` |
|
||||
| `await (await page.$('.x')).screenshot({path})` | `$B screenshot <path> --selector .x` |
|
||||
| `await page.screenshot({fullPage: true, path})` | `$B screenshot <path>` (full page default) |
|
||||
| `await page.screenshot({clip: {x, y, w, h}, path})` | `$B screenshot <path> --clip x,y,w,h` |
|
||||
|
||||
Worked example (the tweet-renderer flow — Puppeteer → browse):
|
||||
|
||||
```bash
|
||||
# Generate HTML in memory, render at 2x scale, screenshot the tweet card.
|
||||
echo '<div class="tweet-card" style="width:400px;height:200px;background:#1da1f2;color:white;padding:20px">hello</div>' > /tmp/tweet.html
|
||||
$B viewport 480x600 --scale 2
|
||||
$B load-html /tmp/tweet.html
|
||||
$B screenshot /tmp/out.png --selector .tweet-card
|
||||
# /tmp/out.png is 800x400 px, crisp (2x deviceScaleFactor).
|
||||
```
|
||||
|
||||
Aliases: typing `setcontent` or `set-content` routes to `load-html` automatically. Typing a typo (`load-htm`) returns `Did you mean 'load-html'?`.
|
||||
|
||||
## User Handoff
|
||||
|
||||
When you hit something you can't handle in headless mode (CAPTCHA, complex auth, multi-factor
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
{
|
||||
"name": "gstack",
|
||||
"version": "1.0.0.0",
|
||||
"version": "1.1.0.0",
|
||||
"description": "Garry's Stack — Claude Code skills + fast headless browser. One repo, one install, entire AI engineering workflow.",
|
||||
"license": "MIT",
|
||||
"type": "module",
|
||||
|
||||
Reference in New Issue
Block a user