docs: v1.28.0.0 — browse SKILL section + VERSION + CHANGELOG

VERSION 1.27.1.0 → 1.28.0.0 (MINOR — substantial new capability:
five new flags/features, ~600 LOC added, new socks dep, multiple
new modules).

browse/SKILL.md.tmpl: new "Headed Mode + Proxy + Anti-Bot Sites"
section between User Handoff and Snapshot Flags. Documents
--headed (auto-Xvfb on Linux), --proxy (with embedded SOCKS5
bridge for auth), download --navigate, the cred-mixing policy,
daemon-discipline (refuse-on-mismatch), the narrowed
webdriver-only stealth, container support caveats, and the
fail-fast/no-retry failure modes.

CHANGELOG entry follows the release-summary format from CLAUDE.md:
two-line headline, lead paragraph, "The numbers that matter"
table tied to specific test files that prove each capability,
"What this means for AI agents" closing tied to a real workflow
shift, then itemized Added/Changed/Fixed/For-contributors
sections.

Browse SKILL.md regenerated via bun run gen:skill-docs.
gstack/llms.txt regenerated automatically from the same pipeline.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Garry Tan
2026-05-07 13:40:05 -07:00
parent 9cb98a7103
commit 0947f0f935
5 changed files with 191 additions and 3 deletions

View File

@@ -1,5 +1,121 @@
# Changelog
## [1.28.0.0] - 2026-05-07
## **Browse handles real-world automation now: SOCKS5 with auth, container Xvfb, browser-native downloads. Plus a single-file `llms.txt` index agents can crawl in one read.**
Five capabilities ship in one PR. Browse picks up `--proxy` (with an
embedded SOCKS5 bridge so Chromium can speak to authenticated
upstreams it can't speak to natively), `--headed` (auto-spawns Xvfb
on Linux containers without DISPLAY), and `download --navigate` (uses
the browser's native download handler for Content-Disposition,
multi-hop CDN redirects, and anti-bot CDN chains where
`page.request.fetch()` falls over). Stealth is narrowed to
`navigator.webdriver` masking only — modern fingerprinters punish
inconsistent fakes, so faking plugins/languages was making
detection easier, not harder. And `gstack/llms.txt` is now
auto-generated from the same source as every SKILL.md, so any agent
that reads `llms.txt` boots into the full surface (47 skills, 75
browse commands) in one fetch.
### The numbers that matter
End-to-end verified via `bun test browse/test/{socks-bridge,proxy-config,proxy-redact,xvfb,stealth-webdriver,bridge-chromium-e2e}.test.ts test/llms-txt-shape.test.ts`:
| Surface | Before | After | Δ |
|---|---|---|---|
| `browse --proxy` (SOCKS5 with auth) | not supported | works end-to-end | new capability |
| `browse --headed` on Linux without DISPLAY | not supported | auto-Xvfb on first free display | new capability |
| `download --navigate` (browser-native) | only `page.request.fetch()` | added native download path | new capability |
| `gstack/llms.txt` index for agents | none | 47 skills + 75 commands in 11KB | new capability |
| Bridge PID validation defenses | n/a | both `/proc/<pid>/cmdline` AND start-time | full safety |
| Tests covering proxy + headed + navigate | 0 | 70+ tests across 7 files | from zero to comprehensive |
The `bridge-chromium-e2e.test.ts` is the one that proves the feature
actually works: real Chromium launches with `proxy.server =
socks5://127.0.0.1:<bridgePort>`, navigates to a local HTTP fixture,
and we assert the auth upstream's connect counter and the HTTP
fixture's hit counter both increment. Without that test we could
ship a working byte-relay and a broken Chromium integration and never
notice.
### What this means for AI agents
Any agent on any project can now hit any site. DDoS-Guard'd CDN
behind an auth-required residential SOCKS5 → `browse --proxy
socks5://user:pass@host:1080 --headed download <url> /tmp/file
--navigate` and the file lands. Linux container without DISPLAY →
`--headed` auto-spawns Xvfb, no manual setup. The `llms.txt` index
makes discovery a one-fetch operation: agents stop scanning 47
SKILL.md files and start with the right skill on the first try.
### Itemized changes
#### Added
- `browse --proxy <url>` flag. Supports SOCKS5 with username/password
auth, HTTP, and HTTPS. SOCKS5+auth runs through an embedded local
bridge (`browse/src/socks-bridge.ts`, ~250 LOC) bound to 127.0.0.1
on an ephemeral port. The bridge handles the SOCKS5 auth handshake
so Chromium (which can't prompt for SOCKS5 creds) can still use
authenticated upstreams.
- Pre-flight `testUpstream()` runs before Chromium launches: 5s total
budget, 3 retries with 500ms backoff (handles VPN warm-up race).
On failure, exits 1 with a redacted error message — no confusing
"connection refused" on first navigation.
- `browse --headed` flag with auto-Xvfb on Linux. Walks the display
range (`:99`, `:100`, ...) until `xdpyinfo` says free; never
hardcodes `:99` and never unlinks `/tmp/.X<n>-lock` for displays
it didn't create. Xvfb child PID + start-time + display recorded
in `~/.gstack/browse.json` so cleanup-on-disconnect can validate
ownership before signaling. Skips spawn when `WAYLAND_DISPLAY` is
set (Chromium uses Wayland natively).
- `download --navigate` flag (community PR #1355, attribution preserved).
Uses `page.waitForEvent('download')` and `page.goto(url, {
waitUntil: 'commit' })` instead of `page.request.fetch()`.
Required for sites where the download is triggered by browser
navigation (Content-Disposition headers, redirect chains, anti-bot
CDNs).
- `gstack/llms.txt` auto-generated from skill frontmatter and the
browse `COMMAND_DESCRIPTIONS` registry. Regenerates on every
`bun run gen:skill-docs`. Strict mode (used in tests) refuses any
skill missing `name` or `description` in its frontmatter.
#### Changed
- Stealth narrowed to `navigator.webdriver` masking only. The
pre-existing `launchHeaded` patches that faked `navigator.plugins`
and `navigator.languages` were removed because modern
fingerprinters check those for consistency with `userAgent`/
`platform`, and synthesized fixed values can flag MORE bot-like,
not less. The cdc_/__webdriver runtime cleanup and Permissions API
patch are kept — those remove ChromeDriver-injected artifacts
rather than synthesize natural-browser values.
- Browse daemon refuses to silently restart on `--proxy`/`--headed`
flag mismatch. Existing daemon with config A + new invocation with
config B → exits 1 with a `browse disconnect` hint. No silent
state loss.
- Cred policy: passing creds in BOTH the URL and `BROWSE_PROXY_USER`/
`BROWSE_PROXY_PASS` env vars now fails fast with a clear error.
Silent override was a debugging trap.
#### Fixed
- N/A — all-new code paths.
#### For contributors
- New module boundary: `browse/src/socks-bridge.ts`,
`browse/src/proxy-config.ts`, `browse/src/proxy-redact.ts`,
`browse/src/xvfb.ts`, `browse/src/stealth.ts`. Each is small,
testable in isolation, and has matching `*.test.ts` coverage.
- 70+ new tests across 7 files. The `bridge-chromium-e2e.test.ts`
test launches real Chromium through the bridge and asserts the
request actually traversed it (upstream connect counter + HTTP
fixture hit counter both increment).
- `socks` npm dependency added (~30KB).
- Xvfb + x11-utils added to `.github/docker/Dockerfile.ci` so
`headed-xvfb`/`headed-orphan-cleanup` exercise the Linux container
path on every CI run instead of only manual smoke tests.
- Community PR #1355 from @garrytan-agents merged; attribution
preserved on the merging commit.
## [1.27.1.0] - 2026-05-06
## **Plan-mode reviews now refuse to dump findings without asking. Four gate-tier tests catch the regression on every PR.**

View File

@@ -862,7 +862,7 @@ Refs are invalidated on navigation — run `snapshot` again after `goto`.
| Command | Description |
|---------|-------------|
| `archive [path]` | Save complete page as MHTML via CDP |
| `download <url|@ref> [path] [--base64]` | Download URL or media element to disk using browser cookies |
| `download <url|@ref> [path] [--base64] [--navigate]` | Download URL or media element to disk using browser cookies. Use --navigate for URLs that trigger browser downloads (CDN redirects, Content-Disposition, anti-bot protected sites) |
| `scrape <images|videos|media> [--selector sel] [--dir path] [--limit N]` | Bulk download all media from page. Writes manifest.json |
### Interaction

View File

@@ -1 +1 @@
1.27.1.0
1.28.0.0

View File

@@ -679,6 +679,42 @@ $B resume
The browser preserves all state (cookies, localStorage, tabs) across the handoff.
After `resume`, you get a fresh snapshot of wherever the user left off.
## Headed Mode + Proxy + Anti-Bot Sites
For sites that block headless browsers, fingerprint Playwright defaults, or require routing through an authenticated SOCKS5 proxy (residential VPN, etc.), browse exposes three coordinated flags:
```bash
# Headed mode — visible Chromium window. Auto-spawns Xvfb on Linux
# containers without DISPLAY (no extra setup needed on Debian/Ubuntu).
browse --headed goto https://example.com
# SOCKS5 with auth (Chromium can't prompt for SOCKS5 creds itself —
# browse runs a local 127.0.0.1 bridge that handles the auth handshake).
browse --proxy socks5://user:pass@residential.proxy.host:1080 goto https://example.com
# HTTP/HTTPS proxy (passes through to Chromium directly):
browse --proxy http://corp-proxy:3128 goto https://example.com
# Browser-triggered file download (Content-Disposition, redirect chain,
# anti-bot CDN — falls back from page.request.fetch() to browser native
# download handler):
browse download "https://protected.example.com/file" /tmp/file.bin --navigate
# Combined: headed + proxy + navigate-download
browse --headed --proxy socks5://user:pass@host:1080 \
download "https://protected.example.com/file" /tmp/file.bin --navigate
```
**Credential policy.** Pass creds via either the URL (`socks5://user:pass@host`) OR the env vars `BROWSE_PROXY_USER` and `BROWSE_PROXY_PASS` — never both. Browse refuses with a clear hint when both are set, because silent override creates "works on my machine" debugging traps.
**Daemon discipline.** Browse runs as a long-lived daemon. `--proxy` and `--headed` change daemon-startup config, so they only apply on a fresh daemon. If a daemon is already running with different config, browse refuses and tells you to `browse disconnect` first. No silent restart that would drop tab state, cookies, or logged-in sessions.
**Stealth.** When `--headed` or `--proxy` are set, browse masks `navigator.webdriver` (the obvious automation tell) via Chromium's `--disable-blink-features=AutomationControlled` plus a small init script. We do NOT fake `navigator.plugins`, `navigator.languages`, or `window.chrome` — modern fingerprinters check those for consistency, and synthesizing fixed values can flag MORE bot-like, not less.
**Container support.** `--headed` on Linux without `DISPLAY` automatically picks a free X display (`:99`, `:100`, ...) and spawns Xvfb. Cleanup on `browse disconnect` validates the recorded PID's `/proc/<pid>/cmdline` matches `Xvfb` AND start-time matches before sending any signal — no PID-reuse footguns. Standard Debian/Ubuntu containers work out of the box; minimal images (alpine, distroless) may also need fonts/dbus/gtk libs for headed Chromium to render.
**Failure modes.** SOCKS5 upstream rejected or unreachable → fail-fast at startup with a redacted error after 3 retries (5s budget). Mid-stream upstream drop → browse kills the affected client connection only; no transport retries (which could corrupt browser traffic). Mismatched daemon config → exit 1 with a `browse disconnect` hint.
## Snapshot Flags
The snapshot is your primary tool for understanding and interacting with pages.
@@ -786,7 +822,7 @@ $B prettyscreenshot --cleanup --scroll-to ".pricing" --width 1440 ~/Desktop/hero
| Command | Description |
|---------|-------------|
| `archive [path]` | Save complete page as MHTML via CDP |
| `download <url|@ref> [path] [--base64]` | Download URL or media element to disk using browser cookies |
| `download <url|@ref> [path] [--base64] [--navigate]` | Download URL or media element to disk using browser cookies. Use --navigate for URLs that trigger browser downloads (CDN redirects, Content-Disposition, anti-bot protected sites) |
| `scrape <images|videos|media> [--selector sel] [--dir path] [--limit N]` | Bulk download all media from page. Writes manifest.json |
### Interaction

View File

@@ -188,6 +188,42 @@ $B resume
The browser preserves all state (cookies, localStorage, tabs) across the handoff.
After `resume`, you get a fresh snapshot of wherever the user left off.
## Headed Mode + Proxy + Anti-Bot Sites
For sites that block headless browsers, fingerprint Playwright defaults, or require routing through an authenticated SOCKS5 proxy (residential VPN, etc.), browse exposes three coordinated flags:
```bash
# Headed mode — visible Chromium window. Auto-spawns Xvfb on Linux
# containers without DISPLAY (no extra setup needed on Debian/Ubuntu).
browse --headed goto https://example.com
# SOCKS5 with auth (Chromium can't prompt for SOCKS5 creds itself —
# browse runs a local 127.0.0.1 bridge that handles the auth handshake).
browse --proxy socks5://user:pass@residential.proxy.host:1080 goto https://example.com
# HTTP/HTTPS proxy (passes through to Chromium directly):
browse --proxy http://corp-proxy:3128 goto https://example.com
# Browser-triggered file download (Content-Disposition, redirect chain,
# anti-bot CDN — falls back from page.request.fetch() to browser native
# download handler):
browse download "https://protected.example.com/file" /tmp/file.bin --navigate
# Combined: headed + proxy + navigate-download
browse --headed --proxy socks5://user:pass@host:1080 \
download "https://protected.example.com/file" /tmp/file.bin --navigate
```
**Credential policy.** Pass creds via either the URL (`socks5://user:pass@host`) OR the env vars `BROWSE_PROXY_USER` and `BROWSE_PROXY_PASS` — never both. Browse refuses with a clear hint when both are set, because silent override creates "works on my machine" debugging traps.
**Daemon discipline.** Browse runs as a long-lived daemon. `--proxy` and `--headed` change daemon-startup config, so they only apply on a fresh daemon. If a daemon is already running with different config, browse refuses and tells you to `browse disconnect` first. No silent restart that would drop tab state, cookies, or logged-in sessions.
**Stealth.** When `--headed` or `--proxy` are set, browse masks `navigator.webdriver` (the obvious automation tell) via Chromium's `--disable-blink-features=AutomationControlled` plus a small init script. We do NOT fake `navigator.plugins`, `navigator.languages`, or `window.chrome` — modern fingerprinters check those for consistency, and synthesizing fixed values can flag MORE bot-like, not less.
**Container support.** `--headed` on Linux without `DISPLAY` automatically picks a free X display (`:99`, `:100`, ...) and spawns Xvfb. Cleanup on `browse disconnect` validates the recorded PID's `/proc/<pid>/cmdline` matches `Xvfb` AND start-time matches before sending any signal — no PID-reuse footguns. Standard Debian/Ubuntu containers work out of the box; minimal images (alpine, distroless) may also need fonts/dbus/gtk libs for headed Chromium to render.
**Failure modes.** SOCKS5 upstream rejected or unreachable → fail-fast at startup with a redacted error after 3 retries (5s budget). Mid-stream upstream drop → browse kills the affected client connection only; no transport retries (which could corrupt browser traffic). Mismatched daemon config → exit 1 with a `browse disconnect` hint.
## Snapshot Flags
{{SNAPSHOT_FLAGS}}