Files
gstack/browse/src/cli.ts
Garry Tan a1a933614c feat: sidebar CSS inspector + per-tab agents (v0.13.9.0) (#650)
* feat: CDP inspector module — persistent sessions, CSS cascade, style modification

New browse/src/cdp-inspector.ts with full CDP inspection engine:
- inspectElement() via CSS.getMatchedStylesForNode + DOM.getBoxModel
- modifyStyle() via CSS.setStyleTexts with headless page.evaluate fallback
- Persistent CDP session lifecycle (create, reuse, detach on nav, re-create)
- Specificity sorting, overridden property detection, UA rule filtering
- Modification history with undo support
- formatInspectorResult() for CLI output

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: browse server inspector endpoints + inspect/style/cleanup/prettyscreenshot CLI

Server endpoints: POST /inspector/pick, GET /inspector, POST /inspector/apply,
POST /inspector/reset, GET /inspector/history, GET /inspector/events (SSE).
CLI commands: inspect (CDP cascade), style (live CSS mod), cleanup (page clutter
removal), prettyscreenshot (clean screenshot pipeline).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: sidebar CSS inspector — element picker, box model, rule cascade, quick edit

Extension changes for the visual CSS inspector:
- inspector.js: element picker with hover highlight, CSS selector generation,
  basic mode fallback (getComputedStyle + CSSOM), page alteration handlers
- inspector.css: picker overlay styles (blue highlight + tooltip)
- background.js: inspector message routing (picker <-> server <-> sidepanel)
- sidepanel: Inspector tab with box model viz (gstack palette), matched rules
  with specificity badges, computed styles, click-to-edit quick edit,
  Send to Agent/Code button, empty/loading/error states

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: document inspect, style, cleanup, prettyscreenshot browse commands

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: auto-track user-created tabs and handle tab close

browser-manager.ts changes:
- context.on('page') listener: automatically tracks tabs opened by the user
  (Cmd+T, right-click open in new tab, window.open). Previously only
  programmatic newTab() was tracked, so user tabs were invisible.
- page.on('close') handler in wirePageEvents: removes closed tabs from the
  pages map and switches activeTabId to the last remaining tab.
- syncActiveTabByUrl: match Chrome extension's active tab URL to the correct
  Playwright page for accurate tab identity.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: per-tab agent isolation via BROWSE_TAB environment variable

Prevents parallel sidebar agents from interfering with each other's tab context.

Three-layer fix:
- sidebar-agent.ts: passes BROWSE_TAB=<tabId> env var to each claude process,
  per-tab processing set allows concurrent agents across tabs
- cli.ts: reads process.env.BROWSE_TAB and includes tabId in command request body
- server.ts: handleCommand() temporarily switches activeTabId when tabId is present,
  restores after command completes (safe: Bun event loop is single-threaded)

Also: per-tab agent state (TabAgentState map), per-tab message queuing,
per-tab chat buffers, verbose streaming narration, stop button endpoint.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: sidebar per-tab chat context, tab bar sync, stop button, UX polish

Extension changes:
- sidepanel.js: per-tab chat history (tabChatHistories map), switchChatTab()
  swaps entire chat view, browserTabActivated handler for instant tab sync,
  stop button wired to /sidebar-agent/stop, pollTabs renders tab bar
- sidepanel.html: updated banner text ("Browser co-pilot"), stop button markup,
  input placeholder "Ask about this page..."
- sidepanel.css: tab bar styles, stop button styles, loading state fixes
- background.js: chrome.tabs.onActivated sends browserTabActivated to sidepanel
  with tab URL for instant tab switch detection

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: per-tab isolation, BROWSE_TAB pinning, tab tracking, sidebar UX

sidebar-agent.test.ts (new tests):
- BROWSE_TAB env var passed to claude process
- CLI reads BROWSE_TAB and sends tabId in body
- handleCommand accepts tabId, saves/restores activeTabId
- Tab pinning only activates when tabId provided
- Per-tab agent state, queue, concurrency
- processingTabs set for parallel agents

sidebar-ux.test.ts (new tests):
- context.on('page') tracks user-created tabs
- page.on('close') removes tabs from pages map
- Tab isolation uses BROWSE_TAB not system prompt hack
- Per-tab chat context in sidepanel
- Tab bar rendering, stop button, banner text

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: resolve merge conflicts — keep security defenses + per-tab isolation

Merged main's security improvements (XML escaping, prompt injection defense,
allowed commands whitelist, --model opus, Write tool, stderr capture) with
our branch's per-tab isolation (BROWSE_TAB env var, processingTabs set,
no --resume). Updated test expectations for expanded system prompt.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: bump version and changelog (v0.13.9.0)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: add inspector message types to background.js allowlist

Pre-existing bug found by Codex: ALLOWED_TYPES in background.js was missing
all inspector message types (startInspector, stopInspector, elementPicked,
pickerCancelled, applyStyle, toggleClass, injectCSS, resetAll, inspectResult).
Messages were silently rejected, making the inspector broken on ALL pages.

Also: separate executeScript and insertCSS into individual try blocks in
injectInspector(), store inspectorMode for routing, and add content.js
fallback when script injection fails (CSP, chrome:// pages).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: basic element picker in content.js for CSP-restricted pages

When inspector.js can't be injected (CSP, chrome:// pages), content.js
provides a basic picker using getComputedStyle + CSSOM:
- startBasicPicker/stopBasicPicker message handlers
- captureBasicData() with ~30 key CSS properties, box model, matched rules
- Hover highlight with outline save/restore (never leaves artifacts)
- Click uses e.target directly (no re-querying by selector)
- Sends inspectResult with mode:'basic' for sidebar rendering
- Escape key cancels picker and restores outlines

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: cleanup + screenshot buttons in sidebar inspector toolbar

Two action buttons in the inspector toolbar:
- Cleanup (🧹): POSTs cleanup --all to server, shows spinner, chat
  notification on success, resets inspector state (element may be removed)
- Screenshot (📸): POSTs screenshot to server, shows spinner, chat
  notification with saved file path

Shared infrastructure:
- .inspector-action-btn CSS with loading spinner via ::after pseudo-element
- chat-notification type in addChatEntry() for system messages
- package.json version bump to 0.13.9.0

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: inspector allowlist, CSP fallback, cleanup/screenshot buttons

16 new tests in sidebar-ux.test.ts:
- Inspector message allowlist includes all inspector types
- content.js basic picker (startBasicPicker, captureBasicData, CSSOM,
  outline save/restore, inspectResult with mode basic, Escape cleanup)
- background.js CSP fallback (separate try blocks, inspectorMode, fallback)
- Cleanup button (POST /command, inspector reset after success)
- Screenshot button (POST /command, notification rendering)
- Chat notification type and CSS styles

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* docs: update project documentation for v0.13.9.0

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: cleanup + screenshot buttons in chat toolbar (not just inspector)

Quick actions toolbar (🧹 Cleanup, 📸 Screenshot) now appears above the chat
input, always visible. Both inspector and chat buttons share runCleanup() and
runScreenshot() helper functions. Clicking either set shows loading state on
both simultaneously.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: chat toolbar buttons, shared helpers, quick-action-btn styles

Tests that chat toolbar exists (chat-cleanup-btn, chat-screenshot-btn,
quick-actions container), CSS styles (.quick-action-btn, .quick-action-btn.loading),
shared runCleanup/runScreenshot helper functions, and cleanup inspector reset.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: aggressive cleanup heuristics — overlays, scroll unlock, blur removal

Massively expanded CLEANUP_SELECTORS with patterns from uBlock Origin and
Readability.js research:
- ads: 30+ selectors (Google, Amazon, Outbrain, Taboola, Criteo, etc.)
- cookies: OneTrust, Cookiebot, TrustArc, Quantcast + generic patterns
- overlays (NEW): paywalls, newsletter popups, interstitials, push prompts,
  app download banners, survey modals
- social: follow prompts, share tools
- Cleanup now defaults to --all when no args (sidebar button fix)
- Uses !important on all display:none (overrides inline styles)
- Unlocks body/html scroll (overflow:hidden from modal lockout)
- Removes blur/filter effects (paywall content blur)
- Removes max-height truncation (article teaser truncation)
- Collapses empty ad placeholder whitespace (empty divs after ad removal)
- Skips gstack-ctrl indicator in sticky removal

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: disable action buttons when disconnected, no error spam

- setActionButtonsEnabled() toggles .disabled class on all cleanup/screenshot
  buttons (both chat toolbar and inspector toolbar)
- Called with false in updateConnection when server URL is null
- Called with true when connection established
- runCleanup/runScreenshot silently return when disconnected instead of
  showing 'Not connected' error notifications
- CSS .disabled style: pointer-events:none, opacity:0.3, cursor:not-allowed

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: cleanup heuristics, button disabled state, overlay selectors

17 new tests:
- cleanup defaults to --all on empty args
- CLEANUP_SELECTORS overlays category (paywall, newsletter, interstitial)
- Major ad networks in selectors (doubleclick, taboola, criteo, etc.)
- Major consent frameworks (OneTrust, Cookiebot, TrustArc, Quantcast)
- !important override for inline styles
- Scroll unlock (body overflow:hidden)
- Blur removal (paywall content blur)
- Article truncation removal (max-height)
- Empty placeholder collapse
- gstack-ctrl indicator skip in sticky cleanup
- setActionButtonsEnabled function
- Buttons disabled when disconnected
- No error spam from cleanup/screenshot when disconnected
- CSS disabled styles for action buttons

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: LLM-based page cleanup — agent analyzes page semantically

Instead of brittle CSS selectors, the cleanup button now sends a prompt to
the sidebar agent (which IS an LLM). The agent:
1. Runs deterministic $B cleanup --all as a quick first pass
2. Takes a snapshot to see what's left
3. Analyzes the page semantically to identify remaining clutter
4. Removes elements intelligently, preserving site branding

This means cleanup works correctly on any site without site-specific selectors.
The LLM understands that "Your Daily Puzzles" is clutter, "ADVERTISEMENT" is
junk, but the SF Chronicle masthead should stay.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: aggressive cleanup heuristics + preserve top nav bar

Deterministic cleanup improvements (used as first pass before LLM analysis):
- New 'clutter' category: audio players, podcast widgets, sidebar puzzles/games,
  recirculation widgets (taboola, outbrain, nativo), cross-promotion banners
- Text-content detection: removes "ADVERTISEMENT", "Article continues below",
  "Sponsored", "Paid content" labels and their parent wrappers
- Sticky fix: preserves the topmost full-width element near viewport top (site
  nav bar) instead of hiding all sticky/fixed elements. Sorts by vertical
  position, preserves the first one that spans >80% viewport width.

Tests: clutter category, ad label removal, nav bar preservation logic.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: LLM-based cleanup architecture, deterministic heuristics, sticky nav

22 new tests covering:
- Cleanup button uses /sidebar-command (agent) not /command (deterministic)
- Cleanup prompt includes deterministic first pass + agent snapshot analysis
- Cleanup prompt lists specific clutter categories for agent guidance
- Cleanup prompt preserves site identity (masthead, headline, body, byline)
- Cleanup prompt instructs scroll unlock and $B eval removal
- Loading state management (async agent, setTimeout)
- Deterministic clutter: audio/podcast, games/puzzles, recirculation
- Ad label text patterns (ADVERTISEMENT, Sponsored, Article continues)
- Ad label parent wrapper hiding for small containers
- Sticky nav preservation (sort by position, first full-width near top)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: prevent repeat chat message rendering on reconnect/replay

Root cause: server persists chat to disk (chat.jsonl) and replays on restart.
Client had no dedup, so every reconnect re-rendered the entire history.
Messages from an old HN session would repeat endlessly on the SF Chronicle tab.

Fix: renderedEntryIds Set tracks which entry IDs have been rendered. addChatEntry
skips entries already in the set. Entries without an id (local notifications)
bypass the check. Clear chat resets the set.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: agent stops when done, no focus stealing, opus for prompt injection safety

Three fixes for sidebar agent UX:
- System prompt: "Be CONCISE. STOP as soon as the task is done. Do NOT keep
  exploring or doing bonus work." Prevents agent from endlessly taking
  screenshots and highlighting elements after answering the question.
- switchTab(id, opts): new bringToFront option. Internal tab pinning
  (BROWSE_TAB) uses bringToFront: false so agent commands never steal
  window focus from the user's active app.
- Keep opus model (not sonnet) for prompt injection resistance on untrusted
  web pages. Remove Write from allowedTools (agent only needs Bash for $B).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: agent conciseness, focus stealing, opus model, switchTab opts

Tests for the three UX fixes:
- System prompt contains STOP/CONCISE/Do NOT keep exploring
- sidebar agent uses opus (not sonnet) for prompt injection resistance
- switchTab has bringToFront option, defaults to true (opt-out)
- handleCommand tab pinning uses bringToFront: false (no focus steal)
- Updated stale tests: switchTab signature, allowedTools excludes Write,
  narration -> conciseness, tab pinning restore calls

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test: sidebar CSS interaction E2E — HN comment highlight round-trip

New E2E test (periodic tier, ~$2/run) that exercises the full sidebar
agent pipeline with CSS interaction:
1. Agent navigates to Hacker News
2. Clicks into the top story's comments
3. Reads comments and identifies the most insightful one
4. Highlights it with a 4px solid orange outline via style injection

Tests: navigation, snapshot, text reading, LLM judgment, CSS modification.
Requires real browser + real Claude (ANTHROPIC_API_KEY).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: sidebar CSS E2E test — correct idle timeout (ms not s), pipe stdio

Root cause of test failure: BROWSE_IDLE_TIMEOUT is in milliseconds, not
seconds. '600' = 0.6 seconds, server died immediately after health check.
Fixed to '600000' (10 minutes).

Also: use 'pipe' stdio instead of file descriptors (closing fds kills child
on macOS/bun), catch ConnectionRefused on poll retry, 4 min poll timeout
for the multi-step opus task.

Test passes: agent navigates to HN, reads comments, identifies most
insightful one, highlights it with orange CSS, stops. 114s, $0.00.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 12:51:05 -06:00

681 lines
25 KiB
TypeScript

/**
* gstack CLI — thin wrapper that talks to the persistent server
*
* Flow:
* 1. Read .gstack/browse.json for port + token
* 2. If missing or stale PID → start server in background
* 3. Health check + version mismatch detection
* 4. Send command via HTTP POST
* 5. Print response to stdout (or stderr for errors)
*/
import * as fs from 'fs';
import * as path from 'path';
import { resolveConfig, ensureStateDir, readVersionHash } from './config';
const config = resolveConfig();
const IS_WINDOWS = process.platform === 'win32';
const MAX_START_WAIT = IS_WINDOWS ? 15000 : (process.env.CI ? 30000 : 8000); // Node+Chromium takes longer on Windows
export function resolveServerScript(
env: Record<string, string | undefined> = process.env,
metaDir: string = import.meta.dir,
execPath: string = process.execPath
): string {
if (env.BROWSE_SERVER_SCRIPT) {
return env.BROWSE_SERVER_SCRIPT;
}
// Dev mode: cli.ts runs directly from browse/src
// On macOS/Linux, import.meta.dir starts with /
// On Windows, it starts with a drive letter (e.g., C:\...)
if (!metaDir.includes('$bunfs')) {
const direct = path.resolve(metaDir, 'server.ts');
if (fs.existsSync(direct)) {
return direct;
}
}
// Compiled binary: derive the source tree from browse/dist/browse
if (execPath) {
const adjacent = path.resolve(path.dirname(execPath), '..', 'src', 'server.ts');
if (fs.existsSync(adjacent)) {
return adjacent;
}
}
throw new Error(
'Cannot find server.ts. Set BROWSE_SERVER_SCRIPT env or run from the browse source tree.'
);
}
const SERVER_SCRIPT = resolveServerScript();
/**
* On Windows, resolve the Node.js-compatible server bundle.
* Falls back to null if not found (server will use Bun instead).
*/
export function resolveNodeServerScript(
metaDir: string = import.meta.dir,
execPath: string = process.execPath
): string | null {
// Dev mode
if (!metaDir.includes('$bunfs')) {
const distScript = path.resolve(metaDir, '..', 'dist', 'server-node.mjs');
if (fs.existsSync(distScript)) return distScript;
}
// Compiled binary: browse/dist/browse → browse/dist/server-node.mjs
if (execPath) {
const adjacent = path.resolve(path.dirname(execPath), 'server-node.mjs');
if (fs.existsSync(adjacent)) return adjacent;
}
return null;
}
const NODE_SERVER_SCRIPT = IS_WINDOWS ? resolveNodeServerScript() : null;
// On Windows, hard-fail if server-node.mjs is missing — the Bun path is known broken.
if (IS_WINDOWS && !NODE_SERVER_SCRIPT) {
throw new Error(
'server-node.mjs not found. Run `bun run build` to generate the Windows server bundle.'
);
}
interface ServerState {
pid: number;
port: number;
token: string;
startedAt: string;
serverPath: string;
binaryVersion?: string;
mode?: 'launched' | 'headed';
}
// ─── State File ────────────────────────────────────────────────
function readState(): ServerState | null {
try {
const data = fs.readFileSync(config.stateFile, 'utf-8');
return JSON.parse(data);
} catch {
return null;
}
}
function isProcessAlive(pid: number): boolean {
if (IS_WINDOWS) {
// Bun's compiled binary can't signal Windows PIDs (always throws ESRCH).
// Use tasklist as a fallback. Only for one-shot calls — too slow for polling loops.
try {
const result = Bun.spawnSync(
['tasklist', '/FI', `PID eq ${pid}`, '/NH', '/FO', 'CSV'],
{ stdout: 'pipe', stderr: 'pipe', timeout: 3000 }
);
return result.stdout.toString().includes(`"${pid}"`);
} catch {
return false;
}
}
try {
process.kill(pid, 0);
return true;
} catch {
return false;
}
}
/**
* HTTP health check — definitive proof the server is alive and responsive.
* Used in all polling loops instead of isProcessAlive() (which is slow on Windows).
*/
export async function isServerHealthy(port: number): Promise<boolean> {
try {
const resp = await fetch(`http://127.0.0.1:${port}/health`, {
signal: AbortSignal.timeout(2000),
});
if (!resp.ok) return false;
const health = await resp.json() as any;
return health.status === 'healthy';
} catch {
return false;
}
}
// ─── Process Management ─────────────────────────────────────────
async function killServer(pid: number): Promise<void> {
if (!isProcessAlive(pid)) return;
if (IS_WINDOWS) {
// taskkill /T /F kills the process tree (Node + Chromium)
try {
Bun.spawnSync(
['taskkill', '/PID', String(pid), '/T', '/F'],
{ stdout: 'pipe', stderr: 'pipe', timeout: 5000 }
);
} catch {}
const deadline = Date.now() + 2000;
while (Date.now() < deadline && isProcessAlive(pid)) {
await Bun.sleep(100);
}
return;
}
try { process.kill(pid, 'SIGTERM'); } catch { return; }
// Wait up to 2s for graceful shutdown
const deadline = Date.now() + 2000;
while (Date.now() < deadline && isProcessAlive(pid)) {
await Bun.sleep(100);
}
// Force kill if still alive
if (isProcessAlive(pid)) {
try { process.kill(pid, 'SIGKILL'); } catch {}
}
}
/**
* Clean up legacy /tmp/browse-server*.json files from before project-local state.
* Verifies PID ownership before sending signals.
*/
function cleanupLegacyState(): void {
// No legacy state on Windows — /tmp and `ps` don't exist, and gstack
// never ran on Windows before the Node.js fallback was added.
if (IS_WINDOWS) return;
try {
const files = fs.readdirSync('/tmp').filter(f => f.startsWith('browse-server') && f.endsWith('.json'));
for (const file of files) {
const fullPath = `/tmp/${file}`;
try {
const data = JSON.parse(fs.readFileSync(fullPath, 'utf-8'));
if (data.pid && isProcessAlive(data.pid)) {
// Verify this is actually a browse server before killing
const check = Bun.spawnSync(['ps', '-p', String(data.pid), '-o', 'command='], {
stdout: 'pipe', stderr: 'pipe', timeout: 2000,
});
const cmd = check.stdout.toString().trim();
if (cmd.includes('bun') || cmd.includes('server.ts')) {
try { process.kill(data.pid, 'SIGTERM'); } catch {}
}
}
fs.unlinkSync(fullPath);
} catch {
// Best effort — skip files we can't parse or clean up
}
}
// Clean up legacy log files too
const logFiles = fs.readdirSync('/tmp').filter(f =>
f.startsWith('browse-console') || f.startsWith('browse-network') || f.startsWith('browse-dialog')
);
for (const file of logFiles) {
try { fs.unlinkSync(`/tmp/${file}`); } catch {}
}
} catch {
// /tmp read failed — skip legacy cleanup
}
}
// ─── Server Lifecycle ──────────────────────────────────────────
async function startServer(extraEnv?: Record<string, string>): Promise<ServerState> {
ensureStateDir(config);
// Clean up stale state file and error log
try { fs.unlinkSync(config.stateFile); } catch {}
try { fs.unlinkSync(path.join(config.stateDir, 'browse-startup-error.log')); } catch {}
let proc: any = null;
if (IS_WINDOWS && NODE_SERVER_SCRIPT) {
// Windows: Bun.spawn() + proc.unref() doesn't truly detach on Windows —
// when the CLI exits, the server dies with it. Use Node's child_process.spawn
// with { detached: true } instead, which is the gold standard for Windows
// process independence. Credit: PR #191 by @fqueiro.
const launcherCode =
`const{spawn}=require('child_process');` +
`spawn(process.execPath,[${JSON.stringify(NODE_SERVER_SCRIPT)}],` +
`{detached:true,stdio:['ignore','ignore','ignore'],env:Object.assign({},process.env,` +
`{BROWSE_STATE_FILE:${JSON.stringify(config.stateFile)}})}).unref()`;
Bun.spawnSync(['node', '-e', launcherCode], { stdio: ['ignore', 'ignore', 'ignore'] });
} else {
// macOS/Linux: Bun.spawn + unref works correctly
proc = Bun.spawn(['bun', 'run', SERVER_SCRIPT], {
stdio: ['ignore', 'pipe', 'pipe'],
env: { ...process.env, BROWSE_STATE_FILE: config.stateFile, ...extraEnv },
});
proc.unref();
}
// Wait for server to become healthy.
// Use HTTP health check (not isProcessAlive) — it's fast (~instant ECONNREFUSED)
// and works reliably on all platforms including Windows.
const start = Date.now();
while (Date.now() - start < MAX_START_WAIT) {
const state = readState();
if (state && await isServerHealthy(state.port)) {
return state;
}
await Bun.sleep(100);
}
// Server didn't start in time — try to get error details
if (proc?.stderr) {
// macOS/Linux: read stderr from the spawned process
const reader = proc.stderr.getReader();
const { value } = await reader.read();
if (value) {
const errText = new TextDecoder().decode(value);
throw new Error(`Server failed to start:\n${errText}`);
}
} else {
// Windows: check startup error log (server writes errors to disk since
// stderr is unavailable due to stdio: 'ignore' for detachment)
const errorLogPath = path.join(config.stateDir, 'browse-startup-error.log');
try {
const errorLog = fs.readFileSync(errorLogPath, 'utf-8').trim();
if (errorLog) {
throw new Error(`Server failed to start:\n${errorLog}`);
}
} catch (e: any) {
if (e.code !== 'ENOENT') throw e;
}
}
throw new Error(`Server failed to start within ${MAX_START_WAIT / 1000}s`);
}
/**
* Acquire an exclusive lockfile to prevent concurrent ensureServer() races (TOCTOU).
* Returns a cleanup function that releases the lock.
*/
function acquireServerLock(): (() => void) | null {
const lockPath = `${config.stateFile}.lock`;
try {
// 'wx' — create exclusively, fails if file already exists (atomic check-and-create)
// Using string flag instead of numeric constants for Bun Windows compatibility
const fd = fs.openSync(lockPath, 'wx');
fs.writeSync(fd, `${process.pid}\n`);
fs.closeSync(fd);
return () => { try { fs.unlinkSync(lockPath); } catch {} };
} catch {
// Lock already held — check if the holder is still alive
try {
const holderPid = parseInt(fs.readFileSync(lockPath, 'utf8').trim(), 10);
if (holderPid && isProcessAlive(holderPid)) {
return null; // Another live process holds the lock
}
// Stale lock — remove and retry
fs.unlinkSync(lockPath);
return acquireServerLock();
} catch {
return null;
}
}
}
async function ensureServer(): Promise<ServerState> {
const state = readState();
// Health-check-first: HTTP is definitive proof the server is alive and responsive.
// This replaces the PID-gated approach which breaks on Windows (Bun's process.kill
// always throws ESRCH for Windows PIDs in compiled binaries).
if (state && await isServerHealthy(state.port)) {
// Check for binary version mismatch (auto-restart on update)
const currentVersion = readVersionHash();
if (currentVersion && state.binaryVersion && currentVersion !== state.binaryVersion) {
console.error('[browse] Binary updated, restarting server...');
await killServer(state.pid);
return startServer();
}
return state;
}
// Guard: never silently replace a headed server with a headless one.
// Headed mode means a user-visible Chrome window is (or was) controlled.
// Silently replacing it would be confusing — tell the user to reconnect.
if (state && state.mode === 'headed' && isProcessAlive(state.pid)) {
console.error(`[browse] Headed server running (PID ${state.pid}) but not responding.`);
console.error(`[browse] Run '$B connect' to restart.`);
process.exit(1);
}
// Ensure state directory exists before lock acquisition (lock file lives there)
ensureStateDir(config);
// Acquire lock to prevent concurrent restart races (TOCTOU)
const releaseLock = acquireServerLock();
if (!releaseLock) {
// Another process is starting the server — wait for it
console.error('[browse] Another instance is starting the server, waiting...');
const start = Date.now();
while (Date.now() - start < MAX_START_WAIT) {
const freshState = readState();
if (freshState && await isServerHealthy(freshState.port)) return freshState;
await Bun.sleep(200);
}
throw new Error('Timed out waiting for another instance to start the server');
}
try {
// Re-read state under lock in case another process just started the server
const freshState = readState();
if (freshState && await isServerHealthy(freshState.port)) {
return freshState;
}
// Kill the old server to avoid orphaned chromium processes
if (state && state.pid) {
await killServer(state.pid);
}
console.error('[browse] Starting server...');
return await startServer();
} finally {
releaseLock();
}
}
// ─── Command Dispatch ──────────────────────────────────────────
async function sendCommand(state: ServerState, command: string, args: string[], retries = 0): Promise<void> {
// BROWSE_TAB env var pins commands to a specific tab (set by sidebar-agent per-tab)
const browseTab = process.env.BROWSE_TAB;
const body = JSON.stringify({ command, args, ...(browseTab ? { tabId: parseInt(browseTab, 10) } : {}) });
try {
const resp = await fetch(`http://127.0.0.1:${state.port}/command`, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': `Bearer ${state.token}`,
},
body,
signal: AbortSignal.timeout(30000),
});
if (resp.status === 401) {
// Token mismatch — server may have restarted
console.error('[browse] Auth failed — server may have restarted. Retrying...');
const newState = readState();
if (newState && newState.token !== state.token) {
return sendCommand(newState, command, args);
}
throw new Error('Authentication failed');
}
const text = await resp.text();
if (resp.ok) {
process.stdout.write(text);
if (!text.endsWith('\n')) process.stdout.write('\n');
} else {
// Try to parse as JSON error
try {
const err = JSON.parse(text);
console.error(err.error || text);
if (err.hint) console.error(err.hint);
} catch {
console.error(text);
}
process.exit(1);
}
} catch (err: any) {
if (err.name === 'AbortError') {
console.error('[browse] Command timed out after 30s');
process.exit(1);
}
// Connection error — server may have crashed
if (err.code === 'ECONNREFUSED' || err.code === 'ECONNRESET' || err.message?.includes('fetch failed')) {
if (retries >= 1) throw new Error('[browse] Server crashed twice in a row — aborting');
console.error('[browse] Server connection lost. Restarting...');
// Kill the old server to avoid orphaned chromium processes
const oldState = readState();
if (oldState && oldState.pid) {
await killServer(oldState.pid);
}
const newState = await startServer();
return sendCommand(newState, command, args, retries + 1);
}
throw err;
}
}
// ─── Main ──────────────────────────────────────────────────────
async function main() {
const args = process.argv.slice(2);
if (args.length === 0 || args[0] === '--help' || args[0] === '-h') {
console.log(`gstack browse — Fast headless browser for AI coding agents
Usage: browse <command> [args...]
Navigation: goto <url> | back | forward | reload | url
Content: text | html [sel] | links | forms | accessibility
Interaction: click <sel> | fill <sel> <val> | select <sel> <val>
hover <sel> | type <text> | press <key>
scroll [sel] | wait <sel|--networkidle|--load> | viewport <WxH>
upload <sel> <file1> [file2...]
cookie-import <json-file>
cookie-import-browser [browser] [--domain <d>]
Inspection: js <expr> | eval <file> | css <sel> <prop> | attrs <sel>
console [--clear|--errors] | network [--clear] | dialog [--clear]
cookies | storage [set <k> <v>] | perf
is <prop> <sel> (visible|hidden|enabled|disabled|checked|editable|focused)
Visual: screenshot [--viewport] [--clip x,y,w,h] [@ref|sel] [path]
pdf [path] | responsive [prefix]
Snapshot: snapshot [-i] [-c] [-d N] [-s sel] [-D] [-a] [-o path] [-C]
-D/--diff: diff against previous snapshot
-a/--annotate: annotated screenshot with ref labels
-C/--cursor-interactive: find non-ARIA clickable elements
Compare: diff <url1> <url2>
Multi-step: chain (reads JSON from stdin)
Tabs: tabs | tab <id> | newtab [url] | closetab [id]
Server: status | cookie <n>=<v> | header <n>:<v>
useragent <str> | stop | restart
Dialogs: dialog-accept [text] | dialog-dismiss
Refs: After 'snapshot', use @e1, @e2... as selectors:
click @e3 | fill @e4 "value" | hover @e1
@c refs from -C: click @c1`);
process.exit(0);
}
// One-time cleanup of legacy /tmp state files
cleanupLegacyState();
const command = args[0];
const commandArgs = args.slice(1);
// ─── Headed Connect (pre-server command) ────────────────────
// connect must be handled BEFORE ensureServer() because it needs
// to restart the server in headed mode with the Chrome extension.
if (command === 'connect') {
// Check if already in headed mode and healthy
const existingState = readState();
if (existingState && existingState.mode === 'headed' && isProcessAlive(existingState.pid)) {
try {
const resp = await fetch(`http://127.0.0.1:${existingState.port}/health`, {
signal: AbortSignal.timeout(2000),
});
if (resp.ok) {
console.log('Already connected in headed mode.');
process.exit(0);
}
} catch {
// Headed server alive but not responding — kill and restart
}
}
// Kill ANY existing server (SIGTERM → wait 2s → SIGKILL)
if (existingState && isProcessAlive(existingState.pid)) {
try { process.kill(existingState.pid, 'SIGTERM'); } catch {}
await new Promise(resolve => setTimeout(resolve, 2000));
if (isProcessAlive(existingState.pid)) {
try { process.kill(existingState.pid, 'SIGKILL'); } catch {}
await new Promise(resolve => setTimeout(resolve, 1000));
}
}
// Kill orphaned Chromium processes that may still hold the profile lock.
// The server PID is the Bun process; Chromium is a child that can outlive it
// if the server is killed abruptly (SIGKILL, crash, manual rm of state file).
const profileDir = path.join(process.env.HOME || '/tmp', '.gstack', 'chromium-profile');
try {
const singletonLock = path.join(profileDir, 'SingletonLock');
const lockTarget = fs.readlinkSync(singletonLock); // e.g. "hostname-12345"
const orphanPid = parseInt(lockTarget.split('-').pop() || '', 10);
if (orphanPid && isProcessAlive(orphanPid)) {
try { process.kill(orphanPid, 'SIGTERM'); } catch {}
await new Promise(resolve => setTimeout(resolve, 1000));
if (isProcessAlive(orphanPid)) {
try { process.kill(orphanPid, 'SIGKILL'); } catch {}
await new Promise(resolve => setTimeout(resolve, 500));
}
}
} catch {
// No lock symlink or not readable — nothing to kill
}
// Clean up Chromium profile locks (can persist after crashes)
for (const lockFile of ['SingletonLock', 'SingletonSocket', 'SingletonCookie']) {
try { fs.unlinkSync(path.join(profileDir, lockFile)); } catch {}
}
// Delete stale state file
try { fs.unlinkSync(config.stateFile); } catch {}
console.log('Launching headed Chromium with extension + sidebar agent...');
try {
// Start server in headed mode with extension auto-loaded
// Use a well-known port so the Chrome extension auto-connects
const serverEnv: Record<string, string> = {
BROWSE_HEADED: '1',
BROWSE_PORT: '34567',
BROWSE_SIDEBAR_CHAT: '1',
};
const newState = await startServer(serverEnv);
// Print connected status
const resp = await fetch(`http://127.0.0.1:${newState.port}/command`, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': `Bearer ${newState.token}`,
},
body: JSON.stringify({ command: 'status', args: [] }),
signal: AbortSignal.timeout(5000),
});
const status = await resp.text();
console.log(`Connected to real Chrome\n${status}`);
// Auto-start sidebar agent
// __dirname is inside $bunfs in compiled binaries — resolve from execPath instead
let agentScript = path.resolve(__dirname, 'sidebar-agent.ts');
if (!fs.existsSync(agentScript)) {
agentScript = path.resolve(path.dirname(process.execPath), '..', 'src', 'sidebar-agent.ts');
}
try {
if (!fs.existsSync(agentScript)) {
throw new Error(`sidebar-agent.ts not found at ${agentScript}`);
}
// Clear old agent queue
const agentQueue = path.join(process.env.HOME || '/tmp', '.gstack', 'sidebar-agent-queue.jsonl');
try { fs.writeFileSync(agentQueue, ''); } catch {}
// Resolve browse binary path the same way — execPath-relative
let browseBin = path.resolve(__dirname, '..', 'dist', 'browse');
if (!fs.existsSync(browseBin)) {
browseBin = process.execPath; // the compiled binary itself
}
// Kill any existing sidebar-agent processes before starting a new one.
// Old agents have stale auth tokens and will silently fail to relay events,
// causing the server to mark the agent as "hung".
try {
const { spawnSync } = require('child_process');
spawnSync('pkill', ['-f', 'sidebar-agent\\.ts'], { stdio: 'ignore', timeout: 3000 });
} catch {}
const agentProc = Bun.spawn(['bun', 'run', agentScript], {
cwd: config.projectDir,
env: {
...process.env,
BROWSE_BIN: browseBin,
BROWSE_STATE_FILE: config.stateFile,
BROWSE_SERVER_PORT: String(newState.port),
},
stdio: ['ignore', 'ignore', 'ignore'],
});
agentProc.unref();
console.log(`[browse] Sidebar agent started (PID: ${agentProc.pid})`);
} catch (err: any) {
console.error(`[browse] Sidebar agent failed to start: ${err.message}`);
console.error(`[browse] Run manually: bun run ${agentScript}`);
}
} catch (err: any) {
console.error(`[browse] Connect failed: ${err.message}`);
process.exit(1);
}
process.exit(0);
}
// ─── Headed Disconnect (pre-server command) ─────────────────
// disconnect must be handled BEFORE ensureServer() because the headed
// guard blocks all commands when the server is unresponsive.
if (command === 'disconnect') {
const existingState = readState();
if (!existingState || existingState.mode !== 'headed') {
console.log('Not in headed mode — nothing to disconnect.');
process.exit(0);
}
// Try graceful shutdown via server
try {
const resp = await fetch(`http://127.0.0.1:${existingState.port}/command`, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': `Bearer ${existingState.token}`,
},
body: JSON.stringify({ command: 'disconnect', args: [] }),
signal: AbortSignal.timeout(3000),
});
if (resp.ok) {
console.log('Disconnected from real browser.');
process.exit(0);
}
} catch {
// Server not responding — force cleanup
}
// Force kill + cleanup
if (isProcessAlive(existingState.pid)) {
try { process.kill(existingState.pid, 'SIGTERM'); } catch {}
await new Promise(resolve => setTimeout(resolve, 2000));
if (isProcessAlive(existingState.pid)) {
try { process.kill(existingState.pid, 'SIGKILL'); } catch {}
}
}
// Clean profile locks and state file
const profileDir = path.join(process.env.HOME || '/tmp', '.gstack', 'chromium-profile');
for (const lockFile of ['SingletonLock', 'SingletonSocket', 'SingletonCookie']) {
try { fs.unlinkSync(path.join(profileDir, lockFile)); } catch {}
}
try { fs.unlinkSync(config.stateFile); } catch {}
console.log('Disconnected (server was unresponsive — force cleaned).');
process.exit(0);
}
// Special case: chain reads from stdin
if (command === 'chain' && commandArgs.length === 0) {
const stdin = await Bun.stdin.text();
commandArgs.push(stdin.trim());
}
const state = await ensureServer();
await sendCommand(state, command, commandArgs);
}
if (import.meta.main) {
main().catch((err) => {
console.error(`[browse] ${err.message}`);
process.exit(1);
});
}