feat(extension): Terminal-only sidebar — auth fix, UX polish, chat rip

The chat queue path is gone. The Chrome side panel is now just an interactive claude PTY in xterm.js. Activity / Refs / Inspector still exist behind the `debug` toggle in the footer. Three threads of change, all from dogfood iteration on top of cc-pty-import: 1. fix(server): cross-port WS auth via Sec-WebSocket-Protocol - Browsers can't set Authorization on a WebSocket upgrade. We had been minting an HttpOnly gstack_pty cookie via /pty-session, but SameSite=Strict cookies don't survive the cross-port jump from server.ts:34567 to the agent's random port from a chrome-extension origin. The WS opened then immediately closed → "Session ended." - /pty-session now also returns ptySessionToken in the JSON body. - Extension calls `new WebSocket(url, [`gstack-pty.<token>`])`. Browser sends Sec-WebSocket-Protocol on the upgrade. - Agent reads the protocol header, validates against validTokens, and MUST echo the protocol back (Chromium closes the connection immediately if a server doesn't pick one of the offered protocols). - Cookie path is kept as a fallback for non-browser callers (curl, integration tests). - New integration test exercises the full protocol-auth round-trip via raw fetch+Upgrade so a future regression of this exact class fails in CI. 2. fix(extension): UX polish on the Terminal pane - Eager auto-connect when the sidebar opens — no "Press any key to start" friction every reload. - Always-visible ↻ Restart button in the terminal toolbar (not gated on the ENDED state) so the user can force a fresh claude mid-session. - MutationObserver on #tab-terminal's class attribute drives a fitAddon.fit() + term.refresh() when the pane becomes visible again — xterm doesn't auto-redraw after display:none → display:flex. 3. feat(extension): rip the chat tab + sidebar-agent.ts - Sidebar is Terminal-only. No more Terminal | Chat primary nav. - sidebar-agent.ts deleted. /sidebar-command, /sidebar-chat, /sidebar-agent/event, /sidebar-tabs* and friends all deleted. - The pickSidebarModel router (sonnet vs opus) is gone — the live PTY uses whatever model the user's `claude` CLI is configured with. - Quick-actions (🧹 Cleanup / 📸 Screenshot / 🍪 Cookies) survive in the Terminal toolbar. Cleanup now injects its prompt into the live PTY via window.gstackInjectToTerminal — no more /sidebar-command POST. The Inspector "Send to Code" action uses the same injection path. - clear-chat button removed from the footer. - sidepanel.js shed ~900 lines of chat polling, optimistic UI, stop-agent, etc. Net diff: -3.4k lines across 16 files. CLAUDE.md, TODOS.md, and docs/designs/SIDEBAR_MESSAGE_FLOW.md rewritten to match. The sidebar regression test (browse/test/sidebar-tabs.test.ts) is rewritten as 27 structural assertions locking the new layout — Terminal sole pane, no chat input, quick-actions in toolbar, eager-connect, MutationObserver repaint, restart helper.
2026-05-18 18:32:28 +08:00 · 2026-04-25 21:03:04 -07:00
parent 0361acfb6a
commit 006dbe19f1
16 changed files with 771 additions and 4229 deletions
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -225,24 +225,35 @@ When you need to interact with a browser (QA, dogfooding, cookie setup), use the
 project uses.
 **Sidebar architecture:** Before modifying `sidepanel.js`, `background.js`,
-`content.js`, `sidebar-agent.ts`, `terminal-agent.ts`, or sidebar-related
+`content.js`, `terminal-agent.ts`, or sidebar-related server endpoints,
-server endpoints, read `docs/designs/SIDEBAR_MESSAGE_FLOW.md`. It documents
+read `docs/designs/SIDEBAR_MESSAGE_FLOW.md`. The sidebar has one primary
-the full initialization timeline, message flow, auth token chain, tab
+surface — the **Terminal** pane (interactive `claude` PTY) — with
-concurrency model, the Terminal-tab PTY flow, and known failure modes.
+Activity / Refs / Inspector as debug overlays behind the footer's
-The sidebar spans 6 files across 2 codebases (extension + server) with
+`debug` toggle. The chat queue path was ripped once the PTY proved out;
-non-obvious ordering dependencies. The doc exists to prevent the kind of
+`sidebar-agent.ts` and the `/sidebar-command` / `/sidebar-chat` /
-silent failures that come from not understanding the cross-component flow.
+`/sidebar-agent/event` endpoints are gone. The doc covers the WS auth
 flow, dual-token model, and threat-model boundary — silent failures
 here usually trace to not understanding the cross-component flow.
-**Terminal tab is its own process.** `terminal-agent.ts` is a separate
+**WebSocket auth uses Sec-WebSocket-Protocol, not cookies.** Browsers
-non-compiled bun process from `sidebar-agent.ts`. Do not bolt PTY logic
+can't set `Authorization` on a WebSocket upgrade, but they CAN set
-onto sidebar-agent — codex confirmed it would couple chat reliability to
+`Sec-WebSocket-Protocol` via `new WebSocket(url, [token])`. The agent
-PTY framing bugs. Cookie minting (`pty-session-cookie.ts`) lives in the
+reads it, validates against `validTokens`, and MUST echo the protocol
-server; the cookie travels via `Set-Cookie` and back via `Cookie:` on the
+back in the upgrade response — without the echo, Chromium closes the
-WebSocket upgrade. The WS upgrade gates on Origin AND cookie; both are
+connection immediately. `Set-Cookie: gstack_pty=...` is kept as a
-load-bearing for the Terminal tab to be safe. `/health` MUST NOT surface
+fallback for non-browser callers (the cross-port `SameSite=Strict`
-the cookie value or any shell-grant token (codex finding: existing
+cookie path doesn't survive from a chrome-extension origin).
-`AUTH_TOKEN` is already exposed there in headed mode; that's a separate
+
-v1.1+ TODO, not something to widen).
+**Cross-pane PTY injection.** The toolbar's Cleanup button and the
 Inspector's "Send to Code" action both pipe text into the live claude
 PTY via `window.gstackInjectToTerminal(text)`, exposed by
 `sidepanel-terminal.js`. No `/sidebar-command` POST — the live REPL is
 the only execution surface in the sidebar now.
 **`/health` MUST NOT surface any shell-grant token.** It already leaks
 `AUTH_TOKEN` to localhost callers in headed mode (a v1.1+ TODO). Don't
 make that worse by adding the PTY session token there. PTY auth flows
 through `POST /pty-session` only.
 **Transport-layer security** (v1.6.0.0+). When `pair-agent` starts an ngrok tunnel,
 the daemon binds two HTTP listeners: a local listener (127.0.0.1, full command
--- a/TODOS.md
+++ b/TODOS.md
@@ -52,28 +52,6 @@ scope of that PR; deliberately deferred to keep PTY-import small.
 ---
 ### v1.1+: Apply terminal-agent's exception handlers to sidebar-agent
 **What:** While reviewing cc-pty-import, codex noted that `sidebar-agent.ts`
 has no `process.on('uncaughtException'|'unhandledRejection')` handlers.
 A bug in claude stream parsing or queue I/O can take down the chat path
 silently. terminal-agent.ts ships with these handlers; sidebar-agent
 should get them too.
 **Why:** Today a single uncaught exception in chat = entire sidebar chat
 dies and nothing tells the user. The CLI doesn't supervise the agent.
 **Pros:** Chat survives transient bugs. **Cons:** Catching uncaught
 exceptions can hide real failures — pair the handlers with structured
 logging so we still see the bug.
 **Context:** codex finding #4 on cc-pty-import plan-eng review.
 **Priority:** P2.
 **Effort:** S.
 ---
 ## Testing
 ### Pre-existing test failures surfaced during v1.12.0.0 ship
--- a/browse/src/cli.ts
+++ b/browse/src/cli.ts
@@ -853,7 +853,7 @@ Refs:           After 'snapshot', use @e1, @e2... as selectors:
    // Delete stale state file
    safeUnlinkQuiet(config.stateFile);
-    console.log('Launching headed Chromium with extension + sidebar agent...');
+    console.log('Launching headed Chromium with extension + terminal agent...');
    try {
      // Start server in headed mode with extension auto-loaded
      // Use a well-known port so the Chrome extension auto-connects
@@ -882,61 +882,12 @@ Refs:           After 'snapshot', use @e1, @e2... as selectors:
      const status = await resp.text();
      console.log(`Connected to real Chrome\n${status}`);
-      // Auto-start sidebar agent
+      // sidebar-agent.ts spawn was here. Ripped alongside the chat queue —
-      // __dirname is inside $bunfs in compiled binaries — resolve from execPath instead
+      // the Terminal pane runs an interactive PTY now, no more one-shot
-      let agentScript = path.resolve(__dirname, 'sidebar-agent.ts');
+      // claude -p subprocesses to multiplex.
      if (!fs.existsSync(agentScript)) {
        agentScript = path.resolve(path.dirname(process.execPath), '..', 'src', 'sidebar-agent.ts');
      }
      try {
        if (!fs.existsSync(agentScript)) {
          throw new Error(`sidebar-agent.ts not found at ${agentScript}`);
        }
        // Clear old agent queue
        const agentQueue = path.join(process.env.HOME || '/tmp', '.gstack', 'sidebar-agent-queue.jsonl');
        try {
          fs.mkdirSync(path.dirname(agentQueue), { recursive: true, mode: 0o700 });
          fs.writeFileSync(agentQueue, '', { mode: 0o600 });
        } catch (err: any) {
          if (err?.code !== 'EACCES') throw err;
        }
-        // Resolve browse binary path the same way — execPath-relative
+      // Auto-start terminal agent (non-compiled bun process). Owns the PTY
-        let browseBin = path.resolve(__dirname, '..', 'dist', 'browse');
+      // WebSocket for the sidebar Terminal pane.
        if (!fs.existsSync(browseBin)) {
          browseBin = process.execPath; // the compiled binary itself
        }
        // Kill any existing sidebar-agent processes before starting a new one.
        // Old agents have stale auth tokens and will silently fail to relay events,
        // causing the server to mark the agent as "hung".
        try {
          const { spawnSync } = require('child_process');
          spawnSync('pkill', ['-f', 'sidebar-agent\\.ts'], { stdio: 'ignore', timeout: 3000 });
        } catch (err: any) {
          if (err?.code !== 'ENOENT') throw err;
        }
        const agentProc = Bun.spawn(['bun', 'run', agentScript], {
          cwd: config.projectDir,
          env: {
            ...process.env,
            BROWSE_BIN: browseBin,
            BROWSE_STATE_FILE: config.stateFile,
            BROWSE_SERVER_PORT: String(newState.port),
          },
          stdio: ['ignore', 'ignore', 'ignore'],
        });
        agentProc.unref();
        console.log(`[browse] Sidebar agent started (PID: ${agentProc.pid})`);
      } catch (err: any) {
        console.error(`[browse] Sidebar agent failed to start: ${err.message}`);
        console.error(`[browse] Run manually: bun run ${agentScript}`);
      }
      // Auto-start terminal agent (non-compiled, parallel to sidebar-agent).
      // Owns the PTY WebSocket for the Terminal sidebar tab. Crash-isolated
      // from the chat agent per codex outside-voice review.
      let termAgentScript = path.resolve(__dirname, 'terminal-agent.ts');
      if (!fs.existsSync(termAgentScript)) {
        termAgentScript = path.resolve(path.dirname(process.execPath), '..', 'src', 'terminal-agent.ts');
--- a/browse/src/server.ts
+++ b/browse/src/server.ts
--- a/browse/src/sidebar-agent.ts
+++ b/browse/src/sidebar-agent.ts
@@ -1,947 +0,0 @@
 /**
 * Sidebar Agent — polls agent-queue from server, spawns claude -p for each
 * message, streams live events back to the server via /sidebar-agent/event.
 *
 * This runs as a NON-COMPILED bun process because compiled bun binaries
 * cannot posix_spawn external executables. The server writes to the queue
 * file, this process reads it and spawns claude.
 *
 * Usage: BROWSE_BIN=/path/to/browse bun run browse/src/sidebar-agent.ts
 */
 import { spawn } from 'child_process';
 import * as fs from 'fs';
 import * as path from 'path';
 import { safeUnlink } from './error-handling';
 import {
  checkCanaryInStructure, logAttempt, hashPayload, extractDomain,
  combineVerdict, writeSessionState, readSessionState, THRESHOLDS,
  readDecision, clearDecision, excerptForReview,
  type LayerSignal,
 } from './security';
 import {
  loadTestsavant, scanPageContent, checkTranscript,
  shouldRunTranscriptCheck, getClassifierStatus,
  loadDeberta, scanPageContentDeberta,
  type ToolCallInput,
 } from './security-classifier';
 const QUEUE = process.env.SIDEBAR_QUEUE_PATH || path.join(process.env.HOME || '/tmp', '.gstack', 'sidebar-agent-queue.jsonl');
 const KILL_FILE = path.join(path.dirname(QUEUE), 'sidebar-agent-kill');
 const SERVER_PORT = parseInt(process.env.BROWSE_SERVER_PORT || '34567', 10);
 const SERVER_URL = `http://127.0.0.1:${SERVER_PORT}`;
 const POLL_MS = 200;  // 200ms poll — keeps time-to-first-token low
 const B = process.env.BROWSE_BIN || path.resolve(__dirname, '../../.claude/skills/gstack/browse/dist/browse');
 const CANCEL_DIR = path.join(process.env.HOME || '/tmp', '.gstack');
 function cancelFileForTab(tabId: number): string {
  return path.join(CANCEL_DIR, `sidebar-agent-cancel-${tabId}`);
 }
 interface QueueEntry {
  prompt: string;
  args?: string[];
  stateFile?: string;
  cwd?: string;
  tabId?: number | null;
  message?: string | null;
  pageUrl?: string | null;
  sessionId?: string | null;
  ts?: string;
  canary?: string; // session-scoped token; leak = prompt injection evidence
 }
 function isValidQueueEntry(e: unknown): e is QueueEntry {
  if (typeof e !== 'object' || e === null) return false;
  const obj = e as Record<string, unknown>;
  if (typeof obj.prompt !== 'string' || obj.prompt.length === 0) return false;
  if (obj.args !== undefined && (!Array.isArray(obj.args) || !obj.args.every(a => typeof a === 'string'))) return false;
  if (obj.stateFile !== undefined) {
    if (typeof obj.stateFile !== 'string') return false;
    if (obj.stateFile.includes('..')) return false;
  }
  if (obj.cwd !== undefined) {
    if (typeof obj.cwd !== 'string') return false;
    if (obj.cwd.includes('..')) return false;
  }
  if (obj.tabId !== undefined && obj.tabId !== null && typeof obj.tabId !== 'number') return false;
  if (obj.message !== undefined && obj.message !== null && typeof obj.message !== 'string') return false;
  if (obj.pageUrl !== undefined && obj.pageUrl !== null && typeof obj.pageUrl !== 'string') return false;
  if (obj.sessionId !== undefined && obj.sessionId !== null && typeof obj.sessionId !== 'string') return false;
  if (obj.canary !== undefined && typeof obj.canary !== 'string') return false;
  return true;
 }
 let lastLine = 0;
 let authToken: string | null = null;
 // Per-tab processing — each tab can run its own agent concurrently
 const processingTabs = new Set<number>();
 // Active claude subprocesses — keyed by tabId for targeted kill
 const activeProcs = new Map<number, ReturnType<typeof spawn>>();
 let activeProc: ReturnType<typeof spawn> | null = null;
 // Kill-file timestamp last seen — avoids double-kill on same write
 let lastKillTs = 0;
 // ─── File drop relay ──────────────────────────────────────────
 function getGitRoot(): string | null {
  try {
    const { execSync } = require('child_process');
    return execSync('git rev-parse --show-toplevel', { encoding: 'utf-8', stdio: ['pipe', 'pipe', 'pipe'] }).trim();
  } catch (err: any) {
    console.debug('[sidebar-agent] Not in a git repo:', err.message);
    return null;
  }
 }
 function writeToInbox(message: string, pageUrl?: string, sessionId?: string): void {
  const gitRoot = getGitRoot();
  if (!gitRoot) {
    console.error('[sidebar-agent] Cannot write to inbox — not in a git repo');
    return;
  }
  const inboxDir = path.join(gitRoot, '.context', 'sidebar-inbox');
  fs.mkdirSync(inboxDir, { recursive: true, mode: 0o700 });
  const now = new Date();
  const timestamp = now.toISOString().replace(/:/g, '-');
  const filename = `${timestamp}-observation.json`;
  const tmpFile = path.join(inboxDir, `.${filename}.tmp`);
  const finalFile = path.join(inboxDir, filename);
  const inboxMessage = {
    type: 'observation',
    timestamp: now.toISOString(),
    page: { url: pageUrl || 'unknown', title: '' },
    userMessage: message,
    sidebarSessionId: sessionId || 'unknown',
  };
  fs.writeFileSync(tmpFile, JSON.stringify(inboxMessage, null, 2), { mode: 0o600 });
  fs.renameSync(tmpFile, finalFile);
  console.log(`[sidebar-agent] Wrote inbox message: ${filename}`);
 }
 // ─── Auth ────────────────────────────────────────────────────────
 async function refreshToken(): Promise<string | null> {
  // Read token from state file (same-user, mode 0o600) instead of /health
  try {
    const stateFile = process.env.BROWSE_STATE_FILE ||
      path.join(process.env.HOME || '/tmp', '.gstack', 'browse.json');
    const data = JSON.parse(fs.readFileSync(stateFile, 'utf-8'));
    authToken = data.token || null;
    return authToken;
  } catch (err: any) {
    console.error('[sidebar-agent] Failed to refresh auth token:', err.message);
    return null;
  }
 }
 // ─── Event relay to server ──────────────────────────────────────
 async function sendEvent(event: Record<string, any>, tabId?: number): Promise<void> {
  if (!authToken) await refreshToken();
  if (!authToken) return;
  try {
    await fetch(`${SERVER_URL}/sidebar-agent/event`, {
      method: 'POST',
      headers: {
        'Content-Type': 'application/json',
        'Authorization': `Bearer ${authToken}`,
      },
      body: JSON.stringify({ ...event, tabId: tabId ?? null }),
    });
  } catch (err) {
    console.error('[sidebar-agent] Failed to send event:', err);
  }
 }
 // ─── Claude subprocess ──────────────────────────────────────────
 function shorten(str: string): string {
  return str
    .replace(new RegExp(B.replace(/[.*+?^${}()|[\]\\]/g, '\\$&'), 'g'), '$B')
    .replace(/\/Users\/[^/]+/g, '~')
    .replace(/\/conductor\/workspaces\/[^/]+\/[^/]+/g, '')
    .replace(/\.claude\/skills\/gstack\//g, '')
    .replace(/browse\/dist\/browse/g, '$B');
 }
 function describeToolCall(tool: string, input: any): string {
  if (!input) return '';
  // For Bash commands, generate a plain-English description
  if (tool === 'Bash' && input.command) {
    const cmd = input.command;
    // Browse binary commands — the most common case
    const browseMatch = cmd.match(/\$B\s+(\w+)|browse[^\s]*\s+(\w+)/);
    if (browseMatch) {
      const browseCmd = browseMatch[1] || browseMatch[2];
      const args = cmd.split(/\s+/).slice(2).join(' ');
      switch (browseCmd) {
        case 'goto': return `Opening ${args.replace(/['"]/g, '')}`;
        case 'snapshot': return args.includes('-i') ? 'Scanning for interactive elements' : args.includes('-D') ? 'Checking what changed' : 'Taking a snapshot of the page';
        case 'screenshot': return `Saving screenshot${args ? ` to ${shorten(args)}` : ''}`;
        case 'click': return `Clicking ${args}`;
        case 'fill': { const parts = args.split(/\s+/); return `Typing "${parts.slice(1).join(' ')}" into ${parts[0]}`; }
        case 'text': return 'Reading page text';
        case 'html': return args ? `Reading HTML of ${args}` : 'Reading full page HTML';
        case 'links': return 'Finding all links on the page';
        case 'forms': return 'Looking for forms';
        case 'console': return 'Checking browser console for errors';
        case 'network': return 'Checking network requests';
        case 'url': return 'Checking current URL';
        case 'back': return 'Going back';
        case 'forward': return 'Going forward';
        case 'reload': return 'Reloading the page';
        case 'scroll': return args ? `Scrolling to ${args}` : 'Scrolling down';
        case 'wait': return `Waiting for ${args}`;
        case 'inspect': return args ? `Inspecting CSS of ${args}` : 'Getting CSS for last picked element';
        case 'style': return `Changing CSS: ${args}`;
        case 'cleanup': return 'Removing page clutter (ads, popups, banners)';
        case 'prettyscreenshot': return 'Taking a clean screenshot';
        case 'css': return `Checking CSS property: ${args}`;
        case 'is': return `Checking if element is ${args}`;
        case 'diff': return `Comparing ${args}`;
        case 'responsive': return 'Taking screenshots at mobile, tablet, and desktop sizes';
        case 'status': return 'Checking browser status';
        case 'tabs': return 'Listing open tabs';
        case 'focus': return 'Bringing browser to front';
        case 'select': return `Selecting option in ${args}`;
        case 'hover': return `Hovering over ${args}`;
        case 'viewport': return `Setting viewport to ${args}`;
        case 'upload': return `Uploading file to ${args.split(/\s+/)[0]}`;
        default: return `Running browse ${browseCmd} ${args}`.trim();
      }
    }
    // Non-browse bash commands
    if (cmd.includes('git ')) return `Running: ${shorten(cmd)}`;
    let short = shorten(cmd);
    return short.length > 100 ? short.slice(0, 100) + '…' : short;
  }
  if (tool === 'Read' && input.file_path) {
    // Skip Claude's internal tool-result file reads — they're plumbing, not user-facing
    if (input.file_path.includes('/tool-results/') || input.file_path.includes('/.claude/projects/')) return '';
    return `Reading ${shorten(input.file_path)}`;
  }
  if (tool === 'Edit' && input.file_path) return `Editing ${shorten(input.file_path)}`;
  if (tool === 'Write' && input.file_path) return `Writing ${shorten(input.file_path)}`;
  if (tool === 'Grep' && input.pattern) return `Searching for "${input.pattern}"`;
  if (tool === 'Glob' && input.pattern) return `Finding files matching ${input.pattern}`;
  try { return shorten(JSON.stringify(input)).slice(0, 80); } catch { return ''; }
 }
 // Keep the old name as an alias for backward compat
 function summarizeToolInput(tool: string, input: any): string {
  return describeToolCall(tool, input);
 }
 /**
 * Scan a Claude stream event for the session canary. Returns the channel where
 * it leaked, or null if clean. Covers every outbound channel: text blocks,
 * text deltas, tool_use arguments (including nested URL/path/command strings),
 * and result payloads.
 */
 function detectCanaryLeak(event: any, canary: string, buf?: DeltaBuffer): string | null {
  if (!canary) return null;
  if (event.type === 'assistant' && event.message?.content) {
    for (const block of event.message.content) {
      if (block.type === 'text' && typeof block.text === 'string' && block.text.includes(canary)) {
        return 'assistant_text';
      }
      if (block.type === 'tool_use' && checkCanaryInStructure(block.input, canary)) {
        return `tool_use:${block.name}`;
      }
    }
  }
  if (event.type === 'content_block_start' && event.content_block?.type === 'tool_use') {
    if (checkCanaryInStructure(event.content_block.input, canary)) {
      return `tool_use:${event.content_block.name}`;
    }
  }
  if (event.type === 'content_block_delta' && event.delta?.type === 'text_delta') {
    if (typeof event.delta.text === 'string') {
      // Rolling buffer: an attacker can ask Claude to emit the canary split
      // across two deltas (e.g., "CANARY-" then "ABCDEF"). A per-delta
      // substring check misses this. Concatenate the previous tail with
      // this chunk and search, then trim the tail to last canary.length-1
      // chars for the next event.
      const combined = buf ? buf.text_delta + event.delta.text : event.delta.text;
      if (combined.includes(canary)) return 'text_delta';
      if (buf) buf.text_delta = combined.slice(-(canary.length - 1));
    }
  }
  if (event.type === 'content_block_delta' && event.delta?.type === 'input_json_delta') {
    if (typeof event.delta.partial_json === 'string') {
      const combined = buf ? buf.input_json_delta + event.delta.partial_json : event.delta.partial_json;
      if (combined.includes(canary)) return 'tool_input_delta';
      if (buf) buf.input_json_delta = combined.slice(-(canary.length - 1));
    }
  }
  if (event.type === 'content_block_stop' && buf) {
    // Block boundary — reset the rolling buffer so a canary straddling
    // two independent tool_use blocks isn't inferred.
    buf.text_delta = '';
    buf.input_json_delta = '';
  }
  if (event.type === 'result' && typeof event.result === 'string' && event.result.includes(canary)) {
    return 'result';
  }
  return null;
 }
 /** Rolling-window tails for delta canary detection. See detectCanaryLeak. */
 interface DeltaBuffer {
  text_delta: string;
  input_json_delta: string;
 }
 interface CanaryContext {
  canary: string;
  pageUrl: string;
  onLeak: (channel: string) => void;
  deltaBuf: DeltaBuffer;
 }
 interface ToolResultScanContext {
  scan: (toolName: string, text: string) => Promise<void>;
 }
 /**
 * Per-tab map of tool_use_id → tool name. Lets the tool_result handler
 * know what tool produced the content (Read, Grep, Glob, Bash $B ...) so
 * we can tag attack logs with the ingress source.
 */
 const toolUseRegistry = new Map<string, { toolName: string; toolInput: unknown }>();
 /**
 * Extract plain-text content from a tool_result block. The Claude stream
 * encodes it as either a string or an array of content blocks (text, image).
 * We care about text — images can't carry prompt injection at this layer.
 */
 function extractToolResultText(content: unknown): string {
  if (typeof content === 'string') return content;
  if (!Array.isArray(content)) return '';
  const parts: string[] = [];
  for (const block of content) {
    if (block && typeof block === 'object') {
      const b = block as Record<string, unknown>;
      if (b.type === 'text' && typeof b.text === 'string') parts.push(b.text);
    }
  }
  return parts.join('\n');
 }
 /**
 * Tools whose outputs should be ML-scanned. Bash/$B outputs already get
 * scanned via the page-content flow. Read/Glob/Grep outputs have been
 * uncovered — Codex review flagged this gap. Adding coverage here closes it.
 */
 const SCANNED_TOOLS = new Set(['Read', 'Grep', 'Glob', 'Bash', 'WebFetch']);
 async function handleStreamEvent(event: any, tabId?: number, canaryCtx?: CanaryContext, toolResultScanCtx?: ToolResultScanContext): Promise<void> {
  // Canary check runs BEFORE any outbound send — we never want to relay
  // a leaked token to the sidepanel UI.
  if (canaryCtx) {
    const channel = detectCanaryLeak(event, canaryCtx.canary, canaryCtx.deltaBuf);
    if (channel) {
      canaryCtx.onLeak(channel);
      return; // drop the event — never relay content that leaked the canary
    }
  }
  if (event.type === 'system' && event.session_id) {
    // Relay claude session ID for --resume support
    await sendEvent({ type: 'system', claudeSessionId: event.session_id }, tabId);
  }
  if (event.type === 'assistant' && event.message?.content) {
    for (const block of event.message.content) {
      if (block.type === 'tool_use') {
        // Register the tool_use so we can correlate tool_results back to
        // the originating tool when they arrive in the next user-role message.
        if (block.id) toolUseRegistry.set(block.id, { toolName: block.name, toolInput: block.input });
        await sendEvent({ type: 'tool_use', tool: block.name, input: summarizeToolInput(block.name, block.input) }, tabId);
      } else if (block.type === 'text' && block.text) {
        await sendEvent({ type: 'text', text: block.text }, tabId);
      }
    }
  }
  // Tool results come back in user-role messages. Content can be a string
  // or an array of typed content blocks.
  if (event.type === 'user' && event.message?.content) {
    for (const block of event.message.content) {
      if (block && typeof block === 'object' && block.type === 'tool_result') {
        const meta = block.tool_use_id ? toolUseRegistry.get(block.tool_use_id) : null;
        const toolName = meta?.toolName ?? 'Unknown';
        const text = extractToolResultText(block.content);
        // Scan this tool output with the ML classifier if the tool is in
        // the SCANNED_TOOLS set and the content is non-trivial.
        if (SCANNED_TOOLS.has(toolName) && text.length >= 32 && toolResultScanCtx) {
          // Fire-and-forget — never block the stream handler. If BLOCK
          // fires, onToolResultBlock handles kill + emit.
          toolResultScanCtx.scan(toolName, text).catch(() => {});
        }
        if (block.tool_use_id) toolUseRegistry.delete(block.tool_use_id);
      }
    }
  }
  if (event.type === 'content_block_start' && event.content_block?.type === 'tool_use') {
    if (event.content_block.id) {
      toolUseRegistry.set(event.content_block.id, {
        toolName: event.content_block.name,
        toolInput: event.content_block.input,
      });
    }
    await sendEvent({ type: 'tool_use', tool: event.content_block.name, input: summarizeToolInput(event.content_block.name, event.content_block.input) }, tabId);
  }
  if (event.type === 'content_block_delta' && event.delta?.type === 'text_delta' && event.delta.text) {
    await sendEvent({ type: 'text_delta', text: event.delta.text }, tabId);
  }
  // Relay tool results so the sidebar can show what happened
  if (event.type === 'content_block_delta' && event.delta?.type === 'input_json_delta') {
    // Tool input streaming — skip, we already announced the tool
  }
  if (event.type === 'result') {
    await sendEvent({ type: 'result', text: event.result || '' }, tabId);
  }
  // Tool result events — summarize and relay
  if (event.type === 'tool_result' || (event.type === 'assistant' && event.message?.content)) {
    // Tool results come in the next assistant turn — handled above
  }
 }
 /**
 * Fire the prompt-injection-detected event to the server. This terminates
 * the session from the sidepanel's perspective and renders the canary leak
 * banner. Also logs locally (salted hash + domain only) and fires telemetry
 * if configured.
 */
 async function onCanaryLeaked(params: {
  tabId: number;
  channel: string;
  canary: string;
  pageUrl: string;
 }): Promise<void> {
  const { tabId, channel, canary, pageUrl } = params;
  const domain = extractDomain(pageUrl);
  console.warn(`[sidebar-agent] CANARY LEAK detected on ${channel} for tab ${tabId} (domain=${domain || 'unknown'})`);
  // Local log — salted hash + domain only, never the payload
  logAttempt({
    ts: new Date().toISOString(),
    urlDomain: domain,
    payloadHash: hashPayload(canary), // hash the canary, not the payload (which might be leaked content)
    confidence: 1.0,
    layer: 'canary',
    verdict: 'block',
  });
  // Broadcast to sidepanel so it can render the approved banner
  await sendEvent({
    type: 'security_event',
    verdict: 'block',
    reason: 'canary_leaked',
    layer: 'canary',
    channel,
    domain,
  }, tabId);
  // Also emit agent_error so the sidepanel's existing error surface
  // reflects that the session terminated. Keeps old clients working.
  await sendEvent({
    type: 'agent_error',
    error: `Session terminated — prompt injection detected${domain ? ` from ${domain}` : ''}`,
  }, tabId);
 }
 /**
 * Pre-spawn ML scan of the user message. If the classifier fires at BLOCK,
 * we log the attempt, emit a security_event to the sidepanel, and DO NOT
 * spawn claude. Returns true if the scan blocked the session.
 *
 * Fail-open: any classifier error or degraded state returns false (safe) so
 * the sidebar keeps working. The architectural controls (XML framing +
 * command allowlist, live in server.ts:554-577) still defend.
 */
 async function preSpawnSecurityCheck(entry: QueueEntry): Promise<boolean> {
  const { message, canary, pageUrl, tabId } = entry;
  if (!message || message.length === 0) return false;
  const tid = tabId ?? 0;
  // L4: scan the user message for direct injection patterns (TestSavantAI)
  // L4c: also scan with DeBERTa-v3 when ensemble is enabled (opt-in)
  const [contentSignal, debertaSignal] = await Promise.all([
    scanPageContent(message),
    scanPageContentDeberta(message),
  ]);
  const signals: LayerSignal[] = [contentSignal, debertaSignal];
  // L4b: only bother with Haiku if another layer already lit up at >= LOG_ONLY.
  // Saves ~70% of Haiku calls per plan §E1 "gating optimization".
  if (shouldRunTranscriptCheck(signals)) {
    const transcriptSignal = await checkTranscript({
      user_message: message,
      tool_calls: [], // no tool calls yet at session start
    });
    signals.push(transcriptSignal);
  }
  const result = combineVerdict(signals);
  if (result.verdict !== 'block') return false;
  // BLOCK verdict. Log + emit + refuse to spawn.
  const domain = extractDomain(pageUrl ?? '');
  const leaderSignal = signals.reduce((a, b) => (a.confidence > b.confidence ? a : b));
  logAttempt({
    ts: new Date().toISOString(),
    urlDomain: domain,
    payloadHash: hashPayload(message),
    confidence: result.confidence,
    layer: leaderSignal.layer,
    verdict: 'block',
  });
  console.warn(`[sidebar-agent] Pre-spawn BLOCK (${result.reason}) for tab ${tid}, confidence=${result.confidence.toFixed(3)}`);
  await sendEvent({
    type: 'security_event',
    verdict: 'block',
    reason: result.reason ?? 'ml_classifier',
    layer: leaderSignal.layer,
    confidence: result.confidence,
    domain,
  }, tid);
  await sendEvent({
    type: 'agent_error',
    error: `Session blocked — prompt injection detected${domain ? ` from ${domain}` : ' in your message'}`,
  }, tid);
  return true;
 }
 async function askClaude(queueEntry: QueueEntry): Promise<void> {
  const { prompt, args, stateFile, cwd, tabId, canary, pageUrl } = queueEntry;
  const tid = tabId ?? 0;
  processingTabs.add(tid);
  await sendEvent({ type: 'agent_start' }, tid);
  // Pre-spawn ML scan: if the user message trips the ensemble, refuse to
  // spawn claude. Fail-open on classifier errors.
  if (await preSpawnSecurityCheck(queueEntry)) {
    processingTabs.delete(tid);
    return;
  }
  return new Promise((resolve) => {
    // Canary context is set after proc is spawned (needs proc reference for kill).
    let canaryCtx: CanaryContext | undefined;
    let canaryTriggered = false;
    // Use args from queue entry (server sets --model, --allowedTools, prompt framing).
    // Fall back to defaults only if queue entry has no args (backward compat).
    // Write doesn't expand attack surface beyond what Bash already provides.
    // The security boundary is the localhost-only message path, not the tool allowlist.
    let claudeArgs = args || ['-p', prompt, '--output-format', 'stream-json', '--verbose',
      '--allowedTools', 'Bash,Read,Glob,Grep,Write'];
    // Validate cwd exists — queue may reference a stale worktree
    let effectiveCwd = cwd || process.cwd();
    try { fs.accessSync(effectiveCwd); } catch (err: any) {
      console.warn('[sidebar-agent] Worktree path inaccessible, falling back to cwd:', effectiveCwd, err.message);
      effectiveCwd = process.cwd();
    }
    // Clear any stale cancel signal for this tab before starting
    const cancelFile = cancelFileForTab(tid);
    safeUnlink(cancelFile);
    const proc = spawn('claude', claudeArgs, {
      stdio: ['pipe', 'pipe', 'pipe'],
      cwd: effectiveCwd,
      env: {
        ...process.env,
        BROWSE_STATE_FILE: stateFile || '',
        // Connect to the existing headed browse server, never start a new one.
        // BROWSE_PORT tells the CLI which port to check.
        // BROWSE_NO_AUTOSTART prevents spawning an invisible headless browser
        // if the headed server is down — fail fast with a clear error instead.
        BROWSE_PORT: process.env.BROWSE_PORT || '34567',
        BROWSE_NO_AUTOSTART: '1',
        // Pin this agent to its tab — prevents cross-tab interference
        // when multiple agents run simultaneously
        BROWSE_TAB: String(tid),
      },
    });
    // Track active procs so kill-file polling can terminate them
    activeProcs.set(tid, proc);
    activeProc = proc;
    proc.stdin.end();
    // Now that proc exists, set up the canary-leak handler. It fires at most
    // once; on fire we kill the subprocess, emit security_event + agent_error,
    // and let the normal close handler resolve the promise.
    if (canary) {
      canaryCtx = {
        canary,
        pageUrl: pageUrl ?? '',
        deltaBuf: { text_delta: '', input_json_delta: '' },
        onLeak: (channel: string) => {
          if (canaryTriggered) return;
          canaryTriggered = true;
          onCanaryLeaked({ tabId: tid, channel, canary, pageUrl: pageUrl ?? '' });
          try { proc.kill('SIGTERM'); } catch (err: any) { if (err?.code !== 'ESRCH') throw err; }
          setTimeout(() => {
            try { proc.kill('SIGKILL'); } catch (err: any) { if (err?.code !== 'ESRCH') throw err; }
          }, 2000);
        },
      };
    }
    // Tool-result ML scan context. Addresses the Codex review gap: Read,
    // Grep, Glob, and WebFetch outputs enter Claude's context without
    // passing through the Bash $B pipeline that content-security.ts
    // already wraps. Scan them here.
    let toolResultBlockFired = false;
    const toolResultScanCtx: ToolResultScanContext = {
      scan: async (toolName: string, text: string) => {
        if (toolResultBlockFired) return;
        // Parallel L4 + L4c ensemble scan (DeBERTa no-op when disabled).
        // We run L4/L4c AND Haiku in parallel on tool outputs regardless of
        // L4's score, because BrowseSafe-Bench shows L4 (TestSavantAI) has
        // low recall on browser-agent-specific attacks (~15% at v1). Gating
        // Haiku on L4 meant our best signal almost never ran. The cost is
        // ~$0.002 + ~300ms per tool output, bounded by the Haiku timeout
        // and offset by Haiku actually seeing the real attack context.
        //
        // Haiku only runs when the Claude CLI is available (checkHaikuAvailable
        // caches the probe). In environments without it, the call returns a
        // degraded signal and the verdict falls back to L4 alone.
        const [contentSignal, debertaSignal, transcriptSignal] = await Promise.all([
          scanPageContent(text),
          scanPageContentDeberta(text),
          checkTranscript({
            user_message: queueEntry.message ?? '',
            tool_calls: [{ tool_name: toolName, tool_input: {} }],
            tool_output: text,
          }),
        ]);
        const signals: LayerSignal[] = [contentSignal, debertaSignal, transcriptSignal];
        const result = combineVerdict(signals, { toolOutput: true });
        if (result.verdict !== 'block') return;
        toolResultBlockFired = true;
        const domain = extractDomain(pageUrl ?? '');
        const payloadHash = hashPayload(text.slice(0, 4096));
        // Log pending — if the user overrides, we'll update via a separate
        // log line. The attempts.jsonl is append-only so both entries survive.
        logAttempt({
          ts: new Date().toISOString(),
          urlDomain: domain,
          payloadHash,
          confidence: result.confidence,
          layer: 'testsavant_content',
          verdict: 'block',
        });
        console.warn(`[sidebar-agent] Tool-result BLOCK on ${toolName} for tab ${tid} (confidence=${result.confidence.toFixed(3)}) — awaiting user decision`);
        // Surface a REVIEWABLE block event. Sidepanel renders the suspected
        // text + layer scores + [Allow and continue] / [Block session] buttons.
        // The user has 60s to decide; default is BLOCK (safe fallback).
        const layerScores = signals
          .filter((s) => s.confidence > 0)
          .map((s) => ({ layer: s.layer, confidence: s.confidence }));
        await sendEvent({
          type: 'security_event',
          verdict: 'block',
          reason: 'tool_result_ml',
          layer: 'testsavant_content',
          confidence: result.confidence,
          domain,
          tool: toolName,
          reviewable: true,
          suspected_text: excerptForReview(text),
          signals: layerScores,
        }, tid);
        // Poll for the user's decision. Default to BLOCK on timeout.
        const REVIEW_TIMEOUT_MS = 60_000;
        const POLL_MS = 500;
        clearDecision(tid); // clear any stale decision from a prior session
        const deadline = Date.now() + REVIEW_TIMEOUT_MS;
        let decision: 'allow' | 'block' = 'block';
        let decisionReason = 'timeout';
        while (Date.now() < deadline) {
          const rec = readDecision(tid);
          if (rec?.decision === 'allow' || rec?.decision === 'block') {
            decision = rec.decision;
            decisionReason = rec.reason ?? 'user';
            break;
          }
          await new Promise((r) => setTimeout(r, POLL_MS));
        }
        clearDecision(tid);
        if (decision === 'allow') {
          // User overrode. Log the override so the audit trail captures it.
          // toolResultBlockFired stays true so we don't re-prompt within the
          // same message — one override per BLOCK event.
          logAttempt({
            ts: new Date().toISOString(),
            urlDomain: domain,
            payloadHash,
            confidence: result.confidence,
            layer: 'testsavant_content',
            verdict: 'user_overrode',
          });
          await sendEvent({
            type: 'security_event',
            verdict: 'user_overrode',
            reason: 'tool_result_ml',
            layer: 'testsavant_content',
            confidence: result.confidence,
            domain,
            tool: toolName,
          }, tid);
          console.warn(`[sidebar-agent] Tab ${tid}: user overrode BLOCK — session continues`);
          // Let the block stay consumed; reset the flag so subsequent tool
          // results get scanned fresh.
          toolResultBlockFired = false;
          return;
        }
        // User chose BLOCK (or timed out). Kill the session as before.
        await sendEvent({
          type: 'agent_error',
          error: `Session terminated — prompt injection detected in ${toolName} output${decisionReason === 'timeout' ? ' (review timeout)' : ''}`,
        }, tid);
        try { proc.kill('SIGTERM'); } catch (err: any) { if (err?.code !== 'ESRCH') throw err; }
        setTimeout(() => {
          try { proc.kill('SIGKILL'); } catch (err: any) { if (err?.code !== 'ESRCH') throw err; }
        }, 2000);
      },
    };
    // Poll for per-tab cancel signal from server's killAgent()
    const cancelCheck = setInterval(() => {
      try {
        if (fs.existsSync(cancelFile)) {
          console.log(`[sidebar-agent] Cancel signal received for tab ${tid} — killing claude subprocess`);
          try { proc.kill('SIGTERM'); } catch (err: any) { if (err?.code !== 'ESRCH') throw err; }
          setTimeout(() => { try { proc.kill('SIGKILL'); } catch (err: any) { if (err?.code !== 'ESRCH') throw err; } }, 3000);
          fs.unlinkSync(cancelFile);
          clearInterval(cancelCheck);
        }
      } catch (err: any) { if (err?.code !== 'ENOENT') throw err; }
    }, 500);
    let buffer = '';
    proc.stdout.on('data', (data: Buffer) => {
      buffer += data.toString();
      const lines = buffer.split('\n');
      buffer = lines.pop() || '';
      for (const line of lines) {
        if (!line.trim()) continue;
        try { handleStreamEvent(JSON.parse(line), tid, canaryCtx, toolResultScanCtx); } catch (err: any) {
          console.error(`[sidebar-agent] Tab ${tid}: Failed to parse stream line:`, line.slice(0, 100), err.message);
        }
      }
    });
    let stderrBuffer = '';
    proc.stderr.on('data', (data: Buffer) => {
      stderrBuffer += data.toString();
    });
    proc.on('close', (code) => {
      clearInterval(cancelCheck);
      activeProc = null;
      activeProcs.delete(tid);
      if (buffer.trim()) {
        try { handleStreamEvent(JSON.parse(buffer), tid, canaryCtx, toolResultScanCtx); } catch (err: any) {
          console.error(`[sidebar-agent] Tab ${tid}: Failed to parse final buffer:`, buffer.slice(0, 100), err.message);
        }
      }
      const doneEvent: Record<string, any> = { type: 'agent_done' };
      if (code !== 0 && stderrBuffer.trim()) {
        doneEvent.stderr = stderrBuffer.trim().slice(-500);
      }
      sendEvent(doneEvent, tid).then(() => {
        processingTabs.delete(tid);
        resolve();
      });
    });
    proc.on('error', (err) => {
      clearInterval(cancelCheck);
      activeProc = null;
      const errorMsg = stderrBuffer.trim()
        ? `${err.message}\nstderr: ${stderrBuffer.trim().slice(-500)}`
        : err.message;
      sendEvent({ type: 'agent_error', error: errorMsg }, tid).then(() => {
        processingTabs.delete(tid);
        resolve();
      });
    });
    // Timeout (default 300s / 5 min — multi-page tasks need time)
    const timeoutMs = parseInt(process.env.SIDEBAR_AGENT_TIMEOUT || '300000', 10);
    setTimeout(() => {
      try { proc.kill('SIGTERM'); } catch (killErr: any) {
        console.warn(`[sidebar-agent] Tab ${tid}: Failed to kill timed-out process:`, killErr.message);
      }
      setTimeout(() => { try { proc.kill('SIGKILL'); } catch (err: any) { if (err?.code !== 'ESRCH') throw err; } }, 3000);
      const timeoutMsg = stderrBuffer.trim()
        ? `Timed out after ${timeoutMs / 1000}s\nstderr: ${stderrBuffer.trim().slice(-500)}`
        : `Timed out after ${timeoutMs / 1000}s`;
      sendEvent({ type: 'agent_error', error: timeoutMsg }, tid).then(() => {
        processingTabs.delete(tid);
        resolve();
      });
    }, timeoutMs);
  });
 }
 // ─── Poll loop ───────────────────────────────────────────────────
 function countLines(): number {
  try {
    return fs.readFileSync(QUEUE, 'utf-8').split('\n').filter(Boolean).length;
  } catch (err: any) {
    console.error('[sidebar-agent] Failed to read queue file:', err.message);
    return 0;
  }
 }
 function readLine(n: number): string | null {
  try {
    const lines = fs.readFileSync(QUEUE, 'utf-8').split('\n').filter(Boolean);
    return lines[n - 1] || null;
  } catch (err: any) {
    console.error(`[sidebar-agent] Failed to read queue line ${n}:`, err.message);
    return null;
  }
 }
 async function poll() {
  const current = countLines();
  if (current <= lastLine) return;
  while (lastLine < current) {
    lastLine++;
    const line = readLine(lastLine);
    if (!line) continue;
    let parsed: unknown;
    try { parsed = JSON.parse(line); } catch (err: any) {
      console.warn(`[sidebar-agent] Skipping malformed queue entry at line ${lastLine}:`, line.slice(0, 80), err.message);
      continue;
    }
    if (!isValidQueueEntry(parsed)) {
      console.warn(`[sidebar-agent] Skipping invalid queue entry at line ${lastLine}: failed schema validation`);
      continue;
    }
    const entry = parsed;
    const tid = entry.tabId ?? 0;
    // Skip if this tab already has an agent running — server queues per-tab
    if (processingTabs.has(tid)) continue;
    console.log(`[sidebar-agent] Processing tab ${tid}: "${entry.message}"`);
    // Write to inbox so workspace agent can pick it up
    writeToInbox(entry.message || entry.prompt, entry.pageUrl, entry.sessionId);
    // Fire and forget — each tab's agent runs concurrently
    askClaude(entry).catch((err) => {
      console.error(`[sidebar-agent] Error on tab ${tid}:`, err);
      sendEvent({ type: 'agent_error', error: String(err) }, tid);
    });
  }
 }
 // ─── Main ────────────────────────────────────────────────────────
 function pollKillFile(): void {
  try {
    const stat = fs.statSync(KILL_FILE);
    const mtime = stat.mtimeMs;
    if (mtime > lastKillTs) {
      lastKillTs = mtime;
      if (activeProcs.size > 0) {
        console.log(`[sidebar-agent] Kill signal received — terminating ${activeProcs.size} active agent(s)`);
        for (const [tid, proc] of activeProcs) {
          try { proc.kill('SIGTERM'); } catch (err: any) { if (err?.code !== 'ESRCH') throw err; }
          setTimeout(() => { try { proc.kill('SIGKILL'); } catch (err: any) { if (err?.code !== 'ESRCH') throw err; } }, 2000);
          processingTabs.delete(tid);
        }
        activeProcs.clear();
      }
    }
  } catch {
    // Kill file doesn't exist yet — normal state
  }
 }
 async function main() {
  const dir = path.dirname(QUEUE);
  fs.mkdirSync(dir, { recursive: true, mode: 0o700 });
  if (!fs.existsSync(QUEUE)) fs.writeFileSync(QUEUE, '', { mode: 0o600 });
  try { fs.chmodSync(QUEUE, 0o600); } catch (err: any) { if (err?.code !== 'ENOENT') throw err; }
  lastLine = countLines();
  await refreshToken();
  console.log(`[sidebar-agent] Started. Watching ${QUEUE} from line ${lastLine}`);
  console.log(`[sidebar-agent] Server: ${SERVER_URL}`);
  console.log(`[sidebar-agent] Browse binary: ${B}`);
  // If GSTACK_SECURITY_ENSEMBLE=deberta is set, also warm the DeBERTa-v3
  // ensemble classifier. Fire-and-forget alongside TestSavantAI — they
  // warm in parallel. No-op when the env var is unset.
  loadDeberta((msg) => console.log(`[security-classifier] ${msg}`))
    .catch((err) => console.warn('[sidebar-agent] DeBERTa warmup failed:', err?.message));
  // Warm up the ML classifier in the background. First call triggers a 112MB
  // download (~30s on average broadband). Non-blocking — the sidebar stays
  // functional on cold start; classifier just reports 'off' until warmed.
  //
  // On warmup completion (success or failure), write the classifier status to
  // ~/.gstack/security/session-state.json so server.ts's /health endpoint can
  // report it to the sidepanel for shield icon rendering.
  loadTestsavant((msg) => console.log(`[security-classifier] ${msg}`))
    .then(() => {
      const s = getClassifierStatus();
      console.log(`[sidebar-agent] Classifier warmup complete: ${JSON.stringify(s)}`);
      const existing = readSessionState();
      writeSessionState({
        sessionId: existing?.sessionId ?? String(process.pid),
        canary: existing?.canary ?? '',
        warnedDomains: existing?.warnedDomains ?? [],
        classifierStatus: s,
        lastUpdated: new Date().toISOString(),
      });
    })
    .catch((err) => console.warn('[sidebar-agent] Classifier warmup failed (degraded mode):', err?.message));
  setInterval(poll, POLL_MS);
  setInterval(pollKillFile, POLL_MS);
 }
 main().catch(console.error);
--- a/browse/src/terminal-agent.ts
+++ b/browse/src/terminal-agent.ts
@@ -200,10 +200,18 @@ function buildServer() {
      // /ws — WebSocket upgrade. CRITICAL gates:
      //   (1) Origin must be chrome-extension://<id>. Cross-site WS hijacking
-      //       defense per codex finding #9.
+      //       defense — required, not optional.
-      //   (2) Cookie gstack_pty must be in validTokens. The cookie was
+      //   (2) Token must be in validTokens. We accept the token via two
-      //       minted by the parent server's /pty-session route under a
+      //       transports for compatibility:
-      //       valid AUTH_TOKEN, so a request without it can't get a shell.
+      //         - Sec-WebSocket-Protocol (preferred for browsers — the only
      //           auth header settable from the browser WebSocket API)
      //         - Cookie gstack_pty (works for non-browser callers and
      //           same-port browser callers; doesn't survive the cross-port
      //           jump from server.ts:34567 to the agent's random port
      //           when SameSite=Strict is set)
      //       Either path works; both verify against the same in-memory
      //       validTokens Set, populated by the parent server's
      //       authenticated /pty-session → /internal/grant chain.
      if (url.pathname === '/ws') {
        const origin = req.headers.get('origin') || '';
        const isExtensionOrigin = origin.startsWith('chrome-extension://');
@@ -214,18 +222,48 @@ function buildServer() {
          return new Response('forbidden origin', { status: 403 });
        }
-        const cookieHeader = req.headers.get('cookie') || '';
+        // Try Sec-WebSocket-Protocol first. Format: a single token, possibly
-        let cookieToken: string | null = null;
+        // with a `gstack-pty.` prefix (which we strip). Browsers send a
-        for (const part of cookieHeader.split(';')) {
+        // comma-separated list when multiple were requested; we pick the
-          const [name, ...rest] = part.trim().split('=');
+        // first that matches a known token.
-          if (name === 'gstack_pty') { cookieToken = rest.join('=') || null; break; }
+        const protoHeader = req.headers.get('sec-websocket-protocol') || '';
        let token: string | null = null;
        let acceptedProtocol: string | null = null;
        for (const raw of protoHeader.split(',').map(s => s.trim()).filter(Boolean)) {
          const candidate = raw.startsWith('gstack-pty.') ? raw.slice('gstack-pty.'.length) : raw;
          if (validTokens.has(candidate)) {
            token = candidate;
            acceptedProtocol = raw;
            break;
          }
        }
-        if (!cookieToken || !validTokens.has(cookieToken)) {
+
        // Fallback: Cookie gstack_pty (legacy / non-browser callers).
        if (!token) {
          const cookieHeader = req.headers.get('cookie') || '';
          for (const part of cookieHeader.split(';')) {
            const [name, ...rest] = part.trim().split('=');
            if (name === 'gstack_pty') {
              const candidate = rest.join('=') || null;
              if (candidate && validTokens.has(candidate)) {
                token = candidate;
              }
              break;
            }
          }
        }
        if (!token) {
          return new Response('unauthorized', { status: 401 });
        }
        const upgraded = server.upgrade(req, {
-          data: { cookie: cookieToken },
+          data: { cookie: token },
          // Echo the protocol back so the browser accepts the upgrade.
          // Required when the client sends Sec-WebSocket-Protocol — the
          // server MUST select one of the offered protocols, otherwise
          // the browser closes the connection immediately.
          ...(acceptedProtocol ? { headers: { 'Sec-WebSocket-Protocol': acceptedProtocol } } : {}),
        });
        return upgraded ? undefined : new Response('upgrade failed', { status: 500 });
      }
--- a/browse/test/sidebar-agent-roundtrip.test.ts
+++ b/browse/test/sidebar-agent-roundtrip.test.ts
@@ -1,226 +0,0 @@
 /**
 * Layer 3: Sidebar agent round-trip tests.
 * Starts server + sidebar-agent together. Mocks the `claude` binary with a shell
 * script that outputs canned stream-json. Verifies events flow end-to-end:
 * POST /sidebar-command → queue → sidebar-agent → mock claude → events → /sidebar-chat
 */
 import { describe, test, expect, beforeAll, afterAll } from 'bun:test';
 import { spawn, type Subprocess } from 'bun';
 import * as fs from 'fs';
 import * as os from 'os';
 import * as path from 'path';
 let serverProc: Subprocess | null = null;
 let agentProc: Subprocess | null = null;
 let serverPort: number = 0;
 let authToken: string = '';
 let tmpDir: string = '';
 let stateFile: string = '';
 let queueFile: string = '';
 let mockBinDir: string = '';
 async function api(pathname: string, opts: RequestInit = {}): Promise<Response> {
  const headers: Record<string, string> = {
    'Content-Type': 'application/json',
    ...(opts.headers as Record<string, string> || {}),
  };
  if (!headers['Authorization'] && authToken) {
    headers['Authorization'] = `Bearer ${authToken}`;
  }
  return fetch(`http://127.0.0.1:${serverPort}${pathname}`, { ...opts, headers });
 }
 async function resetState() {
  await api('/sidebar-session/new', { method: 'POST' });
  fs.writeFileSync(queueFile, '');
 }
 async function pollChatUntil(
  predicate: (entries: any[]) => boolean,
  timeoutMs = 10000,
 ): Promise<any[]> {
  const deadline = Date.now() + timeoutMs;
  while (Date.now() < deadline) {
    const resp = await api('/sidebar-chat?after=0');
    const data = await resp.json();
    if (predicate(data.entries)) return data.entries;
    await new Promise(r => setTimeout(r, 300));
  }
  // Return whatever we have on timeout
  const resp = await api('/sidebar-chat?after=0');
  return (await resp.json()).entries;
 }
 function writeMockClaude(script: string) {
  const mockPath = path.join(mockBinDir, 'claude');
  fs.writeFileSync(mockPath, script, { mode: 0o755 });
 }
 beforeAll(async () => {
  tmpDir = fs.mkdtempSync(path.join(os.tmpdir(), 'sidebar-roundtrip-'));
  stateFile = path.join(tmpDir, 'browse.json');
  queueFile = path.join(tmpDir, 'sidebar-queue.jsonl');
  mockBinDir = path.join(tmpDir, 'bin');
  fs.mkdirSync(mockBinDir, { recursive: true });
  fs.mkdirSync(path.dirname(queueFile), { recursive: true });
  // Write default mock claude that outputs canned events
  writeMockClaude(`#!/bin/bash
 echo '{"type":"system","session_id":"mock-session-123"}'
 echo '{"type":"assistant","message":{"content":[{"type":"text","text":"I can see the page. It looks like a test fixture."}]}}'
 echo '{"type":"result","result":"Done."}'
 `);
  // Start server (no browser)
  const serverScript = path.resolve(__dirname, '..', 'src', 'server.ts');
  serverProc = spawn(['bun', 'run', serverScript], {
    env: {
      ...process.env,
      BROWSE_STATE_FILE: stateFile,
      BROWSE_HEADLESS_SKIP: '1',
      BROWSE_PORT: '0',
      SIDEBAR_QUEUE_PATH: queueFile,
      BROWSE_IDLE_TIMEOUT: '300',
    },
    stdio: ['ignore', 'pipe', 'pipe'],
  });
  // Wait for server
  const deadline = Date.now() + 15000;
  while (Date.now() < deadline) {
    if (fs.existsSync(stateFile)) {
      try {
        const state = JSON.parse(fs.readFileSync(stateFile, 'utf-8'));
        if (state.port && state.token) {
          serverPort = state.port;
          authToken = state.token;
          break;
        }
      } catch {}
    }
    await new Promise(r => setTimeout(r, 100));
  }
  if (!serverPort) throw new Error('Server did not start in time');
  // Start sidebar-agent with mock claude on PATH
  const agentScript = path.resolve(__dirname, '..', 'src', 'sidebar-agent.ts');
  agentProc = spawn(['bun', 'run', agentScript], {
    env: {
      ...process.env,
      PATH: `${mockBinDir}:${process.env.PATH}`,
      BROWSE_SERVER_PORT: String(serverPort),
      BROWSE_STATE_FILE: stateFile,
      SIDEBAR_QUEUE_PATH: queueFile,
      SIDEBAR_AGENT_TIMEOUT: '10000',
      BROWSE_BIN: 'browse',  // doesn't matter, mock claude doesn't use it
    },
    stdio: ['ignore', 'pipe', 'pipe'],
  });
  // Give sidebar-agent time to start polling
  await new Promise(r => setTimeout(r, 1000));
 }, 20000);
 afterAll(() => {
  if (agentProc) { try { agentProc.kill(); } catch {} }
  if (serverProc) { try { serverProc.kill(); } catch {} }
  try { fs.rmSync(tmpDir, { recursive: true, force: true }); } catch {}
 });
 describe('sidebar-agent round-trip', () => {
  test('full message round-trip with mock claude', async () => {
    await resetState();
    // Send a command
    const resp = await api('/sidebar-command', {
      method: 'POST',
      body: JSON.stringify({
        message: 'what is on this page?',
        activeTabUrl: 'https://example.com/test',
      }),
    });
    expect(resp.status).toBe(200);
    // Wait for mock claude to process and events to arrive
    const entries = await pollChatUntil(
      (entries) => entries.some((e: any) => e.type === 'agent_done'),
      15000,
    );
    // Verify the flow: user message → agent_start → text → agent_done
    const userEntry = entries.find((e: any) => e.role === 'user');
    expect(userEntry).toBeDefined();
    expect(userEntry.message).toBe('what is on this page?');
    // The mock claude outputs text — check for any agent text entry
    const textEntries = entries.filter((e: any) => e.role === 'agent' && (e.type === 'text' || e.type === 'result'));
    expect(textEntries.length).toBeGreaterThan(0);
    const doneEntry = entries.find((e: any) => e.type === 'agent_done');
    expect(doneEntry).toBeDefined();
    // Agent should be back to idle
    const session = await (await api('/sidebar-session')).json();
    expect(session.agent.status).toBe('idle');
  }, 20000);
  test('claude crash produces agent_error', async () => {
    await resetState();
    // Replace mock claude with one that crashes
    writeMockClaude(`#!/bin/bash
 echo '{"type":"system","session_id":"crash-test"}' >&2
 exit 1
 `);
    await api('/sidebar-command', {
      method: 'POST',
      body: JSON.stringify({ message: 'crash test' }),
    });
    // Wait for agent_done (sidebar-agent sends agent_done even on crash via proc.on('close'))
    const entries = await pollChatUntil(
      (entries) => entries.some((e: any) => e.type === 'agent_done' || e.type === 'agent_error'),
      15000,
    );
    // Agent should recover to idle
    const session = await (await api('/sidebar-session')).json();
    expect(session.agent.status).toBe('idle');
    // Restore working mock
    writeMockClaude(`#!/bin/bash
 echo '{"type":"assistant","message":{"content":[{"type":"text","text":"recovered"}]}}'
 `);
  }, 20000);
  test('sequential queue drain', async () => {
    await resetState();
    // Restore working mock
    writeMockClaude(`#!/bin/bash
 echo '{"type":"assistant","message":{"content":[{"type":"text","text":"response to: '"'"'$*'"'"'"}]}}'
 `);
    // Send two messages rapidly — first processes, second queues
    await api('/sidebar-command', {
      method: 'POST',
      body: JSON.stringify({ message: 'first message' }),
    });
    await api('/sidebar-command', {
      method: 'POST',
      body: JSON.stringify({ message: 'second message' }),
    });
    // Wait for both to complete (two agent_done events)
    const entries = await pollChatUntil(
      (entries) => entries.filter((e: any) => e.type === 'agent_done').length >= 2,
      20000,
    );
    // Both user messages should be in chat
    const userEntries = entries.filter((e: any) => e.role === 'user');
    expect(userEntries.length).toBeGreaterThanOrEqual(2);
  }, 25000);
 });
--- a/browse/test/sidebar-agent.test.ts
+++ b/browse/test/sidebar-agent.test.ts
@@ -1,562 +0,0 @@
 /**
 * Tests for sidebar agent queue parsing and inbox writing.
 *
 * sidebar-agent.ts functions are not exported (it's an entry-point script),
 * so we test the same logic inline: JSONL parsing, writeToInbox filesystem
 * behavior, and edge cases.
 */
 import { describe, test, expect, beforeEach, afterEach } from 'bun:test';
 import * as fs from 'fs';
 import * as path from 'path';
 import * as os from 'os';
 // ─── Helpers: replicate sidebar-agent logic for unit testing ──────
 /** Parse a single JSONL line — same logic as sidebar-agent poll() */
 function parseQueueLine(line: string): any | null {
  if (!line.trim()) return null;
  try {
    const entry = JSON.parse(line);
    if (!entry.message && !entry.prompt) return null;
    return entry;
  } catch {
    return null;
  }
 }
 /** Read all valid entries from a JSONL string — same as countLines + readLine loop */
 function parseQueueFile(content: string): any[] {
  const entries: any[] = [];
  const lines = content.split('\n').filter(Boolean);
  for (const line of lines) {
    const entry = parseQueueLine(line);
    if (entry) entries.push(entry);
  }
  return entries;
 }
 /** Write to inbox — extracted logic from sidebar-agent.ts writeToInbox() */
 function writeToInbox(
  gitRoot: string,
  message: string,
  pageUrl?: string,
  sessionId?: string,
 ): string | null {
  if (!gitRoot) return null;
  const inboxDir = path.join(gitRoot, '.context', 'sidebar-inbox');
  fs.mkdirSync(inboxDir, { recursive: true });
  const now = new Date();
  const timestamp = now.toISOString().replace(/:/g, '-');
  const filename = `${timestamp}-observation.json`;
  const tmpFile = path.join(inboxDir, `.${filename}.tmp`);
  const finalFile = path.join(inboxDir, filename);
  const inboxMessage = {
    type: 'observation',
    timestamp: now.toISOString(),
    page: { url: pageUrl || 'unknown', title: '' },
    userMessage: message,
    sidebarSessionId: sessionId || 'unknown',
  };
  fs.writeFileSync(tmpFile, JSON.stringify(inboxMessage, null, 2));
  fs.renameSync(tmpFile, finalFile);
  return finalFile;
 }
 /** Shorten paths — same logic as sidebar-agent.ts shorten() */
 function shorten(str: string): string {
  return str
    .replace(/\/Users\/[^/]+/g, '~')
    .replace(/\/conductor\/workspaces\/[^/]+\/[^/]+/g, '')
    .replace(/\.claude\/skills\/gstack\//g, '')
    .replace(/browse\/dist\/browse/g, '$B');
 }
 /** describeToolCall — replicated from sidebar-agent.ts for unit testing */
 function describeToolCall(tool: string, input: any): string {
  if (!input) return '';
  if (tool === 'Bash' && input.command) {
    const cmd = input.command;
    const browseMatch = cmd.match(/\$B\s+(\w+)|browse[^\s]*\s+(\w+)/);
    if (browseMatch) {
      const browseCmd = browseMatch[1] || browseMatch[2];
      const args = cmd.split(/\s+/).slice(2).join(' ');
      switch (browseCmd) {
        case 'goto': return `Opening ${args.replace(/['"]/g, '')}`;
        case 'snapshot': return args.includes('-i') ? 'Scanning for interactive elements' : args.includes('-D') ? 'Checking what changed' : 'Taking a snapshot of the page';
        case 'screenshot': return `Saving screenshot${args ? ` to ${shorten(args)}` : ''}`;
        case 'click': return `Clicking ${args}`;
        case 'fill': { const parts = args.split(/\s+/); return `Typing "${parts.slice(1).join(' ')}" into ${parts[0]}`; }
        case 'text': return 'Reading page text';
        case 'html': return args ? `Reading HTML of ${args}` : 'Reading full page HTML';
        case 'links': return 'Finding all links on the page';
        case 'forms': return 'Looking for forms';
        case 'console': return 'Checking browser console for errors';
        case 'network': return 'Checking network requests';
        case 'url': return 'Checking current URL';
        case 'back': return 'Going back';
        case 'forward': return 'Going forward';
        case 'reload': return 'Reloading the page';
        case 'scroll': return args ? `Scrolling to ${args}` : 'Scrolling down';
        case 'wait': return `Waiting for ${args}`;
        case 'inspect': return args ? `Inspecting CSS of ${args}` : 'Getting CSS for last picked element';
        case 'style': return `Changing CSS: ${args}`;
        case 'cleanup': return 'Removing page clutter (ads, popups, banners)';
        case 'prettyscreenshot': return 'Taking a clean screenshot';
        case 'css': return `Checking CSS property: ${args}`;
        case 'is': return `Checking if element is ${args}`;
        case 'diff': return `Comparing ${args}`;
        case 'responsive': return 'Taking screenshots at mobile, tablet, and desktop sizes';
        case 'status': return 'Checking browser status';
        case 'tabs': return 'Listing open tabs';
        case 'focus': return 'Bringing browser to front';
        case 'select': return `Selecting option in ${args}`;
        case 'hover': return `Hovering over ${args}`;
        case 'viewport': return `Setting viewport to ${args}`;
        case 'upload': return `Uploading file to ${args.split(/\s+/)[0]}`;
        default: return `Running browse ${browseCmd} ${args}`.trim();
      }
    }
    if (cmd.includes('git ')) return `Running: ${shorten(cmd)}`;
    let short = shorten(cmd);
    return short.length > 100 ? short.slice(0, 100) + '…' : short;
  }
  if (tool === 'Read' && input.file_path) return `Reading ${shorten(input.file_path)}`;
  if (tool === 'Edit' && input.file_path) return `Editing ${shorten(input.file_path)}`;
  if (tool === 'Write' && input.file_path) return `Writing ${shorten(input.file_path)}`;
  if (tool === 'Grep' && input.pattern) return `Searching for "${input.pattern}"`;
  if (tool === 'Glob' && input.pattern) return `Finding files matching ${input.pattern}`;
  try { return shorten(JSON.stringify(input)).slice(0, 80); } catch { return ''; }
 }
 // ─── Test setup ──────────────────────────────────────────────────
 let tmpDir: string;
 beforeEach(() => {
  tmpDir = fs.mkdtempSync(path.join(os.tmpdir(), 'sidebar-agent-test-'));
 });
 afterEach(() => {
  fs.rmSync(tmpDir, { recursive: true, force: true });
 });
 // ─── Queue File Parsing ─────────────────────────────────────────
 describe('queue file parsing', () => {
  test('valid JSONL line parsed correctly', () => {
    const line = JSON.stringify({ message: 'hello', prompt: 'check this', pageUrl: 'https://example.com' });
    const entry = parseQueueLine(line);
    expect(entry).not.toBeNull();
    expect(entry.message).toBe('hello');
    expect(entry.prompt).toBe('check this');
    expect(entry.pageUrl).toBe('https://example.com');
  });
  test('malformed JSON line skipped without crash', () => {
    const entry = parseQueueLine('this is not json {{{');
    expect(entry).toBeNull();
  });
  test('valid JSON without message or prompt is skipped', () => {
    const line = JSON.stringify({ foo: 'bar' });
    const entry = parseQueueLine(line);
    expect(entry).toBeNull();
  });
  test('empty file returns no entries', () => {
    const entries = parseQueueFile('');
    expect(entries).toEqual([]);
  });
  test('file with blank lines returns no entries', () => {
    const entries = parseQueueFile('\n\n\n');
    expect(entries).toEqual([]);
  });
  test('mixed valid and invalid lines', () => {
    const content = [
      JSON.stringify({ message: 'first' }),
      'not json',
      JSON.stringify({ unrelated: true }),
      JSON.stringify({ message: 'second', prompt: 'do stuff' }),
    ].join('\n');
    const entries = parseQueueFile(content);
    expect(entries.length).toBe(2);
    expect(entries[0].message).toBe('first');
    expect(entries[1].message).toBe('second');
  });
 });
 // ─── writeToInbox ────────────────────────────────────────────────
 describe('writeToInbox', () => {
  test('creates .context/sidebar-inbox/ directory', () => {
    writeToInbox(tmpDir, 'test message');
    const inboxDir = path.join(tmpDir, '.context', 'sidebar-inbox');
    expect(fs.existsSync(inboxDir)).toBe(true);
    expect(fs.statSync(inboxDir).isDirectory()).toBe(true);
  });
  test('writes valid JSON file', () => {
    const filePath = writeToInbox(tmpDir, 'test message', 'https://example.com', 'session-123');
    expect(filePath).not.toBeNull();
    expect(fs.existsSync(filePath!)).toBe(true);
    const data = JSON.parse(fs.readFileSync(filePath!, 'utf-8'));
    expect(data.type).toBe('observation');
    expect(data.userMessage).toBe('test message');
    expect(data.page.url).toBe('https://example.com');
    expect(data.sidebarSessionId).toBe('session-123');
    expect(data.timestamp).toBeTruthy();
  });
  test('atomic write — final file exists, no .tmp left', () => {
    const filePath = writeToInbox(tmpDir, 'atomic test');
    expect(filePath).not.toBeNull();
    expect(fs.existsSync(filePath!)).toBe(true);
    // Check no .tmp files remain in the inbox directory
    const inboxDir = path.join(tmpDir, '.context', 'sidebar-inbox');
    const files = fs.readdirSync(inboxDir);
    const tmpFiles = files.filter(f => f.endsWith('.tmp'));
    expect(tmpFiles.length).toBe(0);
    // Final file should end with -observation.json
    const jsonFiles = files.filter(f => f.endsWith('-observation.json') && !f.startsWith('.'));
    expect(jsonFiles.length).toBe(1);
  });
  test('handles missing git root gracefully', () => {
    const result = writeToInbox('', 'test');
    expect(result).toBeNull();
  });
  test('defaults pageUrl to unknown when not provided', () => {
    const filePath = writeToInbox(tmpDir, 'no url provided');
    expect(filePath).not.toBeNull();
    const data = JSON.parse(fs.readFileSync(filePath!, 'utf-8'));
    expect(data.page.url).toBe('unknown');
  });
  test('defaults sessionId to unknown when not provided', () => {
    const filePath = writeToInbox(tmpDir, 'no session');
    expect(filePath).not.toBeNull();
    const data = JSON.parse(fs.readFileSync(filePath!, 'utf-8'));
    expect(data.sidebarSessionId).toBe('unknown');
  });
  test('multiple writes create separate files', () => {
    writeToInbox(tmpDir, 'message 1');
    // Tiny delay to ensure different timestamps
    const t = Date.now();
    while (Date.now() === t) {} // spin until next ms
    writeToInbox(tmpDir, 'message 2');
    const inboxDir = path.join(tmpDir, '.context', 'sidebar-inbox');
    const files = fs.readdirSync(inboxDir).filter(f => f.endsWith('.json') && !f.startsWith('.'));
    expect(files.length).toBe(2);
  });
 });
 // ─── describeToolCall (verbose narration) ────────────────────────
 describe('describeToolCall', () => {
  // Browse navigation commands
  test('goto → plain English with URL', () => {
    const result = describeToolCall('Bash', { command: '$B goto https://example.com' });
    expect(result).toBe('Opening https://example.com');
  });
  test('goto strips quotes from URL', () => {
    const result = describeToolCall('Bash', { command: '$B goto "https://example.com"' });
    expect(result).toBe('Opening https://example.com');
  });
  test('url → checking current URL', () => {
    expect(describeToolCall('Bash', { command: '$B url' })).toBe('Checking current URL');
  });
  test('back/forward/reload → plain English', () => {
    expect(describeToolCall('Bash', { command: '$B back' })).toBe('Going back');
    expect(describeToolCall('Bash', { command: '$B forward' })).toBe('Going forward');
    expect(describeToolCall('Bash', { command: '$B reload' })).toBe('Reloading the page');
  });
  // Snapshot variants
  test('snapshot -i → scanning for interactive elements', () => {
    expect(describeToolCall('Bash', { command: '$B snapshot -i' })).toBe('Scanning for interactive elements');
  });
  test('snapshot -D → checking what changed', () => {
    expect(describeToolCall('Bash', { command: '$B snapshot -D' })).toBe('Checking what changed');
  });
  test('snapshot (plain) → taking a snapshot', () => {
    expect(describeToolCall('Bash', { command: '$B snapshot' })).toBe('Taking a snapshot of the page');
  });
  // Interaction commands
  test('click → clicking element', () => {
    expect(describeToolCall('Bash', { command: '$B click @e3' })).toBe('Clicking @e3');
  });
  test('fill → typing into element', () => {
    expect(describeToolCall('Bash', { command: '$B fill @e4 "hello world"' })).toBe('Typing ""hello world"" into @e4');
  });
  test('scroll with selector → scrolling to element', () => {
    expect(describeToolCall('Bash', { command: '$B scroll .footer' })).toBe('Scrolling to .footer');
  });
  test('scroll without args → scrolling down', () => {
    expect(describeToolCall('Bash', { command: '$B scroll' })).toBe('Scrolling down');
  });
  // Reading commands
  test('text → reading page text', () => {
    expect(describeToolCall('Bash', { command: '$B text' })).toBe('Reading page text');
  });
  test('html with selector → reading HTML of element', () => {
    expect(describeToolCall('Bash', { command: '$B html .header' })).toBe('Reading HTML of .header');
  });
  test('html without selector → reading full page HTML', () => {
    expect(describeToolCall('Bash', { command: '$B html' })).toBe('Reading full page HTML');
  });
  test('links → finding all links', () => {
    expect(describeToolCall('Bash', { command: '$B links' })).toBe('Finding all links on the page');
  });
  test('console → checking console', () => {
    expect(describeToolCall('Bash', { command: '$B console' })).toBe('Checking browser console for errors');
  });
  // Inspector commands
  test('inspect with selector → inspecting CSS', () => {
    expect(describeToolCall('Bash', { command: '$B inspect .header' })).toBe('Inspecting CSS of .header');
  });
  test('inspect without args → getting last picked element', () => {
    expect(describeToolCall('Bash', { command: '$B inspect' })).toBe('Getting CSS for last picked element');
  });
  test('style → changing CSS', () => {
    expect(describeToolCall('Bash', { command: '$B style .header color red' })).toBe('Changing CSS: .header color red');
  });
  test('cleanup → removing page clutter', () => {
    expect(describeToolCall('Bash', { command: '$B cleanup --all' })).toBe('Removing page clutter (ads, popups, banners)');
  });
  // Visual commands
  test('screenshot → saving screenshot', () => {
    expect(describeToolCall('Bash', { command: '$B screenshot /tmp/shot.png' })).toBe('Saving screenshot to /tmp/shot.png');
  });
  test('screenshot without path', () => {
    expect(describeToolCall('Bash', { command: '$B screenshot' })).toBe('Saving screenshot');
  });
  test('responsive → multi-size screenshots', () => {
    expect(describeToolCall('Bash', { command: '$B responsive' })).toBe('Taking screenshots at mobile, tablet, and desktop sizes');
  });
  // Non-browse tools
  test('Read tool → reading file', () => {
    expect(describeToolCall('Read', { file_path: '/Users/foo/project/src/app.ts' })).toBe('Reading ~/project/src/app.ts');
  });
  test('Grep tool → searching for pattern', () => {
    expect(describeToolCall('Grep', { pattern: 'handleClick' })).toBe('Searching for "handleClick"');
  });
  test('Glob tool → finding files', () => {
    expect(describeToolCall('Glob', { pattern: '**/*.tsx' })).toBe('Finding files matching **/*.tsx');
  });
  test('Edit tool → editing file', () => {
    expect(describeToolCall('Edit', { file_path: '/Users/foo/src/main.ts' })).toBe('Editing ~/src/main.ts');
  });
  // Edge cases
  test('null input → empty string', () => {
    expect(describeToolCall('Bash', null)).toBe('');
  });
  test('unknown browse command → generic description', () => {
    expect(describeToolCall('Bash', { command: '$B newtab https://foo.com' })).toContain('newtab');
  });
  test('non-browse bash → shortened command', () => {
    expect(describeToolCall('Bash', { command: 'echo hello' })).toBe('echo hello');
  });
  test('full browse binary path recognized', () => {
    const result = describeToolCall('Bash', { command: '/Users/garrytan/.claude/skills/gstack/browse/dist/browse goto https://example.com' });
    expect(result).toBe('Opening https://example.com');
  });
  test('tab command → switching tab', () => {
    expect(describeToolCall('Bash', { command: '$B tab 2' })).toContain('tab');
  });
 });
 // ─── Per-tab agent concurrency (source code validation) ──────────
 describe('per-tab agent concurrency', () => {
  const serverSrc = fs.readFileSync(path.join(__dirname, '..', 'src', 'server.ts'), 'utf-8');
  const agentSrc = fs.readFileSync(path.join(__dirname, '..', 'src', 'sidebar-agent.ts'), 'utf-8');
  test('server has per-tab agent state map', () => {
    expect(serverSrc).toContain('tabAgents');
    expect(serverSrc).toContain('TabAgentState');
    expect(serverSrc).toContain('getTabAgent');
  });
  test('server returns per-tab agent status in /sidebar-chat', () => {
    expect(serverSrc).toContain('getTabAgentStatus');
    expect(serverSrc).toContain('tabAgentStatus');
  });
  test('spawnClaude accepts forTabId parameter', () => {
    const spawnFn = serverSrc.slice(
      serverSrc.indexOf('function spawnClaude('),
      serverSrc.indexOf('\nfunction ', serverSrc.indexOf('function spawnClaude(') + 1),
    );
    expect(spawnFn).toContain('forTabId');
    expect(spawnFn).toContain('tabState.status');
  });
  test('sidebar-command endpoint uses per-tab agent state', () => {
    expect(serverSrc).toContain('msgTabId');
    expect(serverSrc).toContain('tabState.status');
    expect(serverSrc).toContain('tabState.queue');
  });
  test('agent event handler resets per-tab state', () => {
    expect(serverSrc).toContain('eventTabId');
    expect(serverSrc).toContain('tabState.status = \'idle\'');
  });
  test('agent event handler processes per-tab queue', () => {
    // After agent_done, should process next message from THIS tab's queue
    expect(serverSrc).toContain('tabState.queue.length > 0');
    expect(serverSrc).toContain('tabState.queue.shift');
  });
  test('sidebar-agent uses per-tab processing set', () => {
    expect(agentSrc).toContain('processingTabs');
    expect(agentSrc).not.toContain('isProcessing');
  });
  test('sidebar-agent sends tabId with all events', () => {
    // sendEvent should accept tabId parameter
    expect(agentSrc).toContain('async function sendEvent(event: Record<string, any>, tabId?: number)');
    // askClaude destructures tabId from queue entry (regex tolerates
    // additional fields like `canary` and `pageUrl` from security module).
    expect(agentSrc).toMatch(
      /const \{[^}]*\bprompt\b[^}]*\bargs\b[^}]*\bstateFile\b[^}]*\bcwd\b[^}]*\btabId\b[^}]*\}/
    );
  });
  test('sidebar-agent allows concurrent agents across tabs', () => {
    // poll() should not block globally — it should check per-tab
    expect(agentSrc).toContain('processingTabs.has(tid)');
    // askClaude should be fire-and-forget (no await blocking the loop)
    expect(agentSrc).toContain('askClaude(entry).catch');
  });
  test('queue entries include tabId', () => {
    const spawnFn = serverSrc.slice(
      serverSrc.indexOf('function spawnClaude('),
      serverSrc.indexOf('\nfunction ', serverSrc.indexOf('function spawnClaude(') + 1),
    );
    expect(spawnFn).toContain('tabId: agentTabId');
  });
  test('health check monitors all per-tab agents', () => {
    expect(serverSrc).toContain('for (const [tid, state] of tabAgents)');
  });
 });
 describe('BROWSE_TAB tab pinning (cross-tab isolation)', () => {
  const serverSrc = fs.readFileSync(path.join(__dirname, '..', 'src', 'server.ts'), 'utf-8');
  const agentSrc = fs.readFileSync(path.join(__dirname, '..', 'src', 'sidebar-agent.ts'), 'utf-8');
  const cliSrc = fs.readFileSync(path.join(__dirname, '..', 'src', 'cli.ts'), 'utf-8');
  test('sidebar-agent passes BROWSE_TAB env var to claude process', () => {
    // The env block should include BROWSE_TAB set to the tab ID
    expect(agentSrc).toContain('BROWSE_TAB');
    expect(agentSrc).toContain('String(tid)');
  });
  test('CLI reads BROWSE_TAB and sends tabId in command body', () => {
    // BROWSE_TAB env var is still honored (sidebar-agent path). After the
    // make-pdf refactor, the CLI layer now also accepts --tab-id <N>, with
    // the CLI flag taking precedence over the env var. Both resolve to the
    // same `tabId` body field.
    expect(cliSrc).toContain('process.env.BROWSE_TAB');
    expect(cliSrc).toContain('parseInt(envTab, 10)');
  });
  test('handleCommandInternal accepts tabId from request body', () => {
    const handleFn = serverSrc.slice(
      serverSrc.indexOf('async function handleCommandInternal('),
      serverSrc.indexOf('\n/** HTTP wrapper', serverSrc.indexOf('async function handleCommandInternal(') + 1) > 0
        ? serverSrc.indexOf('\n/** HTTP wrapper', serverSrc.indexOf('async function handleCommandInternal(') + 1)
        : serverSrc.indexOf('\nasync function ', serverSrc.indexOf('async function handleCommandInternal(') + 200),
    );
    // Should destructure tabId from body
    expect(handleFn).toContain('tabId');
    // Should save and restore the active tab
    expect(handleFn).toContain('savedTabId');
    expect(handleFn).toContain('switchTab(tabId');
  });
  test('handleCommandInternal restores active tab after command (success path)', () => {
    // On success, should restore savedTabId without stealing focus
    const handleFn = serverSrc.slice(
      serverSrc.indexOf('async function handleCommandInternal('),
      serverSrc.length,
    );
    // Count restore calls — should appear in both success and error paths
    const restoreCount = (handleFn.match(/switchTab\(savedTabId/g) || []).length;
    expect(restoreCount).toBeGreaterThanOrEqual(2); // success + error paths
  });
  test('handleCommandInternal restores active tab on error path', () => {
    // The catch block should also restore
    const catchBlock = serverSrc.slice(
      serverSrc.indexOf('} catch (err: any) {', serverSrc.indexOf('async function handleCommandInternal(')),
    );
    expect(catchBlock).toContain('switchTab(savedTabId');
  });
  test('tab pinning only activates when tabId is provided', () => {
    const handleFn = serverSrc.slice(
      serverSrc.indexOf('async function handleCommandInternal('),
      serverSrc.indexOf('try {', serverSrc.indexOf('async function handleCommandInternal(') + 1),
    );
    // Should check tabId is not undefined/null before switching
    expect(handleFn).toContain('tabId !== undefined');
    expect(handleFn).toContain('tabId !== null');
  });
  test('CLI only sends tabId when it is a valid number', () => {
    // Body should conditionally include tabId. Historically that was keyed off
    // the BROWSE_TAB env var. After the make-pdf refactor, the CLI also honors
    // a --tab-id <N> flag on the CLI itself, so the check is "tabId defined
    // AND not NaN" rather than literally inspecting the env var.
    expect(cliSrc).toContain('tabId !== undefined && !isNaN(tabId)');
  });
 });
--- a/browse/test/sidebar-tabs.test.ts
+++ b/browse/test/sidebar-tabs.test.ts
@@ -1,26 +1,15 @@
 /**
- * Regression: changing the default sidebar tab to Terminal must NOT break
+ * Regression: sidebar layout invariants after the chat-tab rip.
 * the existing Chat path or the debug-tab return-to logic.
 *
- * Original /plan-eng-review Issue 3A asked for a Playwright + extension
+ * The Chrome side panel used to host two surfaces: Chat (one-shot
- * E2E test. The codebase doesn't ship Playwright extension launcher
+ * `claude -p` queue) and Terminal (interactive PTY). Chat was ripped
- * infrastructure (extension tests here are source-level), so this regression
+ * once the PTY proved out — sidebar-agent.ts is gone, the chat queue
- * is implemented as a structural assertion suite over the extension files.
+ * endpoints are gone, and the primary-tab nav (Terminal | Chat) is
- * That's enough to lock the load-bearing invariants:
+ * gone. Terminal is now the sole primary surface.
 *
- *   1. Terminal is the default-active primary tab.
+ * This file locks the load-bearing invariants of that layout so a
- *   2. Chat exists as a non-active primary tab.
+ * future refactor can't silently re-introduce the old surface or break
- *   3. The xterm assets are loaded.
+ * the new one.
 *   4. The debug-close path no longer hardcodes `tab-chat` (uses the
 *      activePrimaryPaneId helper that respects whichever primary tab
 *      the user has selected).
 *   5. Manifest declares the ws://127.0.0.1 host permission so MV3
 *      doesn't block the WebSocket upgrade.
 *   6. The chat surface (chat-messages, chat input wiring) still exists
 *      and was not accidentally deleted alongside the default-tab change.
 *
 * If a future refactor regresses any of these, this test fails BEFORE the
 * change ships.
 */
 import { describe, test, expect } from 'bun:test';
@@ -32,84 +21,220 @@ const JS = fs.readFileSync(path.join(import.meta.dir, '../../extension/sidepanel
 const TERM_JS = fs.readFileSync(path.join(import.meta.dir, '../../extension/sidepanel-terminal.js'), 'utf-8');
 const MANIFEST = JSON.parse(fs.readFileSync(path.join(import.meta.dir, '../../extension/manifest.json'), 'utf-8'));
-describe('sidebar tabs regression: Terminal is default, Chat survives', () => {
+describe('sidebar: chat tab + nav are removed, Terminal is sole primary surface', () => {
-  test('primary tab bar declares Terminal and Chat with Terminal active', () => {
+  test('No primary-tab nav element exists', () => {
-    // Terminal is the active button.
+    expect(HTML).not.toContain('class="primary-tabs"');
-    expect(HTML).toMatch(/<button[^>]*class="primary-tab active"[^>]*data-pane="terminal"/);
+    expect(HTML).not.toContain('data-pane="chat"');
-    // Chat is a primary tab, present and non-active.
+    expect(HTML).not.toContain('data-pane="terminal"');
    expect(HTML).toMatch(/<button[^>]*class="primary-tab"[^>]*data-pane="chat"/);
  });
-  test('Terminal pane is active and Chat pane is not active', () => {
+  test('No <main id="tab-chat"> pane', () => {
-    // tab-terminal has the .active class on its <main>.
+    expect(HTML).not.toMatch(/<main[^>]*id="tab-chat"/);
-    expect(HTML).toMatch(/<main id="tab-terminal" class="tab-content active"/);
+    expect(HTML).not.toContain('id="chat-messages"');
-    // tab-chat is present but NOT active.
+    expect(HTML).not.toContain('id="chat-loading"');
-    expect(HTML).toMatch(/<main id="tab-chat" class="tab-content"(?! active)/);
+    expect(HTML).not.toContain('id="chat-welcome"');
  });
-  test('xterm assets are loaded for the Terminal pane', () => {
+  test('No chat input / send button / experimental banner', () => {
-    expect(HTML).toContain('lib/xterm.css');
+    expect(HTML).not.toContain('class="command-bar"');
-    expect(HTML).toContain('lib/xterm.js');
+    expect(HTML).not.toContain('id="command-input"');
-    expect(HTML).toContain('lib/xterm-addon-fit.js');
+    expect(HTML).not.toContain('id="send-btn"');
-    expect(HTML).toContain('sidepanel-terminal.js');
+    expect(HTML).not.toContain('id="stop-agent-btn"');
    expect(HTML).not.toContain('id="experimental-banner"');
  });
-  test('chat surface still exists (no accidental deletion)', () => {
+  test('No clear-chat button in footer', () => {
-    // The chat input and chat-messages containers are load-bearing for the
+    expect(HTML).not.toContain('id="clear-chat"');
    // existing sidebar-agent flow. If the default-tab change accidentally
    // removed them, this catches it before users do.
    expect(HTML).toContain('id="chat-messages"');
    expect(HTML).toContain('id="chat-loading"');
  });
-  test('debug-close path no longer hardcodes tab-chat', () => {
+  test('Terminal pane is .active by default and has the toolbar', () => {
-    // Before the Terminal default flip, sidepanel.js had two literal
+    expect(HTML).toMatch(/<main[^>]*id="tab-terminal"[^>]*class="tab-content active"/);
-    // `getElementById('tab-chat').classList.add('active')` calls inside the
+    expect(HTML).toContain('id="terminal-toolbar"');
-    // debug-close handlers. Both must now go through activePrimaryPaneId()
+    expect(HTML).toContain('id="terminal-restart-now"');
    // so closing debug returns to whichever primary tab is selected.
    expect(JS).toContain('function activePrimaryPaneId');
    // Old hardcoded form is gone (don't ban the string everywhere — there
    // are legitimate references elsewhere in the file).
    const debugToggleBlock = JS.slice(
      JS.indexOf("debugToggle.addEventListener('click'"),
      JS.indexOf("closeDebug.addEventListener('click'"),
    );
    expect(debugToggleBlock).not.toContain("'tab-chat'");
    expect(debugToggleBlock).toContain('activePrimaryPaneId');
  });
-  test('primary-tab click handler exists and toggles classes', () => {
+  test('Quick-actions buttons (Cleanup / Screenshot / Cookies) survive in the terminal toolbar', () => {
-    expect(JS).toContain("querySelectorAll('.primary-tab')");
+    // Garry explicitly wanted these kept after the chat rip — they drive
-    expect(JS).toContain('aria-selected');
+    // browser actions, not chat.
    expect(HTML).toContain('id="chat-cleanup-btn"');
    expect(HTML).toContain('id="chat-screenshot-btn"');
    expect(HTML).toContain('id="chat-cookies-btn"');
    // They live inside the terminal toolbar now (siblings of the Restart
    // button), not as a separate strip below all panes.
    const toolbarStart = HTML.indexOf('id="terminal-toolbar"');
    const toolbarEnd = HTML.indexOf('</div>', toolbarStart);
    const toolbarBlock = HTML.slice(toolbarStart, toolbarEnd + 6);
    expect(toolbarBlock).toContain('id="chat-cleanup-btn"');
    expect(toolbarBlock).toContain('id="chat-screenshot-btn"');
    expect(toolbarBlock).toContain('id="chat-cookies-btn"');
  });
 });
-describe('sidebar terminal: lazy spawn + auth chain', () => {
+describe('sidepanel.js: chat helpers ripped, terminal-injection helper survives', () => {
-  test('terminal JS waits for first key to start (lazy-spawn)', () => {
+  test('No primary-tab click handler', () => {
-    expect(TERM_JS).toContain('function onAnyKey');
+    expect(JS).not.toContain("querySelectorAll('.primary-tab')");
-    expect(TERM_JS).toContain('terminalActive');
+    expect(JS).not.toContain('activePrimaryPaneId');
    expect(TERM_JS).toContain('connect()');
  });
-  test('terminal JS does NOT auto-reconnect on close (codex finding #8)', () => {
+  test('No chat polling, sendMessage, sendChat, stopAgent, or pollTabs', () => {
-    // Close handler transitions to ENDED and shows a restart button,
+    expect(JS).not.toContain('chatPollInterval');
-    // not a reconnect timer.
+    expect(JS).not.toContain('function sendMessage');
-    const closeBlock = TERM_JS.slice(TERM_JS.indexOf("addEventListener('close'"));
+    expect(JS).not.toContain('function pollChat');
-    expect(closeBlock).toContain('ENDED');
+    expect(JS).not.toContain('function pollTabs');
-    // Forbid bare setTimeout(...connect... patterns inside this file's
+    expect(JS).not.toContain('function switchChatTab');
-    // close handler — would indicate auto-reconnect crept back in.
+    expect(JS).not.toContain('function stopAgent');
-    expect(TERM_JS).not.toMatch(/close[\s\S]{0,200}setTimeout\([^)]*connect/);
+    expect(JS).not.toContain('function applyChatEnabled');
    expect(JS).not.toContain('function showSecurityBanner');
  });
-  test('terminal JS reaches /pty-session with the bootstrap auth token', () => {
+  test('Cleanup runs through the live PTY (no /sidebar-command POST)', () => {
-    expect(TERM_JS).toContain('/pty-session');
+    // The new Cleanup handler injects the prompt straight into claude's
-    expect(TERM_JS).toContain('Bearer ${token}');
+    // PTY via gstackInjectToTerminal. The dead code path was a POST to
-    expect(TERM_JS).toContain('credentials');
+    // /sidebar-command which kicked off a fresh claude -p subprocess.
    const cleanup = JS.slice(JS.indexOf('async function runCleanup'));
    expect(cleanup).toContain('window.gstackInjectToTerminal');
    expect(cleanup).not.toContain('/sidebar-command');
    expect(cleanup).not.toContain('addChatEntry');
  });
-  test('terminal JS opens ws://127.0.0.1 (not wss)', () => {
+  test('Inspector "Send to Code" routes through the live PTY', () => {
-    expect(TERM_JS).toContain('new WebSocket(`ws://127.0.0.1:');
+    const sendBtn = JS.slice(JS.indexOf('inspectorSendBtn.addEventListener'));
-    // Origin is implicit (browser sets chrome-extension://<id>); no manual override.
+    expect(sendBtn).toContain('window.gstackInjectToTerminal');
    expect(sendBtn).not.toContain("type: 'sidebar-command'");
  });
  test('updateConnection no longer kicks off chat / tab polling', () => {
    const update = JS.slice(JS.indexOf('function updateConnection'), JS.indexOf('function updateConnection') + 1500);
    expect(update).not.toContain('chatPollInterval');
    expect(update).not.toContain('tabPollInterval');
    expect(update).not.toContain('pollChat');
    expect(update).not.toContain('pollTabs');
    // BUT must still expose the bootstrap globals for sidepanel-terminal.js.
    expect(update).toContain('window.gstackServerPort');
    expect(update).toContain('window.gstackAuthToken');
  });
 });
 describe('sidepanel-terminal.js: eager auto-connect + injection API', () => {
  test('Exposes window.gstackInjectToTerminal for cross-pane use', () => {
    expect(TERM_JS).toContain('window.gstackInjectToTerminal');
    // Returns false when no live session, true when bytes go out.
    const inject = TERM_JS.slice(TERM_JS.indexOf('window.gstackInjectToTerminal'));
    expect(inject).toContain('return false');
    expect(inject).toContain('return true');
    expect(inject).toContain('ws.readyState !== WebSocket.OPEN');
  });
  test('Auto-connects on init (no keypress required)', () => {
    expect(TERM_JS).not.toContain('function onAnyKey');
    expect(TERM_JS).not.toContain("addEventListener('keydown'");
    expect(TERM_JS).toContain('function tryAutoConnect');
  });
  test('Repaint hook fires when Terminal pane becomes visible', () => {
    // The chat-tab rip removed gstack:primary-tab-changed; we use a
    // MutationObserver on #tab-terminal's class attr instead. The
    // observer must call repaintIfLive when the .active class returns.
    expect(TERM_JS).toContain('MutationObserver');
    expect(TERM_JS).toContain("attributeFilter: ['class']");
    expect(TERM_JS).toContain('repaintIfLive');
    const repaint = TERM_JS.slice(TERM_JS.indexOf('function repaintIfLive'));
    expect(repaint).toContain('fitAddon && fitAddon.fit()');
    expect(repaint).toContain('term.refresh');
    expect(repaint).toContain("type: 'resize'");
  });
  test('No auto-reconnect on close (Restart is user-initiated)', () => {
    const closeOnly = TERM_JS.slice(
      TERM_JS.indexOf("ws.addEventListener('close'"),
      TERM_JS.indexOf("ws.addEventListener('error'"),
    );
    expect(closeOnly).not.toContain('setTimeout');
    expect(closeOnly).not.toContain('tryAutoConnect');
    expect(closeOnly).not.toContain('connect()');
  });
  test('forceRestart helper closes ws, disposes xterm, returns to IDLE', () => {
    expect(TERM_JS).toContain('function forceRestart');
    const fn = TERM_JS.slice(TERM_JS.indexOf('function forceRestart'));
    expect(fn).toContain('ws && ws.close()');
    expect(fn).toContain('term.dispose()');
    expect(fn).toContain('STATE.IDLE');
    expect(fn).toContain('tryAutoConnect()');
  });
  test('Both restart buttons (mid-session and ENDED) call forceRestart', () => {
    expect(TERM_JS).toContain("els.restart?.addEventListener('click', forceRestart)");
    expect(TERM_JS).toContain("els.restartNow?.addEventListener('click', forceRestart)");
  });
 });
 describe('server.ts: chat / sidebar-agent endpoints are gone', () => {
  const SERVER_SRC = fs.readFileSync(path.join(import.meta.dir, '../src/server.ts'), 'utf-8');
  test('No /sidebar-command, /sidebar-chat, /sidebar-agent/* routes', () => {
    expect(SERVER_SRC).not.toMatch(/url\.pathname === ['"]\/sidebar-command['"]/);
    expect(SERVER_SRC).not.toMatch(/url\.pathname === ['"]\/sidebar-chat['"]/);
    expect(SERVER_SRC).not.toMatch(/url\.pathname\.startsWith\(['"]\/sidebar-agent\//);
    expect(SERVER_SRC).not.toMatch(/url\.pathname === ['"]\/sidebar-agent\/event['"]/);
    expect(SERVER_SRC).not.toMatch(/url\.pathname === ['"]\/sidebar-tabs['"]/);
    expect(SERVER_SRC).not.toMatch(/url\.pathname === ['"]\/sidebar-session['"]/);
  });
  test('No chat-related state declarations or helpers', () => {
    // Allow the symbol names inside the rip-marker comments — but no
    // `let`, `const`, `function`, or `interface` declarations of them.
    expect(SERVER_SRC).not.toMatch(/^let agentProcess/m);
    expect(SERVER_SRC).not.toMatch(/^let agentStatus/m);
    expect(SERVER_SRC).not.toMatch(/^let messageQueue/m);
    expect(SERVER_SRC).not.toMatch(/^let sidebarSession/m);
    expect(SERVER_SRC).not.toMatch(/^const tabAgents/m);
    expect(SERVER_SRC).not.toMatch(/^function pickSidebarModel/m);
    expect(SERVER_SRC).not.toMatch(/^function processAgentEvent/m);
    expect(SERVER_SRC).not.toMatch(/^function killAgent/m);
    expect(SERVER_SRC).not.toMatch(/^function addChatEntry/m);
    expect(SERVER_SRC).not.toMatch(/^interface ChatEntry/m);
    expect(SERVER_SRC).not.toMatch(/^interface SidebarSession/m);
  });
  test('/health no longer surfaces agentStatus or messageQueue length', () => {
    const health = SERVER_SRC.slice(SERVER_SRC.indexOf("url.pathname === '/health'"));
    const slice = health.slice(0, 2000);
    expect(slice).not.toContain('agentStatus');
    expect(slice).not.toContain('messageQueue');
    expect(slice).not.toContain('agentStartTime');
    // chatEnabled is hardcoded false now (older clients still see the field).
    expect(slice).toMatch(/chatEnabled:\s*false/);
    // terminalPort survives.
    expect(slice).toContain('terminalPort');
  });
 });
 describe('cli.ts: sidebar-agent is no longer spawned', () => {
  const CLI_SRC = fs.readFileSync(path.join(import.meta.dir, '../src/cli.ts'), 'utf-8');
  test('No Bun.spawn of sidebar-agent.ts', () => {
    expect(CLI_SRC).not.toMatch(/Bun\.spawn\(\s*\['bun',\s*'run',\s*\w*[Aa]gent[Ss]cript\][\s\S]{0,300}sidebar-agent/);
    // The variable name `agentScript` was for sidebar-agent. After the
    // rip there's only termAgentScript. Allow comments to mention the
    // history but not active spawn calls.
    expect(CLI_SRC).not.toMatch(/^\s*let agentScript = path\.resolve/m);
  });
  test('Terminal-agent spawn survives', () => {
    expect(CLI_SRC).toContain('terminal-agent.ts');
    expect(CLI_SRC).toMatch(/Bun\.spawn\(\['bun',\s*'run',\s*termAgentScript\]/);
  });
 });
 describe('files: sidebar-agent.ts and its tests are deleted', () => {
  test('browse/src/sidebar-agent.ts is gone', () => {
    expect(fs.existsSync(path.join(import.meta.dir, '../src/sidebar-agent.ts'))).toBe(false);
  });
  test('sidebar-agent test files are gone', () => {
    expect(fs.existsSync(path.join(import.meta.dir, 'sidebar-agent.test.ts'))).toBe(false);
    expect(fs.existsSync(path.join(import.meta.dir, 'sidebar-agent-roundtrip.test.ts'))).toBe(false);
  });
 });
@@ -123,8 +248,6 @@ describe('manifest: ws permission + xterm-safe CSP', () => {
  });
  test('manifest does NOT add unsafe-eval to extension_pages CSP', () => {
    // xterm@5 is eval-free (verified at vendor time). If a future xterm
    // upgrade requires unsafe-eval, this test fires and forces a decision.
    const csp = MANIFEST.content_security_policy;
    if (csp && csp.extension_pages) {
      expect(csp.extension_pages).not.toContain('unsafe-eval');
--- a/browse/test/terminal-agent-integration.test.ts
+++ b/browse/test/terminal-agent-integration.test.ts
@@ -127,7 +127,7 @@ describe('terminal-agent: /ws gates', () => {
  });
 });
-describe('terminal-agent: PTY round-trip via real WebSocket', () => {
+describe('terminal-agent: PTY round-trip via real WebSocket (Cookie auth)', () => {
  test('binary writes go to PTY stdin, output streams back', async () => {
    const cookie = 'rt-token-must-be-at-least-seventeen-chars-long';
    const granted = await grantToken(cookie);
@@ -182,6 +182,65 @@ describe('terminal-agent: PTY round-trip via real WebSocket', () => {
    await Bun.sleep(200);
  });
  test('Sec-WebSocket-Protocol auth path: browser-style upgrade with token in protocol', async () => {
    // This is the path the actual browser extension takes. Cross-port
    // SameSite=Strict cookies don't reliably survive the jump from the
    // browse server (port A) to the agent (port B) when initiated from a
    // chrome-extension origin, so we send the token via the only auth
    // header the browser WebSocket API lets us set: Sec-WebSocket-Protocol.
    //
    // The browser sends `gstack-pty.<token>` and the agent must:
    //   1) strip the gstack-pty. prefix
    //   2) validate the token
    //   3) ECHO the protocol back in the upgrade response
    // Without (3) the browser closes the connection immediately, which
    // is the exact bug the original cookie-only implementation hit in
    // manual dogfood. This test catches that regression in CI.
    const token = 'sec-protocol-token-must-be-at-least-seventeen-chars';
    await grantToken(token);
    // We exercise the protocol path by raw-handshaking via fetch+Upgrade,
    // because Bun's test-client WebSocket constructor doesn't propagate
    // `protocols` cleanly when also passed `headers` (the constructor
    // detects the third-arg form unreliably). Real browsers (Chromium)
    // use the standard protocols arg fine — the server-side handler is
    // identical either way, so this test still locks the load-bearing
    // invariant: the agent accepts a token via Sec-WebSocket-Protocol
    // and echoes the protocol back so a browser would accept the upgrade.
    const handshakeKey = 'dGhlIHNhbXBsZSBub25jZQ==';
    const resp = await fetch(`http://127.0.0.1:${agentPort}/ws`, {
      headers: {
        'Connection': 'Upgrade',
        'Upgrade': 'websocket',
        'Sec-WebSocket-Version': '13',
        'Sec-WebSocket-Key': handshakeKey,
        'Sec-WebSocket-Protocol': `gstack-pty.${token}`,
        'Origin': 'chrome-extension://test-extension-id',
      },
    });
    // 101 Switching Protocols + protocol echoed back = browser would accept.
    // 401/403/anything else = browser would close the connection immediately
    // (the bug we hit in manual dogfood).
    expect(resp.status).toBe(101);
    expect(resp.headers.get('upgrade')?.toLowerCase()).toBe('websocket');
    expect(resp.headers.get('sec-websocket-protocol')).toBe(`gstack-pty.${token}`);
  });
  test('Sec-WebSocket-Protocol auth: rejects unknown token even with valid Origin', async () => {
    const resp = await fetch(`http://127.0.0.1:${agentPort}/ws`, {
      headers: {
        'Connection': 'Upgrade',
        'Upgrade': 'websocket',
        'Sec-WebSocket-Version': '13',
        'Sec-WebSocket-Key': 'dGhlIHNhbXBsZSBub25jZQ==',
        'Sec-WebSocket-Protocol': 'gstack-pty.never-granted-token',
        'Origin': 'chrome-extension://test-extension-id',
      },
    });
    expect(resp.status).toBe(401);
  });
  test('text frame {type:"resize"} is accepted (no crash, ws stays open)', async () => {
    const cookie = 'resize-token-must-be-at-least-seventeen-chars';
    await grantToken(cookie);
--- a/browse/test/terminal-agent.test.ts
+++ b/browse/test/terminal-agent.test.ts
@@ -122,12 +122,26 @@ describe('Source-level guard: terminal-agent', () => {
    expect(wsHandler).toContain('forbidden origin');
  });
-  test('validates gstack_pty cookie against an in-memory token set', () => {
+  test('validates the session token against an in-memory token set', () => {
    const wsHandler = AGENT_SRC.slice(AGENT_SRC.indexOf("if (url.pathname === '/ws')"));
    // Two transports: Sec-WebSocket-Protocol (preferred for browsers) and
    // Cookie gstack_pty (fallback). Both verify against validTokens.
    expect(wsHandler).toContain('sec-websocket-protocol');
    expect(wsHandler).toContain('gstack_pty');
    expect(wsHandler).toContain('validTokens.has');
  });
  test('Sec-WebSocket-Protocol auth: strips gstack-pty. prefix and echoes back', () => {
    const wsHandler = AGENT_SRC.slice(AGENT_SRC.indexOf("if (url.pathname === '/ws')"));
    // Browsers send `Sec-WebSocket-Protocol: gstack-pty.<token>`. The agent
    // must strip the prefix before checking validTokens, AND echo the
    // protocol back in the upgrade response — without the echo, the
    // browser closes the connection immediately.
    expect(wsHandler).toContain("'gstack-pty.'");
    expect(wsHandler).toContain('Sec-WebSocket-Protocol');
    expect(wsHandler).toContain('acceptedProtocol');
  });
  test('lazy spawn: claude PTY is spawned in message handler, not on upgrade', () => {
    // The whole point of lazy-spawn (codex finding #8) is that the WS
    // upgrade itself does NOT call spawnClaude. Spawn happens on first
@@ -158,14 +172,19 @@ describe('Source-level guard: terminal-agent', () => {
 });
 describe('Source-level guard: server.ts /pty-session route', () => {
-  test('validates AUTH_TOKEN and uses cookie-based grant', () => {
+  test('validates AUTH_TOKEN, grants over loopback, returns token + Set-Cookie', () => {
    const route = SERVER_SRC.slice(SERVER_SRC.indexOf("url.pathname === '/pty-session'"));
    // Must check auth before minting.
    const beforeMint = route.slice(0, route.indexOf('mintPtySessionToken'));
    expect(beforeMint).toContain('validateAuth');
-    // Must call the loopback grant before responding.
+    // Must call the loopback grant before responding (otherwise the
    // agent's validTokens Set never sees the token and /ws would 401).
    expect(route).toContain('grantPtyToken');
-    // Must Set-Cookie with the minted token.
+    // Must return the token in the JSON body for the
    // Sec-WebSocket-Protocol auth path (cross-port cookies don't survive
    // SameSite=Strict from a chrome-extension origin).
    expect(route).toContain('ptySessionToken');
    // Set-Cookie is kept as a fallback for non-browser callers.
    expect(route).toContain('Set-Cookie');
    expect(route).toContain('buildPtySetCookie');
  });
--- a/docs/designs/SIDEBAR_MESSAGE_FLOW.md
+++ b/docs/designs/SIDEBAR_MESSAGE_FLOW.md
@@ -1,211 +1,27 @@
-# Sidebar Message Flow
+# Sidebar Flow
 How the GStack Browser sidebar actually works. Read this before touching
-sidepanel.js, background.js, content.js, server.ts sidebar endpoints,
+`sidepanel.js`, `background.js`, `content.js`, `terminal-agent.ts`, or
-or sidebar-agent.ts.
+sidebar-related server endpoints.
 The sidebar has one primary surface — the **Terminal** pane, an interactive
 `claude` PTY. Activity / Refs / Inspector survive as debug overlays behind
 the `debug` toggle in the footer. The chat queue path (one-shot `claude -p`,
 sidebar-agent.ts) was ripped once the PTY proved out — the Terminal pane is
 strictly more capable.
 ## Components
 ```
 ┌─────────────────┐     ┌──────────────┐     ┌─────────────┐     ┌────────────────┐
 │  sidepanel.js   │────▶│ background.js│────▶│  server.ts   │────▶│sidebar-agent.ts│
 │  (Chrome panel) │     │ (svc worker) │     │  (Bun HTTP)  │     │  (Bun process) │
 └─────────────────┘     └──────────────┘     └─────────────┘     └────────────────┘
        ▲                                           │                      │
        │           polls /sidebar-chat             │    polls queue file   │
        └───────────────────────────────────────────┘                      │
                                                    ◀──────────────────────┘
                                                    POST /sidebar-agent/event
 ```
 ## Startup Timeline
 ```
 T+0ms     CLI runs `$B connect`
            ├── Server starts on port 34567
            ├── Writes state to .gstack/browse.json (pid, port, token)
            ├── Launches headed Chromium with extension
            └── Clears sidebar-agent-queue.jsonl
 T+500ms   sidebar-agent.ts spawned by CLI
            ├── Reads auth token from .gstack/browse.json
            ├── Creates queue file if missing
            ├── Sets lastLine = current line count
            └── Starts polling every 200ms
 T+1-3s    Extension loads in Chromium
            ├── background.js: health poll every 1s (fast startup)
            │     └── GET /health → gets auth token
            ├── content.js: injects on welcome page
            │     └── Does NOT fire gstack-extension-ready (waits for sidebar)
            └── Side panel: may auto-open via chrome.sidePanel.open()
 T+2-10s   Side panel connects
            ├── tryConnect() → asks background for port/token
            ├── Fallback: direct GET /health for token
            ├── updateConnection(url, token)
            │     ├── Starts chat polling (1s interval)
            │     ├── Starts tab polling (2s interval)
            │     ├── Connects SSE activity stream
            │     └── Sends { type: 'sidebarOpened' } to background
            └── background relays to content script → hides welcome arrow
 T+10s+    Ready for messages
 ```
 ## Message Flow: User Types → Claude Responds
 ```
 1. User types "go to hn" in sidebar, hits Enter
 2. sidepanel.js sendMessage()
   ├── Renders user bubble immediately (optimistic)
   ├── Renders thinking dots immediately
   ├── Switches to fast poll (300ms)
   └── chrome.runtime.sendMessage({ type: 'sidebar-command', message, tabId })
 3. background.js
   ├── Gets active Chrome tab URL
   └── POST /sidebar-command { message, activeTabUrl }
       with Authorization: Bearer ${authToken}
 4. server.ts /sidebar-command handler
   ├── validateAuth(req)
   ├── syncActiveTabByUrl(extensionUrl) — syncs Playwright tab to Chrome tab
   ├── pickSidebarModel(message) — 'sonnet' for actions, 'opus' for analysis
   ├── Adds user message to chat buffer
   ├── Builds system prompt + args
   └── Appends JSON to ~/.gstack/sidebar-agent-queue.jsonl
 5. sidebar-agent.ts poll() (within 200ms)
   ├── Reads new line from queue file
   ├── Parses JSON entry
   ├── Checks processingTabs — skips if tab already has agent running
   └── askClaude(entry) — fire and forget
 6. sidebar-agent.ts askClaude()
   ├── spawn('claude', ['-p', prompt, '--model', model, ...])
   ├── Streams stdout line-by-line (stream-json format)
   ├── For each event: POST /sidebar-agent/event { type, tool, text, tabId }
   └── On close: POST /sidebar-agent/event { type: 'agent_done' }
 7. server.ts processAgentEvent()
   ├── Adds entry to chat buffer (in-memory + disk)
   ├── On agent_done: sets tab status to 'idle'
   └── On agent_done: processes next queued message for that tab
 8. sidepanel.js pollChat() (every 300ms during fast poll)
   ├── GET /sidebar-chat?after=${chatLineCount}&tabId=${tabId}
   ├── Renders new entries (text, tool_use, agent_done)
   └── On agent idle: removes thinking dots, stops fast poll
 ```
 ## Arrow Hint Hide Flow (4-step signal chain)
 The welcome page shows a right-pointing arrow until the sidebar opens.
 ```
 1. sidepanel.js updateConnection()
   └── chrome.runtime.sendMessage({ type: 'sidebarOpened' })
 2. background.js
   └── chrome.tabs.sendMessage(activeTabId, { type: 'sidebarOpened' })
 3. content.js onMessage handler
   └── document.dispatchEvent(new CustomEvent('gstack-extension-ready'))
 4. welcome.html script
   └── addEventListener('gstack-extension-ready', () => arrow.classList.add('hidden'))
 ```
 The arrow does NOT hide when the extension loads. Only when the sidebar connects.
 ## Auth Token Flow
 ```
 Server starts → AUTH_TOKEN = crypto.randomUUID()
    │
    ├── GET /health (no auth) → returns { token: AUTH_TOKEN }
    │
    ├── background.js checkHealth() → authToken = data.token
    │     └── Refreshes on EVERY health poll (fixes stale token on restart)
    │
    ├── sidepanel.js tryConnect() → serverToken from background or /health
    │     └── Used for chat polling: Authorization: Bearer ${serverToken}
    │
    └── sidebar-agent.ts refreshToken() → reads from .gstack/browse.json
          └── Used for event relay: Authorization: Bearer ${authToken}
 ```
 If the server restarts, all three components get fresh tokens within 10s
 (background health poll interval).
 ## Model Routing
 `pickSidebarModel(message)` in server.ts classifies messages:
 | Pattern | Model | Why |
 |---------|-------|-----|
 | "click @e24", "go to hn", "screenshot" | sonnet | Deterministic tool calls, no thinking needed |
 | "what does this page say?", "summarize" | opus | Needs comprehension |
 | "find bugs", "check for broken links" | opus | Analysis task |
 | "navigate to X and fill the form" | sonnet | Action-oriented, no analysis words |
 Analysis words (`what`, `why`, `how`, `summarize`, `describe`, `analyze`, `read X and Y`)
 always override action verbs and force opus.
 ## Known Failure Modes
 | Failure | Symptom | Root Cause | Fix |
 |---------|---------|------------|-----|
 | Stale auth token | "Unauthorized" in input | Server restarted, background had old token | background.js refreshes token on every health poll |
 | Tab ID mismatch | Message sent, no response visible | Server assigned tabId 1, sidebar polling tabId 0 | switchChatTab preserves optimistic UI during switch |
 | Sidebar agent not running | Messages queue forever | Agent process failed to spawn or crashed | Check `ps aux | grep sidebar-agent` |
 | Agent stale token | Agent runs but no events appear in sidebar | sidebar-agent has old token from .gstack/browse.json | Agent re-reads token before each event POST |
 | Queue file missing | spawnClaude fails | Race between server start and agent start | Both sides create file if missing |
 | Optimistic UI blown away | User bubble + dots vanish | switchChatTab replaced DOM with welcome screen | Preserved DOM when lastOptimisticMsg is set |
 ## Per-Tab Concurrency
 Each browser tab can run its own agent simultaneously:
 - Server: `tabAgents: Map<number, TabAgentState>` with per-tab queue (max 5)
 - sidebar-agent: `processingTabs: Set<number>` prevents duplicate spawns
 - Two messages on same tab: queued sequentially, processed in order
 - Two messages on different tabs: run concurrently
 ## File Locations
 | Component | File | Runs in |
 |-----------|------|---------|
 | Sidebar UI | `extension/sidepanel.js` | Chrome side panel |
 | Service worker | `extension/background.js` | Chrome background |
 | Content script | `extension/content.js` | Page context |
 | Welcome page | `browse/src/welcome.html` | Page context |
 | HTTP server | `browse/src/server.ts` | Bun (compiled binary) |
 | Agent process | `browse/src/sidebar-agent.ts` | Bun (non-compiled, can spawn) |
 | CLI entry | `browse/src/cli.ts` | Bun (compiled binary) |
 | Queue file | `~/.gstack/sidebar-agent-queue.jsonl` | Filesystem |
 | State file | `.gstack/browse.json` | Filesystem |
 | Chat log | `~/.gstack/sessions/<id>/chat.jsonl` | Filesystem |
 ## Terminal flow
 The sidebar has a second primary tab next to Chat: **Terminal**. Where Chat
 spawns one-shot `claude -p` per message, Terminal runs **interactive
 `claude` in a real PTY** with xterm.js as the renderer.
 ### Components
 ```
 ┌─────────────────┐     ┌──────────────┐     ┌──────────────────┐
-│ sidepanel.js +  │────▶│  server.ts   │────▶│terminal-agent.ts │
+│  sidepanel.js + │────▶│  server.ts   │────▶│terminal-agent.ts │
 │  -terminal.js   │     │  (compiled)  │     │  (non-compiled)  │
 │  (xterm.js)     │     │              │     │  PTY listener    │
 └─────────────────┘     └──────────────┘     └──────────────────┘
        ▲                       │                      │
-        │  ws://127.0.0.1:<termPort>/ws (cookie auth)   │ Bun.spawn(claude)
+        │  ws://127.0.0.1:<termPort>/ws (Sec-WebSocket-Protocol auth)
-        └───────────────────────┼──────────────────────▶│ terminal: {data}
+        └───────────────────────┼──────────────────────▶│ Bun.spawn(claude)
                                │                      │  terminal: {data}
                                │                      ▼
                                │              ┌──────────────────┐
                                │              │  claude PTY      │
@@ -216,7 +32,8 @@ spawns one-shot `claude -p` per message, Terminal runs **interactive
                       ┌──────────────────┐
                       │ pty-session-     │
                       │ cookie.ts        │
-                       │ (HttpOnly cookie)│
+                       │ (in-memory token │
                       │  registry)       │
                       └──────────────────┘
                                │
                                │ POST /internal/grant (loopback)
@@ -227,7 +44,11 @@ spawns one-shot `claude -p` per message, Terminal runs **interactive
                       └──────────────────┘
 ```
-### Startup + first-key timeline
+The compiled browse server can't `posix_spawn` external executables —
 `terminal-agent.ts` runs as a separate non-compiled `bun run` process and
 owns the `claude` subprocess.
 ## Startup + first-keystroke timeline
 ```
 T+0ms     CLI runs `$B connect`
@@ -241,81 +62,139 @@ T+500ms   terminal-agent.ts boots
            └── Probes claude → writes claude-available.json
 T+1-3s    Extension loads, sidebar opens
-            ├── Terminal tab is default-active
+            ├── sidepanel-terminal.js: setState(IDLE), shows "Starting Claude Code..."
-            ├── sidepanel-terminal.js: setState(IDLE), shows "Press any key"
+            └── tryAutoConnect() polls until window.gstackServerPort + token are set
            └── No PTY spawned yet (lazy)
-T+user-keys  First keystroke fires onAnyKey
+T+ready   tryAutoConnect calls connect()
            ├── POST /pty-session (Authorization: Bearer AUTH_TOKEN)
-            │   └── server mints cookie, posts /internal/grant to agent
+            │   └── server mints session token, posts /internal/grant to agent
-            │   └── responds with Set-Cookie: gstack_pty=<HttpOnly>
+            │   └── responds with {terminalPort, ptySessionToken}
            │   └── responds with terminalPort
            ├── GET /claude-available (preflight)
-            ├── new WebSocket(ws://127.0.0.1:<terminalPort>/ws)
+            ├── new WebSocket(`ws://127.0.0.1:<terminalPort>/ws`,
-            │   └── Browser carries gstack_pty cookie + Origin automatically
+            │                 [`gstack-pty.<token>`])
-            │   └── Agent validates Origin AND cookie BEFORE upgrading
+            │   └── Browser sends Sec-WebSocket-Protocol + Origin
-            ├── On upgrade success, send {type:"resize"} then a single byte
+            │   └── Agent validates Origin AND token BEFORE upgrading
-            └── Agent message handler sees first byte → spawnClaude()
+            │   └── Agent echoes the protocol back (REQUIRED — browser
            │       closes the connection without it)
            ├── On open: send {type:"resize"} then a single \n byte
            └── Agent message handler sees the byte → spawnClaude()
 ```
 ## Auth: WebSocket can't send Authorization headers
 Browser WebSocket clients can't set `Authorization`. They CAN set
 `Sec-WebSocket-Protocol` via the second arg of `new WebSocket(url,
 protocols)`. We exploit that:
 1. `POST /pty-session` (auth: Bearer AUTH_TOKEN) → server mints a
   short-lived session token, pushes it to the agent over loopback,
   returns it in the JSON body.
 2. Extension calls `new WebSocket(url, ['gstack-pty.<token>'])`.
 3. Agent reads `Sec-WebSocket-Protocol`, strips `gstack-pty.`, validates
   against `validTokens`, echoes the protocol back. Echo is mandatory —
   without it Chromium closes the connection on receipt of the upgrade
   response.
 A `Set-Cookie: gstack_pty=...` header is also returned for non-browser
 callers (curl, integration tests). The cookie path was the original v1
 design but `SameSite=Strict` cookies don't survive the cross-port jump
 from server.ts:34567 → agent:<random> from a chrome-extension origin.
 The protocol-token path is what the browser actually uses.
 ### Dual-token model
 | Token | Lives in | Used for | Lifetime |
 |-------|----------|----------|----------|
-| `AUTH_TOKEN` | `<stateDir>/browse.json`; in-memory in server.ts | `/pty-session` POST (mint cookie) | server lifetime |
+| `AUTH_TOKEN` | `<stateDir>/browse.json`; in-memory in server.ts | `/pty-session` POST (mint cookie + token) | server lifetime |
-| `gstack_pty` cookie | Browser HttpOnly jar; agent `validTokens` Set | `/ws` upgrade auth | 30 min, dies on WS close |
+| `gstack-pty.<...>` (Sec-WebSocket-Protocol) | Browser memory only; agent `validTokens` Set | `/ws` upgrade auth | 30 min, auto-revoked on WS close |
 | `INTERNAL_TOKEN` | `<stateDir>/terminal-internal-token`; in agent memory | server → agent loopback `/internal/grant` | agent lifetime |
-`AUTH_TOKEN` is **never** valid for `/ws` directly. The cookie is **never**
+`AUTH_TOKEN` is **never** valid for `/ws` directly. The session token is
-valid for `/pty-session` or `/command`. Strict separation prevents an SSE
+**never** valid for `/pty-session` or `/command`. Strict separation
-or sidebar-chat token leak from escalating into shell access.
+prevents an SSE or page-content token leak from escalating into shell
 access.
-### Threat model
+## Threat model
-The Terminal tab **bypasses the entire prompt-injection security stack**
+The Terminal pane **bypasses the prompt-injection security stack** on
-(`content-security.ts` datamarking, `security-classifier.ts` ML scoring,
+purpose — the user is typing directly to claude, there's no untrusted
-canary detection, ensemble verdicts). On the Terminal tab the user is
+page content in the loop. Trust source is the keyboard, same as any
-typing directly to claude — there is no untrusted page content in the
+local terminal.
 loop, so the threat model is "user trusts themselves," same as opening
 a terminal locally.
-That trust assumption is load-bearing on three transport-layer guarantees:
+That trust assumption is load-bearing on three transport guarantees:
-1. **Local-only listener.** `terminal-agent.ts` binds `127.0.0.1` only.
+1. **Local-only listener.** terminal-agent.ts binds `127.0.0.1` only.
-   The dual-listener tunnel surface (server.ts:95 `TUNNEL_PATHS`) does
+   The dual-listener tunnel surface (server.ts `TUNNEL_PATHS`) does
-   **not** include `/pty-session` or `/terminal/*`, so the tunnel returns
+   not include `/pty-session` or `/terminal/*`, so the tunnel returns
   404 by default-deny.
 2. **Origin gate.** `/ws` upgrades require
-   `Origin: chrome-extension://<id>`. A localhost web page cannot mount a
+   `Origin: chrome-extension://<id>`. A localhost web page can't mount
-   cross-site WebSocket hijack against the shell because its Origin is
+   a cross-site WebSocket hijack against the shell because its Origin
-   a regular `http(s)://...`.
+   is a regular `http(s)://...`.
-3. **Cookie auth.** `gstack_pty` is HttpOnly + SameSite=Strict, scoped to
+3. **Session token auth.** Minted only by an authenticated
-   the local listener, minted only by an authenticated `/pty-session`
+   `/pty-session` POST, scoped to one WS, auto-revoked on close.
   POST. JS injected into a page can't read it; cross-site requests
   can't send it.
-Drop any of those three and the whole tab becomes unsafe.
+Drop any one of those three and the whole tab becomes unsafe.
-### Lifecycle
+## Lifecycle
- **Lazy spawn**: claude is not started until the user types a key. Idle
+- **Eager auto-connect.** Sidebar opens → tryAutoConnect polls for the
-  sidebar opens cost nothing.
+  bootstrap globals and connects as soon as they're set. No keypress
- **One PTY per WS**: closing the WebSocket SIGINTs claude, then SIGKILLs
+  required.
-  after 3s. The `gstack_pty` cookie is also revoked so a stolen cookie
+- **One PTY per WS.** Closing the WebSocket SIGINTs claude, then SIGKILLs
-  can't be replayed against a new PTY.
+  after 3s. The session token is revoked so a stolen token can't be
- **No auto-reconnect**: when the WS closes the user sees "Session ended,
+  replayed.
-  click to start a new session." Auto-reconnect would burn a fresh
+- **No auto-reconnect on close.** The user sees "Session ended, click to
-  claude session every reload. v1.1 may add session resumption keyed on
+  start a new session." Auto-reconnect would burn a fresh claude session
-  tab/session id (see TODOS).
+  on every reload. v1.1 may add session resumption keyed on tab/session
  id (see TODOS).
 - **Manual restart anytime.** A `↻ Restart` button lives in the always-
  visible terminal toolbar — works mid-session, not just from the ENDED
  state.
-### Files
+## Quick-action toolbar
 Three browser-action buttons live next to the Restart button at the top
 of the Terminal pane:
 | Button | Behavior |
 |--------|----------|
 | 🧹 Cleanup | `window.gstackInjectToTerminal(prompt)` — pipes a "remove ads/banners" instruction into the live PTY. claude in the terminal sees it and acts. |
 | 📸 Screenshot | `POST /command screenshot` — direct browse-server call, no PTY involvement. |
 | 🍪 Cookies | Navigates to the `/cookie-picker` page. |
 The Inspector's "Send to Code" button uses the same `gstackInjectToTerminal`
 path to forward CSS inspector data into claude.
 ## Debug surfaces (Activity / Refs / Inspector)
 Behind the `debug` toggle in the footer. SSE-driven, independent of the
 Terminal pane:
 - **Activity** — streams every browse command via `/activity/stream` SSE.
 - **Refs** — REST: `GET /refs` — current page's `@ref` element labels.
 - **Inspector** — CDP-based element picker; SSE on `/inspector/events`.
 When the debug strip closes, the Terminal pane re-becomes visible.
 xterm.js doesn't auto-redraw when its container flips from `display:none`
 to `display:flex`, so sidepanel-terminal.js runs a `MutationObserver` on
 `#tab-terminal`'s class attribute and forces a fit + refresh when
 `.active` returns.
 ## Files
 | Component | File | Runs in |
 |-----------|------|---------|
-| Terminal UI | `extension/sidepanel-terminal.js` + xterm.js in `extension/lib/` | Chrome side panel |
+| Sidebar UI shell | `extension/sidepanel.html` + `sidepanel.js` + `sidepanel.css` | Chrome side panel |
-| PTY agent | `browse/src/terminal-agent.ts` | Bun (non-compiled, can spawn) |
+| Terminal UI | `extension/sidepanel-terminal.js` + `extension/lib/xterm.js` | Chrome side panel |
-| Cookie store | `browse/src/pty-session-cookie.ts` | Bun (compiled, in server.ts) |
+| Service worker | `extension/background.js` | Chrome background |
-| Port file | `<stateDir>/terminal-port` | Filesystem |
+| Content script | `extension/content.js` | Page context |
 | HTTP server | `browse/src/server.ts` | Bun (compiled binary) |
 | PTY agent | `browse/src/terminal-agent.ts` | Bun (non-compiled) |
 | PTY token store | `browse/src/pty-session-cookie.ts` | Bun (compiled, in server.ts) |
 | CLI entry | `browse/src/cli.ts` | Bun (compiled binary) |
 | State file | `<stateDir>/browse.json` | Filesystem |
 | Terminal port | `<stateDir>/terminal-port` | Filesystem |
 | Internal token | `<stateDir>/terminal-internal-token` | Filesystem |
 | Claude probe | `<stateDir>/claude-available.json` | Filesystem |
 | Active tab | `<stateDir>/active-tab.json` | Filesystem (claude reads) |
--- a/extension/sidepanel-terminal.js
+++ b/extension/sidepanel-terminal.js
@@ -38,6 +38,7 @@
    mount: document.getElementById('terminal-mount'),
    ended: document.getElementById('terminal-ended'),
    restart: document.getElementById('terminal-restart'),
    restartNow: document.getElementById('terminal-restart-now'),
  };
  /** State machine. */
@@ -109,10 +110,12 @@
  }
  /**
-   * POST /pty-session to mint the HttpOnly cookie. Returns { terminalPort,
+   * POST /pty-session to mint a fresh terminal session. Returns
-   * expiresAt } on success, or null with reason on failure. Note: we do
+   * { terminalPort, ptySessionToken, expiresAt } on success, or
-   * NOT receive the cookie value; it lives in the browser's HttpOnly jar
+   * { error } on failure. The token rides on the WebSocket
-   * and travels with the next same-origin request automatically.
+   * Sec-WebSocket-Protocol header, which is the only auth header
   * the browser WebSocket API lets us set. The token is NOT persisted —
   * each sidebar load mints a fresh one and discards it on close.
   */
  async function mintSession() {
    const serverPort = getServerPort();
@@ -183,6 +186,22 @@
    });
  }
  /**
   * Inject a string into the live PTY (the same way a real keystroke would).
   * Used by the toolbar's Cleanup button and the Inspector's "Send to Code"
   * action so the user can drive claude from outside-the-keyboard surfaces.
   * Returns true if the bytes went out, false if no live session.
   */
  window.gstackInjectToTerminal = function (text) {
    if (!text || !ws || ws.readyState !== WebSocket.OPEN) return false;
    try {
      ws.send(new TextEncoder().encode(text));
      return true;
    } catch {
      return false;
    }
  };
  async function connect() {
    if (state !== STATE.IDLE) return; // already connecting/live
    setState(STATE.CONNECTING);
@@ -192,7 +211,11 @@
      setState(STATE.IDLE, { message: `Cannot start: ${minted.error}` });
      return;
    }
-    const { terminalPort } = minted;
+    const { terminalPort, ptySessionToken } = minted;
    if (!ptySessionToken) {
      setState(STATE.IDLE, { message: 'Cannot start: no session token returned' });
      return;
    }
    // Pre-flight: does claude even exist on PATH?
    const claudeStatus = await checkClaudeAvailable(terminalPort);
@@ -205,7 +228,12 @@
    setState(STATE.LIVE);
    fitAddon && fitAddon.fit();
-    ws = new WebSocket(`ws://127.0.0.1:${terminalPort}/ws`);
+    // Token rides on Sec-WebSocket-Protocol — the only auth header the
    // browser WebSocket API lets us set. Cross-port HttpOnly cookies with
    // SameSite=Strict don't survive the jump from server.ts:34567 to the
    // agent's random port from a chrome-extension origin, so cookies
    // alone weren't reliable.
    ws = new WebSocket(`ws://127.0.0.1:${terminalPort}/ws`, [`gstack-pty.${ptySessionToken}`]);
    ws.binaryType = 'arraybuffer';
    ws.addEventListener('open', () => {
@@ -256,66 +284,101 @@
  // ─── Wiring ───────────────────────────────────────────────────
-  function init() {
+  /**
-    // First-keystroke trigger on the bootstrap card.
+   * Force a fresh session: close any open WS, dispose xterm, return to
-    document.addEventListener('keydown', onAnyKey, { once: false, capture: true });
+   * IDLE, kick off auto-connect. Safe to call from any state.
   */
  function forceRestart() {
    try { ws && ws.close(); } catch {}
    ws = null;
    if (term) {
      try { term.dispose(); } catch {}
      term = null;
      fitAddon = null;
    }
    setState(STATE.IDLE, { message: 'Starting Claude Code...' });
    tryAutoConnect();
  }
-    els.installRetry?.addEventListener('click', async () => {
+  /**
-      // Re-probe and try connecting again.
+   * Repaint xterm when the Terminal pane becomes visible. xterm.js has a
-      const minted = await mintSession();
+   * known issue where its renderer doesn't redraw after a display:none →
-      if (!minted.error) {
+   * display:flex flip — the canvas/DOM stays blank until something forces
-        const claudeStatus = await checkClaudeAvailable(minted.terminalPort);
+   * a layout pass. fit() recomputes dimensions, refresh() redraws.
-        if (claudeStatus.available) {
+   */
-          setState(STATE.IDLE);
+  function repaintIfLive() {
-          // Auto-trigger reconnect on next key
+    if (state !== STATE.LIVE || !term) return;
-        }
+    try { fitAddon && fitAddon.fit(); } catch {}
-      }
+    try { term.refresh(0, term.rows - 1); } catch {}
-    });
+    try {
    els.restart?.addEventListener('click', () => {
      // Clean restart. Drop xterm state too — codex 1C: each session is fresh.
      if (term) {
        try { term.dispose(); } catch {}
        term = null;
        fitAddon = null;
      }
      setState(STATE.IDLE);
    });
    // Tab switching: tell the agent which browser tab is active so claude's
    // active-tab.json stays in sync. sidepanel.js owns the active-tab state;
    // we listen for its "tab activated" event.
    document.addEventListener('gstack:active-tab-changed', (ev) => {
      if (ws && ws.readyState === WebSocket.OPEN) {
-        try {
+        ws.send(JSON.stringify({ type: 'resize', cols: term.cols, rows: term.rows }));
          ws.send(JSON.stringify({
            type: 'tabSwitch',
            tabId: ev.detail?.tabId,
            url: ev.detail?.url,
            title: ev.detail?.title,
          }));
        } catch {}
      }
    } catch {}
  }
  function init() {
    setState(STATE.IDLE, { message: 'Starting Claude Code...' });
    els.installRetry?.addEventListener('click', () => {
      // Re-probe claude on PATH, then try a connect.
      setState(STATE.IDLE, { message: 'Starting Claude Code...' });
      tryAutoConnect();
    });
-    // Initial state
+    // Two restart buttons:
-    setState(STATE.IDLE);
+    //   - els.restart lives inside the ENDED state card (visible only after
    //     a session has ended).
    //   - els.restartNow lives in the always-visible toolbar (lets the user
    //     force a fresh claude mid-session without waiting for it to exit).
    els.restart?.addEventListener('click', forceRestart);
    els.restartNow?.addEventListener('click', forceRestart);
    // Repaint after a debug-tab → primary-pane transition. The debug
    // tabs (Activity / Refs / Inspector) hide the Terminal pane via
    // .tab-content { display: none }; xterm doesn't auto-redraw when its
    // container flips back to visible, so we listen for the close-debug
    // event and force a fit + refresh.
    const observer = new MutationObserver(() => {
      const term = document.getElementById('tab-terminal');
      if (term?.classList.contains('active')) {
        requestAnimationFrame(repaintIfLive);
      }
    });
    const target = document.getElementById('tab-terminal');
    if (target) observer.observe(target, { attributes: true, attributeFilter: ['class'] });
    tryAutoConnect();
  }
-  function onAnyKey(ev) {
+  /**
-    // Only trigger if Terminal pane is the active one and we're idle.
+   * Eager-connect when the sidebar opens. Polls for sidepanel.js to populate
-    const terminalActive = document.getElementById('tab-terminal')?.classList.contains('active');
+   * window.gstackServerPort + window.gstackAuthToken (which it does as soon
-    if (!terminalActive) return;
+   * as /health succeeds), then fires connect() automatically. The user
   * doesn't have to press a key — Terminal is the default tab and "tap to
   * start" was a needless paper cut on every reload.
   */
  function tryAutoConnect() {
    if (state !== STATE.IDLE) return;
-    // Ignore pure modifier keys.
+    let waited = 0;
-    if (['Shift', 'Control', 'Alt', 'Meta', 'CapsLock'].includes(ev.key)) return;
+    const tick = () => {
-    connect();
+      // If the user navigated away (Chat tab) or already connected, drop out.
      if (state !== STATE.IDLE) return;
      if (getServerPort() && getAuthToken()) {
        connect();
        return;
      }
      waited += 200;
      if (waited > 15000) {
        setState(STATE.IDLE, { message: 'Browse server not ready. Reload sidebar to retry.' });
        return;
      }
      setTimeout(tick, 200);
    };
    tick();
  }
  // Wait for sidepanel.js to populate window.gstackServerPort + window.gstackAuthToken.
  // sidepanel.js already polls /health and resolves the connection; we just need
  // to wait for it. If those globals aren't available within 10s, surface a
  // "browse server not ready" message — user can reload sidebar.
  if (document.readyState === 'loading') {
    document.addEventListener('DOMContentLoaded', init);
  } else {
--- a/extension/sidepanel.css
+++ b/extension/sidepanel.css
@@ -675,36 +675,40 @@ body::after {
 }
 .tab-content.active { display: flex; flex-direction: column; }
 /* ─── Primary surface tabs (Terminal | Chat) ──────────────────── */
 .primary-tabs {
  display: flex;
  border-bottom: 1px solid var(--border);
  background: #0f0f0f;
  padding: 0 8px;
  flex-shrink: 0;
 }
 .primary-tab {
  background: transparent;
  border: none;
  color: #71717a;
  padding: 8px 14px;
  font-size: 12px;
  font-family: 'JetBrains Mono', monospace;
  cursor: pointer;
  border-bottom: 2px solid transparent;
  margin-bottom: -1px;
 }
 .primary-tab:hover { color: #e5e5e5; }
 .primary-tab.active {
  color: #e5e5e5;
  border-bottom-color: #f59e0b;
 }
 /* ─── Terminal Tab ────────────────────────────────────────────── */
 #tab-terminal {
  background: #0a0a0a;
  padding: 0;
 }
 .terminal-toolbar {
  display: flex;
  align-items: center;
  justify-content: space-between;
  gap: 6px;
  padding: 4px 8px;
  border-bottom: 1px solid #1a1a1a;
  background: #0a0a0a;
  flex-shrink: 0;
 }
 .terminal-toolbar-actions {
  display: flex;
  gap: 4px;
  flex-wrap: wrap;
 }
 .terminal-toolbar-btn {
  background: transparent;
  border: 1px solid #27272a;
  color: #a1a1aa;
  padding: 3px 10px;
  font-size: 11px;
  font-family: 'JetBrains Mono', monospace;
  border-radius: 3px;
  cursor: pointer;
 }
 .terminal-toolbar-btn:hover {
  color: #f59e0b;
  border-color: #f59e0b;
 }
 .terminal-bootstrap {
  flex: 1;
  display: flex;
--- a/extension/sidepanel.html
+++ b/extension/sidepanel.html
@@ -25,57 +25,28 @@
    </div>
  </div>
  <!-- Security event banner — fires on prompt injection detection.
       Variant A from /plan-design-review 2026-04-19: centered alert-heavy,
       big red error icon, mono layer scores in expandable details. -->
  <div class="security-banner" id="security-banner" role="alert" aria-live="assertive" style="display:none">
    <button class="security-banner-close" id="security-banner-close" aria-label="Dismiss">&times;</button>
    <div class="security-banner-icon" aria-hidden="true">
      <svg width="28" height="28" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
        <circle cx="12" cy="12" r="10"></circle>
        <line x1="12" y1="8" x2="12" y2="12"></line>
        <line x1="12" y1="16" x2="12.01" y2="16"></line>
      </svg>
    </div>
    <div class="security-banner-title" id="security-banner-title">Session terminated</div>
    <div class="security-banner-subtitle" id="security-banner-subtitle">prompt injection detected</div>
    <button class="security-banner-expand" id="security-banner-expand" aria-expanded="false" aria-controls="security-banner-details">
      <span>What happened</span>
      <svg class="security-banner-chevron" width="12" height="12" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round">
        <polyline points="6 9 12 15 18 9"></polyline>
      </svg>
    </button>
    <div class="security-banner-details" id="security-banner-details" hidden>
      <div class="security-banner-section-label">SECURITY LAYERS</div>
      <div class="security-banner-layers" id="security-banner-layers"></div>
      <div class="security-banner-section-label" id="security-banner-suspect-label" hidden>SUSPECTED TEXT</div>
      <pre class="security-banner-suspect" id="security-banner-suspect" hidden></pre>
    </div>
    <div class="security-banner-actions" id="security-banner-actions" hidden>
      <button type="button" class="security-banner-btn security-banner-btn-block" id="security-banner-btn-block">Block session</button>
      <button type="button" class="security-banner-btn security-banner-btn-allow" id="security-banner-btn-allow">Allow and continue</button>
    </div>
  </div>
  <!-- Browser tab bar -->
  <div class="browser-tabs" id="browser-tabs" style="display:none"></div>
-  <!-- Primary surface tabs: Terminal (default) | Chat. Activity / Refs /
+  <!-- Terminal pane is now the sole primary surface. Activity / Refs /
-       Inspector still exist as a separate debug-tabs strip below. The
+       Inspector still exist behind the `debug` toggle in the footer. -->
       Terminal tab is default-active per /plan-eng-review Issue 1B
       (subsequently informed by codex's spawn-waste finding: PTY only
       spawns when the user types, so default-active is cheap). -->
  <nav class="primary-tabs" id="primary-tabs" role="tablist">
    <button class="primary-tab active" role="tab" data-pane="terminal" aria-selected="true">Terminal</button>
    <button class="primary-tab" role="tab" data-pane="chat" aria-selected="false">Chat</button>
  </nav>
  <!-- Terminal Tab (default-active) -->
  <main id="tab-terminal" class="tab-content active" role="tabpanel" aria-label="Terminal">
    <!-- Toolbar with browser quick-actions on the left, Restart on the right.
         Restart is always visible so the user can force a fresh claude any
         time, not just from the ENDED state. -->
    <div class="terminal-toolbar" id="terminal-toolbar">
      <div class="terminal-toolbar-actions">
        <button id="chat-cleanup-btn" class="terminal-toolbar-btn" title="Remove ads, banners, popups">🧹 Cleanup</button>
        <button id="chat-screenshot-btn" class="terminal-toolbar-btn" title="Take a screenshot">📸 Screenshot</button>
        <button id="chat-cookies-btn" class="terminal-toolbar-btn" title="Import cookies from your browser">🍪 Cookies</button>
      </div>
      <button class="terminal-toolbar-btn" id="terminal-restart-now" title="Restart Claude Code session">↻ Restart</button>
    </div>
    <div class="terminal-bootstrap" id="terminal-bootstrap">
      <div class="terminal-bootstrap-icon">▸</div>
-      <p id="terminal-bootstrap-status">Press any key to start Claude Code.</p>
+      <p id="terminal-bootstrap-status">Starting Claude Code...</p>
      <p class="muted" id="terminal-bootstrap-hint">Real PTY. Real terminal. Real claude.</p>
      <pre id="loading-debug" class="muted" style="font-size:11px; font-family:'JetBrains Mono',monospace; white-space:pre-wrap; margin-top:8px; color:#71717A;"></pre>
    </div>
    <div class="terminal-install-card" id="terminal-install-card" style="display:none">
      <p><strong>Claude Code not found</strong></p>
@@ -89,22 +60,6 @@
    </div>
  </main>
  <!-- Chat Tab (the existing claude -p one-shot chat path; preserved verbatim) -->
  <main id="tab-chat" class="tab-content" role="tabpanel" aria-label="Chat">
    <div class="chat-messages" id="chat-messages">
      <div class="chat-loading" id="chat-loading">
        <div class="chat-loading-spinner"></div>
        <p id="loading-status">Looking for browse server...</p>
        <pre id="loading-debug" class="muted" style="font-size:11px; font-family:'JetBrains Mono',monospace; white-space:pre-wrap; margin-top:8px; color:#71717A;"></pre>
      </div>
      <div class="chat-welcome" id="chat-welcome" style="display:none">
        <div class="chat-welcome-icon">G</div>
        <p>Send a message to Claude Code.</p>
        <p class="muted">Your agent will see it and act on it.</p>
      </div>
    </div>
  </main>
  <!-- Debug: Activity Tab (hidden by default) -->
  <main id="tab-activity" class="tab-content" role="log" aria-live="polite">
    <div class="empty-state" id="empty-state">
@@ -204,30 +159,10 @@
    </div>
  </main>
  <!-- Experimental chat banner (shown when chatEnabled) -->
  <div id="experimental-banner" class="experimental-banner" style="display: none;">
    Browser co-pilot &mdash; controls this browser, reports back to your workspace
  </div>
  <!-- Quick Actions Toolbar -->
  <div class="quick-actions" id="quick-actions">
    <button id="chat-cleanup-btn" class="quick-action-btn" title="Remove ads, banners, popups">🧹 Cleanup</button>
    <button id="chat-screenshot-btn" class="quick-action-btn" title="Take a screenshot">📸 Screenshot</button>
    <button id="chat-cookies-btn" class="quick-action-btn" title="Import cookies from your browser">🍪 Cookies</button>
  </div>
  <!-- Command Bar -->
  <div class="command-bar">
    <button class="stop-btn" id="stop-agent-btn" title="Stop agent" style="display: none;">&#x25A0;</button>
    <input type="text" class="command-input" id="command-input" placeholder="Ask about this page..." autocomplete="off" spellcheck="false">
    <button class="send-btn" id="send-btn" title="Send">&#x2191;</button>
  </div>
  <!-- Footer with connection + debug toggle -->
  <footer>
    <div class="footer-left">
      <button class="debug-toggle" id="debug-toggle" title="Toggle debug panels">debug</button>
      <button class="footer-btn" id="clear-chat" title="Clear chat">clear</button>
      <button class="footer-btn" id="reload-sidebar" title="Reload sidebar">reload</button>
    </div>
    <div class="footer-right">
--- a/extension/sidepanel.js
+++ b/extension/sidepanel.js