v1.16.0.0 feat: tunnel allowlist 17→26 + canDispatchOverTunnel pure function (#1253)

* feat: extend tunnel allowlist to 26 commands + extract canDispatchOverTunnel Adds newtab, tabs, back, forward, reload, snapshot, fill, url, closetab to TUNNEL_COMMANDS (matching what cli.ts and REMOTE_BROWSER_ACCESS.md already documented). Each new command is bounded by the existing per-tab ownership check at server.ts:613-624 — scoped tokens default to tabPolicy: 'own-only' so paired agents still can't operate on tabs they don't own. Refactors the inline gate check at server.ts:1771-1783 into a pure exported function canDispatchOverTunnel(command). Same behavior as the inline check; the difference is unit-testability without HTTP. Adds BROWSE_TUNNEL_LOCAL_ONLY=1 test-mode flag that binds the second Bun.serve listener with makeFetchHandler('tunnel') on 127.0.0.1 — no ngrok needed. Production tunnel still requires BROWSE_TUNNEL=1 + valid NGROK_AUTHTOKEN. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test: source-level guards + pure-function unit test + dual-listener behavioral eval Three layers of regression coverage for the tunnel allowlist: 1. dual-listener.test.ts: replaces must-include/must-exclude with exact-set equality on the 26-command literal (the prior intersection-only style let new commands sneak into the source without test updates). Adds a regex assertion that the `command !== 'newtab'` ownership exemption at server.ts:613 still exists — catches refactors that re-introduce the catch-22 from the other side. Updates the /command handler test to look for canDispatchOverTunnel(body?.command) instead of the inline check. 2. tunnel-gate-unit.test.ts (new): 53 expects covering all 26 allowed, 20 blocked, null/undefined/empty/non-string defensive handling, and alias canonicalization (e.g. 'set-content' resolves to 'load-html' which is correctly rejected since 'load-html' isn't tunnel-allowed). 3. pair-agent-tunnel-eval.test.ts (new): 4 behavioral tests that spawn the daemon under BROWSE_HEADLESS_SKIP=1 BROWSE_TUNNEL_LOCAL_ONLY=1, bind both listeners on 127.0.0.1, mint a scoped token via /pair → /connect, and assert: (a) newtab over tunnel passes the gate; (b) pair over tunnel 403s with disallowed_command:pair AND writes a denial-log entry; (c) pair over local does NOT trigger the tunnel gate (proves the gate is surface-scoped); (d) regression for the catch-22 — newtab + goto on the resulting tab does not 403 with "Tab not owned by your agent". All four tests run free under bun test (no API spend, no ngrok). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs: bump tunnel allowlist count 17 -> 26 in CLAUDE.md and REMOTE_BROWSER_ACCESS.md Both docs already named the 9 new commands as remote-accessible (the operator guide's per-command sections at lines 86-119 and 168, plus cli.ts:546-586's instruction blocks). The allowlist count was the only place the drift was visible. Also corrected REMOTE_BROWSER_ACCESS.md's denied-commands list: 'eval' is in the allowlist, not the denied list — prior doc was wrong. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: bump version and changelog (v1.21.0.0) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: re-version v1.21.0.0 -> v1.16.0.0 (lowest unclaimed slot) The previous bump landed at v1.21.0.0 because gstack-next-version advances past the highest claimed slot (v1.20.0.0 from #1252) rather than picking the lowest unclaimed. v1.16-v1.18 are unclaimed and v1.16.0.0 preserves monotonic version ordering on main once #1234 (v1.17), #1233 (v1.19), and #1252 (v1.20) merge after us. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(ci): version-gate enforces collisions, allows lower-but-unclaimed slots The gate was rejecting any PR VERSION below the util's next-slot recommendation, even when the lower slot was unclaimed. This blocked PRs that legitimately want to land at an unclaimed slot below the queue max — which is what /ship should pick when the goal is monotonic version ordering on main (lower-numbered PRs landing first preserves order; the util's "advance past max claimed" semantics only optimizes for fresh runs picking unique slots, not for queue ordering on merge). New gate logic: 1. Hard-fail if PR VERSION <= base VERSION (no actual bump). 2. Hard-fail if PR VERSION exactly matches another open PR's VERSION (real collision). 3. Pass otherwise. If the PR is below the util's suggestion, emit an informational ::notice:: explaining the slot is unclaimed. The util's output stays informational — it tells fresh /ship runs what the next-up slot should be, but the gate only blocks actual conflicts. This is a strict relaxation: every PR that passed the old gate also passes the new one. Confirmed by dry-run against the current queue (4 open PRs claiming 1.17.0.0, 1.19.0.0, 1.21.1.0, 1.22.0.0): - v1.16.0.0 → pass with informational notice (unclaimed) - v1.17.0.0 → fail (collision with #1234) - v1.15.0.0 → fail (no bump from base) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-17 17:51:27 +08:00 · 2026-04-28 00:57:28 -07:00
parent dde55103fc
commit 8f3701b761
10 changed files with 489 additions and 35 deletions
--- a/browse/src/server.ts
+++ b/browse/src/server.ts
@@ -108,13 +108,31 @@ const TUNNEL_PATHS = new Set<string>([
 * extension-inspector state. This allowlist maps to the eng-review decision
 * logged in the CEO plan for sec-wave v1.6.0.0.
 */
-const TUNNEL_COMMANDS = new Set<string>([
+export const TUNNEL_COMMANDS = new Set<string>([
+  // Original 17
  'goto', 'click', 'text', 'screenshot',
  'html', 'links', 'forms', 'accessibility',
  'attrs', 'media', 'data',
  'scroll', 'press', 'type', 'select', 'wait', 'eval',
+  // Tab + navigation primitives operator docs and CLI hints already promised
+  'newtab', 'tabs', 'back', 'forward', 'reload',
+  // Read/inspect/write operators paired agents need to be useful
+  'snapshot', 'fill', 'url', 'closetab',
 ]);

+/**
+ * Pure gate: returns true iff the command is reachable over the tunnel surface.
+ * Extracted from the inline /command handler so the gate logic is unit-testable
+ * without standing up an HTTP listener. Behavior is identical to the inline
+ * check; the function canonicalizes the command (so aliases hit the same set)
+ * and returns false for null/undefined input.
+ */
+export function canDispatchOverTunnel(command: string | undefined | null): boolean {
+  if (typeof command !== 'string' || command.length === 0) return false;
+  const cmd = canonicalizeCommand(command);
+  return TUNNEL_COMMANDS.has(cmd);
+}
+
 /**
 * Read ngrok authtoken from env var, ~/.gstack/ngrok.env, or ngrok's native
 * config files.  Returns null if nothing found.  Shared between the
@@ -1772,8 +1790,7 @@ async function start() {
        // Paired remote agents drive the browser but cannot configure the
        // daemon, launch new browsers, import cookies, or rotate tokens.
        if (surface === 'tunnel') {
-          const cmd = canonicalizeCommand(body?.command);
-          if (!cmd || !TUNNEL_COMMANDS.has(cmd)) {
+          if (!canDispatchOverTunnel(body?.command)) {
            logTunnelDenial(req, url, `disallowed_command:${body?.command}`);
            return new Response(JSON.stringify({
              error: `Command '${body?.command}' is not allowed over the tunnel surface`,
@@ -2060,6 +2077,29 @@ async function start() {
        tunnelListener = null;
      }
    }
+  } else if (process.env.BROWSE_TUNNEL_LOCAL_ONLY === '1') {
+    // Test-only: bind the dual-listener tunnel surface on 127.0.0.1 with NO
+    // ngrok forwarding. Lets paid evals exercise the surface==='tunnel' gate
+    // without an ngrok authtoken or live network. Production tunneling still
+    // requires BROWSE_TUNNEL=1 + a valid authtoken above.
+    try {
+      const boundTunnel = Bun.serve({
+        port: 0,
+        hostname: '127.0.0.1',
+        fetch: makeFetchHandler('tunnel'),
+      });
+      tunnelServer = boundTunnel;
+      tunnelActive = true;
+      const tunnelPort = boundTunnel.port;
+      console.log(`[browse] Tunnel listener bound (local-only test mode) on 127.0.0.1:${tunnelPort}`);
+      const stateContent = JSON.parse(fs.readFileSync(config.stateFile, 'utf-8'));
+      stateContent.tunnelLocalPort = tunnelPort;
+      const tmpState = config.stateFile + '.tmp';
+      fs.writeFileSync(tmpState, JSON.stringify(stateContent, null, 2), { mode: 0o600 });
+      fs.renameSync(tmpState, config.stateFile);
+    } catch (err: any) {
+      console.error(`[browse] BROWSE_TUNNEL_LOCAL_ONLY=1 listener bind failed: ${err.message}`);
+    }
  }
 }