feat(browse): Puppeteer parity — load-html, screenshot --selector, viewport --scale, file:// (v1.1.0.0) (#1062)

* feat(browse): TabSession loadedHtml + command aliases + DX polish primitives

Adds the foundation layer for Puppeteer-parity features:

- TabSession.loadedHtml + setTabContent/getLoadedHtml/clearLoadedHtml —
  enables load-html content to survive context recreation (viewport --scale)
  via in-memory replay. ASCII lifecycle diagram in the source explains the
  clear-before-navigation contract.

- COMMAND_ALIASES + canonicalizeCommand() helper — single source of truth
  for name aliases (setcontent / set-content / setContent → load-html),
  consumed by server dispatch and chain prevalidation.

- buildUnknownCommandError() pure function — rich error messages with
  Levenshtein-based "Did you mean" suggestions (distance ≤ 2, input
  length ≥ 4 to skip 2-letter noise) and NEW_IN_VERSION upgrade hints.

- load-html registered in WRITE_COMMANDS + SCOPE_WRITE so scoped write
  tokens can use it.

- screenshot and viewport descriptions updated for upcoming flags.

- New browse/test/dx-polish.test.ts (15 tests): alias canonicalization,
  Levenshtein threshold + alphabetical tiebreak, short-input guard,
  NEW_IN_VERSION upgrade hint, alias + scope integration invariants.

No consumers yet — pure additive foundation. Safe to bisect on its own.

* feat(browse): accept file:// in goto with smart cwd/home-relative parsing

Extends validateNavigationUrl to accept file:// URLs scoped to safe dirs
(cwd + TEMP_DIR) via the existing validateReadPath policy. The workhorse is a
new normalizeFileUrl() helper that handles non-standard relative forms BEFORE
the WHATWG URL parser sees them:

    file:///abs/path.html       → unchanged
    file://./docs/page.html     → file://<cwd>/docs/page.html
    file://~/Documents/page.html → file://<HOME>/Documents/page.html
    file://docs/page.html       → file://<cwd>/docs/page.html
    file://localhost/abs/path   → unchanged
    file://host.example.com/... → rejected (UNC/network)
    file:// and file:///        → rejected (would list a directory)

Host heuristic rejects segments with '.', ':', '\\', '%', IPv6 brackets, or
Windows drive-letter patterns — so file://docs.v1/page.html, file://127.0.0.1/x,
file://[::1]/x, and file://C:/Users/x are explicit errors.

Uses fileURLToPath() + pathToFileURL() from node:url (never string-concat) so
URL escapes like %20 decode correctly and Node rejects encoded-slash traversal
(%2F..%2F) outright.

Signature change: validateNavigationUrl now returns Promise<string> (the
normalized URL) instead of Promise<void>. Existing callers that ignore the
return value still compile — they just don't benefit from smart-parsing until
updated in follow-up commits. Callers will be migrated in the next few commits
(goto, diff, newTab, restoreState).

Rewrites the url-validation test file: updates existing tests for the new
return type, adds 20+ new tests covering every normalizeFileUrl shape variant,
URL-encoding edge cases, and path-traversal rejection.

References: codex consult v3 P1 findings on URL parser semantics and fileURLToPath.

* feat(browse): BrowserManager deviceScaleFactor + setContent replay + file:// plumbing

Three tightly-coupled changes to BrowserManager, all in service of the
Puppeteer-parity workflow:

1. deviceScaleFactor + currentViewport tracking. New private fields (default
   scale=1, viewport=1280x720) + setDeviceScaleFactor(scale, w, h) method.
   deviceScaleFactor is a context-level Playwright option — changing it
   requires recreateContext(). The method validates (finite number, 1-3 cap,
   headed-mode rejected), stores new values, calls recreateContext(), and
   rolls back the fields on failure so a bad call doesn't leave inconsistent
   state. Context options at all three sites (launch, recreate happy path,
   recreate fallback) now honor the stored values instead of hardcoding
   1280x720.

2. BrowserState.loadedHtml + loadedHtmlWaitUntil. saveState captures per-tab
   loadedHtml from the session; restoreState replays it via newSession.
   setTabContent() — NOT bare page.setContent() — so TabSession.loadedHtml
   is rehydrated and survives *subsequent* scale changes. In-memory only,
   never persisted to disk (HTML may contain secrets or customer data).

3. newTab + restoreState now consume validateNavigationUrl's normalized
   return value. file://./x, file://~/x, and bare-segment forms now take
   effect at every navigation site, not just the top-level goto command.

Together these enable: load-html → viewport --scale 2 → viewport --scale 1.5
→ screenshot, with content surviving both context recreations. Codex v2 P0
flagged that bare page.setContent in restoreState would lose content on the
second scale change — this commit implements the rehydration path.

References: codex v2 P0 (TabSession rehydration), codex v3 P1 (4-caller
return value), plan Feature 3 + Feature 4.

* feat(browse): load-html, screenshot --selector, viewport --scale, alias dispatch

Wires the new handlers and dispatch logic that the previous commits made
possible:

write-commands.ts
- New 'load-html' case: validateReadPath for safe-dir scoping, stat-based
  actionable errors (not found, directory, oversize), extension allowlist
  (.html/.htm/.xhtml/.svg), magic-byte sniff with UTF-8 BOM strip accepting
  any <[a-zA-Z!?] markup opener (not just <!doctype — bare fragments like
  <div>...</div> work for setContent), 50MB cap via GSTACK_BROWSE_MAX_HTML_BYTES
  override, frame-context rejection. Calls session.setTabContent() so replay
  metadata is rehydrated.
- viewport command extended: optional [<WxH>], optional [--scale <n>],
  scale-only variant reads current size via page.viewportSize(). Invalid
  scale (NaN, Infinity, empty, out of 1-3) throws with named value. Headed
  mode rejected explicitly.
- clearLoadedHtml() called BEFORE goto/back/forward/reload navigation
  (not after) so a timed-out goto post-commit doesn't leave stale metadata
  that could resurrect on a later context recreation. Codex v2 P1 catch.
- goto uses validateNavigationUrl's normalized return value.

meta-commands.ts
- screenshot --selector <css> flag: explicit element-screenshot form.
  Rejects alongside positional selector (both = error), preserves --clip
  conflict at line 161, composes with --base64 at lines 168-174.
- chain canonicalizes each step with canonicalizeCommand — step shape is
  now { rawName, name, args } so prevalidation, dispatch, WRITE_COMMANDS.has,
  watch blocking, and result labels all use canonical names while audit
  labels show 'rawName→name' when aliased. Codex v3 P2 catch — prior shape
  only canonicalized at prevalidation and diverged everywhere else.
- diff command consumes validateNavigationUrl return value for both URLs.

server.ts
- Command canonicalization inserted immediately after parse, before scope /
  watch / tab-ownership / content-wrapping checks. rawCommand preserved for
  future audit (not wired into audit log in this commit — follow-up).
- Unknown-command handler replaced with buildUnknownCommandError() from
  commands.ts — produces 'Unknown command: X. Did you mean Y?' with optional
  upgrade hint for NEW_IN_VERSION entries.

security-audit-r2.test.ts
- Updated chain-loop marker from 'for (const cmd of commands)' to
  'for (const c of commands)' to match the new chain step shape. Same
  isWatching + BLOCKED invariants still asserted.

* chore: bump version and changelog (v1.1.0.0)

- VERSION: 1.0.0.0 → 1.1.0.0 (MINOR bump — new user-facing commands)
- package.json: matching version bump
- CHANGELOG.md: new 1.1.0.0 entry describing load-html, screenshot --selector,
  viewport --scale, file:// support, setContent replay, and DX polish in user
  voice with a dedicated Security section for file:// safe-dirs policy
- browse/SKILL.md.tmpl: adds pattern #12 "Render local HTML", pattern #13
  "Retina screenshots", and a full Puppeteer → browse cheatsheet with side-by-
  side API mapping and a worked tweet-renderer migration example
- browse/SKILL.md + SKILL.md: regenerated from templates via `bun run gen:skill-docs`
  to reflect the new command descriptions

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: pre-landing review fixes (9 findings from specialist + adversarial review)

Adversarial review (Claude subagent + Codex) surfaced 9 bugs across
CRITICAL/HIGH severity. All fixed:

1. tab-session.ts:setTabContent — state mutation moved AFTER the setContent
   await. Prior order left phantom HTML in replay metadata if setContent
   threw (timeout, browser crash), which a later viewport --scale would
   silently replay. Now loadedHtml is only recorded on successful load.

2. browser-manager.ts:setDeviceScaleFactor — rollback now forces a second
   recreateContext after restoring the old fields. The fallback path in
   the original recreateContext builds a blank context using whatever
   this.deviceScaleFactor/currentViewport hold at that moment (which were
   the NEW values we were trying to apply). Rolling back the fields without
   a second recreate left the live context at new-scale while state tracked
   old-scale. Now: restore fields, force re-recreate with old values, only
   if that ALSO fails do we return a combined error.

3. commands.ts:buildUnknownCommandError — Levenshtein tiebreak simplified
   to 'd <= 2 && d < bestDist' (strict less). Candidates are pre-sorted
   alphabetically, so first equal-distance wins by default. The prior
   '(d === bestDist && best !== undefined && cand < best)' clause was dead
   code.

4. tab-session.ts:onMainFrameNavigated — now clears loadedHtml, not just
   refs + frame. Without this, a user who load-html'd then clicked a link
   (or had a form submit / JS redirect / OAuth flow) would retain the stale
   replay metadata. The next viewport --scale would silently revert the
   tab to the ORIGINAL loaded HTML, losing whatever the post-navigation
   content was. Silent data corruption. Browser-emitted navigations trigger
   this path via wirePageEvents.

5. browser-manager.ts:saveState + restoreState — tab ownership now flows
   through BrowserState.owner. Without this, a scoped agent's viewport
   --scale would strand them: tab IDs change during recreate, ownership
   map held stale IDs, owner lookup failed. New IDs had no owner, so
   writes without tabId were denied (DoS). Worse, if the agent sent a
   stale tabId the server's swallowed-tab-switch-error path would let the
   command hit whatever tab was currently active (cross-tab authz bypass).
   Now: clear ownership before restore, re-add per-tab with new IDs.

6. meta-commands.ts:state load — disk-loaded state.pages is now explicit
   allowlist (url, isActive, storage:null) instead of object spread.
   Spreading accepted loadedHtml, loadedHtmlWaitUntil, and owner from a
   user-writable state file, letting a tampered state.json smuggle HTML
   past load-html's safe-dirs / extension / magic-byte / 50MB-cap
   validators, or forge tab ownership. Now stripped at the boundary.

7. url-validation.ts:normalizeFileUrl — preserves query string + fragment
   across normalization. file://./app.html?route=home#login previously
   resolved to a filesystem path that URL-encoded '?' as %3F and '#' as
   %23, or (for absolute forms) pathToFileURL dropped them entirely. SPAs
   and fixture URLs with query params 404'd or loaded the wrong route.
   Now: split on ?/# before path resolution, reattach after.

8. url-validation.ts:validateNavigationUrl — reattaches parsed.search +
   parsed.hash to the normalized file:// URL. Same fix at the main
   validator for absolute paths that go through fileURLToPath round-trip.

9. server.ts:writeAuditEntry — audit entries now include aliasOf when the
   user typed an alias ('setcontent' → cmd: 'load-html', aliasOf:
   'setcontent'). Previously the isAliased variable was computed but
   dropped, losing the raw input from the forensic trail. Completes the
   plan's codex v3 P2 requirement.

Also added bm.getCurrentViewport() and switched 'viewport --scale'-
without-size to read from it (more reliable than page.viewportSize() on
headed/transition contexts).

Tests pass: exit 0, no failures. Build clean.

* test: integration coverage for load-html, screenshot --selector, viewport --scale, replay, aliases

Adds 28 Playwright-integration tests that close the coverage gap flagged
by the ship-workflow coverage audit (50% → expected ~80%+).

**load-html (12 tests):**
- happy path loads HTML file, page text matches
- bare HTML fragments (<div>...</div>) accepted, not just full documents
- missing file arg throws usage
- non-.html extension rejected by allowlist
- /etc/passwd.html rejected by safe-dirs policy
- ENOENT path rejected with actionable "not found" error
- directory target rejected
- binary file (PNG magic bytes) disguised as .html rejected by magic-byte check
- UTF-8 BOM stripped before magic-byte check — BOM-prefixed HTML accepted
- --wait-until networkidle exercises non-default branch
- invalid --wait-until value rejected
- unknown flag rejected

**screenshot --selector (5 tests):**
- --selector flag captures element, validates Screenshot saved (element)
- conflicts with positional selector (both = error)
- conflicts with --clip (mutually exclusive)
- composes with --base64 (returns data:image/png;base64,...)
- missing value throws usage

**viewport --scale (5 tests):**
- WxH --scale 2 produces PNG with 2x element dimensions (parses IHDR bytes 16-23)
- --scale without WxH keeps current size + applies scale
- non-finite value (abc) throws "not a finite number"
- out-of-range (4, 0.5) throws "between 1 and 3"
- missing value throws

**setContent replay across context recreation (3 tests):**
- load-html → viewport --scale 2: content survives (hits setTabContent replay path)
- double cycle 2x → 1.5x: content still survives (proves TabSession rehydration)
- goto after load-html clears replay: subsequent viewport --scale does NOT
  resurrect the stale HTML (validates the onMainFrameNavigated fix)

**Command aliases (2 tests):**
- setcontent routes to load-html via chain canonicalization
- set-content (hyphenated) also routes — both end-to-end through chain dispatch

Fixture paths use /tmp (SAFE_DIRECTORIES entry) instead of $TMPDIR which is
/var/folders/... on macOS and outside the safe-dirs boundary. Chain result
labels use rawName→name format when an alias is resolved (matches the
meta-commands.ts chain refactor).

Full suite: exit 0, 223/223 pass.

* docs: update BROWSER.md + CHANGELOG for v1.1.0.0

BROWSER.md:
- Command reference table updated: goto now lists file:// support,
  load-html added to Navigate row, viewport flagged with --scale
  option, screenshot row shows --selector + --base64 flags
- Screenshot modes table adds the fifth mode (element crop via
  --selector flag) and notes the tag-selector-not-caught-positionally
  gotcha
- New "Retina screenshots — viewport --scale" subsection explains
  deviceScaleFactor mechanics, context recreation side effects, and
  headed-mode rejection
- New "Loading local HTML — goto file:// vs load-html" subsection
  explains the two paths, their tradeoffs (URL state, relative asset
  resolution), the safe-dirs policy, extension allowlist + magic-byte
  sniff, 50MB cap, setContent replay across recreateContext, and the
  alias routing (setcontent → load-html before scope check)

CHANGELOG.md (v1.1.0.0 security section expanded, no existing content
removed):
- State files cannot smuggle HTML or forge tab ownership (allowlist
  on disk-loaded page fields)
- Audit log records aliasOf when a canonical command was reached via
  an alias (setcontent → load-html)
- load-html content clears on real navigations (clicks, form submits,
  JS redirects) — not just explicit goto. Also notes SPA query/fragment
  preservation for goto file://

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Garry Tan
2026-04-18 23:25:33 +08:00
committed by GitHub
parent 4d2c8d94d0
commit c15b805cd8
20 changed files with 1439 additions and 92 deletions

View File

@@ -2088,3 +2088,340 @@ describe('Frame', () => {
await handleMetaCommand('frame', ['main'], bm, async () => {});
});
});
// ─── load-html ─────────────────────────────────────────────────
describe('load-html', () => {
const tmpDir = '/tmp';
const fixturePath = path.join(tmpDir, `browse-test-loadhtml-${Date.now()}.html`);
const fragmentPath = path.join(tmpDir, `browse-test-fragment-${Date.now()}.html`);
beforeAll(() => {
fs.writeFileSync(fixturePath, '<html><body><h1 id="loaded">loaded by load-html</h1></body></html>');
fs.writeFileSync(fragmentPath, '<div class="fragment" style="width:100px;height:50px">fragment</div>');
});
afterAll(() => {
try { fs.unlinkSync(fixturePath); } catch {}
try { fs.unlinkSync(fragmentPath); } catch {}
});
test('load-html loads HTML file into page', async () => {
const result = await handleWriteCommand('load-html', [fixturePath], bm);
expect(result).toContain('Loaded HTML:');
expect(result).toContain(fixturePath);
const text = await handleReadCommand('text', [], bm);
expect(text).toContain('loaded by load-html');
});
test('load-html accepts bare HTML fragments (no doctype)', async () => {
const result = await handleWriteCommand('load-html', [fragmentPath], bm);
expect(result).toContain('Loaded HTML:');
const html = await handleReadCommand('html', [], bm);
expect(html).toContain('fragment');
});
test('load-html rejects missing file arg', async () => {
try {
await handleWriteCommand('load-html', [], bm);
expect(true).toBe(false);
} catch (err: any) {
expect(err.message).toMatch(/Usage: browse load-html/);
}
});
test('load-html rejects non-.html extension', async () => {
const txtPath = path.join(tmpDir, `load-html-test-${Date.now()}.txt`);
fs.writeFileSync(txtPath, '<html></html>');
try {
await handleWriteCommand('load-html', [txtPath], bm);
expect(true).toBe(false);
} catch (err: any) {
expect(err.message).toMatch(/does not appear to be HTML/);
} finally {
try { fs.unlinkSync(txtPath); } catch {}
}
});
test('load-html rejects file outside safe dirs', async () => {
try {
await handleWriteCommand('load-html', ['/etc/passwd.html'], bm);
expect(true).toBe(false);
} catch (err: any) {
expect(err.message).toMatch(/must be under|not found|security policy/);
}
});
test('load-html rejects missing file with actionable error', async () => {
try {
await handleWriteCommand('load-html', [path.join(tmpDir, 'does-not-exist.html')], bm);
expect(true).toBe(false);
} catch (err: any) {
expect(err.message).toMatch(/not found|security policy/);
}
});
test('load-html rejects directory target', async () => {
try {
await handleWriteCommand('load-html', [path.join(tmpDir, 'browse-test-notafile.html') + '/'], bm);
expect(true).toBe(false);
} catch (err: any) {
// Either "not found" or "is a directory" — both valid rejections
expect(err.message).toMatch(/not found|directory|not a regular file|security policy/);
}
});
test('load-html rejects binary content disguised as .html', async () => {
const binPath = path.join(tmpDir, `load-html-binary-${Date.now()}.html`);
// PNG magic bytes: 0x89 0x50 0x4E 0x47
fs.writeFileSync(binPath, Buffer.from([0x89, 0x50, 0x4E, 0x47, 0x0D, 0x0A, 0x1A, 0x0A]));
try {
await handleWriteCommand('load-html', [binPath], bm);
expect(true).toBe(false);
} catch (err: any) {
expect(err.message).toMatch(/does not look like HTML/);
} finally {
try { fs.unlinkSync(binPath); } catch {}
}
});
test('load-html strips UTF-8 BOM before magic-byte check', async () => {
const bomPath = path.join(tmpDir, `load-html-bom-${Date.now()}.html`);
const bomBytes = Buffer.from([0xEF, 0xBB, 0xBF]);
fs.writeFileSync(bomPath, Buffer.concat([bomBytes, Buffer.from('<html><body>bom ok</body></html>')]));
try {
const result = await handleWriteCommand('load-html', [bomPath], bm);
expect(result).toContain('Loaded HTML:');
} finally {
try { fs.unlinkSync(bomPath); } catch {}
}
});
test('load-html --wait-until networkidle exercises non-default branch', async () => {
const result = await handleWriteCommand('load-html', [fixturePath, '--wait-until', 'networkidle'], bm);
expect(result).toContain('Loaded HTML:');
});
test('load-html rejects invalid --wait-until value', async () => {
try {
await handleWriteCommand('load-html', [fixturePath, '--wait-until', 'bogus'], bm);
expect(true).toBe(false);
} catch (err: any) {
expect(err.message).toMatch(/Invalid --wait-until/);
}
});
test('load-html rejects unknown flag', async () => {
try {
await handleWriteCommand('load-html', [fixturePath, '--bogus'], bm);
expect(true).toBe(false);
} catch (err: any) {
expect(err.message).toMatch(/Unknown flag/);
}
});
});
// ─── screenshot --selector ─────────────────────────────────────
describe('screenshot --selector', () => {
test('--selector flag with output path captures element', async () => {
await handleWriteCommand('goto', [baseUrl + '/basic.html'], bm);
const p = `/tmp/browse-test-selector-${Date.now()}.png`;
const result = await handleMetaCommand('screenshot', ['--selector', '#title', p], bm, async () => {});
expect(result).toContain('Screenshot saved (element)');
expect(fs.existsSync(p)).toBe(true);
fs.unlinkSync(p);
});
test('--selector conflicts with positional selector', async () => {
await handleWriteCommand('goto', [baseUrl + '/basic.html'], bm);
try {
await handleMetaCommand('screenshot', ['--selector', '#title', '.other'], bm, async () => {});
expect(true).toBe(false);
} catch (err: any) {
expect(err.message).toMatch(/conflicts with positional selector/);
}
});
test('--selector conflicts with --clip', async () => {
await handleWriteCommand('goto', [baseUrl + '/basic.html'], bm);
try {
await handleMetaCommand('screenshot', ['--selector', '#title', '--clip', '0,0,100,100'], bm, async () => {});
expect(true).toBe(false);
} catch (err: any) {
expect(err.message).toMatch(/Cannot use --clip with a selector/);
}
});
test('--selector with --base64 returns element base64', async () => {
await handleWriteCommand('goto', [baseUrl + '/basic.html'], bm);
const result = await handleMetaCommand('screenshot', ['--selector', '#title', '--base64'], bm, async () => {});
expect(result).toMatch(/^data:image\/png;base64,/);
});
test('--selector missing value throws', async () => {
await handleWriteCommand('goto', [baseUrl + '/basic.html'], bm);
try {
await handleMetaCommand('screenshot', ['--selector'], bm, async () => {});
expect(true).toBe(false);
} catch (err: any) {
expect(err.message).toMatch(/Usage: screenshot --selector/);
}
});
});
// ─── viewport --scale ───────────────────────────────────────────
describe('viewport --scale', () => {
test('viewport WxH --scale 2 produces 2x dimension screenshot', async () => {
const tmpFix = path.join('/tmp', `scale-${Date.now()}.html`);
fs.writeFileSync(tmpFix, '<div id="box" style="width:100px;height:50px;background:#f00"></div>');
try {
await handleWriteCommand('viewport', ['200x200', '--scale', '2'], bm);
await handleWriteCommand('load-html', [tmpFix], bm);
const p = `/tmp/scale-${Date.now()}.png`;
await handleMetaCommand('screenshot', ['--selector', '#box', p], bm, async () => {});
// Parse PNG IHDR (bytes 16-23 are width/height big-endian u32)
const buf = fs.readFileSync(p);
const w = buf.readUInt32BE(16);
const h = buf.readUInt32BE(20);
// Box is 100x50 at 2x = 200x100
expect(w).toBe(200);
expect(h).toBe(100);
fs.unlinkSync(p);
// Reset scale for other tests
await handleWriteCommand('viewport', ['1280x720', '--scale', '1'], bm);
} finally {
try { fs.unlinkSync(tmpFix); } catch {}
}
});
test('viewport --scale without WxH keeps current size', async () => {
await handleWriteCommand('viewport', ['800x600'], bm);
const result = await handleWriteCommand('viewport', ['--scale', '2'], bm);
expect(result).toContain('800x600');
expect(result).toContain('2x');
expect(bm.getDeviceScaleFactor()).toBe(2);
await handleWriteCommand('viewport', ['1280x720', '--scale', '1'], bm);
});
test('--scale non-finite (NaN) throws', async () => {
try {
await handleWriteCommand('viewport', ['100x100', '--scale', 'abc'], bm);
expect(true).toBe(false);
} catch (err: any) {
expect(err.message).toMatch(/not a finite number/);
}
});
test('--scale out of range throws', async () => {
try {
await handleWriteCommand('viewport', ['100x100', '--scale', '4'], bm);
expect(true).toBe(false);
} catch (err: any) {
expect(err.message).toMatch(/between 1 and 3/);
}
try {
await handleWriteCommand('viewport', ['100x100', '--scale', '0.5'], bm);
expect(true).toBe(false);
} catch (err: any) {
expect(err.message).toMatch(/between 1 and 3/);
}
});
test('--scale missing value throws', async () => {
try {
await handleWriteCommand('viewport', ['--scale'], bm);
expect(true).toBe(false);
} catch (err: any) {
expect(err.message).toMatch(/missing value/);
}
});
test('viewport with neither arg nor flag throws usage', async () => {
try {
await handleWriteCommand('viewport', [], bm);
expect(true).toBe(false);
} catch (err: any) {
expect(err.message).toMatch(/Usage: browse viewport/);
}
});
});
// ─── setContent replay across context recreation ────────────────
describe('setContent replay (load-html survives viewport --scale)', () => {
const tmpDir = '/tmp';
test('load-html → viewport --scale 2 → content survives', async () => {
const fix = path.join(tmpDir, `replay-${Date.now()}.html`);
fs.writeFileSync(fix, '<h1 id="marker">replay-test-marker</h1>');
try {
await handleWriteCommand('load-html', [fix], bm);
await handleWriteCommand('viewport', ['400x300', '--scale', '2'], bm);
const text = await handleReadCommand('text', [], bm);
expect(text).toContain('replay-test-marker');
await handleWriteCommand('viewport', ['1280x720', '--scale', '1'], bm);
} finally {
try { fs.unlinkSync(fix); } catch {}
}
});
test('double scale cycle: 2x → 1.5x, content still survives', async () => {
const fix = path.join(tmpDir, `replay2-${Date.now()}.html`);
fs.writeFileSync(fix, '<h2 id="m">double-cycle-marker</h2>');
try {
await handleWriteCommand('load-html', [fix], bm);
await handleWriteCommand('viewport', ['400x300', '--scale', '2'], bm);
await handleWriteCommand('viewport', ['400x300', '--scale', '1.5'], bm);
const text = await handleReadCommand('text', [], bm);
expect(text).toContain('double-cycle-marker');
await handleWriteCommand('viewport', ['1280x720', '--scale', '1'], bm);
} finally {
try { fs.unlinkSync(fix); } catch {}
}
});
test('goto clears loadedHtml — subsequent viewport --scale does NOT resurrect old HTML', async () => {
const fix = path.join(tmpDir, `clear-${Date.now()}.html`);
fs.writeFileSync(fix, '<div id="stale">stale-content</div>');
try {
await handleWriteCommand('load-html', [fix], bm);
await handleWriteCommand('goto', [baseUrl + '/basic.html'], bm);
await handleWriteCommand('viewport', ['400x300', '--scale', '2'], bm);
const text = await handleReadCommand('text', [], bm);
// Should see basic.html content, NOT the stale load-html content
expect(text).not.toContain('stale-content');
await handleWriteCommand('viewport', ['1280x720', '--scale', '1'], bm);
} finally {
try { fs.unlinkSync(fix); } catch {}
}
});
});
// ─── Alias routing ─────────────────────────────────────────────
describe('Command aliases', () => {
const tmpDir = '/tmp';
const aliasFix = path.join(tmpDir, `alias-${Date.now()}.html`);
beforeAll(() => {
fs.writeFileSync(aliasFix, '<p id="alias">alias routing ok</p>');
});
afterAll(() => {
try { fs.unlinkSync(aliasFix); } catch {}
});
test('setcontent alias routes to load-html via chain', async () => {
// Chain canonicalizes aliases end-to-end; verifies the dispatch path
const result = await handleMetaCommand('chain', [JSON.stringify([['setcontent', aliasFix]])], bm, async () => {});
expect(result).toContain('Loaded HTML:');
const text = await handleReadCommand('text', [], bm);
expect(text).toContain('alias routing ok');
});
test('set-content (hyphenated) alias also routes', async () => {
const result = await handleMetaCommand('chain', [JSON.stringify([['set-content', aliasFix]])], bm, async () => {});
expect(result).toContain('Loaded HTML:');
});
});

View File

@@ -0,0 +1,101 @@
import { describe, it, expect } from 'bun:test';
import {
canonicalizeCommand,
COMMAND_ALIASES,
NEW_IN_VERSION,
buildUnknownCommandError,
ALL_COMMANDS,
} from '../src/commands';
describe('canonicalizeCommand', () => {
it('resolves setcontent → load-html', () => {
expect(canonicalizeCommand('setcontent')).toBe('load-html');
});
it('resolves set-content → load-html', () => {
expect(canonicalizeCommand('set-content')).toBe('load-html');
});
it('resolves setContent → load-html (case-sensitive key)', () => {
expect(canonicalizeCommand('setContent')).toBe('load-html');
});
it('passes canonical names through unchanged', () => {
expect(canonicalizeCommand('load-html')).toBe('load-html');
expect(canonicalizeCommand('goto')).toBe('goto');
});
it('passes unknown names through unchanged (alias map is allowlist, not filter)', () => {
expect(canonicalizeCommand('totally-made-up')).toBe('totally-made-up');
});
});
describe('buildUnknownCommandError', () => {
it('names the input in every error', () => {
const msg = buildUnknownCommandError('xyz', ALL_COMMANDS);
expect(msg).toContain(`Unknown command: 'xyz'`);
});
it('suggests closest match within Levenshtein 2 when input length >= 4', () => {
const msg = buildUnknownCommandError('load-htm', ALL_COMMANDS);
expect(msg).toContain(`Did you mean 'load-html'?`);
});
it('does NOT suggest for short inputs (< 4 chars, avoids noise on js/is typos)', () => {
// 'j' is distance 1 from 'js' but only 1 char — suggestion would be noisy
const msg = buildUnknownCommandError('j', ALL_COMMANDS);
expect(msg).not.toContain('Did you mean');
});
it('uses alphabetical tiebreak for deterministic suggestions', () => {
// Synthetic command set where two commands tie on distance from input
const syntheticSet = new Set(['alpha', 'beta']);
// 'alpha' vs 'delta' = 3 edits; 'beta' vs 'delta' = 2 edits
// Let's use a case that genuinely ties.
const ties = new Set(['abcd', 'abce']); // both distance 1 from 'abcf'
const msg = buildUnknownCommandError('abcf', ties, {}, {});
// Alphabetical first: 'abcd' comes before 'abce'
expect(msg).toContain(`Did you mean 'abcd'?`);
});
it('appends upgrade hint when command appears in NEW_IN_VERSION', () => {
// Synthetic: pretend load-html isn't in the command set (agent on older build)
const noLoadHtml = new Set([...ALL_COMMANDS].filter(c => c !== 'load-html'));
const msg = buildUnknownCommandError('load-html', noLoadHtml, COMMAND_ALIASES, NEW_IN_VERSION);
expect(msg).toContain('added in browse v');
expect(msg).toContain('Upgrade:');
});
it('omits upgrade hint for unknown commands not in NEW_IN_VERSION', () => {
const msg = buildUnknownCommandError('notarealcommand', ALL_COMMANDS);
expect(msg).not.toContain('added in browse v');
});
it('NEW_IN_VERSION has load-html entry', () => {
expect(NEW_IN_VERSION['load-html']).toBeTruthy();
});
it('COMMAND_ALIASES + command set are consistent — all alias targets exist', () => {
for (const target of Object.values(COMMAND_ALIASES)) {
expect(ALL_COMMANDS.has(target)).toBe(true);
}
});
});
describe('Alias + SCOPE_WRITE integration invariant', () => {
it('load-html is in SCOPE_WRITE (alias canonicalization happens before scope check)', async () => {
const { SCOPE_WRITE } = await import('../src/token-registry');
expect(SCOPE_WRITE.has('load-html')).toBe(true);
});
it('setcontent is NOT directly in any scope set (must canonicalize first)', async () => {
const { SCOPE_WRITE, SCOPE_READ, SCOPE_ADMIN, SCOPE_CONTROL } = await import('../src/token-registry');
// The alias itself must NOT appear in any scope set — only the canonical form.
// This proves scope enforcement relies on canonicalization at dispatch time,
// not on the alias leaking through as an acceptable command.
expect(SCOPE_WRITE.has('setcontent')).toBe(false);
expect(SCOPE_READ.has('setcontent')).toBe(false);
expect(SCOPE_ADMIN.has('setcontent')).toBe(false);
expect(SCOPE_CONTROL.has('setcontent')).toBe(false);
});
});

View File

@@ -392,12 +392,13 @@ describe('frame --url ReDoS fix', () => {
describe('chain command watch-mode guard', () => {
it('chain loop contains isWatching() guard before write dispatch', () => {
const block = sliceBetween(META_SRC, 'for (const cmd of commands)', 'Wait for network to settle');
// Post-alias refactor: loop iterates over canonicalized `c of commands`.
const block = sliceBetween(META_SRC, 'for (const c of commands)', 'Wait for network to settle');
expect(block).toContain('isWatching');
});
it('chain loop BLOCKED message appears for write commands in watch mode', () => {
const block = sliceBetween(META_SRC, 'for (const cmd of commands)', 'Wait for network to settle');
const block = sliceBetween(META_SRC, 'for (const c of commands)', 'Wait for network to settle');
expect(block).toContain('BLOCKED: write commands disabled in watch mode');
});
});

View File

@@ -1,29 +1,50 @@
import { describe, it, expect } from 'bun:test';
import { validateNavigationUrl } from '../src/url-validation';
import { validateNavigationUrl, normalizeFileUrl } from '../src/url-validation';
import * as fs from 'fs';
import * as path from 'path';
import { TEMP_DIR } from '../src/platform';
describe('validateNavigationUrl', () => {
it('allows http URLs', async () => {
await expect(validateNavigationUrl('http://example.com')).resolves.toBeUndefined();
await expect(validateNavigationUrl('http://example.com')).resolves.toBe('http://example.com');
});
it('allows https URLs', async () => {
await expect(validateNavigationUrl('https://example.com/path?q=1')).resolves.toBeUndefined();
await expect(validateNavigationUrl('https://example.com/path?q=1')).resolves.toBe('https://example.com/path?q=1');
});
it('allows localhost', async () => {
await expect(validateNavigationUrl('http://localhost:3000')).resolves.toBeUndefined();
await expect(validateNavigationUrl('http://localhost:3000')).resolves.toBe('http://localhost:3000');
});
it('allows 127.0.0.1', async () => {
await expect(validateNavigationUrl('http://127.0.0.1:8080')).resolves.toBeUndefined();
await expect(validateNavigationUrl('http://127.0.0.1:8080')).resolves.toBe('http://127.0.0.1:8080');
});
it('allows private IPs', async () => {
await expect(validateNavigationUrl('http://192.168.1.1')).resolves.toBeUndefined();
await expect(validateNavigationUrl('http://192.168.1.1')).resolves.toBe('http://192.168.1.1');
});
it('blocks file:// scheme', async () => {
await expect(validateNavigationUrl('file:///etc/passwd')).rejects.toThrow(/scheme.*not allowed/i);
it('rejects file:// paths outside safe dirs (cwd + TEMP_DIR)', async () => {
// file:// is accepted as a scheme now, but safe-dirs policy blocks /etc/passwd.
await expect(validateNavigationUrl('file:///etc/passwd')).rejects.toThrow(/Path must be within/i);
});
it('accepts file:// for files under TEMP_DIR', async () => {
const tmpHtml = path.join(TEMP_DIR, `browse-test-${Date.now()}.html`);
fs.writeFileSync(tmpHtml, '<html><body>ok</body></html>');
try {
const result = await validateNavigationUrl(`file://${tmpHtml}`);
// Result should be a canonical file:// URL (pathToFileURL form)
expect(result.startsWith('file://')).toBe(true);
expect(result.toLowerCase()).toContain('browse-test-');
} finally {
fs.unlinkSync(tmpHtml);
}
});
it('rejects unsupported file URL host (UNC/network paths)', async () => {
await expect(validateNavigationUrl('file://host.example.com/foo.html')).rejects.toThrow(/Unsupported file URL host/i);
});
it('blocks javascript: scheme', async () => {
@@ -79,11 +100,11 @@ describe('validateNavigationUrl', () => {
});
it('does not block hostnames starting with fd (e.g. fd.example.com)', async () => {
await expect(validateNavigationUrl('https://fd.example.com/')).resolves.toBeUndefined();
await expect(validateNavigationUrl('https://fd.example.com/')).resolves.toBe('https://fd.example.com/');
});
it('does not block hostnames starting with fc (e.g. fcustomer.com)', async () => {
await expect(validateNavigationUrl('https://fcustomer.com/')).resolves.toBeUndefined();
await expect(validateNavigationUrl('https://fcustomer.com/')).resolves.toBe('https://fcustomer.com/');
});
it('throws on malformed URLs', async () => {
@@ -92,8 +113,8 @@ describe('validateNavigationUrl', () => {
});
describe('validateNavigationUrl — restoreState coverage', () => {
it('blocks file:// URLs that could appear in saved state', async () => {
await expect(validateNavigationUrl('file:///etc/passwd')).rejects.toThrow(/scheme.*not allowed/i);
it('blocks file:// URLs outside safe dirs that could appear in saved state', async () => {
await expect(validateNavigationUrl('file:///etc/passwd')).rejects.toThrow(/Path must be within/i);
});
it('blocks chrome:// URLs that could appear in saved state', async () => {
@@ -105,10 +126,98 @@ describe('validateNavigationUrl — restoreState coverage', () => {
});
it('allows normal https URLs from saved state', async () => {
await expect(validateNavigationUrl('https://example.com/page')).resolves.toBeUndefined();
await expect(validateNavigationUrl('https://example.com/page')).resolves.toBe('https://example.com/page');
});
it('allows localhost URLs from saved state', async () => {
await expect(validateNavigationUrl('http://localhost:3000/app')).resolves.toBeUndefined();
await expect(validateNavigationUrl('http://localhost:3000/app')).resolves.toBe('http://localhost:3000/app');
});
});
describe('normalizeFileUrl', () => {
const cwd = process.cwd();
it('passes through absolute file:/// URLs unchanged', () => {
expect(normalizeFileUrl('file:///tmp/page.html')).toBe('file:///tmp/page.html');
});
it('expands file://./<rel> to absolute file://<cwd>/<rel>', () => {
const result = normalizeFileUrl('file://./docs/page.html');
expect(result.startsWith('file://')).toBe(true);
expect(result).toContain(cwd.replace(/\\/g, '/'));
expect(result.endsWith('/docs/page.html')).toBe(true);
});
it('expands file://~/<rel> to absolute file://<homedir>/<rel>', () => {
const result = normalizeFileUrl('file://~/Documents/page.html');
expect(result.startsWith('file://')).toBe(true);
expect(result.endsWith('/Documents/page.html')).toBe(true);
});
it('expands file://<simple-segment>/<rest> to cwd-relative', () => {
const result = normalizeFileUrl('file://docs/page.html');
expect(result.startsWith('file://')).toBe(true);
expect(result).toContain(cwd.replace(/\\/g, '/'));
expect(result.endsWith('/docs/page.html')).toBe(true);
});
it('passes through file://localhost/<abs> unchanged', () => {
expect(normalizeFileUrl('file://localhost/tmp/page.html')).toBe('file://localhost/tmp/page.html');
});
it('rejects empty file:// URL', () => {
expect(() => normalizeFileUrl('file://')).toThrow(/is empty/i);
});
it('rejects file:/// with no path', () => {
expect(() => normalizeFileUrl('file:///')).toThrow(/no path/i);
});
it('rejects file://./ (directory listing)', () => {
expect(() => normalizeFileUrl('file://./')).toThrow(/current directory/i);
});
it('rejects dotted host-like segment file://docs.v1/page.html', () => {
expect(() => normalizeFileUrl('file://docs.v1/page.html')).toThrow(/Unsupported file URL host/i);
});
it('rejects IP-like host file://127.0.0.1/foo', () => {
expect(() => normalizeFileUrl('file://127.0.0.1/tmp/x')).toThrow(/Unsupported file URL host/i);
});
it('rejects IPv6 host file://[::1]/foo', () => {
expect(() => normalizeFileUrl('file://[::1]/tmp/x')).toThrow(/Unsupported file URL host/i);
});
it('rejects Windows drive letter file://C:/Users/x', () => {
expect(() => normalizeFileUrl('file://C:/Users/x')).toThrow(/Unsupported file URL host/i);
});
it('passes through non-file URLs', () => {
expect(normalizeFileUrl('https://example.com')).toBe('https://example.com');
});
});
describe('validateNavigationUrl — file:// URL-encoding', () => {
it('decodes %20 via fileURLToPath (space in filename)', async () => {
const tmpHtml = path.join(TEMP_DIR, `hello world ${Date.now()}.html`);
fs.writeFileSync(tmpHtml, '<html>ok</html>');
try {
// Build an escaped file:// URL and verify it validates against the actual path
const encodedPath = tmpHtml.split('/').map(encodeURIComponent).join('/');
const url = `file://${encodedPath}`;
const result = await validateNavigationUrl(url);
expect(result.startsWith('file://')).toBe(true);
} finally {
fs.unlinkSync(tmpHtml);
}
});
it('rejects path traversal via encoded slash (file:///tmp/safe%2F..%2Fetc/passwd)', async () => {
// Node's fileURLToPath rejects encoded slashes outright with a clear error.
// Either "encoded /" rejection OR "Path must be within" safe-dirs rejection is acceptable.
await expect(
validateNavigationUrl('file:///tmp/safe%2F..%2Fetc/passwd')
).rejects.toThrow(/encoded \/|Path must be within/i);
});
});