v1.38.0.0 fix wave: Windows install hardening + Unicode sanitization at server egress (4 community PRs) (#1505)

* fix(browse): single-point Unicode sanitization at server egress

Add sanitizeLoneSurrogates (regex-based UTF-16 lone-half cleaner) and
sanitizeReplacer (JSON.stringify replacer that runs the cleaner on every
string field during encoding).

Split handleCommandInternal into handleCommandInternalImpl (raw) plus a
thin sanitizing wrapper. The wrapper applies sanitizeLoneSurrogates to
cr.result so both single-command (handleCommand line 1034) and batch-loop
(line 1966) egress paths inherit it. Inline INVARIANT comment near the
wrapper documents the architectural constraint.

Both SSE producers (activity feed at /activity/stream and inspector
stream) stringify with sanitizeReplacer. Post-stringify regex is
ineffective on those paths because JSON.stringify has already converted
the lone surrogate into the escape sequence "\\\\uD800" before any regex
could match it; the replacer runs during stringify on the raw string
value, so the substitution lands.

Originated from @realcarsonterry PR #1463 (handleCommand-only wrap).
Architectural lift to handleCommandInternal + SSE coverage authored on
this branch.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(setup): _link_or_copy helper for Windows file-copy fallback

On Windows without Developer Mode (MSYS2/Git Bash), plain ln -snf
silently creates a frozen file copy that doesn't refresh on git pull.
Skill files become stale after every upgrade.

Add a _link_or_copy SRC DST helper near IS_WINDOWS detection (line ~33).
It auto-dispatches: on Unix it preserves ln -snf semantics, on Windows
it copies (cp -R for directories, cp -f for files). When the source is
a Unix-style name-only alias that doesn't resolve on disk (the
connect-chrome → gstack/open-gstack-browser pattern), the helper
returns 0 silently on Windows rather than aborting setup under set -e.

Rewrite all 42 prior ln -snf call sites to route through the helper:
link_claude_skill_dirs (line 437), team-claude install paths (lines 556,
581, 592), Codex host adapter block (lines 618-640), Factory host
adapter block (lines 658-678), OpenCode host adapter block (lines
696-731), Kiro host adapter block (lines 939-953), plus migration and
alias sites.

Add _print_windows_copy_note_once helper and call it from
link_claude_skill_dirs after any linking work completes so Windows
users see one user-visible note explaining they must re-run ./setup
after every git pull.

Extend cleanup_old_claude_symlinks and cleanup_prefixed_claude_symlinks
with a Windows branch: when the target is a real directory containing a
real-file SKILL.md (no symlink to readlink), and IS_WINDOWS=1, treat
the name-matched directory as gstack-managed and remove it. This makes
--prefix / --no-prefix flips work on Windows instead of leaving stale
copies behind.

Originated from @realcarsonterry PR #1462 (1 of 42 sites). Helper
extraction, 42-site rewrite, alias-resolution edge case, and Windows
cleanup compat authored on this branch.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(docs): rename stale gbrain_sync_mode to artifacts_sync_mode + register /document-generate

Five stale gstack-config references in docs/ pointed to the deprecated
gbrain_sync_mode key (renamed to artifacts_sync_mode in v1.27.0.0):
- docs/gbrain-sync.md: lines 62, 110, 111, 173
- docs/gbrain-sync-errors.md: lines 26, 203

Users following the docs would set a key that gstack-brain-sync no
longer reads, silently breaking artifacts sync.

Originated from @realcarsonterry PR #1461 (verbatim).

Also register /document-generate in AGENTS.md (Operational + memory
table) and docs/skills.md (skill index). The skill shipped in v1.35.0.0
but the doc-inventory cross-check in test/skill-validation.test.ts was
failing because neither file mentioned it.

Allowlist the new test/docs-config-keys.test.ts file in
test/no-stale-gstack-brain-refs.test.ts — it intentionally lists the
deprecated keys in its DEPRECATED_KEYS denylist (defending the rename).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* ci(windows): migrate windows-free-tests to paid faster runner + register wave tests

Move the Windows free-test job from GitHub-hosted windows-latest to
Blacksmith's paid Windows runner (blacksmith-2vcpu-windows-2022).
Spin-up drops from ~60s to ~10s and Bun installs land 3-4x faster. The
label can swap to namespace-profile-windows or ubicloud-windows-* if
this repo's Blacksmith installation isn't configured.

Register the four new wave tests in the workflow's curated test list:
  - browse/test/server-sanitize-surrogates.test.ts
  - test/setup-windows-fallback.test.ts
  - test/build-script-shell-compat.test.ts
  - test/docs-config-keys.test.ts

These tests cover the Windows-hardening surface that this wave ships
(sanitizer wiring, _link_or_copy helper, build-script subshells, doc-
config drift), so they need to run on Windows where the bug shapes
actually manifest.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* test: wave coverage for sanitizer, link_or_copy, build script, doc drift

Four new test files (29 cases total):

browse/test/server-sanitize-surrogates.test.ts:
  - 11 unit cases for sanitizeLoneSurrogates (passthrough, valid pair,
    lone high/low mid-string, trailing/leading lone, adjacent doubles,
    pair-then-lone, lone-then-pair, empty)
  - 2 bug-repro tests pinning the regression intent (UTF-8 round-trip,
    JSON.parse round-trip with codepoint assertion)
  - 4 wiring invariants asserting the architectural choke points stay
    intact (handleCommandInternalImpl rename, central sanitization
    line, sanitizeReplacer function exists, SSE producers stringify
    with replacer)
  Function extracted from server.ts via regex + eval'd in test scope
  so no production-code export is needed.

test/setup-windows-fallback.test.ts:
  - Static invariant (D7): zero raw `ln` calls outside the
    _link_or_copy helper body and comments
  - Helper-existence assertions
  - 4-cell behavior matrix (file/dir × Windows/Unix) via awk-style
    helper extraction + bash -c sourcing
  - Windows-note printer registration check
  Mirrors test/setup-conductor-worktree.test.ts patterns.

test/build-script-shell-compat.test.ts:
  - Regex assertion that package.json scripts.* contain no bash brace
    groups (Bun-Windows-hostile)
  - Subshell-precedence check for `.version` redirects
  Strips single-quoted strings before regexing so embedded JS code
  inside echo '...' doesn't false-positive.

test/docs-config-keys.test.ts:
  - DEPRECATED_KEYS denylist scanned across docs/**/*.md
  - Round-trip test for `gstack-config get artifacts_sync_mode`
  Defends the v1.27.0.0 rename from doc drift.

Updates to two existing tests:
  - test/setup-conductor-worktree.test.ts: expect `_link_or_copy`
    instead of `ln -snf` at the Conductor-worktree guard call site
  - test/gen-skill-docs.test.ts: same swap at three assertion sites
    (Codex section, Claude link_claude_skill_dirs body, Codex
    link_codex_skill_dirs body)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* chore: bump v1.38.0.0 + build-script subshells + CHANGELOG

VERSION 1.35.0.0 → 1.38.0.0 (MINOR). PR #1500 (lyon-v2) claimed
v1.37.0.0 ahead of this branch; v1.38.0.0 is the next free MINOR slot
per bin/gstack-next-version queue check. Workspace-aware ship rule
applies — queue-advancing past a claimed version within the same
bump level is explicitly permitted.

package.json build script: three `{ git rev-parse HEAD ...; }` brace
groups → `( git rev-parse HEAD ... )` subshells. Bun's Windows shell
parser doesn't grok bash brace groups; subshells are POSIX-universal.
Originated from @realcarsonterry PR #1460.

CHANGELOG entry covers the full wave:
- Windows install hardening (42-site _link_or_copy + cleanup compat)
- Unicode sanitization architecture (handleCommandInternal + SSE
  replacer)
- Build script POSIX-shell compat (subshells)
- Doc rename (gbrain_sync_mode → artifacts_sync_mode)
- Windows CI on paid faster runner
- 4 new wave tests (29 cases)
Frames each item as a current system property, not a fix narrative.

Credits @realcarsonterry for PRs #1460, #1461, #1462, #1463 (the seed
of the wave). Scope expansion to all 42 setup sites, every server
egress path, Windows CI migration, and codex-flagged P0/P1 fixes
(connect-chrome alias on Windows, SSE replacer, prefix-cleanup
Windows compat) authored on this branch.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* docs: post-ship sync for v1.38.0.0

Document the two architectural invariants that landed in v1.38.0.0 in
their persistent homes (not just CHANGELOG):

- README Windows section: add the `./setup` re-run-after-git-pull
  requirement that `_print_windows_copy_note_once` shows at runtime.
- CONTRIBUTING "Things to know": add the no-raw-`ln` invariant for
  contributors editing `setup`, with the test that enforces it.
- ARCHITECTURE: new "Unicode sanitization at server egress" section
  between Shell injection prevention and Prompt injection defense,
  with egress table (HTTP/batch/SSE) and the post-stringify-regex
  rationale.
- CLAUDE.md: cross-references for both invariants, matching the
  v1.6.0.0 dual-listener pattern (each constraint says which files
  to read before editing and which test pins it).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* ci(windows): use windows-latest-8-cores instead of unregistered Blacksmith label

actionlint failed PR #1505 because `blacksmith-2vcpu-windows-2022` isn't
in the repo's approved runner-label list (actionlint.yaml only registers
`ubicloud-standard-2`, and Ubicloud doesn't ship a Windows pool).

Switch to GitHub's paid larger Windows runner `windows-latest-8-cores`
— 4x the cores of the free `windows-latest` at the larger-runner billing
rate, no new third-party CI provider, no actionlint config changes.

CHANGELOG: replace "Blacksmith" / "blacksmith-2vcpu-windows-2022" /
"~6x faster spin-up" claims with the actual choice (8 cores vs 4, paid
larger runner).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* ci(windows): switch from windows-latest-8-cores to ubicloud-standard-2-windows

`windows-latest-8-cores` sat queued indefinitely because the GitHub
larger-runner billing isn't enabled at the org level — the
"Queued — Waiting to run this check" status surfaced on PR #1505 with
no progress for the whole CI run.

Switch to Ubicloud Windows runners (`ubicloud-standard-2-windows`) so
Windows CI uses the same provider as the existing Linux evals
(`ubicloud-standard-2`). Billing stays under one account instead of
two.

Register the new label in actionlint.yaml alongside the existing
ubicloud-standard-2 entry so actionlint doesn't reject it as unknown.

CHANGELOG entry updated: runner row reflects the actual provider chosen,
"Itemized changes" mentions the actionlint.yaml registration, and the
narrative paragraph documents why `windows-latest-8-cores` failed first.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* ci: migrate all workflows to Ubicloud (Linux + Windows, 8-core)

Switch every `runs-on` in this repo to Ubicloud so CI has a single billing
surface, consistent capacity, and 4x more cores on the workloads that were
previously stuck on free `ubuntu-latest` (2 cores). Windows uses Ubicloud's
Windows pool too — `ubicloud-standard-8-windows` — so the queued-forever
problem with GitHub's `windows-latest-8-cores` paid larger runner (org-level
larger-runner billing not enabled) goes away.

Workflows touched (9):
- evals.yml, evals-periodic.yml, ci-image.yml — bump default + matrix from
  `ubicloud-standard-2` to `ubicloud-standard-8`. The one matrix entry that
  was already on -8 stays.
- windows-free-tests.yml — `ubicloud-standard-2-windows` → `ubicloud-standard-8-windows`.
- make-pdf-gate.yml — matrix `ubuntu-latest` → `ubicloud-standard-8`. macOS
  entry preserved; the poppler-install `if: matrix.os` conditional swaps to
  match the new label.
- actionlint.yml, pr-title-sync.yml, skill-docs.yml, version-gate.yml —
  `ubuntu-latest` → `ubicloud-standard-8`.

.github/actionlint.yaml registers all four Ubicloud labels in one place:
- ubicloud-standard-2
- ubicloud-standard-8
- ubicloud-standard-2-windows  (the v1.38.0.0 windows-free-tests target)
- ubicloud-standard-8-windows  (this PR's windows-free-tests target)

Removed the duplicate `actionlint.yaml` at the repo root that I accidentally
created in the prior commit — actionlint only reads `.github/actionlint.yaml`,
so the root file was dead weight.

CHANGELOG entry updated: a single "all Ubicloud" sentence in the narrative
plus a metrics-row covering the runner pool change, and the itemized line
expanded to enumerate the 9 affected workflows. The previously-orphaned
"Itemized changes" line about just `windows-free-tests.yml` is replaced.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* ci(windows): revert to free `windows-latest`

Ubicloud doesn't ship Windows runners — confirmed via their docs. The
`ubicloud-standard-*-windows` labels I added do not exist and were causing
`windows-free-tests` to sit "Queued — Waiting to run this check" forever
(GitHub Actions can't tell a typoed label from a self-hosted runner that's
about to register; it just waits).

Three prior Windows-runner attempts all failed for different reasons:
- `blacksmith-2vcpu-windows-2022` — Blacksmith app not installed on the org
- `windows-latest-8-cores` — GitHub paid larger-runner billing not enabled
- `ubicloud-standard-2/8-windows` — Ubicloud doesn't offer Windows at all

The free `windows-latest` runner (4 cores, ~60s spin-up, $0) is the one
path that actually runs. The wave-coverage Windows tests are <30s of real
work; total job time stays under 2 minutes.

Cleaned up `.github/actionlint.yaml` to drop the bogus
`ubicloud-standard-*-windows` entries — kept only the two real Linux labels.

CHANGELOG: split the runner-pool row into Linux (migrated to Ubicloud-8)
vs Windows (stays on free windows-latest), with the why on each. Itemized
line for windows-free-tests rewritten to reflect the actual outcome.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* test(windows): skip Unix-only cases on Windows runner

windows-free-tests on GitHub free windows-latest fails three cases that
depend on Unix tooling the runner doesn't have:

1. `setup-windows-fallback.test.ts` behavior matrix — IS_WINDOWS=0 cells
   assert `ln -snf` produces a real symlink. On Windows-without-Developer-
   Mode (which the free `windows-latest` runner is), `ln -snf` silently
   creates a file copy. That's literally the bug `_link_or_copy` exists
   to work around, so the assertion can never pass there. Skip the whole
   describe block on win32. The static-invariant test (zero raw `ln`
   outside the helper body) above the matrix still runs and pins the
   shape the Windows install relies on.

2. `docs-config-keys.test.ts` round-trip — spawnSync(`bin/gstack-config`)
   on Windows doesn't read the bash shebang and fails to exec. Skip on
   win32; the deprecated-key denylist test in the same file still runs
   and is the actual invariant defending the v1.27.0.0 rename at the doc
   layer.

Use `describe.skipIf(process.platform === 'win32', ...)` and
`test.skipIf(process.platform === 'win32', ...)`. Tests still run on
macOS and Linux unchanged.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
Garry Tan
2026-05-14 21:19:58 -07:00
committed by GitHub
parent e362b0ae2f
commit 3bf43766d5
28 changed files with 699 additions and 82 deletions

View File

@@ -59,6 +59,43 @@ import * as net from 'net';
import * as path from 'path';
import * as crypto from 'crypto';
// ─── Unicode Sanitization ───────────────────────────────────────
// Remove unpaired UTF-16 surrogate halves (\uD800\uDFFF). Page DOM text,
// OCR output, and other CDP-sourced strings can contain lone surrogates;
// JSON consumers downstream (Anthropic API in particular) reject them with
// "no low surrogate in string". Valid surrogate pairs (e.g. emoji) survive
// unchanged. Lone halves become U+FFFD (<28>).
//
// INVARIANT: every server egress path that ships page-content strings MUST
// route through this sanitizer. handleCommandInternal wraps the final
// cr.result string (text/plain bodies carry lone surrogates verbatim;
// JSON.stringify already escapes them). The two SSE producers below
// stringify with `sanitizeReplacer` so payload string fields get cleaned
// BEFORE escaping. Plain post-stringify regex is a no-op there because
// JSON.stringify converts \uD800 → "\\ud800" — the regex can't see the
// surrogate after that point.
function sanitizeLoneSurrogates(str: string): string {
return str.replace(/[\uD800-\uDFFF]/g, (match, offset) => {
const code = match.charCodeAt(0);
if (code >= 0xD800 && code <= 0xDBFF) {
const next = str.charCodeAt(offset + 1);
if (next >= 0xDC00 && next <= 0xDFFF) return match;
}
if (code >= 0xDC00 && code <= 0xDFFF) {
const prev = str.charCodeAt(offset - 1);
if (prev >= 0xD800 && prev <= 0xDBFF) return match;
}
return '<27>';
});
}
// JSON.stringify replacer that sanitizes string values before they get
// escape-encoded. Pair with stringify when the consumer will JSON.parse the
// payload back into JS strings (SSE clients do this).
function sanitizeReplacer(_key: string, value: unknown): unknown {
return typeof value === 'string' ? sanitizeLoneSurrogates(value) : value;
}
// ─── Config ─────────────────────────────────────────────────────
const config = resolveConfig();
ensureStateDir(config);
@@ -683,7 +720,7 @@ interface CommandResult {
* skipActivity: true when called from chain (chain emits 1 event for all subcommands)
* chainDepth: recursion guard — reject nested chains (depth > 0 means inside a chain)
*/
async function handleCommandInternal(
async function handleCommandInternalImpl(
body: { command: string; args?: string[]; tabId?: number },
tokenInfo?: TokenInfo | null,
opts?: { skipRateCheck?: boolean; skipActivity?: boolean; chainDepth?: number },
@@ -1027,6 +1064,21 @@ async function handleCommandInternal(
}
}
/**
* Sanitizing wrapper around handleCommandInternalImpl. ALL callers (single-command
* HTTP, batch loop, scoped-token dispatch) go through this so the lone-surrogate
* sanitization happens once at the architectural choke point, not per-leaf.
* Do not bypass this by calling handleCommandInternalImpl directly.
*/
async function handleCommandInternal(
body: { command: string; args?: string[]; tabId?: number },
tokenInfo?: TokenInfo | null,
opts?: { skipRateCheck?: boolean; skipActivity?: boolean; chainDepth?: number },
): Promise<CommandResult> {
const cr = await handleCommandInternalImpl(body, tokenInfo, opts);
return { ...cr, result: sanitizeLoneSurrogates(cr.result) };
}
/** HTTP wrapper — converts CommandResult to Response */
async function handleCommand(body: any, tokenInfo?: TokenInfo | null): Promise<Response> {
const cr = await handleCommandInternal(body, tokenInfo);
@@ -1827,19 +1879,24 @@ export async function start() {
const stream = new ReadableStream({
start(controller) {
// SSE egress invariant: every JSON.stringify here ships page-content-derived
// fields (URLs, command args, errors) to the sidebar. Lone surrogates must
// be sanitized DURING stringify (via sanitizeReplacer) so they're cleaned
// before escape-encoding — post-stringify regex is ineffective because
// JSON.stringify has already converted \uD800 → "\\ud800".
// 1. Gap detection + replay
const { entries, gap, gapFrom, availableFrom } = getActivityAfter(afterId);
if (gap) {
controller.enqueue(encoder.encode(`event: gap\ndata: ${JSON.stringify({ gapFrom, availableFrom })}\n\n`));
controller.enqueue(encoder.encode(`event: gap\ndata: ${JSON.stringify({ gapFrom, availableFrom }, sanitizeReplacer)}\n\n`));
}
for (const entry of entries) {
controller.enqueue(encoder.encode(`event: activity\ndata: ${JSON.stringify(entry)}\n\n`));
controller.enqueue(encoder.encode(`event: activity\ndata: ${JSON.stringify(entry, sanitizeReplacer)}\n\n`));
}
// 2. Subscribe for live events
const unsubscribe = subscribe((entry) => {
try {
controller.enqueue(encoder.encode(`event: activity\ndata: ${JSON.stringify(entry)}\n\n`));
controller.enqueue(encoder.encode(`event: activity\ndata: ${JSON.stringify(entry, sanitizeReplacer)}\n\n`));
} catch (err: any) {
console.debug('[browse] Activity SSE stream error, unsubscribing:', err.message);
unsubscribe();
@@ -2188,10 +2245,15 @@ export async function start() {
const encoder = new TextEncoder();
const stream = new ReadableStream({
start(controller) {
// SSE egress invariant: inspectorData and CDP event payloads carry
// page-DOM strings (selectors, attribute values, console messages).
// sanitizeReplacer cleans lone surrogates DURING JSON.stringify so
// they're neutralized before escape-encoding (post-stringify regex
// is a no-op once \uD800 has become "\\ud800").
// Send current state immediately
if (inspectorData) {
controller.enqueue(encoder.encode(
`event: state\ndata: ${JSON.stringify({ data: inspectorData, timestamp: inspectorTimestamp })}\n\n`
`event: state\ndata: ${JSON.stringify({ data: inspectorData, timestamp: inspectorTimestamp }, sanitizeReplacer)}\n\n`
));
}
@@ -2199,7 +2261,7 @@ export async function start() {
const notify: InspectorSubscriber = (event) => {
try {
controller.enqueue(encoder.encode(
`event: inspector\ndata: ${JSON.stringify(event)}\n\n`
`event: inspector\ndata: ${JSON.stringify(event, sanitizeReplacer)}\n\n`
));
} catch (err: any) {
console.debug('[browse] Inspector SSE stream error:', err.message);

View File

@@ -0,0 +1,129 @@
import { describe, test, expect } from 'bun:test';
import * as fs from 'fs';
import * as path from 'path';
// The sanitizer is module-private in server.ts. Rather than refactor it to a
// separate module just for testing, we extract its source via a regex slice and
// eval it in a fresh function scope. Keeps the production layout untouched.
const SERVER_PATH = path.resolve(import.meta.dir, '..', 'src', 'server.ts');
const SERVER_SRC = fs.readFileSync(SERVER_PATH, 'utf-8');
const fnMatch = SERVER_SRC.match(
/function sanitizeLoneSurrogates\(str: string\): string \{[\s\S]*?\n\}/
);
if (!fnMatch) throw new Error('Could not locate sanitizeLoneSurrogates in server.ts');
// Strip TS annotations so eval works under plain JS.
const jsSrc = fnMatch[0].replace('(str: string): string', '(str)');
const sanitizeLoneSurrogates = new Function(`${jsSrc}\nreturn sanitizeLoneSurrogates;`)() as (
s: string,
) => string;
describe('sanitizeLoneSurrogates — unit cases', () => {
test('passthrough ASCII', () => {
expect(sanitizeLoneSurrogates('hello')).toBe('hello');
});
test('passthrough empty string', () => {
expect(sanitizeLoneSurrogates('')).toBe('');
});
test('preserves valid surrogate pair (U+1F389 🎉)', () => {
expect(sanitizeLoneSurrogates('hi 🎉')).toBe('hi 🎉');
});
test('replaces lone high surrogate mid-string', () => {
expect(sanitizeLoneSurrogates('a\uD800b')).toBe('a<>b');
});
test('replaces lone low surrogate mid-string', () => {
expect(sanitizeLoneSurrogates('a\uDC00b')).toBe('a<>b');
});
test('replaces trailing lone high at end of string', () => {
expect(sanitizeLoneSurrogates('a\uD800')).toBe('a<>');
});
test('replaces leading lone low at start of string', () => {
expect(sanitizeLoneSurrogates('\uDC00b')).toBe('<27>b');
});
test('replaces two adjacent lone highs', () => {
expect(sanitizeLoneSurrogates('\uD800\uD800')).toBe('<27><>');
});
test('replaces two adjacent lone lows', () => {
expect(sanitizeLoneSurrogates('\uDC00\uDC00')).toBe('<27><>');
});
test('preserves valid pair followed by lone low', () => {
// 𐀀 = U+10000 = 𐀀, then a separate lone low.
const input = '𐀀\uDC00';
const output = sanitizeLoneSurrogates(input);
// Valid pair intact, trailing lone low replaced.
expect(output).toBe('𐀀<>');
});
test('preserves valid pair preceded by lone low', () => {
const input = '\uDC00𐀀';
const output = sanitizeLoneSurrogates(input);
expect(output).toBe('<27>𐀀');
});
});
describe('sanitizeLoneSurrogates — bug-repro (D5)', () => {
// Pin the regression intent: a future refactor that drops sanitization
// must fail this test even if happy-path tests still pass.
test('unsanitized lone surrogate causes UTF-8 encode to substitute, sanitized version is stable', () => {
const badPayload = 'page content\uD800more content';
// Buffer.from(str, 'utf-8') silently substitutes invalid sequences with
// EF BF BD (U+FFFD). Round-trip is therefore lossy for lone surrogates.
const roundTrippedRaw = Buffer.from(badPayload, 'utf-8').toString('utf-8');
expect(roundTrippedRaw).not.toBe(badPayload); // proves the bug exists pre-sanitize
// After sanitization the round-trip is stable.
const sanitized = sanitizeLoneSurrogates(badPayload);
const roundTrippedSanitized = Buffer.from(sanitized, 'utf-8').toString('utf-8');
expect(roundTrippedSanitized).toBe(sanitized);
});
test('JSON.parse(JSON.stringify(...)) round-trip is stable after sanitization', () => {
// Anthropic's API path wraps the response body in a tool_result JSON
// object. JSON.stringify CAN encode a lone surrogate (escapes it), but
// some downstream consumers reject the resulting body.
const badPayload = 'before\uD800after';
const sanitized = sanitizeLoneSurrogates(badPayload);
const wrapped = JSON.stringify({ content: sanitized });
const reparsed = JSON.parse(wrapped) as { content: string };
// .toBe(sanitized) already proves the surrogate was replaced; the
// additional explicit check below documents the specific code points.
expect(reparsed.content).toBe(sanitized);
expect(reparsed.content.charCodeAt(6)).toBe(0xfffd); // <20> not \uD800
});
});
describe('sanitizeLoneSurrogates — wiring invariants', () => {
test('server.ts wraps every command result through handleCommandInternal', () => {
// The architectural choice is to wrap once at handleCommandInternal so
// both single-command HTTP and the batch loop inherit. If a future
// refactor moves sanitization back to handleCommand only, this test
// fails by detecting the missing wrapper.
expect(SERVER_SRC).toContain('async function handleCommandInternalImpl(');
expect(SERVER_SRC).toContain('result: sanitizeLoneSurrogates(cr.result)');
});
test('SSE activity feed sanitizes outbound frames via sanitizeReplacer', () => {
// Replacer must run DURING stringify; post-stringify regex is ineffective
// because JSON.stringify converts \uD800 → "\\ud800" before our regex sees it.
expect(SERVER_SRC).toContain('JSON.stringify(entry, sanitizeReplacer)');
});
test('SSE inspector stream sanitizes outbound frames via sanitizeReplacer', () => {
expect(SERVER_SRC).toContain('JSON.stringify(event, sanitizeReplacer)');
});
test('sanitizeReplacer is a function defined in server.ts', () => {
expect(SERVER_SRC).toContain('function sanitizeReplacer(');
});
});