Two fixes to get the E2E actually running end-to-end (first attempt
failed at the SDK auth step, second at the assertion step):
1. Don't pass an explicit `env:` object to runAgentSdkTest. The SDK's
auth pipeline misses ANTHROPIC_API_KEY when env is supplied as an
object (verified against the plan-mode-no-op test, which passes no
env and auths cleanly). Mutate process.env before the call instead,
and restore the originals in finally so other tests don't inherit
the ambient mutation.
2. The "Run /learn with no arguments" user prompt was too narrow — the
model reduced it to a direct action and skipped the preamble
privacy-gate directives entirely, so zero AskUserQuestions fired.
Mirror the plan-mode-no-op pattern: point the model at the skill
file on disk and ask it to follow every preamble directive. Bumped
maxTurns from 6 to 10 to give the preamble room to execute.
Verified both tests pass under `EVALS=1 EVALS_TIER=periodic bun test
test/skill-e2e-brain-privacy-gate.test.ts` against a real ANTHROPIC_API_KEY.
Cost per run: ~$0.30-$0.50 per test.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two periodic-tier E2E tests exercising the preamble's privacy gate
end-to-end via the Agent SDK + canUseTool. Previously uncovered:
- Positive: stages a fake gbrain on PATH + gbrain_sync_mode_prompted=false
in config, runs a real skill, intercepts tool-use. Asserts the
preamble fires a 3-option AskUserQuestion matching the canonical
prose ("publish session memory" / "artifact" / "decline") and does
NOT fire a second time in the same run (idempotency within session).
- Negative: same staging but prompted=true. Asserts the gate stays
silent even with gbrain detected on the host.
Registered in test/helpers/touchfiles.ts as `brain-privacy-gate`
(periodic) with dependency tracking on generate-brain-sync-block.ts,
the three gstack-brain-* bins, gstack-config, and the Agent SDK runner.
Diff-based selection re-runs the E2E when any of those change.
Cost: ~$0.30-$0.50 per run. Only fires under EVALS=1 EVALS_TIER=periodic;
gate tier stays free.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>