mirror of
https://github.com/garrytan/gstack.git
synced 2026-05-20 03:12:28 +08:00
feat(test/helpers): initialPlanContent + wrote_findings_before_asking + shared report-at-bottom assertion
Three additions to claude-pty-runner.ts: 1. runPlanSkillObservation gains initialPlanContent?: string. Pre-pumps a user message containing the seeded plan before invoking the skill, with a 3s gap so the message renders before the slash command. claude has no --plan-file flag (verified via claude --help), so message-pump is the route. Lets STOP-gate regression tests force complexity findings. 2. ClassifyResult gains wrote_findings_before_asking with companion strictPlanWrites?: boolean opt on classifyVisible. Fires when a Write/ Edit to .claude/plans/* precedes any AskUserQuestion render in the session window. Default off — preserves zero-findings → write plan → plan_ready as legitimate for unseeded smokes. Six new unit tests cover before/after-AUQ ordering, permission-dialog edge case, strict-off path. 3. assertReportAtBottomIfPlanWritten(obs) shared helper. Wraps the existing assertReviewReportAtBottom(content) and gates on obs.planFile (artifact existing), so the assertion fires under both 'asked' and 'plan_ready' when a plan was actually written. Also: runPlanSkillObservation now captures obs.planFile on every classifier outcome, not just 'plan_ready'. Catches the case where the skill wrote a plan partway through then paused on a question. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -295,6 +295,78 @@ describe('classifyVisible (runtime path through the runner classifier)', () => {
|
||||
// recent-tail window would surface here.
|
||||
expect(TAIL_SCAN_BYTES).toBe(1500);
|
||||
});
|
||||
|
||||
// D4-B: strictPlanWrites detector. Catches the transcript bug where the
|
||||
// model writes findings to the plan file before any AskUserQuestion fires.
|
||||
test('strictPlanWrites: plan write before any AUQ → wrote_findings_before_asking', () => {
|
||||
const visible = `
|
||||
⏺ Edit(/Users/me/.claude/plans/some-plan.md)
|
||||
⎿ Updated 12 lines
|
||||
`;
|
||||
const result = classifyVisible(visible, { strictPlanWrites: true });
|
||||
expect(result?.outcome).toBe('wrote_findings_before_asking');
|
||||
expect(result?.summary).toContain('.claude/plans/some-plan.md');
|
||||
});
|
||||
|
||||
test('strictPlanWrites: plan write AFTER an AUQ render → not flagged', () => {
|
||||
// AUQ renders first, then the model writes the plan post-answer. This is
|
||||
// the legitimate end-of-workflow flow and must NOT trigger the detector.
|
||||
const visible = `
|
||||
D1 — Some scope question
|
||||
|
||||
❯ 1. Option A
|
||||
2. Option B
|
||||
|
||||
⏺ Edit(/Users/me/.claude/plans/some-plan.md)
|
||||
⎿ Updated 12 lines
|
||||
`;
|
||||
const result = classifyVisible(visible, { strictPlanWrites: true });
|
||||
// Outcome is 'asked' (the numbered list rendered); the post-AUQ plan
|
||||
// write is ignored by the detector.
|
||||
expect(result?.outcome).toBe('asked');
|
||||
});
|
||||
|
||||
test('strictPlanWrites: AUQ first then plan write — write_pos > auq_pos → not flagged', () => {
|
||||
// Same scenario, more explicit ordering: the regex finds the write at a
|
||||
// position AFTER the numbered list. Detector lets it through.
|
||||
const visible = [
|
||||
'D1 — Choose your approach',
|
||||
'',
|
||||
'❯ 1. Approach A',
|
||||
' 2. Approach B',
|
||||
'',
|
||||
'⏺ Write(/Users/me/.claude/plans/draft.md)',
|
||||
'⎿ Wrote 42 lines',
|
||||
].join('\n');
|
||||
const result = classifyVisible(visible, { strictPlanWrites: true });
|
||||
expect(result?.outcome).toBe('asked');
|
||||
});
|
||||
|
||||
test('strictPlanWrites: only a permission dialog visible → plan write still flagged', () => {
|
||||
// A permission dialog ❯ 1./2. is NOT an AUQ; pre-AUQ plan writes still
|
||||
// hit the detector even when a permission prompt is on screen.
|
||||
const visible = `
|
||||
⏺ Edit(/Users/me/.claude/plans/some-plan.md)
|
||||
|
||||
Edit to /Users/me/.claude/plans/some-plan.md
|
||||
|
||||
Do you want to proceed?
|
||||
|
||||
❯ 1. Yes
|
||||
2. No
|
||||
`;
|
||||
const result = classifyVisible(visible, { strictPlanWrites: true });
|
||||
expect(result?.outcome).toBe('wrote_findings_before_asking');
|
||||
});
|
||||
|
||||
test('strictPlanWrites OFF: plan write before AUQ → returns null (legacy behavior preserved)', () => {
|
||||
const visible = `
|
||||
⏺ Edit(/Users/me/.claude/plans/some-plan.md)
|
||||
⎿ Updated 12 lines
|
||||
`;
|
||||
// Without strictPlanWrites, the sanctioned-path list lets this through.
|
||||
expect(classifyVisible(visible)).toBeNull();
|
||||
});
|
||||
});
|
||||
|
||||
describe('parseNumberedOptions', () => {
|
||||
|
||||
Reference in New Issue
Block a user