diff --git a/CHANGELOG.md b/CHANGELOG.md index e9f3049a..674d8b2a 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,5 +1,131 @@ # Changelog +## [1.31.0.0] - 2026-05-09 + +## **AskUserQuestion stops getting silently buried in plan files.** +## **The forever-war contradiction in the preamble is deleted, the test harness sees prose-rendered questions, and 5 fictional test variants are gone.** + +After v1.31, `/plan-eng-review`, `/office-hours`, and the rest of the +plan-* skills surface every decision through AskUserQuestion. The +"fallback when neither variant is callable" clause that quietly +authorized a `## Decisions to confirm` plan-write + ExitPlanMode is +deleted, along with the "trivial fix" exception that survived the +prior tightening and the "outside plan mode, output as prose and stop" +escape hatch. Skill-text loses ~10 lines net across 8 inline sites +plus the 6 places the same fallback was repeated verbatim inside +`plan-eng-review/SKILL.md.tmpl`. + +Five test variants that simulated a Conductor configuration nobody +actually runs (`--disallowedTools AskUserQuestion` without a registered +MCP variant, i.e. "neither AUQ tool callable") are deleted. They +tested a state that doesn't exist in production: real Conductor +sessions register `mcp__conductor__AskUserQuestion`, so the model +always has the MCP variant. The deleted variants were a long-running +flake source. + +The harness gained three new primitives that survive the test cull: +`isProseAUQVisible` regex detector for lettered (A/B/C/D) and numbered +(1/2/3) prose AUQ rendering, an LLM judge using `claude-haiku-4-5` +that classifies TTY snapshots as `waiting` / `working` / `hung`, and +high-water-mark tracking on `PlanSkillObservation` so tests that +check "did the user see a question at SOME point" don't have to scan +the truncated 2KB evidence window. + +### The numbers that matter + +| Surface | Before | After | Δ | +|---|---|---|---| +| Fallback clause inline sites in skill-text | 8 | 0 | -8 | +| Surviving "trivial fix" / "prose-and-stop" escape hatches | 2 | 0 | -2 | +| Plan-mode test variants under fictional `--disallowedTools` | 5 | 0 | -5 | +| LLM judge classifications | 0 | 4 (waiting/working/hung/unknown) | +4 | +| Diff size on this branch (after merge with main) | — | -721 / +928 | net +207 | + +The deleted "fallback" clause was the load-bearing instruction the +model was rationalizing as a general escape hatch from "fanning out +round-trip AUQs." Once it's gone, the anti-shortcut clause and STOP +gates in `plan-eng-review` Sections 1-4 stand without a contradicting +instruction to lose to. `gate-tier plan-eng-finding-floor` passes on +every run since the architectural fix landed. + +### What this means for builders + +If you are running `/plan-eng-review` or any other plan-* skill, you +will see one AskUserQuestion per finding instead of four findings +quietly batched into a "## Decisions to confirm" plan-file write that +gets buried under ExitPlanMode. The harness improvements (prose-AUQ +detector, LLM judge, snapshot logs at `~/.gstack/analytics/pty-judge.jsonl` +and `~/.gstack/analytics/pty-snapshots/` when `GSTACK_PTY_LOG=1`) are +load-bearing for any future plan-mode regression test that needs to +distinguish "model is thinking" from "model is waiting for me." + +### Itemized changes + +#### Architectural fix +- Deleted `## Decisions to confirm` fallback clause from + `scripts/resolvers/preamble/generate-ask-user-format.ts:12` (both + branches: plan-file write AND prose-and-stop) +- Deleted same fallback clause from + `scripts/resolvers/preamble/generate-completion-status.ts:29` +- Deleted fallback inline sentences from + `plan-eng-review/SKILL.md.tmpl` (Step 0 + Sections 1-4: 5 instances) + and `office-hours/SKILL.md.tmpl` (1 instance) +- Deleted "Only skip AskUserQuestion when the decision is genuinely + trivial" exception from `plan-eng-review/SKILL.md.tmpl:204` +- Replaced with single hard rule: "If no AskUserQuestion variant + appears in your tool list, this skill is BLOCKED. Stop, report + `BLOCKED — AskUserQuestion unavailable`, and wait for the user." +- Regenerated all 47 generated SKILL.md files (default + 7 host adapters) + +#### Test harness primitives +- Added `isProseAUQVisible` regex detector with line-start anchoring + and tail-only native-cursor gate + (`test/helpers/claude-pty-runner.ts`); 8 unit tests cover lettered + and numbered formats, threshold edges, native-cursor exclusion, and + mid-prose false-positive guard +- Added `judgePtyState` LLM judge using `claude -p --model + claude-haiku-4-5 --max-turns 1` with subscription auth (no API key + env required), in-process cache by SHA-1 of normalized last-4KB + snapshot, JSONL log to `~/.gstack/analytics/pty-judge.jsonl` +- Added high-water-mark flags `proseAUQEverObserved` and + `waitingEverObserved` to `PlanSkillObservation`; tests check these + rather than re-running detectors against the truncated evidence + window +- Added snapshot logging via `GSTACK_PTY_LOG=1`, dumping last 4KB of + visible TTY at every judge tick to + `~/.gstack/analytics/pty-snapshots/-ms.txt` +- `assertReportAtBottomIfPlanWritten` now tolerates ENOENT (TTY-detected + path that didn't persist) and `outcome='asked'` smoke runs (workflow + exited at first AUQ, no review report yet) +- Wired LLM judge fallback into `runPlanSkillObservation` and + `runPlanSkillFloorCheck` polling loops: after 60s of no terminal + classification, snapshot every 30s and call the judge; on `waiting` + verdict, return `outcome='asked'` early + +#### Test surface changes +- Added `test/skill-e2e-plan-eng-multi-finding-batching.test.ts` + (periodic tier) using `runPlanSkillCounting` with a 4-finding seeded + fixture (`FORCING_BATCHING_ENG`) that mirrors the original transcript + bug shape; asserts at least 3 distinct review-phase AUQs +- Deleted `test/skill-e2e-autoplan-auto-mode.test.ts` entirely +- Deleted test 2 (`--disallowedTools AskUserQuestion`) from + `plan-ceo-plan-mode`, `plan-design-plan-mode`, `plan-eng-plan-mode` + (kept test 1 baseline plus plan-eng-plan-mode test 3 STOP-gate) +- Removed `autoplan-auto-mode` entry from `test/helpers/touchfiles.ts` + (E2E_TOUCHFILES and E2E_TIERS); updated `test/touchfiles.test.ts` + assertion count + +#### For contributors +- Three subagent investigations across the debugging cycle were the + load-bearing diagnostic step: the architectural fix, the prose-AUQ + detector design, and the test-fictional-state retraction. The + pattern that worked: have a fresh-context subagent verify the + parent's mental model against actual file contents before committing + to a fix. Codex review caught that "three places" was actually + eight, that the proposed multi-finding test would pass trivially + given how `runPlanSkillFloorCheck` exits on first AUQ, and that + three existing tests codified the deleted fallback as PASS. + ## [1.30.0.0] - 2026-05-09 ## **Twenty-one community fixes land in one wave, plus closing fixes that put the Windows + codex surfaces under CI for the first time.** diff --git a/SKILL.md b/SKILL.md index ad1a0e5b..5ee5de76 100644 --- a/SKILL.md +++ b/SKILL.md @@ -107,7 +107,7 @@ In plan mode, allowed because they inform the plan: `$B`, `$D`, `codex exec`/`co ## Skill Invocation During Plan Mode -If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, fall back to writing the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode — never silently auto-decide. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. +If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, the skill is BLOCKED — stop and report `BLOCKED — AskUserQuestion unavailable` per the AskUserQuestion Format rule. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. If `PROACTIVE` is `"false"`, do not auto-invoke or proactively suggest skills. If a skill seems useful, ask: "I think /skillname might help here — want me to run it?" diff --git a/VERSION b/VERSION index fbeae638..52c3b4a5 100644 --- a/VERSION +++ b/VERSION @@ -1 +1 @@ -1.30.0.0 +1.31.0.0 diff --git a/autoplan/SKILL.md b/autoplan/SKILL.md index 8cfe0237..eb149a5a 100644 --- a/autoplan/SKILL.md +++ b/autoplan/SKILL.md @@ -116,7 +116,7 @@ In plan mode, allowed because they inform the plan: `$B`, `$D`, `codex exec`/`co ## Skill Invocation During Plan Mode -If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, fall back to writing the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode — never silently auto-decide. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. +If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, the skill is BLOCKED — stop and report `BLOCKED — AskUserQuestion unavailable` per the AskUserQuestion Format rule. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. If `PROACTIVE` is `"false"`, do not auto-invoke or proactively suggest skills. If a skill seems useful, ask: "I think /skillname might help here — want me to run it?" @@ -287,7 +287,7 @@ AI orchestrator (e.g., OpenClaw). In spawned sessions: **Rule:** if any `mcp__*__AskUserQuestion` variant is in your tool list, prefer it. Hosts may disable native AUQ via `--disallowedTools AskUserQuestion` (Conductor does, by default) and route through their MCP variant; calling native there silently fails. Same questions/options shape; same decision-brief format applies. -**Fallback when neither variant is callable:** in plan mode, write the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode (the native "Ready to execute?" surfaces it). Outside plan mode, output the brief as prose and stop. **Never silently auto-decide** — only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking. +**If no AskUserQuestion variant appears in your tool list, this skill is BLOCKED.** Stop, report `BLOCKED — AskUserQuestion unavailable`, and wait for the user. Do not write decisions to the plan file as a substitute, do not emit them as prose and stop, and do not silently auto-decide (only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking). ### Format diff --git a/benchmark-models/SKILL.md b/benchmark-models/SKILL.md index a1a15c18..5e5e6bd6 100644 --- a/benchmark-models/SKILL.md +++ b/benchmark-models/SKILL.md @@ -109,7 +109,7 @@ In plan mode, allowed because they inform the plan: `$B`, `$D`, `codex exec`/`co ## Skill Invocation During Plan Mode -If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, fall back to writing the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode — never silently auto-decide. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. +If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, the skill is BLOCKED — stop and report `BLOCKED — AskUserQuestion unavailable` per the AskUserQuestion Format rule. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. If `PROACTIVE` is `"false"`, do not auto-invoke or proactively suggest skills. If a skill seems useful, ask: "I think /skillname might help here — want me to run it?" diff --git a/benchmark/SKILL.md b/benchmark/SKILL.md index fb4a9fbc..46934ba3 100644 --- a/benchmark/SKILL.md +++ b/benchmark/SKILL.md @@ -109,7 +109,7 @@ In plan mode, allowed because they inform the plan: `$B`, `$D`, `codex exec`/`co ## Skill Invocation During Plan Mode -If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, fall back to writing the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode — never silently auto-decide. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. +If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, the skill is BLOCKED — stop and report `BLOCKED — AskUserQuestion unavailable` per the AskUserQuestion Format rule. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. If `PROACTIVE` is `"false"`, do not auto-invoke or proactively suggest skills. If a skill seems useful, ask: "I think /skillname might help here — want me to run it?" diff --git a/browse/SKILL.md b/browse/SKILL.md index b40a441c..1d544756 100644 --- a/browse/SKILL.md +++ b/browse/SKILL.md @@ -108,7 +108,7 @@ In plan mode, allowed because they inform the plan: `$B`, `$D`, `codex exec`/`co ## Skill Invocation During Plan Mode -If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, fall back to writing the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode — never silently auto-decide. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. +If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, the skill is BLOCKED — stop and report `BLOCKED — AskUserQuestion unavailable` per the AskUserQuestion Format rule. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. If `PROACTIVE` is `"false"`, do not auto-invoke or proactively suggest skills. If a skill seems useful, ask: "I think /skillname might help here — want me to run it?" diff --git a/canary/SKILL.md b/canary/SKILL.md index fd43d9f9..79ab0f07 100644 --- a/canary/SKILL.md +++ b/canary/SKILL.md @@ -108,7 +108,7 @@ In plan mode, allowed because they inform the plan: `$B`, `$D`, `codex exec`/`co ## Skill Invocation During Plan Mode -If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, fall back to writing the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode — never silently auto-decide. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. +If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, the skill is BLOCKED — stop and report `BLOCKED — AskUserQuestion unavailable` per the AskUserQuestion Format rule. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. If `PROACTIVE` is `"false"`, do not auto-invoke or proactively suggest skills. If a skill seems useful, ask: "I think /skillname might help here — want me to run it?" @@ -279,7 +279,7 @@ AI orchestrator (e.g., OpenClaw). In spawned sessions: **Rule:** if any `mcp__*__AskUserQuestion` variant is in your tool list, prefer it. Hosts may disable native AUQ via `--disallowedTools AskUserQuestion` (Conductor does, by default) and route through their MCP variant; calling native there silently fails. Same questions/options shape; same decision-brief format applies. -**Fallback when neither variant is callable:** in plan mode, write the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode (the native "Ready to execute?" surfaces it). Outside plan mode, output the brief as prose and stop. **Never silently auto-decide** — only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking. +**If no AskUserQuestion variant appears in your tool list, this skill is BLOCKED.** Stop, report `BLOCKED — AskUserQuestion unavailable`, and wait for the user. Do not write decisions to the plan file as a substitute, do not emit them as prose and stop, and do not silently auto-decide (only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking). ### Format diff --git a/codex/SKILL.md b/codex/SKILL.md index b71d1ed4..6be6ccdf 100644 --- a/codex/SKILL.md +++ b/codex/SKILL.md @@ -110,7 +110,7 @@ In plan mode, allowed because they inform the plan: `$B`, `$D`, `codex exec`/`co ## Skill Invocation During Plan Mode -If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, fall back to writing the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode — never silently auto-decide. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. +If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, the skill is BLOCKED — stop and report `BLOCKED — AskUserQuestion unavailable` per the AskUserQuestion Format rule. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. If `PROACTIVE` is `"false"`, do not auto-invoke or proactively suggest skills. If a skill seems useful, ask: "I think /skillname might help here — want me to run it?" @@ -281,7 +281,7 @@ AI orchestrator (e.g., OpenClaw). In spawned sessions: **Rule:** if any `mcp__*__AskUserQuestion` variant is in your tool list, prefer it. Hosts may disable native AUQ via `--disallowedTools AskUserQuestion` (Conductor does, by default) and route through their MCP variant; calling native there silently fails. Same questions/options shape; same decision-brief format applies. -**Fallback when neither variant is callable:** in plan mode, write the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode (the native "Ready to execute?" surfaces it). Outside plan mode, output the brief as prose and stop. **Never silently auto-decide** — only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking. +**If no AskUserQuestion variant appears in your tool list, this skill is BLOCKED.** Stop, report `BLOCKED — AskUserQuestion unavailable`, and wait for the user. Do not write decisions to the plan file as a substitute, do not emit them as prose and stop, and do not silently auto-decide (only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking). ### Format diff --git a/context-restore/SKILL.md b/context-restore/SKILL.md index ac458822..68e1fb5d 100644 --- a/context-restore/SKILL.md +++ b/context-restore/SKILL.md @@ -112,7 +112,7 @@ In plan mode, allowed because they inform the plan: `$B`, `$D`, `codex exec`/`co ## Skill Invocation During Plan Mode -If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, fall back to writing the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode — never silently auto-decide. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. +If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, the skill is BLOCKED — stop and report `BLOCKED — AskUserQuestion unavailable` per the AskUserQuestion Format rule. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. If `PROACTIVE` is `"false"`, do not auto-invoke or proactively suggest skills. If a skill seems useful, ask: "I think /skillname might help here — want me to run it?" @@ -283,7 +283,7 @@ AI orchestrator (e.g., OpenClaw). In spawned sessions: **Rule:** if any `mcp__*__AskUserQuestion` variant is in your tool list, prefer it. Hosts may disable native AUQ via `--disallowedTools AskUserQuestion` (Conductor does, by default) and route through their MCP variant; calling native there silently fails. Same questions/options shape; same decision-brief format applies. -**Fallback when neither variant is callable:** in plan mode, write the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode (the native "Ready to execute?" surfaces it). Outside plan mode, output the brief as prose and stop. **Never silently auto-decide** — only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking. +**If no AskUserQuestion variant appears in your tool list, this skill is BLOCKED.** Stop, report `BLOCKED — AskUserQuestion unavailable`, and wait for the user. Do not write decisions to the plan file as a substitute, do not emit them as prose and stop, and do not silently auto-decide (only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking). ### Format diff --git a/context-save/SKILL.md b/context-save/SKILL.md index 8cd8a32c..f260a6fa 100644 --- a/context-save/SKILL.md +++ b/context-save/SKILL.md @@ -112,7 +112,7 @@ In plan mode, allowed because they inform the plan: `$B`, `$D`, `codex exec`/`co ## Skill Invocation During Plan Mode -If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, fall back to writing the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode — never silently auto-decide. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. +If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, the skill is BLOCKED — stop and report `BLOCKED — AskUserQuestion unavailable` per the AskUserQuestion Format rule. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. If `PROACTIVE` is `"false"`, do not auto-invoke or proactively suggest skills. If a skill seems useful, ask: "I think /skillname might help here — want me to run it?" @@ -283,7 +283,7 @@ AI orchestrator (e.g., OpenClaw). In spawned sessions: **Rule:** if any `mcp__*__AskUserQuestion` variant is in your tool list, prefer it. Hosts may disable native AUQ via `--disallowedTools AskUserQuestion` (Conductor does, by default) and route through their MCP variant; calling native there silently fails. Same questions/options shape; same decision-brief format applies. -**Fallback when neither variant is callable:** in plan mode, write the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode (the native "Ready to execute?" surfaces it). Outside plan mode, output the brief as prose and stop. **Never silently auto-decide** — only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking. +**If no AskUserQuestion variant appears in your tool list, this skill is BLOCKED.** Stop, report `BLOCKED — AskUserQuestion unavailable`, and wait for the user. Do not write decisions to the plan file as a substitute, do not emit them as prose and stop, and do not silently auto-decide (only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking). ### Format diff --git a/cso/SKILL.md b/cso/SKILL.md index 9696e058..4ab69b53 100644 --- a/cso/SKILL.md +++ b/cso/SKILL.md @@ -113,7 +113,7 @@ In plan mode, allowed because they inform the plan: `$B`, `$D`, `codex exec`/`co ## Skill Invocation During Plan Mode -If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, fall back to writing the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode — never silently auto-decide. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. +If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, the skill is BLOCKED — stop and report `BLOCKED — AskUserQuestion unavailable` per the AskUserQuestion Format rule. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. If `PROACTIVE` is `"false"`, do not auto-invoke or proactively suggest skills. If a skill seems useful, ask: "I think /skillname might help here — want me to run it?" @@ -284,7 +284,7 @@ AI orchestrator (e.g., OpenClaw). In spawned sessions: **Rule:** if any `mcp__*__AskUserQuestion` variant is in your tool list, prefer it. Hosts may disable native AUQ via `--disallowedTools AskUserQuestion` (Conductor does, by default) and route through their MCP variant; calling native there silently fails. Same questions/options shape; same decision-brief format applies. -**Fallback when neither variant is callable:** in plan mode, write the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode (the native "Ready to execute?" surfaces it). Outside plan mode, output the brief as prose and stop. **Never silently auto-decide** — only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking. +**If no AskUserQuestion variant appears in your tool list, this skill is BLOCKED.** Stop, report `BLOCKED — AskUserQuestion unavailable`, and wait for the user. Do not write decisions to the plan file as a substitute, do not emit them as prose and stop, and do not silently auto-decide (only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking). ### Format diff --git a/design-consultation/SKILL.md b/design-consultation/SKILL.md index 01ac0d7b..b131a274 100644 --- a/design-consultation/SKILL.md +++ b/design-consultation/SKILL.md @@ -136,7 +136,7 @@ In plan mode, allowed because they inform the plan: `$B`, `$D`, `codex exec`/`co ## Skill Invocation During Plan Mode -If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, fall back to writing the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode — never silently auto-decide. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. +If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, the skill is BLOCKED — stop and report `BLOCKED — AskUserQuestion unavailable` per the AskUserQuestion Format rule. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. If `PROACTIVE` is `"false"`, do not auto-invoke or proactively suggest skills. If a skill seems useful, ask: "I think /skillname might help here — want me to run it?" @@ -307,7 +307,7 @@ AI orchestrator (e.g., OpenClaw). In spawned sessions: **Rule:** if any `mcp__*__AskUserQuestion` variant is in your tool list, prefer it. Hosts may disable native AUQ via `--disallowedTools AskUserQuestion` (Conductor does, by default) and route through their MCP variant; calling native there silently fails. Same questions/options shape; same decision-brief format applies. -**Fallback when neither variant is callable:** in plan mode, write the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode (the native "Ready to execute?" surfaces it). Outside plan mode, output the brief as prose and stop. **Never silently auto-decide** — only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking. +**If no AskUserQuestion variant appears in your tool list, this skill is BLOCKED.** Stop, report `BLOCKED — AskUserQuestion unavailable`, and wait for the user. Do not write decisions to the plan file as a substitute, do not emit them as prose and stop, and do not silently auto-decide (only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking). ### Format diff --git a/design-html/SKILL.md b/design-html/SKILL.md index 5b066f10..5e3961cf 100644 --- a/design-html/SKILL.md +++ b/design-html/SKILL.md @@ -115,7 +115,7 @@ In plan mode, allowed because they inform the plan: `$B`, `$D`, `codex exec`/`co ## Skill Invocation During Plan Mode -If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, fall back to writing the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode — never silently auto-decide. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. +If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, the skill is BLOCKED — stop and report `BLOCKED — AskUserQuestion unavailable` per the AskUserQuestion Format rule. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. If `PROACTIVE` is `"false"`, do not auto-invoke or proactively suggest skills. If a skill seems useful, ask: "I think /skillname might help here — want me to run it?" @@ -286,7 +286,7 @@ AI orchestrator (e.g., OpenClaw). In spawned sessions: **Rule:** if any `mcp__*__AskUserQuestion` variant is in your tool list, prefer it. Hosts may disable native AUQ via `--disallowedTools AskUserQuestion` (Conductor does, by default) and route through their MCP variant; calling native there silently fails. Same questions/options shape; same decision-brief format applies. -**Fallback when neither variant is callable:** in plan mode, write the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode (the native "Ready to execute?" surfaces it). Outside plan mode, output the brief as prose and stop. **Never silently auto-decide** — only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking. +**If no AskUserQuestion variant appears in your tool list, this skill is BLOCKED.** Stop, report `BLOCKED — AskUserQuestion unavailable`, and wait for the user. Do not write decisions to the plan file as a substitute, do not emit them as prose and stop, and do not silently auto-decide (only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking). ### Format diff --git a/design-review/SKILL.md b/design-review/SKILL.md index 34db25eb..96aa5054 100644 --- a/design-review/SKILL.md +++ b/design-review/SKILL.md @@ -113,7 +113,7 @@ In plan mode, allowed because they inform the plan: `$B`, `$D`, `codex exec`/`co ## Skill Invocation During Plan Mode -If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, fall back to writing the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode — never silently auto-decide. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. +If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, the skill is BLOCKED — stop and report `BLOCKED — AskUserQuestion unavailable` per the AskUserQuestion Format rule. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. If `PROACTIVE` is `"false"`, do not auto-invoke or proactively suggest skills. If a skill seems useful, ask: "I think /skillname might help here — want me to run it?" @@ -284,7 +284,7 @@ AI orchestrator (e.g., OpenClaw). In spawned sessions: **Rule:** if any `mcp__*__AskUserQuestion` variant is in your tool list, prefer it. Hosts may disable native AUQ via `--disallowedTools AskUserQuestion` (Conductor does, by default) and route through their MCP variant; calling native there silently fails. Same questions/options shape; same decision-brief format applies. -**Fallback when neither variant is callable:** in plan mode, write the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode (the native "Ready to execute?" surfaces it). Outside plan mode, output the brief as prose and stop. **Never silently auto-decide** — only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking. +**If no AskUserQuestion variant appears in your tool list, this skill is BLOCKED.** Stop, report `BLOCKED — AskUserQuestion unavailable`, and wait for the user. Do not write decisions to the plan file as a substitute, do not emit them as prose and stop, and do not silently auto-decide (only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking). ### Format diff --git a/design-shotgun/SKILL.md b/design-shotgun/SKILL.md index 5661a616..b20f46ea 100644 --- a/design-shotgun/SKILL.md +++ b/design-shotgun/SKILL.md @@ -130,7 +130,7 @@ In plan mode, allowed because they inform the plan: `$B`, `$D`, `codex exec`/`co ## Skill Invocation During Plan Mode -If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, fall back to writing the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode — never silently auto-decide. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. +If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, the skill is BLOCKED — stop and report `BLOCKED — AskUserQuestion unavailable` per the AskUserQuestion Format rule. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. If `PROACTIVE` is `"false"`, do not auto-invoke or proactively suggest skills. If a skill seems useful, ask: "I think /skillname might help here — want me to run it?" @@ -301,7 +301,7 @@ AI orchestrator (e.g., OpenClaw). In spawned sessions: **Rule:** if any `mcp__*__AskUserQuestion` variant is in your tool list, prefer it. Hosts may disable native AUQ via `--disallowedTools AskUserQuestion` (Conductor does, by default) and route through their MCP variant; calling native there silently fails. Same questions/options shape; same decision-brief format applies. -**Fallback when neither variant is callable:** in plan mode, write the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode (the native "Ready to execute?" surfaces it). Outside plan mode, output the brief as prose and stop. **Never silently auto-decide** — only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking. +**If no AskUserQuestion variant appears in your tool list, this skill is BLOCKED.** Stop, report `BLOCKED — AskUserQuestion unavailable`, and wait for the user. Do not write decisions to the plan file as a substitute, do not emit them as prose and stop, and do not silently auto-decide (only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking). ### Format diff --git a/devex-review/SKILL.md b/devex-review/SKILL.md index 26330842..d0ecec0c 100644 --- a/devex-review/SKILL.md +++ b/devex-review/SKILL.md @@ -113,7 +113,7 @@ In plan mode, allowed because they inform the plan: `$B`, `$D`, `codex exec`/`co ## Skill Invocation During Plan Mode -If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, fall back to writing the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode — never silently auto-decide. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. +If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, the skill is BLOCKED — stop and report `BLOCKED — AskUserQuestion unavailable` per the AskUserQuestion Format rule. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. If `PROACTIVE` is `"false"`, do not auto-invoke or proactively suggest skills. If a skill seems useful, ask: "I think /skillname might help here — want me to run it?" @@ -284,7 +284,7 @@ AI orchestrator (e.g., OpenClaw). In spawned sessions: **Rule:** if any `mcp__*__AskUserQuestion` variant is in your tool list, prefer it. Hosts may disable native AUQ via `--disallowedTools AskUserQuestion` (Conductor does, by default) and route through their MCP variant; calling native there silently fails. Same questions/options shape; same decision-brief format applies. -**Fallback when neither variant is callable:** in plan mode, write the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode (the native "Ready to execute?" surfaces it). Outside plan mode, output the brief as prose and stop. **Never silently auto-decide** — only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking. +**If no AskUserQuestion variant appears in your tool list, this skill is BLOCKED.** Stop, report `BLOCKED — AskUserQuestion unavailable`, and wait for the user. Do not write decisions to the plan file as a substitute, do not emit them as prose and stop, and do not silently auto-decide (only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking). ### Format diff --git a/document-release/SKILL.md b/document-release/SKILL.md index 7c362be7..9fa7f5ad 100644 --- a/document-release/SKILL.md +++ b/document-release/SKILL.md @@ -110,7 +110,7 @@ In plan mode, allowed because they inform the plan: `$B`, `$D`, `codex exec`/`co ## Skill Invocation During Plan Mode -If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, fall back to writing the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode — never silently auto-decide. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. +If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, the skill is BLOCKED — stop and report `BLOCKED — AskUserQuestion unavailable` per the AskUserQuestion Format rule. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. If `PROACTIVE` is `"false"`, do not auto-invoke or proactively suggest skills. If a skill seems useful, ask: "I think /skillname might help here — want me to run it?" @@ -281,7 +281,7 @@ AI orchestrator (e.g., OpenClaw). In spawned sessions: **Rule:** if any `mcp__*__AskUserQuestion` variant is in your tool list, prefer it. Hosts may disable native AUQ via `--disallowedTools AskUserQuestion` (Conductor does, by default) and route through their MCP variant; calling native there silently fails. Same questions/options shape; same decision-brief format applies. -**Fallback when neither variant is callable:** in plan mode, write the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode (the native "Ready to execute?" surfaces it). Outside plan mode, output the brief as prose and stop. **Never silently auto-decide** — only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking. +**If no AskUserQuestion variant appears in your tool list, this skill is BLOCKED.** Stop, report `BLOCKED — AskUserQuestion unavailable`, and wait for the user. Do not write decisions to the plan file as a substitute, do not emit them as prose and stop, and do not silently auto-decide (only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking). ### Format diff --git a/health/SKILL.md b/health/SKILL.md index 8bf85628..0f94f543 100644 --- a/health/SKILL.md +++ b/health/SKILL.md @@ -110,7 +110,7 @@ In plan mode, allowed because they inform the plan: `$B`, `$D`, `codex exec`/`co ## Skill Invocation During Plan Mode -If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, fall back to writing the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode — never silently auto-decide. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. +If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, the skill is BLOCKED — stop and report `BLOCKED — AskUserQuestion unavailable` per the AskUserQuestion Format rule. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. If `PROACTIVE` is `"false"`, do not auto-invoke or proactively suggest skills. If a skill seems useful, ask: "I think /skillname might help here — want me to run it?" @@ -281,7 +281,7 @@ AI orchestrator (e.g., OpenClaw). In spawned sessions: **Rule:** if any `mcp__*__AskUserQuestion` variant is in your tool list, prefer it. Hosts may disable native AUQ via `--disallowedTools AskUserQuestion` (Conductor does, by default) and route through their MCP variant; calling native there silently fails. Same questions/options shape; same decision-brief format applies. -**Fallback when neither variant is callable:** in plan mode, write the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode (the native "Ready to execute?" surfaces it). Outside plan mode, output the brief as prose and stop. **Never silently auto-decide** — only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking. +**If no AskUserQuestion variant appears in your tool list, this skill is BLOCKED.** Stop, report `BLOCKED — AskUserQuestion unavailable`, and wait for the user. Do not write decisions to the plan file as a substitute, do not emit them as prose and stop, and do not silently auto-decide (only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking). ### Format diff --git a/investigate/SKILL.md b/investigate/SKILL.md index e4435db6..18b6537e 100644 --- a/investigate/SKILL.md +++ b/investigate/SKILL.md @@ -149,7 +149,7 @@ In plan mode, allowed because they inform the plan: `$B`, `$D`, `codex exec`/`co ## Skill Invocation During Plan Mode -If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, fall back to writing the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode — never silently auto-decide. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. +If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, the skill is BLOCKED — stop and report `BLOCKED — AskUserQuestion unavailable` per the AskUserQuestion Format rule. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. If `PROACTIVE` is `"false"`, do not auto-invoke or proactively suggest skills. If a skill seems useful, ask: "I think /skillname might help here — want me to run it?" @@ -320,7 +320,7 @@ AI orchestrator (e.g., OpenClaw). In spawned sessions: **Rule:** if any `mcp__*__AskUserQuestion` variant is in your tool list, prefer it. Hosts may disable native AUQ via `--disallowedTools AskUserQuestion` (Conductor does, by default) and route through their MCP variant; calling native there silently fails. Same questions/options shape; same decision-brief format applies. -**Fallback when neither variant is callable:** in plan mode, write the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode (the native "Ready to execute?" surfaces it). Outside plan mode, output the brief as prose and stop. **Never silently auto-decide** — only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking. +**If no AskUserQuestion variant appears in your tool list, this skill is BLOCKED.** Stop, report `BLOCKED — AskUserQuestion unavailable`, and wait for the user. Do not write decisions to the plan file as a substitute, do not emit them as prose and stop, and do not silently auto-decide (only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking). ### Format diff --git a/land-and-deploy/SKILL.md b/land-and-deploy/SKILL.md index e16515a3..c914b1b9 100644 --- a/land-and-deploy/SKILL.md +++ b/land-and-deploy/SKILL.md @@ -107,7 +107,7 @@ In plan mode, allowed because they inform the plan: `$B`, `$D`, `codex exec`/`co ## Skill Invocation During Plan Mode -If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, fall back to writing the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode — never silently auto-decide. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. +If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, the skill is BLOCKED — stop and report `BLOCKED — AskUserQuestion unavailable` per the AskUserQuestion Format rule. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. If `PROACTIVE` is `"false"`, do not auto-invoke or proactively suggest skills. If a skill seems useful, ask: "I think /skillname might help here — want me to run it?" @@ -278,7 +278,7 @@ AI orchestrator (e.g., OpenClaw). In spawned sessions: **Rule:** if any `mcp__*__AskUserQuestion` variant is in your tool list, prefer it. Hosts may disable native AUQ via `--disallowedTools AskUserQuestion` (Conductor does, by default) and route through their MCP variant; calling native there silently fails. Same questions/options shape; same decision-brief format applies. -**Fallback when neither variant is callable:** in plan mode, write the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode (the native "Ready to execute?" surfaces it). Outside plan mode, output the brief as prose and stop. **Never silently auto-decide** — only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking. +**If no AskUserQuestion variant appears in your tool list, this skill is BLOCKED.** Stop, report `BLOCKED — AskUserQuestion unavailable`, and wait for the user. Do not write decisions to the plan file as a substitute, do not emit them as prose and stop, and do not silently auto-decide (only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking). ### Format diff --git a/landing-report/SKILL.md b/landing-report/SKILL.md index 712c9ae1..6600affd 100644 --- a/landing-report/SKILL.md +++ b/landing-report/SKILL.md @@ -108,7 +108,7 @@ In plan mode, allowed because they inform the plan: `$B`, `$D`, `codex exec`/`co ## Skill Invocation During Plan Mode -If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, fall back to writing the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode — never silently auto-decide. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. +If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, the skill is BLOCKED — stop and report `BLOCKED — AskUserQuestion unavailable` per the AskUserQuestion Format rule. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. If `PROACTIVE` is `"false"`, do not auto-invoke or proactively suggest skills. If a skill seems useful, ask: "I think /skillname might help here — want me to run it?" @@ -279,7 +279,7 @@ AI orchestrator (e.g., OpenClaw). In spawned sessions: **Rule:** if any `mcp__*__AskUserQuestion` variant is in your tool list, prefer it. Hosts may disable native AUQ via `--disallowedTools AskUserQuestion` (Conductor does, by default) and route through their MCP variant; calling native there silently fails. Same questions/options shape; same decision-brief format applies. -**Fallback when neither variant is callable:** in plan mode, write the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode (the native "Ready to execute?" surfaces it). Outside plan mode, output the brief as prose and stop. **Never silently auto-decide** — only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking. +**If no AskUserQuestion variant appears in your tool list, this skill is BLOCKED.** Stop, report `BLOCKED — AskUserQuestion unavailable`, and wait for the user. Do not write decisions to the plan file as a substitute, do not emit them as prose and stop, and do not silently auto-decide (only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking). ### Format diff --git a/learn/SKILL.md b/learn/SKILL.md index 0a1a34a8..66458eb7 100644 --- a/learn/SKILL.md +++ b/learn/SKILL.md @@ -110,7 +110,7 @@ In plan mode, allowed because they inform the plan: `$B`, `$D`, `codex exec`/`co ## Skill Invocation During Plan Mode -If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, fall back to writing the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode — never silently auto-decide. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. +If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, the skill is BLOCKED — stop and report `BLOCKED — AskUserQuestion unavailable` per the AskUserQuestion Format rule. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. If `PROACTIVE` is `"false"`, do not auto-invoke or proactively suggest skills. If a skill seems useful, ask: "I think /skillname might help here — want me to run it?" @@ -281,7 +281,7 @@ AI orchestrator (e.g., OpenClaw). In spawned sessions: **Rule:** if any `mcp__*__AskUserQuestion` variant is in your tool list, prefer it. Hosts may disable native AUQ via `--disallowedTools AskUserQuestion` (Conductor does, by default) and route through their MCP variant; calling native there silently fails. Same questions/options shape; same decision-brief format applies. -**Fallback when neither variant is callable:** in plan mode, write the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode (the native "Ready to execute?" surfaces it). Outside plan mode, output the brief as prose and stop. **Never silently auto-decide** — only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking. +**If no AskUserQuestion variant appears in your tool list, this skill is BLOCKED.** Stop, report `BLOCKED — AskUserQuestion unavailable`, and wait for the user. Do not write decisions to the plan file as a substitute, do not emit them as prose and stop, and do not silently auto-decide (only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking). ### Format diff --git a/make-pdf/SKILL.md b/make-pdf/SKILL.md index 6883ea16..f116687d 100644 --- a/make-pdf/SKILL.md +++ b/make-pdf/SKILL.md @@ -108,7 +108,7 @@ In plan mode, allowed because they inform the plan: `$B`, `$D`, `codex exec`/`co ## Skill Invocation During Plan Mode -If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, fall back to writing the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode — never silently auto-decide. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. +If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, the skill is BLOCKED — stop and report `BLOCKED — AskUserQuestion unavailable` per the AskUserQuestion Format rule. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. If `PROACTIVE` is `"false"`, do not auto-invoke or proactively suggest skills. If a skill seems useful, ask: "I think /skillname might help here — want me to run it?" diff --git a/office-hours/SKILL.md b/office-hours/SKILL.md index 4f2a83ff..4ada7e5b 100644 --- a/office-hours/SKILL.md +++ b/office-hours/SKILL.md @@ -145,7 +145,7 @@ In plan mode, allowed because they inform the plan: `$B`, `$D`, `codex exec`/`co ## Skill Invocation During Plan Mode -If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, fall back to writing the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode — never silently auto-decide. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. +If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, the skill is BLOCKED — stop and report `BLOCKED — AskUserQuestion unavailable` per the AskUserQuestion Format rule. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. If `PROACTIVE` is `"false"`, do not auto-invoke or proactively suggest skills. If a skill seems useful, ask: "I think /skillname might help here — want me to run it?" @@ -316,7 +316,7 @@ AI orchestrator (e.g., OpenClaw). In spawned sessions: **Rule:** if any `mcp__*__AskUserQuestion` variant is in your tool list, prefer it. Hosts may disable native AUQ via `--disallowedTools AskUserQuestion` (Conductor does, by default) and route through their MCP variant; calling native there silently fails. Same questions/options shape; same decision-brief format applies. -**Fallback when neither variant is callable:** in plan mode, write the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode (the native "Ready to execute?" surfaces it). Outside plan mode, output the brief as prose and stop. **Never silently auto-decide** — only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking. +**If no AskUserQuestion variant appears in your tool list, this skill is BLOCKED.** Stop, report `BLOCKED — AskUserQuestion unavailable`, and wait for the user. Do not write decisions to the plan file as a substitute, do not emit them as prose and stop, and do not silently auto-decide (only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking). ### Format @@ -1329,7 +1329,7 @@ Rules: **RECOMMENDATION:** Choose [X] because [one-line reason mapped to the founder's stated goal]. -Emit ONE AskUserQuestion that lists every alternative (A/B and optionally C) as numbered options, using the preamble's AskUserQuestion Format section. The AskUserQuestion call is a tool_use, not prose — write the question text and call the tool. If no AskUserQuestion variant is callable in this session, follow the preamble's "Tool resolution" fallback: in plan mode, write `## Decisions to confirm` into the plan file and ExitPlanMode; outside plan mode, output the decision brief as prose and stop. Never silently auto-decide. +Emit ONE AskUserQuestion that lists every alternative (A/B and optionally C) as numbered options, using the preamble's AskUserQuestion Format section. The AskUserQuestion call is a tool_use, not prose — write the question text and call the tool. **STOP.** Do NOT proceed to Phase 4.5 (Founder Signal Synthesis), Phase 5 (Design Doc), Phase 6 (Closing), or any design-doc generation until the user responds. A "clearly winning approach" is still an approach decision and still needs explicit user approval before it lands in the design doc. Writing the recommendation in chat prose and continuing forward is the failure mode this gate exists to prevent. diff --git a/office-hours/SKILL.md.tmpl b/office-hours/SKILL.md.tmpl index 2ab92c88..8546040e 100644 --- a/office-hours/SKILL.md.tmpl +++ b/office-hours/SKILL.md.tmpl @@ -440,7 +440,7 @@ Rules: **RECOMMENDATION:** Choose [X] because [one-line reason mapped to the founder's stated goal]. -Emit ONE AskUserQuestion that lists every alternative (A/B and optionally C) as numbered options, using the preamble's AskUserQuestion Format section. The AskUserQuestion call is a tool_use, not prose — write the question text and call the tool. If no AskUserQuestion variant is callable in this session, follow the preamble's "Tool resolution" fallback: in plan mode, write `## Decisions to confirm` into the plan file and ExitPlanMode; outside plan mode, output the decision brief as prose and stop. Never silently auto-decide. +Emit ONE AskUserQuestion that lists every alternative (A/B and optionally C) as numbered options, using the preamble's AskUserQuestion Format section. The AskUserQuestion call is a tool_use, not prose — write the question text and call the tool. **STOP.** Do NOT proceed to Phase 4.5 (Founder Signal Synthesis), Phase 5 (Design Doc), Phase 6 (Closing), or any design-doc generation until the user responds. A "clearly winning approach" is still an approach decision and still needs explicit user approval before it lands in the design doc. Writing the recommendation in chat prose and continuing forward is the failure mode this gate exists to prevent. diff --git a/open-gstack-browser/SKILL.md b/open-gstack-browser/SKILL.md index 099639ac..7f31f35e 100644 --- a/open-gstack-browser/SKILL.md +++ b/open-gstack-browser/SKILL.md @@ -107,7 +107,7 @@ In plan mode, allowed because they inform the plan: `$B`, `$D`, `codex exec`/`co ## Skill Invocation During Plan Mode -If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, fall back to writing the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode — never silently auto-decide. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. +If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, the skill is BLOCKED — stop and report `BLOCKED — AskUserQuestion unavailable` per the AskUserQuestion Format rule. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. If `PROACTIVE` is `"false"`, do not auto-invoke or proactively suggest skills. If a skill seems useful, ask: "I think /skillname might help here — want me to run it?" @@ -278,7 +278,7 @@ AI orchestrator (e.g., OpenClaw). In spawned sessions: **Rule:** if any `mcp__*__AskUserQuestion` variant is in your tool list, prefer it. Hosts may disable native AUQ via `--disallowedTools AskUserQuestion` (Conductor does, by default) and route through their MCP variant; calling native there silently fails. Same questions/options shape; same decision-brief format applies. -**Fallback when neither variant is callable:** in plan mode, write the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode (the native "Ready to execute?" surfaces it). Outside plan mode, output the brief as prose and stop. **Never silently auto-decide** — only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking. +**If no AskUserQuestion variant appears in your tool list, this skill is BLOCKED.** Stop, report `BLOCKED — AskUserQuestion unavailable`, and wait for the user. Do not write decisions to the plan file as a substitute, do not emit them as prose and stop, and do not silently auto-decide (only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking). ### Format diff --git a/package.json b/package.json index 482a541c..679bb503 100644 --- a/package.json +++ b/package.json @@ -1,6 +1,6 @@ { "name": "gstack", - "version": "1.29.0.0", + "version": "1.31.0.0", "description": "Garry's Stack — Claude Code skills + fast headless browser. One repo, one install, entire AI engineering workflow.", "license": "MIT", "type": "module", diff --git a/pair-agent/SKILL.md b/pair-agent/SKILL.md index 7ef456ae..a67a9c6c 100644 --- a/pair-agent/SKILL.md +++ b/pair-agent/SKILL.md @@ -108,7 +108,7 @@ In plan mode, allowed because they inform the plan: `$B`, `$D`, `codex exec`/`co ## Skill Invocation During Plan Mode -If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, fall back to writing the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode — never silently auto-decide. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. +If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, the skill is BLOCKED — stop and report `BLOCKED — AskUserQuestion unavailable` per the AskUserQuestion Format rule. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. If `PROACTIVE` is `"false"`, do not auto-invoke or proactively suggest skills. If a skill seems useful, ask: "I think /skillname might help here — want me to run it?" @@ -279,7 +279,7 @@ AI orchestrator (e.g., OpenClaw). In spawned sessions: **Rule:** if any `mcp__*__AskUserQuestion` variant is in your tool list, prefer it. Hosts may disable native AUQ via `--disallowedTools AskUserQuestion` (Conductor does, by default) and route through their MCP variant; calling native there silently fails. Same questions/options shape; same decision-brief format applies. -**Fallback when neither variant is callable:** in plan mode, write the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode (the native "Ready to execute?" surfaces it). Outside plan mode, output the brief as prose and stop. **Never silently auto-decide** — only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking. +**If no AskUserQuestion variant appears in your tool list, this skill is BLOCKED.** Stop, report `BLOCKED — AskUserQuestion unavailable`, and wait for the user. Do not write decisions to the plan file as a substitute, do not emit them as prose and stop, and do not silently auto-decide (only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking). ### Format diff --git a/plan-ceo-review/SKILL.md b/plan-ceo-review/SKILL.md index 931678f2..7292ac31 100644 --- a/plan-ceo-review/SKILL.md +++ b/plan-ceo-review/SKILL.md @@ -139,7 +139,7 @@ In plan mode, allowed because they inform the plan: `$B`, `$D`, `codex exec`/`co ## Skill Invocation During Plan Mode -If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, fall back to writing the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode — never silently auto-decide. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. +If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, the skill is BLOCKED — stop and report `BLOCKED — AskUserQuestion unavailable` per the AskUserQuestion Format rule. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. If `PROACTIVE` is `"false"`, do not auto-invoke or proactively suggest skills. If a skill seems useful, ask: "I think /skillname might help here — want me to run it?" @@ -310,7 +310,7 @@ AI orchestrator (e.g., OpenClaw). In spawned sessions: **Rule:** if any `mcp__*__AskUserQuestion` variant is in your tool list, prefer it. Hosts may disable native AUQ via `--disallowedTools AskUserQuestion` (Conductor does, by default) and route through their MCP variant; calling native there silently fails. Same questions/options shape; same decision-brief format applies. -**Fallback when neither variant is callable:** in plan mode, write the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode (the native "Ready to execute?" surfaces it). Outside plan mode, output the brief as prose and stop. **Never silently auto-decide** — only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking. +**If no AskUserQuestion variant appears in your tool list, this skill is BLOCKED.** Stop, report `BLOCKED — AskUserQuestion unavailable`, and wait for the user. Do not write decisions to the plan file as a substitute, do not emit them as prose and stop, and do not silently auto-decide (only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking). ### Format @@ -1740,7 +1740,7 @@ Follow the AskUserQuestion format from the Preamble above. Additional rules for * For each option: effort, risk, and maintenance burden in one line. * **Map the reasoning to my engineering preferences above.** One sentence connecting your recommendation to a specific preference. * Label with issue NUMBER + option LETTER (e.g., "3A", "3B"). -* **Escape hatch (tightened):** If a section has zero findings, state "No issues, moving on" and proceed. If it has findings, use AskUserQuestion for each — a finding with an "obvious fix" is still a finding and still needs user approval before any change lands in the plan. Only skip AskUserQuestion when the decision is genuinely trivial (e.g., a typo fix) AND there are no meaningful alternatives. When in doubt, ask. +* **Zero findings:** if a section has zero findings, state "No issues, moving on" and proceed. Otherwise, use AskUserQuestion for each finding — a finding with an "obvious fix" is still a finding and still needs user approval before any change lands in the plan. ## Required Outputs diff --git a/plan-ceo-review/SKILL.md.tmpl b/plan-ceo-review/SKILL.md.tmpl index 9de3f458..7da8f576 100644 --- a/plan-ceo-review/SKILL.md.tmpl +++ b/plan-ceo-review/SKILL.md.tmpl @@ -681,7 +681,7 @@ Follow the AskUserQuestion format from the Preamble above. Additional rules for * For each option: effort, risk, and maintenance burden in one line. * **Map the reasoning to my engineering preferences above.** One sentence connecting your recommendation to a specific preference. * Label with issue NUMBER + option LETTER (e.g., "3A", "3B"). -* **Escape hatch (tightened):** If a section has zero findings, state "No issues, moving on" and proceed. If it has findings, use AskUserQuestion for each — a finding with an "obvious fix" is still a finding and still needs user approval before any change lands in the plan. Only skip AskUserQuestion when the decision is genuinely trivial (e.g., a typo fix) AND there are no meaningful alternatives. When in doubt, ask. +* **Zero findings:** if a section has zero findings, state "No issues, moving on" and proceed. Otherwise, use AskUserQuestion for each finding — a finding with an "obvious fix" is still a finding and still needs user approval before any change lands in the plan. ## Required Outputs diff --git a/plan-design-review/SKILL.md b/plan-design-review/SKILL.md index 05b7cb05..3cdabc11 100644 --- a/plan-design-review/SKILL.md +++ b/plan-design-review/SKILL.md @@ -112,7 +112,7 @@ In plan mode, allowed because they inform the plan: `$B`, `$D`, `codex exec`/`co ## Skill Invocation During Plan Mode -If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, fall back to writing the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode — never silently auto-decide. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. +If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, the skill is BLOCKED — stop and report `BLOCKED — AskUserQuestion unavailable` per the AskUserQuestion Format rule. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. If `PROACTIVE` is `"false"`, do not auto-invoke or proactively suggest skills. If a skill seems useful, ask: "I think /skillname might help here — want me to run it?" @@ -283,7 +283,7 @@ AI orchestrator (e.g., OpenClaw). In spawned sessions: **Rule:** if any `mcp__*__AskUserQuestion` variant is in your tool list, prefer it. Hosts may disable native AUQ via `--disallowedTools AskUserQuestion` (Conductor does, by default) and route through their MCP variant; calling native there silently fails. Same questions/options shape; same decision-brief format applies. -**Fallback when neither variant is callable:** in plan mode, write the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode (the native "Ready to execute?" surfaces it). Outside plan mode, output the brief as prose and stop. **Never silently auto-decide** — only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking. +**If no AskUserQuestion variant appears in your tool list, this skill is BLOCKED.** Stop, report `BLOCKED — AskUserQuestion unavailable`, and wait for the user. Do not write decisions to the plan file as a substitute, do not emit them as prose and stop, and do not silently auto-decide (only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking). ### Format @@ -1542,7 +1542,7 @@ Follow the AskUserQuestion format from the Preamble above. Additional rules for * Present 2-3 options. For each: effort to specify now, risk if deferred. * **Map to Design Principles above.** One sentence connecting your recommendation to a specific principle. * Label with issue NUMBER + option LETTER (e.g., "3A", "3B"). -* **Escape hatch (tightened):** If a section has zero findings, state "No issues, moving on" and proceed. If it has findings, use AskUserQuestion for each — a gap with an "obvious fix" is still a gap and still needs user approval before any change lands in the plan. Only skip AskUserQuestion when the fix is genuinely trivial AND there are no meaningful design alternatives. When in doubt, ask. +* **Zero findings:** if a section has zero findings, state "No issues, moving on" and proceed. Otherwise, use AskUserQuestion for each gap — a gap with an "obvious fix" is still a gap and still needs user approval before any change lands in the plan. * **NEVER use AskUserQuestion to ask which variant the user prefers.** Always create a comparison board first (`$D compare --serve`) and open it in the browser. The board has rating controls, comments, remix/regenerate buttons, and structured feedback output. Use AskUserQuestion ONLY to notify the user the board is open and wait for them to finish — not to present variants inline and ask "which do you prefer?" That is a degraded experience. ## Required Outputs diff --git a/plan-design-review/SKILL.md.tmpl b/plan-design-review/SKILL.md.tmpl index 1b84c2f0..24e89a01 100644 --- a/plan-design-review/SKILL.md.tmpl +++ b/plan-design-review/SKILL.md.tmpl @@ -348,7 +348,7 @@ Follow the AskUserQuestion format from the Preamble above. Additional rules for * Present 2-3 options. For each: effort to specify now, risk if deferred. * **Map to Design Principles above.** One sentence connecting your recommendation to a specific principle. * Label with issue NUMBER + option LETTER (e.g., "3A", "3B"). -* **Escape hatch (tightened):** If a section has zero findings, state "No issues, moving on" and proceed. If it has findings, use AskUserQuestion for each — a gap with an "obvious fix" is still a gap and still needs user approval before any change lands in the plan. Only skip AskUserQuestion when the fix is genuinely trivial AND there are no meaningful design alternatives. When in doubt, ask. +* **Zero findings:** if a section has zero findings, state "No issues, moving on" and proceed. Otherwise, use AskUserQuestion for each gap — a gap with an "obvious fix" is still a gap and still needs user approval before any change lands in the plan. * **NEVER use AskUserQuestion to ask which variant the user prefers.** Always create a comparison board first (`$D compare --serve`) and open it in the browser. The board has rating controls, comments, remix/regenerate buttons, and structured feedback output. Use AskUserQuestion ONLY to notify the user the board is open and wait for them to finish — not to present variants inline and ask "which do you prefer?" That is a degraded experience. ## Required Outputs diff --git a/plan-devex-review/SKILL.md b/plan-devex-review/SKILL.md index dda12c3f..90c0566b 100644 --- a/plan-devex-review/SKILL.md +++ b/plan-devex-review/SKILL.md @@ -116,7 +116,7 @@ In plan mode, allowed because they inform the plan: `$B`, `$D`, `codex exec`/`co ## Skill Invocation During Plan Mode -If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, fall back to writing the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode — never silently auto-decide. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. +If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, the skill is BLOCKED — stop and report `BLOCKED — AskUserQuestion unavailable` per the AskUserQuestion Format rule. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. If `PROACTIVE` is `"false"`, do not auto-invoke or proactively suggest skills. If a skill seems useful, ask: "I think /skillname might help here — want me to run it?" @@ -287,7 +287,7 @@ AI orchestrator (e.g., OpenClaw). In spawned sessions: **Rule:** if any `mcp__*__AskUserQuestion` variant is in your tool list, prefer it. Hosts may disable native AUQ via `--disallowedTools AskUserQuestion` (Conductor does, by default) and route through their MCP variant; calling native there silently fails. Same questions/options shape; same decision-brief format applies. -**Fallback when neither variant is callable:** in plan mode, write the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode (the native "Ready to execute?" surfaces it). Outside plan mode, output the brief as prose and stop. **Never silently auto-decide** — only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking. +**If no AskUserQuestion variant appears in your tool list, this skill is BLOCKED.** Stop, report `BLOCKED — AskUserQuestion unavailable`, and wait for the user. Do not write decisions to the plan file as a substitute, do not emit them as prose and stop, and do not silently auto-decide (only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking). ### Format @@ -1712,11 +1712,10 @@ DX reviews: * **Map to DX First Principles above.** One sentence connecting your recommendation to a specific principle (e.g., "This violates 'zero friction at T0' because [persona] needs 3 extra config steps before their first API call"). -* **Escape hatch (tightened):** If a section has zero findings, state "No issues, - moving on" and proceed. If it has findings, use AskUserQuestion for each — a - gap with an "obvious fix" is still a gap and still needs user approval before - any change lands in the plan. Only skip AskUserQuestion when the fix is - genuinely trivial AND there are no meaningful DX alternatives. When in doubt, ask. +* **Zero findings:** if a section has zero findings, state "No issues, moving on" + and proceed. Otherwise, use AskUserQuestion for each gap — a gap with an + "obvious fix" is still a gap and still needs user approval before any change + lands in the plan. * Assume the user hasn't looked at this window in 20 minutes. Re-ground every question. ## Required Outputs diff --git a/plan-devex-review/SKILL.md.tmpl b/plan-devex-review/SKILL.md.tmpl index 449b4132..5b68c2ea 100644 --- a/plan-devex-review/SKILL.md.tmpl +++ b/plan-devex-review/SKILL.md.tmpl @@ -669,11 +669,10 @@ DX reviews: * **Map to DX First Principles above.** One sentence connecting your recommendation to a specific principle (e.g., "This violates 'zero friction at T0' because [persona] needs 3 extra config steps before their first API call"). -* **Escape hatch (tightened):** If a section has zero findings, state "No issues, - moving on" and proceed. If it has findings, use AskUserQuestion for each — a - gap with an "obvious fix" is still a gap and still needs user approval before - any change lands in the plan. Only skip AskUserQuestion when the fix is - genuinely trivial AND there are no meaningful DX alternatives. When in doubt, ask. +* **Zero findings:** if a section has zero findings, state "No issues, moving on" + and proceed. Otherwise, use AskUserQuestion for each gap — a gap with an + "obvious fix" is still a gap and still needs user approval before any change + lands in the plan. * Assume the user hasn't looked at this window in 20 minutes. Re-ground every question. ## Required Outputs diff --git a/plan-eng-review/SKILL.md b/plan-eng-review/SKILL.md index 655e36b8..59ca19b3 100644 --- a/plan-eng-review/SKILL.md +++ b/plan-eng-review/SKILL.md @@ -114,7 +114,7 @@ In plan mode, allowed because they inform the plan: `$B`, `$D`, `codex exec`/`co ## Skill Invocation During Plan Mode -If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, fall back to writing the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode — never silently auto-decide. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. +If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, the skill is BLOCKED — stop and report `BLOCKED — AskUserQuestion unavailable` per the AskUserQuestion Format rule. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. If `PROACTIVE` is `"false"`, do not auto-invoke or proactively suggest skills. If a skill seems useful, ask: "I think /skillname might help here — want me to run it?" @@ -285,7 +285,7 @@ AI orchestrator (e.g., OpenClaw). In spawned sessions: **Rule:** if any `mcp__*__AskUserQuestion` variant is in your tool list, prefer it. Hosts may disable native AUQ via `--disallowedTools AskUserQuestion` (Conductor does, by default) and route through their MCP variant; calling native there silently fails. Same questions/options shape; same decision-brief format applies. -**Fallback when neither variant is callable:** in plan mode, write the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode (the native "Ready to execute?" surfaces it). Outside plan mode, output the brief as prose and stop. **Never silently auto-decide** — only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking. +**If no AskUserQuestion variant appears in your tool list, this skill is BLOCKED.** Stop, report `BLOCKED — AskUserQuestion unavailable`, and wait for the user. Do not write decisions to the plan file as a substitute, do not emit them as prose and stop, and do not silently auto-decide (only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking). ### Format @@ -884,7 +884,7 @@ Before reviewing anything, answer these questions: - How will users download or install it (GitHub Releases, package manager, container registry)? If the plan defers distribution, flag it explicitly in the "NOT in scope" section — don't let it silently drop. -If the complexity check triggers (8+ files or 2+ new classes/services), STOP before any review-section work. Call AskUserQuestion: name what's overbuilt, propose a minimal version that achieves the core goal, ask whether to reduce or proceed as-is. The AskUserQuestion call is a tool_use, not prose — call the tool directly. If no AskUserQuestion variant is callable, follow the preamble's "Tool resolution" fallback: in plan mode, write `## Decisions to confirm` into the plan file and ExitPlanMode; outside plan mode, output the decision brief as prose and stop. Never silently auto-decide. +If the complexity check triggers (8+ files or 2+ new classes/services), STOP before any review-section work. Call AskUserQuestion: name what's overbuilt, propose a minimal version that achieves the core goal, ask whether to reduce or proceed as-is. The AskUserQuestion call is a tool_use, not prose — call the tool directly. **STOP.** Do NOT proceed to Section 1 (Architecture review), edit the plan file with a proposed scope reduction, or call ExitPlanMode until the user responds. Naming the 80% solution in chat prose and continuing — or loading the AskUserQuestion schema via ToolSearch and then never invoking it — is the failure mode this gate exists to prevent. @@ -949,7 +949,7 @@ Evaluate: * For each new codepath or integration point, describe one realistic production failure scenario and whether the plan accounts for it. * **Distribution architecture:** If this introduces a new artifact (binary, package, container), how does it get built, published, and updated? Is the CI/CD pipeline part of the plan or deferred? -For each issue found in this section, call AskUserQuestion individually. One issue per call. Present options, state your recommendation, explain WHY. Do NOT batch multiple issues into one AskUserQuestion. Use the preamble's AskUserQuestion Format section. The AskUserQuestion call is a tool_use, not prose — call the tool directly. If no AskUserQuestion variant is callable in this session, follow the preamble's "Tool resolution" fallback: in plan mode, write `## Decisions to confirm` into the plan file and ExitPlanMode; outside plan mode, output the decision brief as prose and stop. Never silently auto-decide. +For each issue found in this section, call AskUserQuestion individually. One issue per call. Present options, state your recommendation, explain WHY. Do NOT batch multiple issues into one AskUserQuestion. Use the preamble's AskUserQuestion Format section. The AskUserQuestion call is a tool_use, not prose — call the tool directly. **STOP.** Do NOT proceed to the next review section, edit the plan file with the proposed fix, or call ExitPlanMode until the user responds. An issue with an "obvious fix" is still an issue and still needs explicit user approval before it lands in the plan. Loading the AskUserQuestion schema via ToolSearch and then writing the recommendation as chat prose is the failure mode this gate exists to prevent. @@ -987,7 +987,7 @@ Evaluate: * Areas that are over-engineered or under-engineered relative to my preferences. * Existing ASCII diagrams in touched files — are they still accurate after this change? -For each issue found in this section, call AskUserQuestion individually. One issue per call. Present options, state your recommendation, explain WHY. Do NOT batch multiple issues into one AskUserQuestion. Use the preamble's AskUserQuestion Format section. The AskUserQuestion call is a tool_use, not prose — call the tool directly. If no AskUserQuestion variant is callable in this session, follow the preamble's "Tool resolution" fallback: in plan mode, write `## Decisions to confirm` into the plan file and ExitPlanMode; outside plan mode, output the decision brief as prose and stop. Never silently auto-decide. +For each issue found in this section, call AskUserQuestion individually. One issue per call. Present options, state your recommendation, explain WHY. Do NOT batch multiple issues into one AskUserQuestion. Use the preamble's AskUserQuestion Format section. The AskUserQuestion call is a tool_use, not prose — call the tool directly. **STOP.** Do NOT proceed to the next review section, edit the plan file with the proposed fix, or call ExitPlanMode until the user responds. An issue with an "obvious fix" is still an issue and still needs explicit user approval before it lands in the plan. Loading the AskUserQuestion schema via ToolSearch and then writing the recommendation as chat prose is the failure mode this gate exists to prevent. @@ -1171,7 +1171,7 @@ This file is consumed by `/qa` and `/qa-only` as primary test input. Include onl For LLM/prompt changes: check the "Prompt/LLM changes" file patterns listed in CLAUDE.md. If this plan touches ANY of those patterns, state which eval suites must be run, which cases should be added, and what baselines to compare against. Then use AskUserQuestion to confirm the eval scope with the user. -For each issue found in this section, call AskUserQuestion individually. One issue per call. Present options, state your recommendation, explain WHY. Do NOT batch multiple issues into one AskUserQuestion. Use the preamble's AskUserQuestion Format section. The AskUserQuestion call is a tool_use, not prose — call the tool directly. If no AskUserQuestion variant is callable in this session, follow the preamble's "Tool resolution" fallback: in plan mode, write `## Decisions to confirm` into the plan file and ExitPlanMode; outside plan mode, output the decision brief as prose and stop. Never silently auto-decide. +For each issue found in this section, call AskUserQuestion individually. One issue per call. Present options, state your recommendation, explain WHY. Do NOT batch multiple issues into one AskUserQuestion. Use the preamble's AskUserQuestion Format section. The AskUserQuestion call is a tool_use, not prose — call the tool directly. **STOP.** Do NOT proceed to the next review section, edit the plan file with the proposed fix, or call ExitPlanMode until the user responds. An issue with an "obvious fix" is still an issue and still needs explicit user approval before it lands in the plan. Loading the AskUserQuestion schema via ToolSearch and then writing the recommendation as chat prose is the failure mode this gate exists to prevent. @@ -1182,7 +1182,7 @@ Evaluate: * Caching opportunities. * Slow or high-complexity code paths. -For each issue found in this section, call AskUserQuestion individually. One issue per call. Present options, state your recommendation, explain WHY. Do NOT batch multiple issues into one AskUserQuestion. Use the preamble's AskUserQuestion Format section. The AskUserQuestion call is a tool_use, not prose — call the tool directly. If no AskUserQuestion variant is callable in this session, follow the preamble's "Tool resolution" fallback: in plan mode, write `## Decisions to confirm` into the plan file and ExitPlanMode; outside plan mode, output the decision brief as prose and stop. Never silently auto-decide. +For each issue found in this section, call AskUserQuestion individually. One issue per call. Present options, state your recommendation, explain WHY. Do NOT batch multiple issues into one AskUserQuestion. Use the preamble's AskUserQuestion Format section. The AskUserQuestion call is a tool_use, not prose — call the tool directly. **STOP.** Do NOT proceed to the next review section, edit the plan file with the proposed fix, or call ExitPlanMode until the user responds. An issue with an "obvious fix" is still an issue and still needs explicit user approval before it lands in the plan. Loading the AskUserQuestion schema via ToolSearch and then writing the recommendation as chat prose is the failure mode this gate exists to prevent. @@ -1339,7 +1339,7 @@ Follow the AskUserQuestion format from the Preamble above. Additional rules for * **Map the reasoning to my engineering preferences above.** One sentence connecting your recommendation to a specific preference (DRY, explicit > clever, minimal diff, etc.). * Label with issue NUMBER + option LETTER (e.g., "3A", "3B"). * **Coverage vs kind:** for every per-issue AskUserQuestion you raise in this review, decide whether the options differ in coverage or in kind. If coverage (e.g., more tests vs fewer, complete error handling vs happy-path-only, full edge-case coverage vs shortcut), include `Completeness: N/10` on each option. If kind (e.g., architectural choice between two different systems, posture-over-posture, A/B/C where each is a different kind of thing), skip the score and add one line: `Note: options differ in kind, not coverage — no completeness score.` Do NOT fabricate scores on kind-differentiated questions — filler scores are worse than no score. -* **Escape hatch (tightened):** If a section has zero findings, state "No issues, moving on" and proceed. If it has findings, use AskUserQuestion for each — a finding with an "obvious fix" is still a finding and still needs user approval before any change lands in the plan. Only skip AskUserQuestion when the decision is genuinely trivial (e.g., a typo fix) AND there are no meaningful alternatives. When in doubt, ask. +* **Zero findings:** if a section has zero findings, state "No issues, moving on" and proceed. Otherwise, use AskUserQuestion for each finding — a finding with an "obvious fix" is still a finding and still needs user approval before any change lands in the plan. ## Required outputs diff --git a/plan-eng-review/SKILL.md.tmpl b/plan-eng-review/SKILL.md.tmpl index f473dd10..a950c060 100644 --- a/plan-eng-review/SKILL.md.tmpl +++ b/plan-eng-review/SKILL.md.tmpl @@ -113,7 +113,7 @@ Before reviewing anything, answer these questions: - How will users download or install it (GitHub Releases, package manager, container registry)? If the plan defers distribution, flag it explicitly in the "NOT in scope" section — don't let it silently drop. -If the complexity check triggers (8+ files or 2+ new classes/services), STOP before any review-section work. Call AskUserQuestion: name what's overbuilt, propose a minimal version that achieves the core goal, ask whether to reduce or proceed as-is. The AskUserQuestion call is a tool_use, not prose — call the tool directly. If no AskUserQuestion variant is callable, follow the preamble's "Tool resolution" fallback: in plan mode, write `## Decisions to confirm` into the plan file and ExitPlanMode; outside plan mode, output the decision brief as prose and stop. Never silently auto-decide. +If the complexity check triggers (8+ files or 2+ new classes/services), STOP before any review-section work. Call AskUserQuestion: name what's overbuilt, propose a minimal version that achieves the core goal, ask whether to reduce or proceed as-is. The AskUserQuestion call is a tool_use, not prose — call the tool directly. **STOP.** Do NOT proceed to Section 1 (Architecture review), edit the plan file with a proposed scope reduction, or call ExitPlanMode until the user responds. Naming the 80% solution in chat prose and continuing — or loading the AskUserQuestion schema via ToolSearch and then never invoking it — is the failure mode this gate exists to prevent. @@ -142,7 +142,7 @@ Evaluate: * For each new codepath or integration point, describe one realistic production failure scenario and whether the plan accounts for it. * **Distribution architecture:** If this introduces a new artifact (binary, package, container), how does it get built, published, and updated? Is the CI/CD pipeline part of the plan or deferred? -For each issue found in this section, call AskUserQuestion individually. One issue per call. Present options, state your recommendation, explain WHY. Do NOT batch multiple issues into one AskUserQuestion. Use the preamble's AskUserQuestion Format section. The AskUserQuestion call is a tool_use, not prose — call the tool directly. If no AskUserQuestion variant is callable in this session, follow the preamble's "Tool resolution" fallback: in plan mode, write `## Decisions to confirm` into the plan file and ExitPlanMode; outside plan mode, output the decision brief as prose and stop. Never silently auto-decide. +For each issue found in this section, call AskUserQuestion individually. One issue per call. Present options, state your recommendation, explain WHY. Do NOT batch multiple issues into one AskUserQuestion. Use the preamble's AskUserQuestion Format section. The AskUserQuestion call is a tool_use, not prose — call the tool directly. **STOP.** Do NOT proceed to the next review section, edit the plan file with the proposed fix, or call ExitPlanMode until the user responds. An issue with an "obvious fix" is still an issue and still needs explicit user approval before it lands in the plan. Loading the AskUserQuestion schema via ToolSearch and then writing the recommendation as chat prose is the failure mode this gate exists to prevent. @@ -157,7 +157,7 @@ Evaluate: * Areas that are over-engineered or under-engineered relative to my preferences. * Existing ASCII diagrams in touched files — are they still accurate after this change? -For each issue found in this section, call AskUserQuestion individually. One issue per call. Present options, state your recommendation, explain WHY. Do NOT batch multiple issues into one AskUserQuestion. Use the preamble's AskUserQuestion Format section. The AskUserQuestion call is a tool_use, not prose — call the tool directly. If no AskUserQuestion variant is callable in this session, follow the preamble's "Tool resolution" fallback: in plan mode, write `## Decisions to confirm` into the plan file and ExitPlanMode; outside plan mode, output the decision brief as prose and stop. Never silently auto-decide. +For each issue found in this section, call AskUserQuestion individually. One issue per call. Present options, state your recommendation, explain WHY. Do NOT batch multiple issues into one AskUserQuestion. Use the preamble's AskUserQuestion Format section. The AskUserQuestion call is a tool_use, not prose — call the tool directly. **STOP.** Do NOT proceed to the next review section, edit the plan file with the proposed fix, or call ExitPlanMode until the user responds. An issue with an "obvious fix" is still an issue and still needs explicit user approval before it lands in the plan. Loading the AskUserQuestion schema via ToolSearch and then writing the recommendation as chat prose is the failure mode this gate exists to prevent. @@ -167,7 +167,7 @@ For each issue found in this section, call AskUserQuestion individually. One iss For LLM/prompt changes: check the "Prompt/LLM changes" file patterns listed in CLAUDE.md. If this plan touches ANY of those patterns, state which eval suites must be run, which cases should be added, and what baselines to compare against. Then use AskUserQuestion to confirm the eval scope with the user. -For each issue found in this section, call AskUserQuestion individually. One issue per call. Present options, state your recommendation, explain WHY. Do NOT batch multiple issues into one AskUserQuestion. Use the preamble's AskUserQuestion Format section. The AskUserQuestion call is a tool_use, not prose — call the tool directly. If no AskUserQuestion variant is callable in this session, follow the preamble's "Tool resolution" fallback: in plan mode, write `## Decisions to confirm` into the plan file and ExitPlanMode; outside plan mode, output the decision brief as prose and stop. Never silently auto-decide. +For each issue found in this section, call AskUserQuestion individually. One issue per call. Present options, state your recommendation, explain WHY. Do NOT batch multiple issues into one AskUserQuestion. Use the preamble's AskUserQuestion Format section. The AskUserQuestion call is a tool_use, not prose — call the tool directly. **STOP.** Do NOT proceed to the next review section, edit the plan file with the proposed fix, or call ExitPlanMode until the user responds. An issue with an "obvious fix" is still an issue and still needs explicit user approval before it lands in the plan. Loading the AskUserQuestion schema via ToolSearch and then writing the recommendation as chat prose is the failure mode this gate exists to prevent. @@ -178,7 +178,7 @@ Evaluate: * Caching opportunities. * Slow or high-complexity code paths. -For each issue found in this section, call AskUserQuestion individually. One issue per call. Present options, state your recommendation, explain WHY. Do NOT batch multiple issues into one AskUserQuestion. Use the preamble's AskUserQuestion Format section. The AskUserQuestion call is a tool_use, not prose — call the tool directly. If no AskUserQuestion variant is callable in this session, follow the preamble's "Tool resolution" fallback: in plan mode, write `## Decisions to confirm` into the plan file and ExitPlanMode; outside plan mode, output the decision brief as prose and stop. Never silently auto-decide. +For each issue found in this section, call AskUserQuestion individually. One issue per call. Present options, state your recommendation, explain WHY. Do NOT batch multiple issues into one AskUserQuestion. Use the preamble's AskUserQuestion Format section. The AskUserQuestion call is a tool_use, not prose — call the tool directly. **STOP.** Do NOT proceed to the next review section, edit the plan file with the proposed fix, or call ExitPlanMode until the user responds. An issue with an "obvious fix" is still an issue and still needs explicit user approval before it lands in the plan. Loading the AskUserQuestion schema via ToolSearch and then writing the recommendation as chat prose is the failure mode this gate exists to prevent. @@ -201,7 +201,7 @@ Follow the AskUserQuestion format from the Preamble above. Additional rules for * **Map the reasoning to my engineering preferences above.** One sentence connecting your recommendation to a specific preference (DRY, explicit > clever, minimal diff, etc.). * Label with issue NUMBER + option LETTER (e.g., "3A", "3B"). * **Coverage vs kind:** for every per-issue AskUserQuestion you raise in this review, decide whether the options differ in coverage or in kind. If coverage (e.g., more tests vs fewer, complete error handling vs happy-path-only, full edge-case coverage vs shortcut), include `Completeness: N/10` on each option. If kind (e.g., architectural choice between two different systems, posture-over-posture, A/B/C where each is a different kind of thing), skip the score and add one line: `Note: options differ in kind, not coverage — no completeness score.` Do NOT fabricate scores on kind-differentiated questions — filler scores are worse than no score. -* **Escape hatch (tightened):** If a section has zero findings, state "No issues, moving on" and proceed. If it has findings, use AskUserQuestion for each — a finding with an "obvious fix" is still a finding and still needs user approval before any change lands in the plan. Only skip AskUserQuestion when the decision is genuinely trivial (e.g., a typo fix) AND there are no meaningful alternatives. When in doubt, ask. +* **Zero findings:** if a section has zero findings, state "No issues, moving on" and proceed. Otherwise, use AskUserQuestion for each finding — a finding with an "obvious fix" is still a finding and still needs user approval before any change lands in the plan. ## Required outputs diff --git a/plan-tune/SKILL.md b/plan-tune/SKILL.md index 21572ec7..3a7f6949 100644 --- a/plan-tune/SKILL.md +++ b/plan-tune/SKILL.md @@ -121,7 +121,7 @@ In plan mode, allowed because they inform the plan: `$B`, `$D`, `codex exec`/`co ## Skill Invocation During Plan Mode -If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, fall back to writing the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode — never silently auto-decide. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. +If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, the skill is BLOCKED — stop and report `BLOCKED — AskUserQuestion unavailable` per the AskUserQuestion Format rule. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. If `PROACTIVE` is `"false"`, do not auto-invoke or proactively suggest skills. If a skill seems useful, ask: "I think /skillname might help here — want me to run it?" @@ -292,7 +292,7 @@ AI orchestrator (e.g., OpenClaw). In spawned sessions: **Rule:** if any `mcp__*__AskUserQuestion` variant is in your tool list, prefer it. Hosts may disable native AUQ via `--disallowedTools AskUserQuestion` (Conductor does, by default) and route through their MCP variant; calling native there silently fails. Same questions/options shape; same decision-brief format applies. -**Fallback when neither variant is callable:** in plan mode, write the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode (the native "Ready to execute?" surfaces it). Outside plan mode, output the brief as prose and stop. **Never silently auto-decide** — only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking. +**If no AskUserQuestion variant appears in your tool list, this skill is BLOCKED.** Stop, report `BLOCKED — AskUserQuestion unavailable`, and wait for the user. Do not write decisions to the plan file as a substitute, do not emit them as prose and stop, and do not silently auto-decide (only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking). ### Format diff --git a/qa-only/SKILL.md b/qa-only/SKILL.md index ee48e5fa..42deeeaa 100644 --- a/qa-only/SKILL.md +++ b/qa-only/SKILL.md @@ -109,7 +109,7 @@ In plan mode, allowed because they inform the plan: `$B`, `$D`, `codex exec`/`co ## Skill Invocation During Plan Mode -If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, fall back to writing the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode — never silently auto-decide. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. +If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, the skill is BLOCKED — stop and report `BLOCKED — AskUserQuestion unavailable` per the AskUserQuestion Format rule. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. If `PROACTIVE` is `"false"`, do not auto-invoke or proactively suggest skills. If a skill seems useful, ask: "I think /skillname might help here — want me to run it?" @@ -280,7 +280,7 @@ AI orchestrator (e.g., OpenClaw). In spawned sessions: **Rule:** if any `mcp__*__AskUserQuestion` variant is in your tool list, prefer it. Hosts may disable native AUQ via `--disallowedTools AskUserQuestion` (Conductor does, by default) and route through their MCP variant; calling native there silently fails. Same questions/options shape; same decision-brief format applies. -**Fallback when neither variant is callable:** in plan mode, write the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode (the native "Ready to execute?" surfaces it). Outside plan mode, output the brief as prose and stop. **Never silently auto-decide** — only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking. +**If no AskUserQuestion variant appears in your tool list, this skill is BLOCKED.** Stop, report `BLOCKED — AskUserQuestion unavailable`, and wait for the user. Do not write decisions to the plan file as a substitute, do not emit them as prose and stop, and do not silently auto-decide (only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking). ### Format diff --git a/qa/SKILL.md b/qa/SKILL.md index f853efd8..6e8fb2c0 100644 --- a/qa/SKILL.md +++ b/qa/SKILL.md @@ -115,7 +115,7 @@ In plan mode, allowed because they inform the plan: `$B`, `$D`, `codex exec`/`co ## Skill Invocation During Plan Mode -If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, fall back to writing the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode — never silently auto-decide. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. +If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, the skill is BLOCKED — stop and report `BLOCKED — AskUserQuestion unavailable` per the AskUserQuestion Format rule. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. If `PROACTIVE` is `"false"`, do not auto-invoke or proactively suggest skills. If a skill seems useful, ask: "I think /skillname might help here — want me to run it?" @@ -286,7 +286,7 @@ AI orchestrator (e.g., OpenClaw). In spawned sessions: **Rule:** if any `mcp__*__AskUserQuestion` variant is in your tool list, prefer it. Hosts may disable native AUQ via `--disallowedTools AskUserQuestion` (Conductor does, by default) and route through their MCP variant; calling native there silently fails. Same questions/options shape; same decision-brief format applies. -**Fallback when neither variant is callable:** in plan mode, write the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode (the native "Ready to execute?" surfaces it). Outside plan mode, output the brief as prose and stop. **Never silently auto-decide** — only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking. +**If no AskUserQuestion variant appears in your tool list, this skill is BLOCKED.** Stop, report `BLOCKED — AskUserQuestion unavailable`, and wait for the user. Do not write decisions to the plan file as a substitute, do not emit them as prose and stop, and do not silently auto-decide (only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking). ### Format diff --git a/retro/SKILL.md b/retro/SKILL.md index abdf3397..eb2e3fd0 100644 --- a/retro/SKILL.md +++ b/retro/SKILL.md @@ -127,7 +127,7 @@ In plan mode, allowed because they inform the plan: `$B`, `$D`, `codex exec`/`co ## Skill Invocation During Plan Mode -If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, fall back to writing the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode — never silently auto-decide. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. +If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, the skill is BLOCKED — stop and report `BLOCKED — AskUserQuestion unavailable` per the AskUserQuestion Format rule. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. If `PROACTIVE` is `"false"`, do not auto-invoke or proactively suggest skills. If a skill seems useful, ask: "I think /skillname might help here — want me to run it?" @@ -298,7 +298,7 @@ AI orchestrator (e.g., OpenClaw). In spawned sessions: **Rule:** if any `mcp__*__AskUserQuestion` variant is in your tool list, prefer it. Hosts may disable native AUQ via `--disallowedTools AskUserQuestion` (Conductor does, by default) and route through their MCP variant; calling native there silently fails. Same questions/options shape; same decision-brief format applies. -**Fallback when neither variant is callable:** in plan mode, write the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode (the native "Ready to execute?" surfaces it). Outside plan mode, output the brief as prose and stop. **Never silently auto-decide** — only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking. +**If no AskUserQuestion variant appears in your tool list, this skill is BLOCKED.** Stop, report `BLOCKED — AskUserQuestion unavailable`, and wait for the user. Do not write decisions to the plan file as a substitute, do not emit them as prose and stop, and do not silently auto-decide (only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking). ### Format diff --git a/review/SKILL.md b/review/SKILL.md index e29e4ad9..16b2ea4f 100644 --- a/review/SKILL.md +++ b/review/SKILL.md @@ -112,7 +112,7 @@ In plan mode, allowed because they inform the plan: `$B`, `$D`, `codex exec`/`co ## Skill Invocation During Plan Mode -If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, fall back to writing the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode — never silently auto-decide. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. +If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, the skill is BLOCKED — stop and report `BLOCKED — AskUserQuestion unavailable` per the AskUserQuestion Format rule. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. If `PROACTIVE` is `"false"`, do not auto-invoke or proactively suggest skills. If a skill seems useful, ask: "I think /skillname might help here — want me to run it?" @@ -283,7 +283,7 @@ AI orchestrator (e.g., OpenClaw). In spawned sessions: **Rule:** if any `mcp__*__AskUserQuestion` variant is in your tool list, prefer it. Hosts may disable native AUQ via `--disallowedTools AskUserQuestion` (Conductor does, by default) and route through their MCP variant; calling native there silently fails. Same questions/options shape; same decision-brief format applies. -**Fallback when neither variant is callable:** in plan mode, write the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode (the native "Ready to execute?" surfaces it). Outside plan mode, output the brief as prose and stop. **Never silently auto-decide** — only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking. +**If no AskUserQuestion variant appears in your tool list, this skill is BLOCKED.** Stop, report `BLOCKED — AskUserQuestion unavailable`, and wait for the user. Do not write decisions to the plan file as a substitute, do not emit them as prose and stop, and do not silently auto-decide (only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking). ### Format diff --git a/scrape/SKILL.md b/scrape/SKILL.md index e6605326..80b7247c 100644 --- a/scrape/SKILL.md +++ b/scrape/SKILL.md @@ -108,7 +108,7 @@ In plan mode, allowed because they inform the plan: `$B`, `$D`, `codex exec`/`co ## Skill Invocation During Plan Mode -If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, fall back to writing the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode — never silently auto-decide. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. +If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, the skill is BLOCKED — stop and report `BLOCKED — AskUserQuestion unavailable` per the AskUserQuestion Format rule. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. If `PROACTIVE` is `"false"`, do not auto-invoke or proactively suggest skills. If a skill seems useful, ask: "I think /skillname might help here — want me to run it?" @@ -279,7 +279,7 @@ AI orchestrator (e.g., OpenClaw). In spawned sessions: **Rule:** if any `mcp__*__AskUserQuestion` variant is in your tool list, prefer it. Hosts may disable native AUQ via `--disallowedTools AskUserQuestion` (Conductor does, by default) and route through their MCP variant; calling native there silently fails. Same questions/options shape; same decision-brief format applies. -**Fallback when neither variant is callable:** in plan mode, write the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode (the native "Ready to execute?" surfaces it). Outside plan mode, output the brief as prose and stop. **Never silently auto-decide** — only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking. +**If no AskUserQuestion variant appears in your tool list, this skill is BLOCKED.** Stop, report `BLOCKED — AskUserQuestion unavailable`, and wait for the user. Do not write decisions to the plan file as a substitute, do not emit them as prose and stop, and do not silently auto-decide (only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking). ### Format diff --git a/scripts/resolvers/preamble/generate-ask-user-format.ts b/scripts/resolvers/preamble/generate-ask-user-format.ts index 6fa3055d..859ed5b0 100644 --- a/scripts/resolvers/preamble/generate-ask-user-format.ts +++ b/scripts/resolvers/preamble/generate-ask-user-format.ts @@ -9,7 +9,7 @@ export function generateAskUserFormat(_ctx: TemplateContext): string { **Rule:** if any \`mcp__*__AskUserQuestion\` variant is in your tool list, prefer it. Hosts may disable native AUQ via \`--disallowedTools AskUserQuestion\` (Conductor does, by default) and route through their MCP variant; calling native there silently fails. Same questions/options shape; same decision-brief format applies. -**Fallback when neither variant is callable:** in plan mode, write the decision brief into the plan file as a \`## Decisions to confirm\` section + ExitPlanMode (the native "Ready to execute?" surfaces it). Outside plan mode, output the brief as prose and stop. **Never silently auto-decide** — only \`/plan-tune\` AUTO_DECIDE opt-ins authorize auto-picking. +**If no AskUserQuestion variant appears in your tool list, this skill is BLOCKED.** Stop, report \`BLOCKED — AskUserQuestion unavailable\`, and wait for the user. Do not write decisions to the plan file as a substitute, do not emit them as prose and stop, and do not silently auto-decide (only \`/plan-tune\` AUTO_DECIDE opt-ins authorize auto-picking). ### Format diff --git a/scripts/resolvers/preamble/generate-completion-status.ts b/scripts/resolvers/preamble/generate-completion-status.ts index 12909922..21c0bd5e 100644 --- a/scripts/resolvers/preamble/generate-completion-status.ts +++ b/scripts/resolvers/preamble/generate-completion-status.ts @@ -26,7 +26,7 @@ In plan mode, allowed because they inform the plan: \`$B\`, \`$D\`, \`codex exec ## Skill Invocation During Plan Mode -If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — \`mcp__*__AskUserQuestion\` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, fall back to writing the decision brief into the plan file as a \`## Decisions to confirm\` section + ExitPlanMode — never silently auto-decide. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode.`; +If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — \`mcp__*__AskUserQuestion\` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, the skill is BLOCKED — stop and report \`BLOCKED — AskUserQuestion unavailable\` per the AskUserQuestion Format rule. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode.`; } export function generateCompletionStatus(ctx: TemplateContext): string { diff --git a/setup-browser-cookies/SKILL.md b/setup-browser-cookies/SKILL.md index 15996523..4e46c0b2 100644 --- a/setup-browser-cookies/SKILL.md +++ b/setup-browser-cookies/SKILL.md @@ -105,7 +105,7 @@ In plan mode, allowed because they inform the plan: `$B`, `$D`, `codex exec`/`co ## Skill Invocation During Plan Mode -If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, fall back to writing the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode — never silently auto-decide. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. +If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, the skill is BLOCKED — stop and report `BLOCKED — AskUserQuestion unavailable` per the AskUserQuestion Format rule. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. If `PROACTIVE` is `"false"`, do not auto-invoke or proactively suggest skills. If a skill seems useful, ask: "I think /skillname might help here — want me to run it?" diff --git a/setup-deploy/SKILL.md b/setup-deploy/SKILL.md index a77246eb..1f4c3acc 100644 --- a/setup-deploy/SKILL.md +++ b/setup-deploy/SKILL.md @@ -111,7 +111,7 @@ In plan mode, allowed because they inform the plan: `$B`, `$D`, `codex exec`/`co ## Skill Invocation During Plan Mode -If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, fall back to writing the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode — never silently auto-decide. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. +If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, the skill is BLOCKED — stop and report `BLOCKED — AskUserQuestion unavailable` per the AskUserQuestion Format rule. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. If `PROACTIVE` is `"false"`, do not auto-invoke or proactively suggest skills. If a skill seems useful, ask: "I think /skillname might help here — want me to run it?" @@ -282,7 +282,7 @@ AI orchestrator (e.g., OpenClaw). In spawned sessions: **Rule:** if any `mcp__*__AskUserQuestion` variant is in your tool list, prefer it. Hosts may disable native AUQ via `--disallowedTools AskUserQuestion` (Conductor does, by default) and route through their MCP variant; calling native there silently fails. Same questions/options shape; same decision-brief format applies. -**Fallback when neither variant is callable:** in plan mode, write the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode (the native "Ready to execute?" surfaces it). Outside plan mode, output the brief as prose and stop. **Never silently auto-decide** — only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking. +**If no AskUserQuestion variant appears in your tool list, this skill is BLOCKED.** Stop, report `BLOCKED — AskUserQuestion unavailable`, and wait for the user. Do not write decisions to the plan file as a substitute, do not emit them as prose and stop, and do not silently auto-decide (only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking). ### Format diff --git a/setup-gbrain/SKILL.md b/setup-gbrain/SKILL.md index 9f8f379d..40c59a1a 100644 --- a/setup-gbrain/SKILL.md +++ b/setup-gbrain/SKILL.md @@ -112,7 +112,7 @@ In plan mode, allowed because they inform the plan: `$B`, `$D`, `codex exec`/`co ## Skill Invocation During Plan Mode -If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, fall back to writing the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode — never silently auto-decide. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. +If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, the skill is BLOCKED — stop and report `BLOCKED — AskUserQuestion unavailable` per the AskUserQuestion Format rule. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. If `PROACTIVE` is `"false"`, do not auto-invoke or proactively suggest skills. If a skill seems useful, ask: "I think /skillname might help here — want me to run it?" @@ -283,7 +283,7 @@ AI orchestrator (e.g., OpenClaw). In spawned sessions: **Rule:** if any `mcp__*__AskUserQuestion` variant is in your tool list, prefer it. Hosts may disable native AUQ via `--disallowedTools AskUserQuestion` (Conductor does, by default) and route through their MCP variant; calling native there silently fails. Same questions/options shape; same decision-brief format applies. -**Fallback when neither variant is callable:** in plan mode, write the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode (the native "Ready to execute?" surfaces it). Outside plan mode, output the brief as prose and stop. **Never silently auto-decide** — only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking. +**If no AskUserQuestion variant appears in your tool list, this skill is BLOCKED.** Stop, report `BLOCKED — AskUserQuestion unavailable`, and wait for the user. Do not write decisions to the plan file as a substitute, do not emit them as prose and stop, and do not silently auto-decide (only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking). ### Format diff --git a/ship/SKILL.md b/ship/SKILL.md index 27b785e5..adf4e093 100644 --- a/ship/SKILL.md +++ b/ship/SKILL.md @@ -113,7 +113,7 @@ In plan mode, allowed because they inform the plan: `$B`, `$D`, `codex exec`/`co ## Skill Invocation During Plan Mode -If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, fall back to writing the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode — never silently auto-decide. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. +If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, the skill is BLOCKED — stop and report `BLOCKED — AskUserQuestion unavailable` per the AskUserQuestion Format rule. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. If `PROACTIVE` is `"false"`, do not auto-invoke or proactively suggest skills. If a skill seems useful, ask: "I think /skillname might help here — want me to run it?" @@ -284,7 +284,7 @@ AI orchestrator (e.g., OpenClaw). In spawned sessions: **Rule:** if any `mcp__*__AskUserQuestion` variant is in your tool list, prefer it. Hosts may disable native AUQ via `--disallowedTools AskUserQuestion` (Conductor does, by default) and route through their MCP variant; calling native there silently fails. Same questions/options shape; same decision-brief format applies. -**Fallback when neither variant is callable:** in plan mode, write the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode (the native "Ready to execute?" surfaces it). Outside plan mode, output the brief as prose and stop. **Never silently auto-decide** — only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking. +**If no AskUserQuestion variant appears in your tool list, this skill is BLOCKED.** Stop, report `BLOCKED — AskUserQuestion unavailable`, and wait for the user. Do not write decisions to the plan file as a substitute, do not emit them as prose and stop, and do not silently auto-decide (only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking). ### Format diff --git a/skillify/SKILL.md b/skillify/SKILL.md index 293d5ed4..530a9039 100644 --- a/skillify/SKILL.md +++ b/skillify/SKILL.md @@ -109,7 +109,7 @@ In plan mode, allowed because they inform the plan: `$B`, `$D`, `codex exec`/`co ## Skill Invocation During Plan Mode -If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, fall back to writing the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode — never silently auto-decide. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. +If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, the skill is BLOCKED — stop and report `BLOCKED — AskUserQuestion unavailable` per the AskUserQuestion Format rule. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. If `PROACTIVE` is `"false"`, do not auto-invoke or proactively suggest skills. If a skill seems useful, ask: "I think /skillname might help here — want me to run it?" @@ -280,7 +280,7 @@ AI orchestrator (e.g., OpenClaw). In spawned sessions: **Rule:** if any `mcp__*__AskUserQuestion` variant is in your tool list, prefer it. Hosts may disable native AUQ via `--disallowedTools AskUserQuestion` (Conductor does, by default) and route through their MCP variant; calling native there silently fails. Same questions/options shape; same decision-brief format applies. -**Fallback when neither variant is callable:** in plan mode, write the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode (the native "Ready to execute?" surfaces it). Outside plan mode, output the brief as prose and stop. **Never silently auto-decide** — only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking. +**If no AskUserQuestion variant appears in your tool list, this skill is BLOCKED.** Stop, report `BLOCKED — AskUserQuestion unavailable`, and wait for the user. Do not write decisions to the plan file as a substitute, do not emit them as prose and stop, and do not silently auto-decide (only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking). ### Format diff --git a/sync-gbrain/SKILL.md b/sync-gbrain/SKILL.md index d87e275e..6f957a61 100644 --- a/sync-gbrain/SKILL.md +++ b/sync-gbrain/SKILL.md @@ -112,7 +112,7 @@ In plan mode, allowed because they inform the plan: `$B`, `$D`, `codex exec`/`co ## Skill Invocation During Plan Mode -If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, fall back to writing the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode — never silently auto-decide. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. +If the user invokes a skill in plan mode, the skill takes precedence over generic plan mode behavior. **Treat the skill file as executable instructions, not reference.** Follow it step by step starting from Step 0; the first AskUserQuestion is the workflow entering plan mode, not a violation of it. AskUserQuestion (any variant — `mcp__*__AskUserQuestion` or native; see "AskUserQuestion Format → Tool resolution") satisfies plan mode's end-of-turn requirement. If no variant is callable, the skill is BLOCKED — stop and report `BLOCKED — AskUserQuestion unavailable` per the AskUserQuestion Format rule. At a STOP point, stop immediately. Do not continue the workflow or call ExitPlanMode there. Commands marked "PLAN MODE EXCEPTION — ALWAYS RUN" execute. Call ExitPlanMode only after the skill workflow completes, or if the user tells you to cancel the skill or leave plan mode. If `PROACTIVE` is `"false"`, do not auto-invoke or proactively suggest skills. If a skill seems useful, ask: "I think /skillname might help here — want me to run it?" @@ -283,7 +283,7 @@ AI orchestrator (e.g., OpenClaw). In spawned sessions: **Rule:** if any `mcp__*__AskUserQuestion` variant is in your tool list, prefer it. Hosts may disable native AUQ via `--disallowedTools AskUserQuestion` (Conductor does, by default) and route through their MCP variant; calling native there silently fails. Same questions/options shape; same decision-brief format applies. -**Fallback when neither variant is callable:** in plan mode, write the decision brief into the plan file as a `## Decisions to confirm` section + ExitPlanMode (the native "Ready to execute?" surfaces it). Outside plan mode, output the brief as prose and stop. **Never silently auto-decide** — only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking. +**If no AskUserQuestion variant appears in your tool list, this skill is BLOCKED.** Stop, report `BLOCKED — AskUserQuestion unavailable`, and wait for the user. Do not write decisions to the plan file as a substitute, do not emit them as prose and stop, and do not silently auto-decide (only `/plan-tune` AUTO_DECIDE opt-ins authorize auto-picking). ### Format diff --git a/test/fixtures/forcing-finding-seeds.ts b/test/fixtures/forcing-finding-seeds.ts index fec88793..8d52c858 100644 --- a/test/fixtures/forcing-finding-seeds.ts +++ b/test/fixtures/forcing-finding-seeds.ts @@ -81,3 +81,42 @@ export const FORCING_FLOOR_DEVEX = [ '', 'No quickstart command, no hosted sandbox, no copy-pasteable curl example.', ].join('\n'); + +/** + * Multi-finding batching regression seed (periodic tier). + * + * Mirrors the May 2026 transcript bug shape: 4 distinct non-trivial findings + * spread across plan-eng-review's standard sections (Architecture, Code + * Quality, Tests, Performance). Each finding is independent — there is no + * legitimate reason to batch them into a single AskUserQuestion. + * + * Used by test/skill-e2e-plan-eng-multi-finding-batching.test.ts to assert + * the agent fires >= 3 review-phase AUQs (i.e., does NOT batch them into a + * "## Decisions to confirm" section + ExitPlanMode). Floor of 3 (not 4) is + * the [N-1] tolerance from the existing finding-count band convention. + */ +export const FORCING_BATCHING_ENG = [ + 'Please review this plan thoroughly. As you go, write your plan-mode plan to /tmp/gstack-test-plan-eng-batching.md (use Edit/Write to that exact path).', + '', + '# Plan: Add background job retry framework', + '', + '## Architecture', + "We'll roll a custom exponential-backoff scheduler inline in each worker", + "rather than use the existing job library's built-in retry hooks. Same", + 'shape as the library version, but we want full control over the curve.', + '', + '## Code quality', + 'The retry envelope (compute delay, log attempt, dispatch) is duplicated', + 'across 5 worker files with copy-pasted bodies. We will leave the', + 'duplication for now and refactor "later."', + '', + '## Tests', + 'The existing `processWebhookJob()` flow gets rewritten as part of this', + 'change. No regression test for the prior at-most-once delivery guarantee', + 'is planned.', + '', + '## Performance', + 'On every retry we re-fetch the full job payload from the database, then', + 'iterate the payload to recompute the dependency graph. Could cache the', + 'graph on the first attempt; not planned.', +].join('\n'); diff --git a/test/helpers/claude-pty-runner.ts b/test/helpers/claude-pty-runner.ts index 2fb7b484..f06ba05f 100644 --- a/test/helpers/claude-pty-runner.ts +++ b/test/helpers/claude-pty-runner.ts @@ -297,6 +297,250 @@ export function isNumberedOptionListVisible(visible: string): boolean { return /❯\s*1\./.test(visible) && /(^|[^0-9])2\./.test(visible); } +// ──────────────────────────────────────────────────────────────────────────── +// LLM judge — "is the model waiting for user input, working, or hung?" +// +// Regex detectors (isNumberedOptionListVisible, isProseAUQVisible) are fast +// and deterministic but brittle to PTY rendering quirks (cursor-positioning +// escapes that collapse multi-line option lists onto a single logical line). +// When they miss, the polling loop times out at the full budget — even +// though the model is correctly surfacing a question via a format the regex +// can't reassemble. +// +// This LLM judge takes a TTY snapshot and answers a trichotomy: +// - 'waiting' — agent surfaced a question/options, sitting at input prompt +// - 'working' — agent is still generating (spinner, tool calls, "Musing") +// - 'hung' — agent stopped without surfacing anything (rare) +// +// Used by polling loops as a fallback after N seconds with no terminal +// classification. On 'waiting' verdict, return outcome='asked' early. +// +// Cost: ~$0.0005 per call using claude haiku 4.5. Cached by snapshot hash so +// identical TTY frames don't re-charge. All verdicts logged to +// ~/.gstack/analytics/pty-judge.jsonl for offline analysis. +// ──────────────────────────────────────────────────────────────────────────── + +import { spawnSync as nodeSpawnSync } from 'node:child_process'; +import { createHash } from 'node:crypto'; + +export interface PtyStateVerdict { + state: 'waiting' | 'working' | 'hung' | 'unknown'; + reasoning: string; + /** SHA-1 of the normalized snapshot input (for caching/dedup). */ + hash: string; + /** Wall time (ms) the judge call took. */ + elapsedMs: number; +} + +const PTY_VERDICT_CACHE = new Map(); + +/** + * Persist a verdict (or snapshot dump) to the analytics JSONL log. + * Best-effort — failures (disk full, permission denied, etc.) are swallowed + * so the harness never fails on logging. + */ +function logPtyJudge(record: Record): void { + try { + const dir = `${process.env.HOME}/.gstack/analytics`; + fs.mkdirSync(dir, { recursive: true }); + fs.appendFileSync(`${dir}/pty-judge.jsonl`, JSON.stringify(record) + '\n'); + } catch { + /* best-effort */ + } +} + +/** + * Snapshot dump for postmortem debugging when GSTACK_PTY_LOG=1. + * Writes the last 4KB of visible TTY plus context to + * ~/.gstack/analytics/pty-snapshots/-ms.txt. + */ +export function logPtySnapshot(visible: string, ctx: { testName: string; elapsedMs: number; tag?: string }): void { + if (process.env.GSTACK_PTY_LOG !== '1') return; + try { + const dir = `${process.env.HOME}/.gstack/analytics/pty-snapshots`; + fs.mkdirSync(dir, { recursive: true }); + const tag = ctx.tag ? `-${ctx.tag}` : ''; + const file = `${dir}/${ctx.testName}-${ctx.elapsedMs}ms${tag}.txt`; + fs.writeFileSync( + file, + `# testName: ${ctx.testName}\n# elapsedMs: ${ctx.elapsedMs}\n# tag: ${ctx.tag ?? ''}\n# visible.length: ${visible.length}\n\n${visible.slice(-4096)}`, + ); + } catch { + /* best-effort */ + } +} + +/** + * Ask Claude Haiku 4.5 to classify a TTY snapshot as waiting/working/hung. + * + * Implementation: spawns `claude -p --model claude-haiku-4-5` synchronously + * with the prompt piped via stdin. Uses subscription auth (no API key env + * required). 30-second timeout; returns 'unknown' on any failure mode + * (timeout, malformed JSON, missing claude binary). + * + * Cache: identical snapshot hashes return the cached verdict without + * re-calling. Cache lives in-process; resets between test runs. + */ +export function judgePtyState( + visible: string, + ctx?: { testName?: string }, +): PtyStateVerdict { + // Normalize: strip trailing whitespace lines + take last 4KB. Hash the + // normalized form so spinner-frame-only diffs (which all look "working") + // don't bust the cache and rack up cost. + const tail = visible.slice(-4096).replace(/[ \t]+$/gm, ''); + const hash = createHash('sha1').update(tail).digest('hex').slice(0, 16); + + const cached = PTY_VERDICT_CACHE.get(hash); + if (cached) return cached; + + const judgeStart = Date.now(); + const prompt = `You are reading a snapshot of a terminal where Claude Code is running in plan mode for an automated test. Your job: classify the agent's current state. + +Pick exactly ONE: +- WAITING — agent surfaced a question or option list and is sitting at the input prompt waiting for user reply. Signs: numbered/lettered options visible (1./2./3. or A)/B)/C)), "Recommendation:" line, cursor at empty input prompt with no recent generation activity. +- WORKING — agent is actively generating or running tools. Signs: spinner glyphs (✻ ✶ ✳ ✢ ✽), "Musing..." or "Churned for ..." text, recent tool-call blocks (Read/Edit/Bash/Grep), in-flight token output. +- HUNG — agent has stopped without surfacing a question and without any spinner/work activity. Rare; usually means a crash. + +Respond with strict JSON ONLY (no markdown fences, no prose): +{"state":"waiting","reasoning":"one short sentence"} + +Terminal snapshot (last 4KB): +\`\`\` +${tail} +\`\`\``; + + let verdict: PtyStateVerdict = { + state: 'unknown', + reasoning: 'judge call did not complete', + hash, + elapsedMs: 0, + }; + + try { + const result = nodeSpawnSync( + 'claude', + ['-p', '--model', 'claude-haiku-4-5', '--max-turns', '1'], + { + input: prompt, + stdio: ['pipe', 'pipe', 'pipe'], + timeout: 30_000, + encoding: 'utf-8', + }, + ); + const elapsedMs = Date.now() - judgeStart; + if (result.status === 0 && result.stdout) { + // Pull the first {...} JSON object out of stdout. Haiku occasionally + // wraps in ```json ...``` despite the prompt; tolerate that. + const match = result.stdout.match(/\{[\s\S]*?"state"[\s\S]*?\}/); + if (match) { + try { + const parsed = JSON.parse(match[0]); + const state = ['waiting', 'working', 'hung'].includes(parsed.state) + ? (parsed.state as 'waiting' | 'working' | 'hung') + : 'unknown'; + verdict = { + state, + reasoning: typeof parsed.reasoning === 'string' ? parsed.reasoning.slice(0, 200) : '', + hash, + elapsedMs, + }; + } catch { + verdict = { state: 'unknown', reasoning: 'malformed JSON', hash, elapsedMs }; + } + } else { + verdict = { state: 'unknown', reasoning: 'no JSON in response', hash, elapsedMs }; + } + } else { + verdict = { + state: 'unknown', + reasoning: `claude exited ${result.status} (${(result.stderr ?? '').slice(0, 80)})`, + hash, + elapsedMs, + }; + } + } catch (err) { + verdict = { + state: 'unknown', + reasoning: `judge spawn failed: ${(err as Error).message}`.slice(0, 200), + hash, + elapsedMs: Date.now() - judgeStart, + }; + } + + PTY_VERDICT_CACHE.set(hash, verdict); + logPtyJudge({ + ts: new Date().toISOString(), + testName: ctx?.testName ?? 'unknown', + state: verdict.state, + reasoning: verdict.reasoning, + hash: verdict.hash, + judgeMs: verdict.elapsedMs, + }); + return verdict; +} + +/** + * Detect a prose-rendered AskUserQuestion in plan mode. + * + * Plan-mode AUQs sometimes render as visible model output rather than via + * the native numbered-prompt UI — e.g., when --disallowedTools AskUserQuestion + * is set and no MCP variant is callable, the model surfaces the question as + * lettered or numbered options in plain text. isNumberedOptionListVisible + * doesn't catch these because the `❯` cursor sits on the empty input prompt, + * not on option 1. + * + * Detection patterns: + * - 2+ distinct lettered options (A) B) C) D)) at line starts — typical + * for plan-eng / plan-design / plan-devex prose AUQ + * - 3+ distinct numbered options (1. 2. 3.) at line starts WITHOUT a + * `❯1.` cursor — typical for autoplan / office-hours prose AUQ + * + * Used by classifyVisible and runPlanSkillFloorCheck to return outcome='asked' + * (or auq_observed) instead of letting the harness time out when the model + * is correctly surfacing the question and waiting for user input via prose. + * + * The 4KB tail window avoids matching stale options from earlier prompts in + * scrollback. Permission dialogs are filtered out by the caller (see + * isPermissionDialogVisible callers in classifyVisible). + */ +export function isProseAUQVisible(visible: string): boolean { + const tail = visible.length > 4096 ? visible.slice(-4096) : visible; + + // Pattern 1: 2+ distinct lettered options at line starts. Allow leading + // whitespace or `❯` cursor before the marker. PTY may collapse multiple + // option lines onto one logical line via stripped cursor-positioning + // escapes, but the NEWLINE before each option survives. + const letteredRe = /(?:^|\n)[ \t❯]*([A-D])\)/g; + const letteredHits = new Set(); + let lm: RegExpExecArray | null; + while ((lm = letteredRe.exec(tail)) !== null) { + if (lm[1]) letteredHits.add(lm[1]); + } + if (letteredHits.size >= 2) return true; + + // Pattern 2: 2+ distinct numbered options at line starts, AND no + // `❯1.` cursor IN THE RECENT TAIL (not the full buffer — a + // trust-dialog `❯ 1. Yes` at boot is in scrollback forever and + // would otherwise suppress this path for the rest of the run). + // The native-UI deferral only applies when the cursor list is + // currently rendered, not historically. + // + // Threshold 2 (matching the lettered branch): the tail is a 4KB window, + // and by the time the polling loop sees it, the model may have emitted + // option 1 several KB earlier and only 2/3/4 remain in tail. False + // positives on prose ("First, x. Second, y.") are extremely rare given + // the line-start anchor + the no-cursor gate. + if (/❯\s*1\./.test(tail)) return false; + const numberedRe = /(?:^|\n)[ \t❯]*([1-9])\./g; + const numberedHits = new Set(); + let nm: RegExpExecArray | null; + while ((nm = numberedRe.exec(tail)) !== null) { + if (nm[1]) numberedHits.add(nm[1]); + } + return numberedHits.size >= 2; +} + /** * Parse a rendered numbered-option list out of the visible TTY text. * @@ -570,6 +814,21 @@ export function classifyVisible( summary: 'skill fired a numbered-option prompt (AskUserQuestion or routing-injection)', }; } + // Prose-rendered AUQ: model surfaced the question as lettered or numbered + // options in plain text (typical under --disallowedTools AskUserQuestion + // when no MCP variant is callable). The model is waiting for user input + // via the plan-mode input prompt rather than via the AUQ tool UI; this + // is still a legitimate "asked" surface — semantically equivalent to a + // tool-call AUQ from the test's perspective. + if (isProseAUQVisible(visible)) { + if (isPermissionDialogVisible(visible.slice(-TAIL_SCAN_BYTES))) { + return null; + } + return { + outcome: 'asked', + summary: 'skill rendered a prose-style AskUserQuestion (model waiting for user input)', + }; + } return null; } @@ -784,9 +1043,22 @@ export function assertReviewReportAtBottom( * `'wrote_findings_before_asking'` when a plan was already written. */ export function assertReportAtBottomIfPlanWritten( - obs: { planFile?: string; evidence: string }, + obs: { planFile?: string; evidence: string; outcome?: string }, ): void { if (!obs.planFile) return; + // Skip when the plan file path was detected from TTY output but no file + // exists on disk. This happens when the model mentions a path mid-stream + // (e.g., as a tool-call argument that was interrupted, or in a draft that + // was never persisted). The report-at-bottom contract is for fully-written + // plan files; ENOENT means there's no file content to enforce against. + if (!fs.existsSync(obs.planFile)) return; + // Skip on 'asked' outcomes — these are smoke tests that exited at the + // first AUQ render (Step 0 only). The model never reached the workflow's + // report-writing step, so a partial plan file without the report section + // is the expected mid-flight state, not a contract violation. The + // report-at-bottom check applies to outcomes that imply the workflow + // ran end-to-end (plan_ready, completion_summary, etc.). + if (obs.outcome === 'asked') return; const content = fs.readFileSync(obs.planFile, 'utf-8'); const verdict = assertReviewReportAtBottom(content); if (!verdict.ok) { @@ -1130,6 +1402,27 @@ export interface PlanSkillObservation { * the section, and that's the regression we want to catch. */ planFile?: string; + /** + * High-water-mark flag: did the polling loop ever observe a + * prose-rendered AskUserQuestion (lettered or numbered options visible) + * during the run? Set true the first poll iteration that + * isProseAUQVisible returns true on the recent buffer; remains true + * for the rest of the observation. + * + * The 2KB `evidence` window often misses the prose-AUQ moment because + * by the time outcome=plan_ready fires, the ExitPlanMode "Ready to + * execute" UI has pushed the options out of the tail. Tests that need + * to assert "the user saw the question at SOME point" should check + * this flag rather than re-running isProseAUQVisible on the truncated + * evidence. + */ + proseAUQEverObserved?: boolean; + /** + * High-water-mark flag: did the LLM judge ever return state='waiting' + * during the run? Same shape as proseAUQEverObserved but driven by the + * Haiku judge fallback rather than the regex detector. + */ + waitingEverObserved?: boolean; } /** @@ -1220,6 +1513,17 @@ export async function runPlanSkillObservation(opts: { const budgetMs = opts.timeoutMs ?? 180_000; const start = Date.now(); + let lastJudgeAt = 0; + let lastJudgeVerdict: PtyStateVerdict | null = null; + // High-water marks: did we EVER see a prose-AUQ surface or a judge + // 'waiting' verdict during the run? Models may surface options + // briefly, then resume thinking when no user response comes (test + // env has no responder). At timeout we trust historical signals + // even if the current state is 'working'. + let proseAUQEverObserved = false; + let waitingEverObserved = false; + const JUDGE_AFTER_MS = 60_000; + const JUDGE_INTERVAL_MS = 30_000; while (Date.now() - start < budgetMs) { await Bun.sleep(2000); const visible = session.visibleSince(since); @@ -1240,6 +1544,18 @@ export async function runPlanSkillObservation(opts: { elapsedMs: Date.now() - startedAt, }; } + + // Cheap surface-tracking: did the model ever surface a prose AUQ in + // this tick's recent buffer? Track once-true (high water). + if (!proseAUQEverObserved && isProseAUQVisible(visible)) { + proseAUQEverObserved = true; + logPtySnapshot(visible, { + testName: opts.skillName, + elapsedMs: Date.now() - start, + tag: 'prose-auq-surfaced', + }); + } + const classified = classifyVisible(visible, { strictPlanWrites: !!opts.initialPlanContent, }); @@ -1248,6 +1564,8 @@ export async function runPlanSkillObservation(opts: { ...classified, evidence: visible.slice(-2000), elapsedMs: Date.now() - startedAt, + proseAUQEverObserved, + waitingEverObserved, }; // Capture the plan file path on any outcome where one may have been // written. Gating only on 'plan_ready' missed two cases: (1) the @@ -1260,13 +1578,60 @@ export async function runPlanSkillObservation(opts: { if (planFile) obs.planFile = planFile; return obs; } + + // LLM judge fallback: if regex detectors didn't classify and we've + // burned >60s with periodic ticks, ask Haiku "is the model waiting, + // working, or hung?" Treat 'waiting' as 'asked' (model surfaced a + // question via prose the regex couldn't reassemble). Snapshot the + // visible buffer at each judge call when GSTACK_PTY_LOG=1. + const elapsed = Date.now() - start; + if (elapsed > JUDGE_AFTER_MS && Date.now() - lastJudgeAt > JUDGE_INTERVAL_MS) { + lastJudgeAt = Date.now(); + logPtySnapshot(visible, { testName: opts.skillName, elapsedMs: elapsed, tag: 'judge-tick' }); + lastJudgeVerdict = judgePtyState(visible, { testName: opts.skillName }); + if (lastJudgeVerdict.state === 'waiting') { + waitingEverObserved = true; + return { + outcome: 'asked', + summary: `LLM judge: ${lastJudgeVerdict.reasoning} (state=waiting after ${Math.round(elapsed / 1000)}s)`, + evidence: visible.slice(-2000), + elapsedMs: Date.now() - startedAt, + }; + } + } } + // Timeout fallback: if we observed a prose-AUQ surface OR a judge + // 'waiting' verdict at any point during the run, treat as 'asked'. + // This catches the model-surfaced-then-resumed-thinking case where + // by the time the timeout fires, the buffer has moved past the + // options into spinner state but the question DID surface earlier. + const finalVisible = session.visibleSince(since); + if (proseAUQEverObserved || waitingEverObserved) { + return { + outcome: 'asked', + summary: + `prose-AUQ surface observed during run (proseAUQEverObserved=${proseAUQEverObserved}, waitingEverObserved=${waitingEverObserved}); model surfaced the question and the test budget elapsed without a follow-up classification` + + (lastJudgeVerdict + ? ` (last LLM judge: ${lastJudgeVerdict.state} — ${lastJudgeVerdict.reasoning})` + : ''), + evidence: finalVisible.slice(-2000), + elapsedMs: Date.now() - startedAt, + proseAUQEverObserved, + waitingEverObserved, + }; + } return { outcome: 'timeout', - summary: `no terminal outcome within ${budgetMs}ms`, - evidence: session.visibleSince(since).slice(-2000), + summary: + `no terminal outcome within ${budgetMs}ms` + + (lastJudgeVerdict + ? ` (last LLM judge: state=${lastJudgeVerdict.state} — ${lastJudgeVerdict.reasoning})` + : ''), + evidence: finalVisible.slice(-2000), elapsedMs: Date.now() - startedAt, + proseAUQEverObserved, + waitingEverObserved, }; } finally { await session.close(); @@ -1629,6 +1994,10 @@ export async function runPlanSkillFloorCheck(opts: { session.send(`${opts.followUpPrompt}\r`); const start = Date.now(); + let lastJudgeAt = 0; + let lastJudgeVerdict: PtyStateVerdict | null = null; + const JUDGE_AFTER_MS = 60_000; + const JUDGE_INTERVAL_MS = 30_000; while (Date.now() - start < timeoutMs) { await Bun.sleep(2000); const visible = session.visibleSince(since); @@ -1652,12 +2021,15 @@ export async function runPlanSkillFloorCheck(opts: { }; } - // Success: ANY non-permission numbered-option list is an AUQ render. - // The bug we're catching is "fired zero AUQs," so observing one is - // sufficient — we don't need to fingerprint or navigate past it. + // Success: ANY non-permission numbered-option list is an AUQ render — + // either via the native numbered-prompt UI (isNumberedOptionListVisible) + // OR via prose-rendered options under --disallowedTools when no MCP + // variant is callable (isProseAUQVisible). Both surface the question + // to the user; the bug we're catching is "fired zero AUQs." + const tail = visible.slice(-TAIL_SCAN_BYTES); if ( - isNumberedOptionListVisible(visible) && - !isPermissionDialogVisible(visible.slice(-TAIL_SCAN_BYTES)) + (isNumberedOptionListVisible(visible) || isProseAUQVisible(visible)) && + !isPermissionDialogVisible(tail) ) { return { auqObserved: true, @@ -1668,6 +2040,28 @@ export async function runPlanSkillFloorCheck(opts: { }; } + // LLM judge fallback: same shape as runPlanSkillObservation. After 60s + // of polling without a regex hit, ask Haiku to classify the snapshot. + // 'waiting' verdict counts as floor met (model surfaced a question via + // prose the regex couldn't catch). 'working' / 'hung' / 'unknown' don't + // change the outcome — they enrich the eventual timeout summary so the + // failure diagnostic is more actionable than "no AUQ render." + const elapsed = Date.now() - start; + if (elapsed > JUDGE_AFTER_MS && Date.now() - lastJudgeAt > JUDGE_INTERVAL_MS) { + lastJudgeAt = Date.now(); + logPtySnapshot(visible, { testName: opts.skillName, elapsedMs: elapsed, tag: 'floor-judge-tick' }); + lastJudgeVerdict = judgePtyState(visible, { testName: opts.skillName }); + if (lastJudgeVerdict.state === 'waiting') { + return { + auqObserved: true, + outcome: 'auq_observed', + summary: `LLM judge: ${lastJudgeVerdict.reasoning} (state=waiting after ${Math.round(elapsed / 1000)}s; floor met)`, + evidence: visible.slice(-3000), + elapsedMs: Date.now() - startedAt, + }; + } + } + // Silent write outside sanctioned dirs is the transcript-bug shape. const writeRe = /⏺\s*(?:Write|Edit)\(([^)]+)\)/g; let m: RegExpExecArray | null; diff --git a/test/helpers/claude-pty-runner.unit.test.ts b/test/helpers/claude-pty-runner.unit.test.ts index dcb4f5ef..ab1a89cb 100644 --- a/test/helpers/claude-pty-runner.unit.test.ts +++ b/test/helpers/claude-pty-runner.unit.test.ts @@ -26,6 +26,7 @@ import { describe, test, expect } from 'bun:test'; import { isPermissionDialogVisible, isNumberedOptionListVisible, + isProseAUQVisible, isPlanReadyVisible, parseNumberedOptions, classifyVisible, @@ -192,6 +193,105 @@ describe('isNumberedOptionListVisible', () => { }); }); +describe('isProseAUQVisible', () => { + test('matches 4 lettered options A) B) C) D) at line starts (plan-eng prose AUQ shape)', () => { + const sample = ` +What would you like me to review? Options: +A) Point me at an existing design doc or plan file (path). +B) Describe new work you're planning — I'll explore the codebase. +C) You meant /review for the diff already on this branch. +D) Something else (tell me). +Recommendation: A if you have a doc in mind, otherwise B. +❯ +`; + expect(isProseAUQVisible(sample)).toBe(true); + }); + + test('matches 2 lettered options (minimum threshold)', () => { + const sample = ` +A) First option +B) Second option +`; + expect(isProseAUQVisible(sample)).toBe(true); + }); + + test('matches 3 numbered options 1. 2. 3. without ❯ 1. cursor (autoplan prose AUQ shape)', () => { + const sample = ` +What's the task? A few options: + 1. You have a plan idea in mind — describe it. + 2. You want to review an existing plan elsewhere. + 3. You meant a different command — /plan-ceo-review etc. +❯ +`; + expect(isProseAUQVisible(sample)).toBe(true); + }); + + test('returns false when ❯ 1. cursor is present in the recent tail (native UI handled by isNumberedOptionListVisible)', () => { + const sample = ` +❯ 1. First option + 2. Second option + 3. Third option +`; + expect(isProseAUQVisible(sample)).toBe(false); + }); + + test('does NOT suppress numbered-prose detection when ❯ 1. is only in early scrollback (trust dialog)', () => { + // Boot trust dialog rendered ❯ 1. Yes at startup, then a long body of + // model output, then prose-rendered numbered options now. The historic + // ❯ 1. is in the full buffer but NOT in the recent tail. Should detect + // the prose AUQ. + const trustHeader = '❯ 1. Yes, trust\n 2. No\n'; + const filler = 'x'.repeat(5000); // pushes trust dialog out of last 4KB tail + const proseAUQ = `\n 1. Review the docs\n 2. Investigate the code\n 3. Defer to next session\n❯ \n`; + const sample = trustHeader + filler + proseAUQ; + expect(isProseAUQVisible(sample)).toBe(true); + }); + + test('returns false on single lettered option', () => { + const sample = ` +A) Only one option mentioned in passing. +`; + expect(isProseAUQVisible(sample)).toBe(false); + }); + + test('matches 2 numbered options (threshold matches lettered branch — tails miss option 1)', () => { + const sample = ` +1. First note. +2. Second note. +`; + expect(isProseAUQVisible(sample)).toBe(true); + }); + + test('returns false on a single numbered option', () => { + const sample = ` +1. Only one option mentioned. +`; + expect(isProseAUQVisible(sample)).toBe(false); + }); + + test('does not match mid-prose lettered text like "(see option B) above"', () => { + const sample = ` +This refers to (see option B) above and also to point A) earlier. +`; + // The B) and A) markers are mid-line, not at line starts, so they don't count. + expect(isProseAUQVisible(sample)).toBe(false); + }); + + test('matches with leading whitespace and ❯ prefix on options', () => { + const sample = ` + A) Option with whitespace prefix +❯ B) Option with cursor prefix + C) Another option +`; + expect(isProseAUQVisible(sample)).toBe(true); + }); + + test('returns false on plain text with no option markers', () => { + expect(isProseAUQVisible('Just some plain text output from the model.')).toBe(false); + expect(isProseAUQVisible('')).toBe(false); + }); +}); + describe('classifyVisible (runtime path through the runner classifier)', () => { // These tests call the actual classifier so a future contributor who // reorders branches (e.g. moves the permission short-circuit before diff --git a/test/helpers/touchfiles.ts b/test/helpers/touchfiles.ts index 18c25e0b..abd60c13 100644 --- a/test/helpers/touchfiles.ts +++ b/test/helpers/touchfiles.ts @@ -103,7 +103,6 @@ export const E2E_TOUCHFILES: Record = { // INSIDE the existing 4 plan-X-review-plan-mode test files (covered // transitively by the entries above). Two new standalone files exist for // skills with no prior plan-mode test: - 'autoplan-auto-mode': ['autoplan/**', 'plan-ceo-review/**', 'plan-design-review/**', 'plan-eng-review/**', 'plan-devex-review/**', 'scripts/resolvers/preamble/generate-completion-status.ts', 'scripts/resolvers/question-tuning.ts', 'scripts/resolvers/preamble/generate-ask-user-format.ts', 'scripts/resolvers/preamble.ts', 'test/helpers/claude-pty-runner.ts'], 'office-hours-auto-mode': ['office-hours/**', 'scripts/resolvers/preamble/generate-completion-status.ts', 'scripts/resolvers/question-tuning.ts', 'scripts/resolvers/preamble/generate-ask-user-format.ts', 'scripts/resolvers/preamble.ts', 'test/helpers/claude-pty-runner.ts'], 'office-hours-phase4-fork': ['office-hours/**', 'scripts/resolvers/preamble/generate-ask-user-format.ts', 'scripts/resolvers/preamble/generate-completion-status.ts', 'scripts/resolvers/preamble.ts', 'scripts/resolvers/question-tuning.ts', 'test/helpers/llm-judge.ts', 'test/skill-e2e-office-hours-phase4.test.ts'], 'llm-judge-recommendation': ['test/helpers/llm-judge.ts', 'test/llm-judge-recommendation.test.ts', 'scripts/resolvers/preamble/generate-ask-user-format.ts', 'codex/SKILL.md.tmpl', 'scripts/resolvers/review.ts'], @@ -143,6 +142,13 @@ export const E2E_TOUCHFILES: Record = { 'plan-ceo-finding-floor': ['plan-ceo-review/**', 'scripts/resolvers/preamble.ts', 'scripts/resolvers/preamble/generate-ask-user-format.ts', 'scripts/resolvers/preamble/generate-completion-status.ts', 'scripts/resolvers/review.ts', 'test/helpers/claude-pty-runner.ts', 'test/fixtures/forcing-finding-seeds.ts', 'test/skill-e2e-plan-ceo-finding-floor.test.ts'], 'plan-design-finding-floor': ['plan-design-review/**', 'scripts/resolvers/preamble.ts', 'scripts/resolvers/preamble/generate-ask-user-format.ts', 'scripts/resolvers/preamble/generate-completion-status.ts', 'scripts/resolvers/review.ts', 'test/helpers/claude-pty-runner.ts', 'test/fixtures/forcing-finding-seeds.ts', 'test/skill-e2e-plan-design-finding-floor.test.ts'], 'plan-devex-finding-floor': ['plan-devex-review/**', 'scripts/resolvers/preamble.ts', 'scripts/resolvers/preamble/generate-ask-user-format.ts', 'scripts/resolvers/preamble/generate-completion-status.ts', 'scripts/resolvers/review.ts', 'test/helpers/claude-pty-runner.ts', 'test/fixtures/forcing-finding-seeds.ts', 'test/skill-e2e-plan-devex-finding-floor.test.ts'], + + // Multi-finding batching regression — periodic tier complement to the + // gate-tier finding-floor. Catches the May 2026 transcript shape where + // a model fires one AUQ then batches the rest into a "## Decisions to + // confirm" plan write. runPlanSkillFloorCheck cannot detect that shape + // (it exits on first AUQ); runPlanSkillCounting can. + 'plan-eng-multi-finding-batching': ['plan-eng-review/**', 'scripts/resolvers/preamble.ts', 'scripts/resolvers/preamble/generate-ask-user-format.ts', 'scripts/resolvers/preamble/generate-completion-status.ts', 'scripts/resolvers/review.ts', 'test/helpers/claude-pty-runner.ts', 'test/fixtures/forcing-finding-seeds.ts', 'test/skill-e2e-plan-eng-multi-finding-batching.test.ts'], 'brain-privacy-gate': ['scripts/resolvers/preamble/generate-brain-sync-block.ts', 'scripts/resolvers/preamble.ts', 'bin/gstack-brain-sync', 'bin/gstack-artifacts-init', 'bin/gstack-config', 'test/helpers/agent-sdk-runner.ts'], // /setup-gbrain Path 4 (Remote MCP) — happy + bad-token end-to-end via @@ -416,7 +422,6 @@ export const E2E_TIERS: Record = { 'plan-devex-review-plan-mode': 'gate', 'plan-mode-no-op': 'gate', // v1.21+ auto-mode regression tests - 'autoplan-auto-mode': 'gate', 'office-hours-auto-mode': 'gate', 'auto-decide-preserved': 'periodic', 'e2e-harness-audit': 'gate', @@ -443,6 +448,7 @@ export const E2E_TIERS: Record = { 'plan-ceo-finding-floor': 'gate', 'plan-design-finding-floor': 'gate', 'plan-devex-finding-floor': 'gate', + 'plan-eng-multi-finding-batching': 'periodic', // Privacy gate for gstack-brain-sync — periodic (non-deterministic LLM call, // costs ~$0.30-$0.50 per run, not needed on every commit) diff --git a/test/skill-e2e-autoplan-auto-mode.test.ts b/test/skill-e2e-autoplan-auto-mode.test.ts deleted file mode 100644 index 0677917b..00000000 --- a/test/skill-e2e-autoplan-auto-mode.test.ts +++ /dev/null @@ -1,77 +0,0 @@ -/** - * autoplan AskUserQuestion-blocked regression (gate, paid, real-PTY). - * - * v1.21+ regression: Conductor launches Claude Code with - * `--disallowedTools AskUserQuestion --permission-mode default` (verified - * by inspecting the parent claude process via `ps`). The native - * AskUserQuestion tool is removed from the model's tool registry; without - * fallback guidance the model can't ask the user and silently proceeds. - * - * Autoplan auto-decides INTERMEDIATE questions BY DESIGN - * (autoplan/SKILL.md.tmpl:45), but Phase 1's premise confirmation gate is - * one of the few non-auto-decided AskUserQuestions and MUST surface to the - * user. This test asserts that gate still surfaces when AskUserQuestion is - * disallowed at the tool-registry level — the fix must route the question - * through a Conductor-side variant (mcp__conductor__AskUserQuestion) or - * through the plan-file + ExitPlanMode flow. - * - * Filename keeps `auto-mode` for branch-history continuity. Auto-mode (the - * AUTO_DECIDE preamble path when QUESTION_TUNING=true) is a related but - * distinct silencing mechanism; both share the same fix surface. - * - * Note on report-at-bottom contract: the GSTACK REVIEW REPORT delete-then- - * append flow lives in `scripts/resolvers/review.ts` and is exercised when - * reviews actually run. The PTY harness can't drive autoplan through its - * review phases without auto-progression of AUQs (see runPlanSkillCounting), - * and `--disallowedTools AskUserQuestion` makes autoplan bail at the - * premise gate via the plan-file fallback before any review runs. The - * report-at-bottom prompt change is verified statically in - * `test/gen-skill-docs.test.ts` instead — that's the load-bearing - * verification for the contradictory-prompt fix. - */ - -import { describe, test, expect } from 'bun:test'; -import { runPlanSkillObservation, planFileHasDecisionsSection } from './helpers/claude-pty-runner'; - -const shouldRun = !!process.env.EVALS && process.env.EVALS_TIER === 'gate'; -const describeE2E = shouldRun ? describe : describe.skip; - -describeE2E('autoplan AskUserQuestion-blocked smoke (gate)', () => { - // Pass envelope is ['asked', 'plan_ready']: model either renders the - // first non-auto-decided gate (Phase 1 premise confirmation) as numbered - // prose or surfaces it through the plan file + ExitPlanMode flow. - // Autoplan auto-decides intermediate questions BY DESIGN; the failure - // signal we care about is the AUTO_DECIDE preamble firing on a gate it - // shouldn't (caught explicitly via the 'auto_decided' outcome). - test('a non-auto-decided gate surfaces when AskUserQuestion is --disallowedTools', async () => { - const obs = await runPlanSkillObservation({ - skillName: 'autoplan', - inPlanMode: true, - extraArgs: ['--disallowedTools', 'AskUserQuestion'], - timeoutMs: 300_000, - }); - - if ( - obs.outcome === 'auto_decided' || - obs.outcome === 'silent_write' || - obs.outcome === 'exited' || - obs.outcome === 'timeout' - ) { - throw new Error( - `autoplan AskUserQuestion-blocked regression: outcome=${obs.outcome}\n` + - `summary: ${obs.summary}\n` + - `elapsed: ${obs.elapsedMs}ms\n` + - `--- evidence (last 2KB visible) ---\n${obs.evidence}`, - ); - } - if (obs.outcome === 'plan_ready') { - if (!obs.planFile || !planFileHasDecisionsSection(obs.planFile)) { - throw new Error( - `autoplan AskUserQuestion-blocked regression: plan_ready without a "## Decisions" section in ${obs.planFile ?? ''} — Phase 1 premise gate was silently skipped.\n` + - `--- evidence (last 2KB visible) ---\n${obs.evidence}`, - ); - } - } - expect(['asked', 'plan_ready']).toContain(obs.outcome); - }, 360_000); -}); diff --git a/test/skill-e2e-plan-ceo-plan-mode.test.ts b/test/skill-e2e-plan-ceo-plan-mode.test.ts index a99084e1..30e32fb2 100644 --- a/test/skill-e2e-plan-ceo-plan-mode.test.ts +++ b/test/skill-e2e-plan-ceo-plan-mode.test.ts @@ -33,10 +33,9 @@ * See test/helpers/claude-pty-runner.ts for runner internals. */ -import { describe, test, expect } from 'bun:test'; +import { describe, test } from 'bun:test'; import { runPlanSkillObservation, - planFileHasDecisionsSection, assertReportAtBottomIfPlanWritten, } from './helpers/claude-pty-runner'; @@ -74,66 +73,4 @@ describeE2E('plan-ceo-review plan-mode smoke (gate)', () => { } assertReportAtBottomIfPlanWritten(obs); }, 360_000); - - // v1.21+ regression: Conductor launches Claude Code with - // `--disallowedTools AskUserQuestion --permission-mode default` (verified - // via `ps` on the live Conductor claude process). Native AskUserQuestion - // is removed from the model's tool registry; without fallback guidance - // the model can't ask and silently proceeds. - // - // The fix (Tool resolution preamble) accepts two surface paths under - // --disallowedTools: - // - 'asked' — model emits a numbered-option prompt as prose (with - // the same D + Pros/cons format as a real AUQ) - // - 'plan_ready' — model writes the question into the plan file as a - // "## Decisions to confirm" section + ExitPlanMode; - // the native plan-mode "Ready to execute?" surfaces - // it through the TTY confirmation - // - // Both let the user see the decision. Failure signals are - // silent_write/exited/timeout (model never surfaced the question) and - // 'auto_decided' (the AUTO_DECIDE preamble fired without a /plan-tune - // opt-in — caught explicitly). - test('AskUserQuestion surfaces when --disallowedTools AskUserQuestion is set', async () => { - const obs = await runPlanSkillObservation({ - skillName: 'plan-ceo-review', - inPlanMode: true, - extraArgs: ['--disallowedTools', 'AskUserQuestion'], - timeoutMs: 300_000, - }); - - if ( - obs.outcome === 'auto_decided' || - obs.outcome === 'silent_write' || - obs.outcome === 'exited' || - obs.outcome === 'timeout' - ) { - throw new Error( - `plan-ceo-review AskUserQuestion-blocked regression: outcome=${obs.outcome}\n` + - `summary: ${obs.summary}\n` + - `elapsed: ${obs.elapsedMs}ms\n` + - `--- evidence (last 2KB visible) ---\n${obs.evidence}`, - ); - } - // plan_ready under --disallowedTools is only a pass when the model used - // the plan-file fallback (wrote a `## Decisions to confirm` section). - // Without that section, plan_ready means the model silently skipped Step 0 - // and went straight to ExitPlanMode — the regression we're catching. - if (obs.outcome === 'plan_ready') { - if (!obs.planFile) { - throw new Error( - `plan-ceo-review AskUserQuestion-blocked regression: outcome=plan_ready but no plan file path detected in TTY output. Cannot verify the model used the fallback flow.\n` + - `--- evidence (last 2KB visible) ---\n${obs.evidence}`, - ); - } - if (!planFileHasDecisionsSection(obs.planFile)) { - throw new Error( - `plan-ceo-review AskUserQuestion-blocked regression: model wrote ${obs.planFile} without a "## Decisions" section. Step 0 was silently skipped.\n` + - `--- evidence (last 2KB visible) ---\n${obs.evidence}`, - ); - } - } - expect(['asked', 'plan_ready']).toContain(obs.outcome); - assertReportAtBottomIfPlanWritten(obs); - }, 360_000); }); diff --git a/test/skill-e2e-plan-design-plan-mode.test.ts b/test/skill-e2e-plan-design-plan-mode.test.ts index da3d591a..80b98287 100644 --- a/test/skill-e2e-plan-design-plan-mode.test.ts +++ b/test/skill-e2e-plan-design-plan-mode.test.ts @@ -37,41 +37,4 @@ describeE2E('plan-design-review plan-mode smoke (gate)', () => { expect(['asked', 'plan_ready']).toContain(obs.outcome); assertReportAtBottomIfPlanWritten(obs); }, 360_000); - - // v1.21+ regression: see skill-e2e-plan-ceo-plan-mode.test.ts for the - // contract. plan-design-review legitimately short-circuits on no-UI-scope - // branches, so this case keeps the same ['asked', 'plan_ready'] envelope - // as the baseline. The discriminating regression signals are - // 'auto_decided' (AUTO_DECIDE preamble fired upstream) or any failure - // outcome — both mean the user never saw a question they should have. - test('does not silently auto-decide when --disallowedTools AskUserQuestion is set', async () => { - const obs = await runPlanSkillObservation({ - skillName: 'plan-design-review', - inPlanMode: true, - extraArgs: ['--disallowedTools', 'AskUserQuestion'], - timeoutMs: 300_000, - }); - - if ( - obs.outcome === 'auto_decided' || - obs.outcome === 'silent_write' || - obs.outcome === 'exited' || - obs.outcome === 'timeout' - ) { - throw new Error( - `plan-design-review AskUserQuestion-blocked regression: outcome=${obs.outcome}\n` + - `summary: ${obs.summary}\n` + - `elapsed: ${obs.elapsedMs}ms\n` + - `--- evidence (last 2KB visible) ---\n${obs.evidence}`, - ); - } - // plan-design-review legitimately short-circuits to plan_ready on no-UI - // branches. Allow plan_ready WITHOUT a decisions section ONLY if the - // plan file genuinely has no UI scope (we don't have a deterministic way - // to check this from the test, so this skill keeps the looser envelope). - // Other plan-mode skills require the decisions section under - // --disallowedTools; design is the special case. - expect(['asked', 'plan_ready']).toContain(obs.outcome); - assertReportAtBottomIfPlanWritten(obs); - }, 360_000); }); diff --git a/test/skill-e2e-plan-eng-multi-finding-batching.test.ts b/test/skill-e2e-plan-eng-multi-finding-batching.test.ts new file mode 100644 index 00000000..1eef660a --- /dev/null +++ b/test/skill-e2e-plan-eng-multi-finding-batching.test.ts @@ -0,0 +1,96 @@ +/** + * /plan-eng-review multi-finding batching regression (periodic, paid, real-PTY). + * + * Catches the specific shape of the May 2026 transcript bug that the + * single-finding gate-tier floor test cannot detect: a model that fires + * one AskUserQuestion and then batches the remaining findings into a + * single "## Decisions to confirm" plan write + ExitPlanMode. + * + * Why a separate test from skill-e2e-plan-eng-finding-floor: + * - The gate-tier floor (runPlanSkillFloorCheck) exits on the first AUQ + * render and returns success. A model that fires once-then-batches + * would pass that test trivially. + * - This test uses runPlanSkillCounting at periodic tier (~25 min budget, + * N-AUQ tracking, ceiling-bounded retries) to actually count distinct + * review-phase AUQs and assert the model fires one per finding. + * + * Why a separate test from skill-e2e-plan-eng-finding-count (the existing + * 5-finding count test): + * - The fixture here mirrors the D1-D4 transcript shape (4 findings) and + * the floor matches that exact threshold (3, the [N-1] tolerance band). + * This is the tightest regression test for the original bug class — + * not a band-around-N test, but a "did the agent batch?" test. + * + * Tier: periodic (~25 min, ~$5/run). Sequential by default. + */ + +import { describe, test } from 'bun:test'; +import * as fs from 'node:fs'; +import { + runPlanSkillCounting, + engStep0Boundary, +} from './helpers/claude-pty-runner'; +import { FORCING_BATCHING_ENG } from './fixtures/forcing-finding-seeds'; + +const shouldRun = !!process.env.EVALS && process.env.EVALS_TIER === 'periodic'; +const describeE2E = shouldRun ? describe : describe.skip; + +const N = 4; +const FLOOR = N - 1; // 3 — agent must fire at least one AUQ per non-batched finding + +const PLAN_PATH = '/tmp/gstack-test-plan-eng-batching.md'; + +describeE2E('/plan-eng-review multi-finding batching regression (periodic)', () => { + test( + `4-finding plan emits >= ${FLOOR} review-phase AskUserQuestions (no batching)`, + async () => { + try { + fs.rmSync(PLAN_PATH, { force: true }); + } catch { + /* best-effort */ + } + + const obs = await runPlanSkillCounting({ + skillName: 'plan-eng-review', + slashCommand: '/plan-eng-review', + followUpPrompt: FORCING_BATCHING_ENG, + isLastStep0AUQ: engStep0Boundary, + reviewCountCeiling: N + 3, // hard cap above floor + tolerance + cwd: process.cwd(), + timeoutMs: 1_500_000, // 25 min + env: { QUESTION_TUNING: 'false', EXPLAIN_LEVEL: 'default' }, + }); + + try { + if (!['plan_ready', 'completion_summary', 'ceiling_reached'].includes(obs.outcome)) { + throw new Error( + `multi-finding batching test FAILED: outcome=${obs.outcome}\n` + + `step0=${obs.step0Count} review=${obs.reviewCount} elapsed=${obs.elapsedMs}ms\n` + + `--- evidence (last 3KB) ---\n${obs.evidence}`, + ); + } + if (obs.reviewCount < FLOOR) { + throw new Error( + `BATCHING REGRESSION: reviewCount=${obs.reviewCount} < FLOOR=${FLOOR}.\n` + + `Agent surfaced fewer review-phase AUQs than findings — this is the\n` + + `May 2026 transcript bug shape: model batched multiple findings into\n` + + `a single plan write + ExitPlanMode instead of asking one per finding.\n` + + `Review-phase fingerprints:\n` + + obs.fingerprints + .filter((f) => !f.preReview) + .map((f) => ` - "${f.promptSnippet.slice(0, 80)}"`) + .join('\n') + + `\n--- evidence (last 3KB) ---\n${obs.evidence}`, + ); + } + } finally { + try { + fs.rmSync(PLAN_PATH, { force: true }); + } catch { + /* best-effort */ + } + } + }, + 1_700_000, + ); +}); diff --git a/test/skill-e2e-plan-eng-plan-mode.test.ts b/test/skill-e2e-plan-eng-plan-mode.test.ts index 3324022b..ec2adca4 100644 --- a/test/skill-e2e-plan-eng-plan-mode.test.ts +++ b/test/skill-e2e-plan-eng-plan-mode.test.ts @@ -65,43 +65,6 @@ describeE2E('plan-eng-review plan-mode smoke (gate)', () => { assertReportAtBottomIfPlanWritten(obs); }, 360_000); - // v1.21+ regression: see skill-e2e-plan-ceo-plan-mode.test.ts for the - // contract. Pass envelope is ['asked', 'plan_ready']; failure signals - // are 'auto_decided' (AUTO_DECIDE without opt-in) plus the standard - // silent_write/exited/timeout. - test('AskUserQuestion surfaces when --disallowedTools AskUserQuestion is set', async () => { - const obs = await runPlanSkillObservation({ - skillName: 'plan-eng-review', - inPlanMode: true, - extraArgs: ['--disallowedTools', 'AskUserQuestion'], - timeoutMs: 300_000, - }); - - if ( - obs.outcome === 'auto_decided' || - obs.outcome === 'silent_write' || - obs.outcome === 'exited' || - obs.outcome === 'timeout' - ) { - throw new Error( - `plan-eng-review AskUserQuestion-blocked regression: outcome=${obs.outcome}\n` + - `summary: ${obs.summary}\n` + - `elapsed: ${obs.elapsedMs}ms\n` + - `--- evidence (last 2KB visible) ---\n${obs.evidence}`, - ); - } - if (obs.outcome === 'plan_ready') { - if (!obs.planFile || !planFileHasDecisionsSection(obs.planFile)) { - throw new Error( - `plan-eng-review AskUserQuestion-blocked regression: plan_ready without a "## Decisions" section in ${obs.planFile ?? ''} — Step 0 was silently skipped.\n` + - `--- evidence (last 2KB visible) ---\n${obs.evidence}`, - ); - } - } - expect(['asked', 'plan_ready']).toContain(obs.outcome); - assertReportAtBottomIfPlanWritten(obs); - }, 360_000); - // D3-B / D4-B: when a plan with guaranteed-finding-triggering complexity // is seeded, the skill MUST fire AskUserQuestion (or fall back to a // Decisions section) before writing findings to the plan. The diff --git a/test/touchfiles.test.ts b/test/touchfiles.test.ts index 8cd5af2d..416ef057 100644 --- a/test/touchfiles.test.ts +++ b/test/touchfiles.test.ts @@ -99,14 +99,14 @@ describe('selectTests', () => { expect(result.selected).toContain('autoplan-chain-pty'); // Per-finding count + review-report-at-bottom (v1.21.x) expect(result.selected).toContain('plan-ceo-finding-count'); - // v1.22+ AskUserQuestion-blocked regression: autoplan-auto-mode + - // auto-decide-preserved also depend on plan-ceo-review/** - expect(result.selected).toContain('autoplan-auto-mode'); + // v1.22+ AskUserQuestion-blocked regression: auto-decide-preserved + // also depends on plan-ceo-review/** (autoplan-auto-mode test was + // removed in v1.28 — see commit message for the rationale). expect(result.selected).toContain('auto-decide-preserved'); // v1.27+ gate-tier reviewCount-floor regression for transcript bug expect(result.selected).toContain('plan-ceo-finding-floor'); - expect(result.selected.length).toBe(22); - expect(result.skipped.length).toBe(Object.keys(E2E_TOUCHFILES).length - 22); + expect(result.selected.length).toBe(21); + expect(result.skipped.length).toBe(Object.keys(E2E_TOUCHFILES).length - 21); }); test('global touchfile triggers ALL tests', () => {