mirror of
https://github.com/garrytan/gstack.git
synced 2026-05-17 01:31:26 +08:00
merge: origin/main v1.1.2.0 (mode-posture energy fix) into fix-checkpoints
Main shipped /plan-ceo-review + /office-hours mode-posture fixes as v1.1.2.0 — same version slot this branch used. Bumped ours to v1.1.3.0. Resolved conflicts: - VERSION / package.json: 1.1.2.0 → 1.1.3.0 - CHANGELOG.md: our entry moved up to [1.1.3.0], main's mode-posture entry kept at [1.1.2.0]. Sequence: 1.1.3.0 → 1.1.2.0 → 1.1.1.0 → 1.1.0.0 → 1.0.0.0 - Migration renamed v1.1.2.0.sh → v1.1.3.0.sh (version string inside + test path reference + hardening test describe block all updated) Also expanded our CHANGELOG entry to reflect the testing work that landed over the debugging loop: 8 live-fire E2E tests, 21 free-tier hardening tests, collision sentinel, env: param on runSkillTest, and GSTACK_HOME storage-path fix. No code overlap with main's changes (main edited preamble writing-style rules + plan-ceo-review and office-hours templates; ours edited context-save / context-restore / migration / tests). SKILL.md files regenerated via bun run gen:skill-docs --host all. Golden fixtures updated. bun test: 0 failures after renaming v1.1.2.0 → v1.1.3.0 in the hardening test describe block and MIGRATION constant.
This commit is contained in:
12
test/fixtures/golden/claude-ship-SKILL.md
vendored
12
test/fixtures/golden/claude-ship-SKILL.md
vendored
@@ -410,9 +410,15 @@ Per-skill instructions may add additional formatting rules on top of this baseli
|
||||
These rules apply to every AskUserQuestion, every response you write to the user, and every review finding. They compose with the AskUserQuestion Format section above: Format = *how* a question is structured; Writing Style = *the prose quality of the content inside it*.
|
||||
|
||||
1. **Jargon gets a one-sentence gloss on first use per skill invocation.** Even if the user's own prompt already contained the term — users often paste jargon from someone else's plan. Gloss unconditionally on first use. No cross-invocation memory: a new skill fire is a new first-use opportunity. Example: "race condition (two things happen at the same time and step on each other)".
|
||||
2. **Frame questions in outcome terms, not implementation terms.** Bad: "Is this endpoint idempotent?" Good: "If someone double-clicks the button, is it OK for the action to run twice?" Ask the question the user would actually want to answer.
|
||||
3. **Short sentences. Concrete nouns. Active voice.** Standard advice from any good writing guide. Prefer "the cache stores the result for 60s" over "results will have been cached for a period of 60s."
|
||||
4. **Close every decision with user impact.** Connect the technical call back to who's affected. "If we skip this, your users will see a 3-second spinner on every page load." Make the user's user real.
|
||||
2. **Frame questions in outcome terms, not implementation terms.** Ask the question the user would actually want to answer. Outcome framing covers three families — match the framing to the mode:
|
||||
- **Pain reduction** (default for diagnostic / HOLD SCOPE / rigor review): "If someone double-clicks the button, is it OK for the action to run twice?" (instead of "Is this endpoint idempotent?")
|
||||
- **Upside / delight** (for expansion / builder / vision contexts): "When the workflow finishes, does the user see the result instantly, or are they still refreshing a dashboard?" (instead of "Should we add webhook notifications?")
|
||||
- **Interrogative pressure** (for forcing-question / founder-challenge contexts): "Can you name the actual person whose career gets better if this ships and whose career gets worse if it doesn't?" (instead of "Who's the target user?")
|
||||
3. **Short sentences. Concrete nouns. Active voice.** Standard advice from any good writing guide. Prefer "the cache stores the result for 60s" over "results will have been cached for a period of 60s." *Exception:* stacked, multi-part questions are a legitimate forcing device — "Title? Gets them promoted? Gets them fired? Keeps them up at night?" is longer than one short sentence, and it should be, because the pressure IS in the stacking. Don't collapse a stack into a single neutral ask when the skill's posture is forcing.
|
||||
4. **Close every decision with user impact.** Connect the technical call back to who's affected. Make the user's user real. Impact has three shapes — again, match the mode:
|
||||
- **Pain avoided:** "If we skip this, your users will see a 3-second spinner on every page load."
|
||||
- **Capability unlocked:** "If we ship this, users get instant feedback the moment a workflow finishes — no tabs to refresh, no polling."
|
||||
- **Consequence named** (for forcing questions): "If you can't name the person whose career this helps, you don't know who you're building for — and 'users' isn't an answer."
|
||||
5. **User-turn override.** If the user's current message says "be terse" / "no explanations" / "brutally honest, just the answer" / similar, skip this entire Writing Style block for your next response, regardless of config. User's in-turn request wins.
|
||||
6. **Glossary boundary is the curated list.** Terms below get glossed. Terms not on the list are assumed plain-English enough. If you see a term that genuinely needs glossing but isn't listed, note it (once) in your response so it can be added via PR.
|
||||
|
||||
|
||||
12
test/fixtures/golden/codex-ship-SKILL.md
vendored
12
test/fixtures/golden/codex-ship-SKILL.md
vendored
@@ -399,9 +399,15 @@ Per-skill instructions may add additional formatting rules on top of this baseli
|
||||
These rules apply to every AskUserQuestion, every response you write to the user, and every review finding. They compose with the AskUserQuestion Format section above: Format = *how* a question is structured; Writing Style = *the prose quality of the content inside it*.
|
||||
|
||||
1. **Jargon gets a one-sentence gloss on first use per skill invocation.** Even if the user's own prompt already contained the term — users often paste jargon from someone else's plan. Gloss unconditionally on first use. No cross-invocation memory: a new skill fire is a new first-use opportunity. Example: "race condition (two things happen at the same time and step on each other)".
|
||||
2. **Frame questions in outcome terms, not implementation terms.** Bad: "Is this endpoint idempotent?" Good: "If someone double-clicks the button, is it OK for the action to run twice?" Ask the question the user would actually want to answer.
|
||||
3. **Short sentences. Concrete nouns. Active voice.** Standard advice from any good writing guide. Prefer "the cache stores the result for 60s" over "results will have been cached for a period of 60s."
|
||||
4. **Close every decision with user impact.** Connect the technical call back to who's affected. "If we skip this, your users will see a 3-second spinner on every page load." Make the user's user real.
|
||||
2. **Frame questions in outcome terms, not implementation terms.** Ask the question the user would actually want to answer. Outcome framing covers three families — match the framing to the mode:
|
||||
- **Pain reduction** (default for diagnostic / HOLD SCOPE / rigor review): "If someone double-clicks the button, is it OK for the action to run twice?" (instead of "Is this endpoint idempotent?")
|
||||
- **Upside / delight** (for expansion / builder / vision contexts): "When the workflow finishes, does the user see the result instantly, or are they still refreshing a dashboard?" (instead of "Should we add webhook notifications?")
|
||||
- **Interrogative pressure** (for forcing-question / founder-challenge contexts): "Can you name the actual person whose career gets better if this ships and whose career gets worse if it doesn't?" (instead of "Who's the target user?")
|
||||
3. **Short sentences. Concrete nouns. Active voice.** Standard advice from any good writing guide. Prefer "the cache stores the result for 60s" over "results will have been cached for a period of 60s." *Exception:* stacked, multi-part questions are a legitimate forcing device — "Title? Gets them promoted? Gets them fired? Keeps them up at night?" is longer than one short sentence, and it should be, because the pressure IS in the stacking. Don't collapse a stack into a single neutral ask when the skill's posture is forcing.
|
||||
4. **Close every decision with user impact.** Connect the technical call back to who's affected. Make the user's user real. Impact has three shapes — again, match the mode:
|
||||
- **Pain avoided:** "If we skip this, your users will see a 3-second spinner on every page load."
|
||||
- **Capability unlocked:** "If we ship this, users get instant feedback the moment a workflow finishes — no tabs to refresh, no polling."
|
||||
- **Consequence named** (for forcing questions): "If you can't name the person whose career this helps, you don't know who you're building for — and 'users' isn't an answer."
|
||||
5. **User-turn override.** If the user's current message says "be terse" / "no explanations" / "brutally honest, just the answer" / similar, skip this entire Writing Style block for your next response, regardless of config. User's in-turn request wins.
|
||||
6. **Glossary boundary is the curated list.** Terms below get glossed. Terms not on the list are assumed plain-English enough. If you see a term that genuinely needs glossing but isn't listed, note it (once) in your response so it can be added via PR.
|
||||
|
||||
|
||||
12
test/fixtures/golden/factory-ship-SKILL.md
vendored
12
test/fixtures/golden/factory-ship-SKILL.md
vendored
@@ -401,9 +401,15 @@ Per-skill instructions may add additional formatting rules on top of this baseli
|
||||
These rules apply to every AskUserQuestion, every response you write to the user, and every review finding. They compose with the AskUserQuestion Format section above: Format = *how* a question is structured; Writing Style = *the prose quality of the content inside it*.
|
||||
|
||||
1. **Jargon gets a one-sentence gloss on first use per skill invocation.** Even if the user's own prompt already contained the term — users often paste jargon from someone else's plan. Gloss unconditionally on first use. No cross-invocation memory: a new skill fire is a new first-use opportunity. Example: "race condition (two things happen at the same time and step on each other)".
|
||||
2. **Frame questions in outcome terms, not implementation terms.** Bad: "Is this endpoint idempotent?" Good: "If someone double-clicks the button, is it OK for the action to run twice?" Ask the question the user would actually want to answer.
|
||||
3. **Short sentences. Concrete nouns. Active voice.** Standard advice from any good writing guide. Prefer "the cache stores the result for 60s" over "results will have been cached for a period of 60s."
|
||||
4. **Close every decision with user impact.** Connect the technical call back to who's affected. "If we skip this, your users will see a 3-second spinner on every page load." Make the user's user real.
|
||||
2. **Frame questions in outcome terms, not implementation terms.** Ask the question the user would actually want to answer. Outcome framing covers three families — match the framing to the mode:
|
||||
- **Pain reduction** (default for diagnostic / HOLD SCOPE / rigor review): "If someone double-clicks the button, is it OK for the action to run twice?" (instead of "Is this endpoint idempotent?")
|
||||
- **Upside / delight** (for expansion / builder / vision contexts): "When the workflow finishes, does the user see the result instantly, or are they still refreshing a dashboard?" (instead of "Should we add webhook notifications?")
|
||||
- **Interrogative pressure** (for forcing-question / founder-challenge contexts): "Can you name the actual person whose career gets better if this ships and whose career gets worse if it doesn't?" (instead of "Who's the target user?")
|
||||
3. **Short sentences. Concrete nouns. Active voice.** Standard advice from any good writing guide. Prefer "the cache stores the result for 60s" over "results will have been cached for a period of 60s." *Exception:* stacked, multi-part questions are a legitimate forcing device — "Title? Gets them promoted? Gets them fired? Keeps them up at night?" is longer than one short sentence, and it should be, because the pressure IS in the stacking. Don't collapse a stack into a single neutral ask when the skill's posture is forcing.
|
||||
4. **Close every decision with user impact.** Connect the technical call back to who's affected. Make the user's user real. Impact has three shapes — again, match the mode:
|
||||
- **Pain avoided:** "If we skip this, your users will see a 3-second spinner on every page load."
|
||||
- **Capability unlocked:** "If we ship this, users get instant feedback the moment a workflow finishes — no tabs to refresh, no polling."
|
||||
- **Consequence named** (for forcing questions): "If you can't name the person whose career this helps, you don't know who you're building for — and 'users' isn't an answer."
|
||||
5. **User-turn override.** If the user's current message says "be terse" / "no explanations" / "brutally honest, just the answer" / similar, skip this entire Writing Style block for your next response, regardless of config. User's in-turn request wins.
|
||||
6. **Glossary boundary is the curated list.** Terms below get glossed. Terms not on the list are assumed plain-English enough. If you see a term that genuinely needs glossing but isn't listed, note it (once) in your response so it can be added via PR.
|
||||
|
||||
|
||||
15
test/fixtures/mode-posture/builder-idea.md
vendored
Normal file
15
test/fixtures/mode-posture/builder-idea.md
vendored
Normal file
@@ -0,0 +1,15 @@
|
||||
# Weekend Project: Dependency Graph Visualizer
|
||||
|
||||
I want to build a tool that takes a codebase and visualizes its dependency graph — modules, imports, which files depend on which. For fun, for learning. Maybe open-source it.
|
||||
|
||||
## What I have so far
|
||||
|
||||
- Rough idea: point it at a repo, get an interactive graph
|
||||
- Stack I'm leaning toward: TypeScript + D3 or Cytoscape for rendering
|
||||
- Potential: could work for JS/TS first, maybe Python later
|
||||
|
||||
## What I don't know yet
|
||||
|
||||
- How to make the visualization actually useful vs just pretty
|
||||
- Whether this should be a CLI, a web tool, or a VS Code extension
|
||||
- What would make someone else want to use it
|
||||
23
test/fixtures/mode-posture/expansion-plan.md
vendored
Normal file
23
test/fixtures/mode-posture/expansion-plan.md
vendored
Normal file
@@ -0,0 +1,23 @@
|
||||
# Plan: Team Velocity Dashboard
|
||||
|
||||
## Context
|
||||
|
||||
We're building a dashboard for engineering managers to track team code velocity — commits per engineer, PR cycle time, review latency, CI pass rate. The data already lives in GitHub; we're just aggregating it for a manager's single-pane view.
|
||||
|
||||
## Changes
|
||||
|
||||
1. New React component `TeamVelocityDashboard` in `src/dashboard/`
|
||||
2. REST API endpoint `GET /api/team/velocity?days=30` returning aggregated metrics
|
||||
3. Background job pulling GitHub data every 15 minutes into Postgres
|
||||
4. Simple filter UI: team, date range, metric
|
||||
|
||||
## Architecture
|
||||
|
||||
- Frontend: React + shadcn/ui
|
||||
- Backend: Express + PostgreSQL
|
||||
- Data source: GitHub REST API (cached 15min)
|
||||
|
||||
## Open questions
|
||||
|
||||
- Should we support multiple repos per team?
|
||||
- Do we show individual engineer names or aggregate only?
|
||||
13
test/fixtures/mode-posture/forcing-pitch.md
vendored
Normal file
13
test/fixtures/mode-posture/forcing-pitch.md
vendored
Normal file
@@ -0,0 +1,13 @@
|
||||
# Our Idea: AI Tools for Product Managers
|
||||
|
||||
We're building AI tools for product managers at mid-market SaaS companies. The product combines a bunch of the things PMs already do — writing PRDs, gathering user feedback, analyzing usage data, drafting roadmaps — and uses LLMs to speed each of them up.
|
||||
|
||||
## Who we're targeting
|
||||
|
||||
Product managers at SaaS companies with 50-500 engineers. These PMs are stretched thin, juggle a lot of surface area, and would benefit from AI assistance.
|
||||
|
||||
## What we've done so far
|
||||
|
||||
- Talked to a few PMs we know from prior jobs
|
||||
- Built a prototype that summarizes Zoom customer calls into a PRD stub
|
||||
- Got on a waitlist of about 40 signups from LinkedIn posts
|
||||
Reference in New Issue
Block a user