mirror of
https://github.com/garrytan/gstack.git
synced 2026-05-20 19:29:56 +08:00
fix: resolve merge conflicts with origin/main (v0.6.0 + v0.6.0.1 + v0.6.1)
Merge main's test bootstrap, boil-the-lake completeness principle, selective expansion, ship gate overrides, and gstack-upgrade vendor sync. Conflicts resolved: - CHANGELOG: keep main's 0.6.1/0.6.0.1/0.6.0/0.5.4/0.5.3 entries - VERSION: take main's 0.6.1 - design-consultation: office-hours naming + main's "what's out there" phrasing - ship: keep both verification rules (fresh evidence + coverage tests) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -11,6 +11,7 @@ allowed-tools:
|
||||
- Grep
|
||||
- Glob
|
||||
- AskUserQuestion
|
||||
- WebSearch
|
||||
---
|
||||
|
||||
{{PREAMBLE}}
|
||||
@@ -39,6 +40,7 @@ You are running the `/ship` workflow. This is a **non-interactive, fully automat
|
||||
- Multi-file changesets (auto-split into bisectable commits)
|
||||
- TODOS.md completed-item detection (auto-mark)
|
||||
- Auto-fixable review findings (dead code, N+1, stale comments — fixed automatically)
|
||||
- Test coverage gaps (auto-generate and commit, or flag in PR body)
|
||||
|
||||
---
|
||||
|
||||
@@ -54,10 +56,27 @@ You are running the `/ship` workflow. This is a **non-interactive, fully automat
|
||||
|
||||
{{REVIEW_DASHBOARD}}
|
||||
|
||||
If the verdict is NOT "CLEARED TO SHIP (3/3)", use AskUserQuestion:
|
||||
- Show which reviews are missing or have open issues
|
||||
- RECOMMENDATION: Choose B (run missing reviews first) unless the change is trivial
|
||||
- Options: A) Ship anyway B) Abort — run missing review(s) first C) Reviews not relevant for this change
|
||||
If the Eng Review is NOT "CLEAR":
|
||||
|
||||
1. **Check for a prior override on this branch:**
|
||||
```bash
|
||||
eval $(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
|
||||
grep '"skill":"ship-review-override"' ~/.gstack/projects/$SLUG/$BRANCH-reviews.jsonl 2>/dev/null || echo "NO_OVERRIDE"
|
||||
```
|
||||
If an override exists, display the dashboard and note "Review gate previously accepted — continuing." Do NOT ask again.
|
||||
|
||||
2. **If no override exists,** use AskUserQuestion:
|
||||
- Show that Eng Review is missing or has open issues
|
||||
- RECOMMENDATION: Choose C if the change is obviously trivial (< 20 lines, typo fix, config-only); Choose B for larger changes
|
||||
- Options: A) Ship anyway B) Abort — run /plan-eng-review first C) Change is too small to need eng review
|
||||
- If CEO/Design reviews are missing, mention them as informational ("CEO Review not run — recommended for product changes") but do NOT block or recommend aborting for them
|
||||
|
||||
3. **If the user chooses A or C,** persist the decision so future `/ship` runs on this branch skip the gate:
|
||||
```bash
|
||||
eval $(~/.claude/skills/gstack/bin/gstack-slug 2>/dev/null)
|
||||
echo '{"skill":"ship-review-override","timestamp":"'"$(date -u +%Y-%m-%dT%H:%M:%SZ)"'","decision":"USER_CHOICE"}' >> ~/.gstack/projects/$SLUG/$BRANCH-reviews.jsonl
|
||||
```
|
||||
Substitute USER_CHOICE with "ship_anyway" or "not_relevant".
|
||||
|
||||
---
|
||||
|
||||
@@ -75,6 +94,12 @@ git fetch origin <base> && git merge origin/<base> --no-edit
|
||||
|
||||
---
|
||||
|
||||
## Step 2.5: Test Framework Bootstrap
|
||||
|
||||
{{TEST_BOOTSTRAP}}
|
||||
|
||||
---
|
||||
|
||||
## Step 3: Run tests (on merged code)
|
||||
|
||||
**Do NOT run `RAILS_ENV=test bin/rails db:migrate`** — `bin/test-lane` already calls
|
||||
@@ -159,6 +184,144 @@ If multiple suites need to run, run them sequentially (each needs a test lane).
|
||||
|
||||
---
|
||||
|
||||
## Step 3.4: Test Coverage Audit
|
||||
|
||||
100% coverage is the goal — every untested path is a path where bugs hide and vibe coding becomes yolo coding. Evaluate what was ACTUALLY coded (from the diff), not what was planned.
|
||||
|
||||
**0. Before/after test count:**
|
||||
|
||||
```bash
|
||||
# Count test files before any generation
|
||||
find . -name '*.test.*' -o -name '*.spec.*' -o -name '*_test.*' -o -name '*_spec.*' | grep -v node_modules | wc -l
|
||||
```
|
||||
|
||||
Store this number for the PR body.
|
||||
|
||||
**1. Trace every codepath changed** using `git diff origin/<base>...HEAD`:
|
||||
|
||||
Read every changed file. For each one, trace how data flows through the code — don't just list functions, actually follow the execution:
|
||||
|
||||
1. **Read the diff.** For each changed file, read the full file (not just the diff hunk) to understand context.
|
||||
2. **Trace data flow.** Starting from each entry point (route handler, exported function, event listener, component render), follow the data through every branch:
|
||||
- Where does input come from? (request params, props, database, API call)
|
||||
- What transforms it? (validation, mapping, computation)
|
||||
- Where does it go? (database write, API response, rendered output, side effect)
|
||||
- What can go wrong at each step? (null/undefined, invalid input, network failure, empty collection)
|
||||
3. **Diagram the execution.** For each changed file, draw an ASCII diagram showing:
|
||||
- Every function/method that was added or modified
|
||||
- Every conditional branch (if/else, switch, ternary, guard clause, early return)
|
||||
- Every error path (try/catch, rescue, error boundary, fallback)
|
||||
- Every call to another function (trace into it — does IT have untested branches?)
|
||||
- Every edge: what happens with null input? Empty array? Invalid type?
|
||||
|
||||
This is the critical step — you're building a map of every line of code that can execute differently based on input. Every branch in this diagram needs a test.
|
||||
|
||||
**2. Map user flows, interactions, and error states:**
|
||||
|
||||
Code coverage isn't enough — you need to cover how real users interact with the changed code. For each changed feature, think through:
|
||||
|
||||
- **User flows:** What sequence of actions does a user take that touches this code? Map the full journey (e.g., "user clicks 'Pay' → form validates → API call → success/failure screen"). Each step in the journey needs a test.
|
||||
- **Interaction edge cases:** What happens when the user does something unexpected?
|
||||
- Double-click/rapid resubmit
|
||||
- Navigate away mid-operation (back button, close tab, click another link)
|
||||
- Submit with stale data (page sat open for 30 minutes, session expired)
|
||||
- Slow connection (API takes 10 seconds — what does the user see?)
|
||||
- Concurrent actions (two tabs, same form)
|
||||
- **Error states the user can see:** For every error the code handles, what does the user actually experience?
|
||||
- Is there a clear error message or a silent failure?
|
||||
- Can the user recover (retry, go back, fix input) or are they stuck?
|
||||
- What happens with no network? With a 500 from the API? With invalid data from the server?
|
||||
- **Empty/zero/boundary states:** What does the UI show with zero results? With 10,000 results? With a single character input? With maximum-length input?
|
||||
|
||||
Add these to your diagram alongside the code branches. A user flow with no test is just as much a gap as an untested if/else.
|
||||
|
||||
**3. Check each branch against existing tests:**
|
||||
|
||||
Go through your diagram branch by branch — both code paths AND user flows. For each one, search for a test that exercises it:
|
||||
- Function `processPayment()` → look for `billing.test.ts`, `billing.spec.ts`, `test/billing_test.rb`
|
||||
- An if/else → look for tests covering BOTH the true AND false path
|
||||
- An error handler → look for a test that triggers that specific error condition
|
||||
- A call to `helperFn()` that has its own branches → those branches need tests too
|
||||
- A user flow → look for an integration or E2E test that walks through the journey
|
||||
- An interaction edge case → look for a test that simulates the unexpected action
|
||||
|
||||
Quality scoring rubric:
|
||||
- ★★★ Tests behavior with edge cases AND error paths
|
||||
- ★★ Tests correct behavior, happy path only
|
||||
- ★ Smoke test / existence check / trivial assertion (e.g., "it renders", "it doesn't throw")
|
||||
|
||||
**4. Output ASCII coverage diagram:**
|
||||
|
||||
Include BOTH code paths and user flows in the same diagram:
|
||||
|
||||
```
|
||||
CODE PATH COVERAGE
|
||||
===========================
|
||||
[+] src/services/billing.ts
|
||||
│
|
||||
├── processPayment()
|
||||
│ ├── [★★★ TESTED] Happy path + card declined + timeout — billing.test.ts:42
|
||||
│ ├── [GAP] Network timeout — NO TEST
|
||||
│ └── [GAP] Invalid currency — NO TEST
|
||||
│
|
||||
└── refundPayment()
|
||||
├── [★★ TESTED] Full refund — billing.test.ts:89
|
||||
└── [★ TESTED] Partial refund (checks non-throw only) — billing.test.ts:101
|
||||
|
||||
USER FLOW COVERAGE
|
||||
===========================
|
||||
[+] Payment checkout flow
|
||||
│
|
||||
├── [★★★ TESTED] Complete purchase — checkout.e2e.ts:15
|
||||
├── [GAP] Double-click submit — NO TEST
|
||||
├── [GAP] Navigate away during payment — NO TEST
|
||||
└── [★ TESTED] Form validation errors (checks render only) — checkout.test.ts:40
|
||||
|
||||
[+] Error states
|
||||
│
|
||||
├── [★★ TESTED] Card declined message — billing.test.ts:58
|
||||
├── [GAP] Network timeout UX (what does user see?) — NO TEST
|
||||
└── [GAP] Empty cart submission — NO TEST
|
||||
|
||||
─────────────────────────────────
|
||||
COVERAGE: 5/12 paths tested (42%)
|
||||
Code paths: 3/5 (60%)
|
||||
User flows: 2/7 (29%)
|
||||
QUALITY: ★★★: 2 ★★: 2 ★: 1
|
||||
GAPS: 7 paths need tests
|
||||
─────────────────────────────────
|
||||
```
|
||||
|
||||
**Fast path:** All paths covered → "Step 3.4: All new code paths have test coverage ✓" Continue.
|
||||
|
||||
**5. Generate tests for uncovered paths:**
|
||||
|
||||
If test framework detected (or bootstrapped in Step 2.5):
|
||||
- Prioritize error handlers and edge cases first (happy paths are more likely already tested)
|
||||
- Read 2-3 existing test files to match conventions exactly
|
||||
- Generate unit tests. Mock all external dependencies (DB, API, Redis).
|
||||
- Write tests that exercise the specific uncovered path with real assertions
|
||||
- Run each test. Passes → commit as `test: coverage for {feature}`
|
||||
- Fails → fix once. Still fails → revert, note gap in diagram.
|
||||
|
||||
Caps: 30 code paths max, 20 tests generated max (code + user flow combined), 2-min per-test exploration cap.
|
||||
|
||||
If no test framework AND user declined bootstrap → diagram only, no generation. Note: "Test generation skipped — no test framework configured."
|
||||
|
||||
**Diff is test-only changes:** Skip Step 3.4 entirely: "No new application code paths to audit."
|
||||
|
||||
**6. After-count and coverage summary:**
|
||||
|
||||
```bash
|
||||
# Count test files after generation
|
||||
find . -name '*.test.*' -o -name '*.spec.*' -o -name '*_test.*' -o -name '*_spec.*' | grep -v node_modules | wc -l
|
||||
```
|
||||
|
||||
For PR body: `Tests: {before} → {after} (+{delta} new)`
|
||||
Coverage line: `Test Coverage Audit: N new code paths. M covered (X%). K tests generated, J committed.`
|
||||
|
||||
---
|
||||
|
||||
## Step 3.5: Pre-Landing Review
|
||||
|
||||
Review the diff for structural issues that tests don't catch.
|
||||
@@ -409,6 +572,10 @@ gh pr create --base <base> --title "<type>: <summary>" --body "$(cat <<'EOF'
|
||||
## Summary
|
||||
<bullet points from CHANGELOG>
|
||||
|
||||
## Test Coverage
|
||||
<coverage diagram from Step 3.4, or "All new code paths have test coverage.">
|
||||
<If Step 3.4 ran: "Tests: {before} → {after} (+{delta} new)">
|
||||
|
||||
## Pre-Landing Review
|
||||
<findings from Step 3.5, or "No issues found.">
|
||||
|
||||
@@ -451,4 +618,5 @@ EOF
|
||||
- **TODOS.md completion detection must be conservative.** Only mark items as completed when the diff clearly shows the work is done.
|
||||
- **Use Greptile reply templates from greptile-triage.md.** Every reply includes evidence (inline diff, code references, re-rank suggestion). Never post vague replies.
|
||||
- **Never push without fresh verification evidence.** If code changed after Step 3 tests, re-run before pushing.
|
||||
- **Step 3.4 generates coverage tests.** They must pass before committing. Never commit failing tests.
|
||||
- **The goal is: user says `/ship`, next thing they see is the review + PR URL.**
|
||||
|
||||
Reference in New Issue
Block a user