mirror of
https://github.com/garrytan/gstack.git
synced 2026-05-22 04:38:24 +08:00
v1.33.1.0 fix(learnings): token-OR query + task-shaped retrieval in 3 long skills (#1442)
* fix(learnings): use token-OR matching in gstack-learnings-search --query
Split the query on whitespace into tokens; a learning matches if ANY
token appears as a substring in ANY of key/insight/files. Previously
the whole query was a single substring, so multi-word queries like
"debug investigation" only matched learnings whose insight contained
that exact contiguous phrase, which is usually nothing.
Whitespace-only query falls through to no-query (matches today's no-flag
behavior). Single-word queries behave exactly as before.
Adds test/gstack-learnings-search.test.ts: 3 assertions covering
multi-token, single-token, and no-query backwards compat.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* feat(resolver): parameterized LEARNINGS_SEARCH with shell-injection guard
The {{LEARNINGS_SEARCH}} macro now accepts a query=KEYWORD argument that
gets interpolated as --query "<keyword>" into the generated bash. Empty
value falls through to no-query (principle of least surprise: a stray
{{LEARNINGS_SEARCH:query=}} placeholder gets today's behavior, not a
build failure). Pattern reuses the parameterized-macro parsing from
composition.ts. The 13 templates that don't pass a query stay
byte-identical in their generated SKILL.md output.
Shell-injection guard: the query value is whitelisted to
^[A-Za-z0-9 _-]+$ at gen-skill-docs time. Any \$(), backticks,
semicolons, or quotes throw a loud build error instead of emitting
executable bash. Static template queries are safe by inspection;
this defends against future contributors writing dangerous values.
Adds 5 assertions to test/gen-skill-docs.test.ts covering no-args,
claude+query=foo bar on both cross-project and project-scoped branches,
codex host variant, empty value semantics, and shell-injection payloads
(\$(whoami), backticks, ;, &, ", \\, \$x) throwing build errors.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* feat(skills): task-shaped queries + mid-flow refresh in /investigate /qa /ship
The three long skills now pull learnings keyed to their theme at the
top, then re-pull at phase boundaries as work shifts to new sub-tasks.
Top-of-skill queries (5-6 token unions, token-OR matched):
- investigate: "debug investigation root cause hypothesis bug fix"
- qa: "qa testing bug regression flake fixture"
- ship: "release ship version changelog merge pr"
Mid-flow refresh blocks (concrete keyword recipe + worked examples):
- investigate: between Phase 1 (hypothesis) and Phase 2 (analysis),
keyed to the hypothesis noun. Examples: auth-cookie, session-expiry.
- qa: between Phase 7 (triage) and Phase 8 (fix loop), keyed to the
buggy component name. Examples: checkout-button, signup-form.
- ship: just before Step 12 (VERSION bump), keyed to the headline
feature. Examples: learnings-search, pacing, worktree-ship.
Keyword recipe enforces alphanumeric+hyphen only (no quotes, slashes,
dots, colons) so dynamic queries cannot inject shell metacharacters.
The other 13 short-lived skills keep the bare {{LEARNINGS_SEARCH}} form.
Backwards-compat verified via diff: their generated SKILL.md output is
byte-identical to before this change.
Golden ship fixtures regenerated to match the new ship/SKILL.md output.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* chore: bump version and changelog (v1.33.1.0)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* test: refresh codex+factory ship golden fixtures
Follow-up to 513c9660 — the codex and factory host outputs needed
regeneration too, missed in the initial commit because gen:skill-docs
was only run for the claude host. Now matches gen:skill-docs --host all.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
18
qa/SKILL.md
18
qa/SKILL.md
@@ -1068,9 +1068,9 @@ Search for relevant learnings from previous sessions:
|
||||
_CROSS_PROJ=$(~/.claude/skills/gstack/bin/gstack-config get cross_project_learnings 2>/dev/null || echo "unset")
|
||||
echo "CROSS_PROJECT: $_CROSS_PROJ"
|
||||
if [ "$_CROSS_PROJ" = "true" ]; then
|
||||
~/.claude/skills/gstack/bin/gstack-learnings-search --limit 10 --cross-project 2>/dev/null || true
|
||||
~/.claude/skills/gstack/bin/gstack-learnings-search --limit 10 --query "qa testing bug regression flake fixture" --cross-project 2>/dev/null || true
|
||||
else
|
||||
~/.claude/skills/gstack/bin/gstack-learnings-search --limit 10 2>/dev/null || true
|
||||
~/.claude/skills/gstack/bin/gstack-learnings-search --limit 10 --query "qa testing bug regression flake fixture" 2>/dev/null || true
|
||||
fi
|
||||
```
|
||||
|
||||
@@ -1426,6 +1426,20 @@ Sort all discovered issues by severity, then decide which to fix based on the se
|
||||
|
||||
Mark issues that cannot be fixed from source code (e.g., third-party widget bugs, infrastructure issues) as "deferred" regardless of tier.
|
||||
|
||||
### Refresh learnings for the component/page where the bug lives
|
||||
|
||||
The top-of-skill learnings pull was keyed to "qa testing" broadly. Before the fix loop, re-pull learnings keyed to the component or page where the bug you're about to fix lives so prior fixes for the same component-shape surface.
|
||||
|
||||
Pick ONE keyword that names the buggy component or page. The keyword should be a noun: the failing component name, the page route base, or the feature noun. The keyword MUST be alphanumeric or hyphen only — no quotes, slashes, dots, colons, or whitespace. If your candidate has any of those, simplify to just the alphanumeric stem.
|
||||
|
||||
Worked examples (qa-specific): good keywords are `checkout-button`, `signup-form`, `payment`. Bad: `tests are failing`, `<failing-test>`, `app/views/_checkout.html.erb`.
|
||||
|
||||
```bash
|
||||
~/.claude/skills/gstack/bin/gstack-learnings-search --query "<your-keyword>" --limit 5 2>/dev/null || true
|
||||
```
|
||||
|
||||
If any learnings come back, name which one applies to the fix you're about to make in one sentence. If none come back, continue without reference — the absence is itself useful information.
|
||||
|
||||
---
|
||||
|
||||
## Phase 8: Fix Loop
|
||||
|
||||
@@ -100,7 +100,7 @@ mkdir -p .gstack/qa-reports/screenshots
|
||||
|
||||
---
|
||||
|
||||
{{LEARNINGS_SEARCH}}
|
||||
{{LEARNINGS_SEARCH:query=qa testing bug regression flake fixture}}
|
||||
|
||||
## Test Plan Context
|
||||
|
||||
@@ -154,6 +154,20 @@ Sort all discovered issues by severity, then decide which to fix based on the se
|
||||
|
||||
Mark issues that cannot be fixed from source code (e.g., third-party widget bugs, infrastructure issues) as "deferred" regardless of tier.
|
||||
|
||||
### Refresh learnings for the component/page where the bug lives
|
||||
|
||||
The top-of-skill learnings pull was keyed to "qa testing" broadly. Before the fix loop, re-pull learnings keyed to the component or page where the bug you're about to fix lives so prior fixes for the same component-shape surface.
|
||||
|
||||
Pick ONE keyword that names the buggy component or page. The keyword should be a noun: the failing component name, the page route base, or the feature noun. The keyword MUST be alphanumeric or hyphen only — no quotes, slashes, dots, colons, or whitespace. If your candidate has any of those, simplify to just the alphanumeric stem.
|
||||
|
||||
Worked examples (qa-specific): good keywords are `checkout-button`, `signup-form`, `payment`. Bad: `tests are failing`, `<failing-test>`, `app/views/_checkout.html.erb`.
|
||||
|
||||
```bash
|
||||
~/.claude/skills/gstack/bin/gstack-learnings-search --query "<your-keyword>" --limit 5 2>/dev/null || true
|
||||
```
|
||||
|
||||
If any learnings come back, name which one applies to the fix you're about to make in one sentence. If none come back, continue without reference — the absence is itself useful information.
|
||||
|
||||
---
|
||||
|
||||
## Phase 8: Fix Loop
|
||||
|
||||
Reference in New Issue
Block a user