refactor: reorganize codebase — move browse CLI to browse/ directory

Restructure project layout: src/ → browse/src/, test/ → browse/test/. Add snapshot testing. Update docs, package.json, and skills integration. Add setup script and TODO tracking. Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
2026-05-20 11:19:56 +08:00 · 2026-03-11 19:27:43 -07:00
parent bfa6102679
commit e9fbb664f8
29 changed files with 888 additions and 130 deletions
--- a/review/SKILL.md
+++ b/review/SKILL.md
@@ -0,0 +1,78 @@
+---
+name: review
+version: 1.0.0
+description: |
+  Pre-landing PR review. Analyzes diff against main for SQL safety, LLM trust
+  boundary violations, conditional side effects, and other structural issues.
+allowed-tools:
+  - Bash
+  - Read
+  - Edit
+  - Write
+  - Grep
+  - Glob
+  - AskUserQuestion
+---
+
+# Pre-Landing PR Review
+
+You are running the `/review` workflow. Analyze the current branch's diff against main for structural issues that tests don't catch.
+
+---
+
+## Step 1: Check branch
+
+1. Run `git branch --show-current` to get the current branch.
+2. If on `main`, output: **"Nothing to review — you're on main or have no changes against main."** and stop.
+3. Run `git fetch origin main --quiet && git diff origin/main --stat` to check if there's a diff. If no diff, output the same message and stop.
+
+---
+
+## Step 2: Read the checklist
+
+Read `.claude/skills/review/checklist.md`.
+
+**If the file cannot be read, STOP and report the error.** Do not proceed without the checklist.
+
+---
+
+## Step 3: Get the diff
+
+Fetch the latest main to avoid false positives from a stale local main:
+
+```bash
+git fetch origin main --quiet
+```
+
+Run `git diff origin/main` to get the full diff. This includes both committed and uncommitted changes against the latest main.
+
+---
+
+## Step 4: Two-pass review
+
+Apply the checklist against the diff in two passes:
+
+1. **Pass 1 (CRITICAL):** SQL & Data Safety, LLM Output Trust Boundary
+2. **Pass 2 (INFORMATIONAL):** Conditional Side Effects, Magic Numbers & String Coupling, Dead Code & Consistency, LLM Prompt Issues, Test Gaps, View/Frontend
+
+Follow the output format specified in the checklist. Respect the suppressions — do NOT flag items listed in the "DO NOT flag" section.
+
+---
+
+## Step 5: Output findings
+
+**Always output ALL findings** — both critical and informational. The user must see every issue.
+
+- If CRITICAL issues found: output all findings, then for EACH critical issue use a separate AskUserQuestion with the problem, your recommended fix, and options (A: Fix it now, B: Acknowledge, C: False positive — skip).
+  After all critical questions are answered, output a summary of what the user chose for each issue. If the user chose A (fix) on any issue, apply the recommended fixes. If only B/C were chosen, no action needed.
+- If only non-critical issues found: output findings. No further action needed.
+- If no issues found: output `Pre-Landing Review: No issues found.`
+
+---
+
+## Important Rules
+
+- **Read the FULL diff before commenting.** Do not flag issues already addressed in the diff.
+- **Read-only by default.** Only modify files if the user explicitly chooses "Fix it now" on a critical issue. Never commit, push, or create PRs.
+- **Be terse.** One line problem, one line fix. No preamble.
+- **Only flag real problems.** Skip anything that's fine.
--- a/review/checklist.md
+++ b/review/checklist.md
@@ -0,0 +1,125 @@
+# Pre-Landing Review Checklist
+
+## Instructions
+
+Review the `git diff origin/main` output for the issues listed below. Be specific — cite `file:line` and suggest fixes. Skip anything that's fine. Only flag real problems.
+
+**Two-pass review:**
+- **Pass 1 (CRITICAL):** Run SQL & Data Safety and LLM Output Trust Boundary first. These can block `/ship`.
+- **Pass 2 (INFORMATIONAL):** Run all remaining categories. These are included in the PR body but do not block.
+
+**Output format:**
+
+```
+Pre-Landing Review: N issues (X critical, Y informational)
+
+**CRITICAL** (blocking /ship):
+- [file:line] Problem description
+  Fix: suggested fix
+
+**Issues** (non-blocking):
+- [file:line] Problem description
+  Fix: suggested fix
+```
+
+If no issues found: `Pre-Landing Review: No issues found.`
+
+Be terse. For each issue: one line describing the problem, one line with the fix. No preamble, no summaries, no "looks good overall."
+
+---
+
+## Review Categories
+
+### Pass 1 — CRITICAL
+
+#### SQL & Data Safety
+- String interpolation in SQL (even if values are `.to_i`/`.to_f` — use `sanitize_sql_array` or Arel)
+- TOCTOU races: check-then-set patterns that should be atomic `WHERE` + `update_all`
+- `update_column`/`update_columns` bypassing validations on fields that have or should have constraints
+- N+1 queries: `.includes()` missing for associations used in loops/views (especially avatar, attachments)
+
+#### Race Conditions & Concurrency
+- Read-check-write without uniqueness constraint or `rescue RecordNotUnique; retry` (e.g., `where(hash:).first` then `save!` without handling concurrent insert)
+- `find_or_create_by` on columns without unique DB index — concurrent calls can create duplicates
+- Status transitions that don't use atomic `WHERE old_status = ? UPDATE SET new_status` — concurrent updates can skip or double-apply transitions
+- `html_safe` on user-controlled data (XSS) — check any `.html_safe`, `raw()`, or string interpolation into `html_safe` output
+
+#### LLM Output Trust Boundary
+- LLM-generated values (emails, URLs, names) written to DB or passed to mailers without format validation. Add lightweight guards (`EMAIL_REGEXP`, `URI.parse`, `.strip`) before persisting.
+- Structured tool output (arrays, hashes) accepted without type/shape checks before database writes.
+
+### Pass 2 — INFORMATIONAL
+
+#### Conditional Side Effects
+- Code paths that branch on a condition but forget to apply a side effect on one branch. Example: item promoted to verified but URL only attached when a secondary condition is true — the other branch promotes without the URL, creating an inconsistent record.
+- Log messages that claim an action happened but the action was conditionally skipped. The log should reflect what actually occurred.
+
+#### Magic Numbers & String Coupling
+- Bare numeric literals used in multiple files — should be named constants documented together
+- Error message strings used as query filters elsewhere (grep for the string — is anything matching on it?)
+
+#### Dead Code & Consistency
+- Variables assigned but never read
+- Version mismatch between PR title and VERSION/CHANGELOG files
+- CHANGELOG entries that describe changes inaccurately (e.g., "changed from X to Y" when X never existed)
+- Comments/docstrings that describe old behavior after the code changed
+
+#### LLM Prompt Issues
+- 0-indexed lists in prompts (LLMs reliably return 1-indexed)
+- Prompt text listing available tools/capabilities that don't match what's actually wired up in the `tool_classes`/`tools` array
+- Word/token limits stated in multiple places that could drift
+
+#### Test Gaps
+- Negative-path tests that assert type/status but not the side effects (URL attached? field populated? callback fired?)
+- Assertions on string content without checking format (e.g., asserting title present but not URL format)
+- `.expects(:something).never` missing when a code path should explicitly NOT call an external service
+- Security enforcement features (blocking, rate limiting, auth) without integration tests verifying the enforcement path works end-to-end
+
+#### Crypto & Entropy
+- Truncation of data instead of hashing (last N chars instead of SHA-256) — less entropy, easier collisions
+- `rand()` / `Random.rand` for security-sensitive values — use `SecureRandom` instead
+- Non-constant-time comparisons (`==`) on secrets or tokens — vulnerable to timing attacks
+
+#### Time Window Safety
+- Date-key lookups that assume "today" covers 24h — report at 8am PT only sees midnight→8am under today's key
+- Mismatched time windows between related features — one uses hourly buckets, another uses daily keys for the same data
+
+#### Type Coercion at Boundaries
+- Values crossing Ruby→JSON→JS boundaries where type could change (numeric vs string) — hash/digest inputs must normalize types
+- Hash/digest inputs that don't call `.to_s` or equivalent before serialization — `{ cores: 8 }` vs `{ cores: "8" }` produce different hashes
+
+#### View/Frontend
+- Inline `<style>` blocks in partials (re-parsed every render)
+- O(n*m) lookups in views (`Array#find` in a loop instead of `index_by` hash)
+- Ruby-side `.select{}` filtering on DB results that could be a `WHERE` clause (unless intentionally avoiding leading-wildcard `LIKE`)
+
+---
+
+## Gate Classification
+
+```
+CRITICAL (blocks /ship):          INFORMATIONAL (in PR body):
+├─ SQL & Data Safety              ├─ Conditional Side Effects
+├─ Race Conditions & Concurrency  ├─ Magic Numbers & String Coupling
+└─ LLM Output Trust Boundary      ├─ Dead Code & Consistency
+                                   ├─ LLM Prompt Issues
+                                   ├─ Test Gaps
+                                   ├─ Crypto & Entropy
+                                   ├─ Time Window Safety
+                                   ├─ Type Coercion at Boundaries
+                                   └─ View/Frontend
+```
+
+---
+
+## Suppressions — DO NOT flag these
+
+- "X is redundant with Y" when the redundancy is harmless and aids readability (e.g., `present?` redundant with `length > 20`)
+- "Add a comment explaining why this threshold/constant was chosen" — thresholds change during tuning, comments rot
+- "This assertion could be tighter" when the assertion already covers the behavior
+- Suggesting consistency-only changes (wrapping a value in a conditional to match how another constant is guarded)
+- "Regex doesn't handle edge case X" when the input is constrained and X never occurs in practice
+- "Test exercises multiple guards simultaneously" — that's fine, tests don't need to isolate every guard
+- Eval threshold changes (max_actionable, min scores) — these are tuned empirically and change constantly
+- Harmless no-ops (e.g., `.reject` on an element that's never in the array)
+- ANYTHING already addressed in the diff you're reviewing — read the FULL diff before commenting