mirror of
https://github.com/garrytan/gstack.git
synced 2026-05-20 19:29:56 +08:00
feat: add 7 specialist checklist files for Review Army
- testing.md (always-on): coverage gaps, flaky patterns, security enforcement - maintainability.md (always-on): dead code, DRY, stale comments - security.md (conditional): OWASP deep analysis, auth bypass, injection - performance.md (conditional): N+1 queries, bundle impact, complexity - data-migration.md (conditional): reversibility, lock duration, backfill - api-contract.md (conditional): breaking changes, versioning, error format - red-team.md (conditional): adversarial analysis, cross-cutting concerns All use standard header with JSON output schema and NO FINDINGS fallback.
This commit is contained in:
48
review/specialists/api-contract.md
Normal file
48
review/specialists/api-contract.md
Normal file
@@ -0,0 +1,48 @@
|
|||||||
|
# API Contract Specialist Review Checklist
|
||||||
|
|
||||||
|
Scope: When SCOPE_API=true
|
||||||
|
Output: JSON objects, one finding per line. Schema:
|
||||||
|
{"severity":"CRITICAL|INFORMATIONAL","confidence":N,"path":"file","line":N,"category":"api-contract","summary":"...","fix":"...","fingerprint":"path:line:api-contract","specialist":"api-contract"}
|
||||||
|
If no findings: output `NO FINDINGS` and nothing else.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Categories
|
||||||
|
|
||||||
|
### Breaking Changes
|
||||||
|
- Removed fields from response bodies (clients may depend on them)
|
||||||
|
- Changed field types (string → number, object → array)
|
||||||
|
- New required parameters added to existing endpoints
|
||||||
|
- Changed HTTP methods (GET → POST) or status codes (200 → 201)
|
||||||
|
- Renamed endpoints without maintaining the old path as a redirect/alias
|
||||||
|
- Changed authentication requirements (public → authenticated)
|
||||||
|
|
||||||
|
### Versioning Strategy
|
||||||
|
- Breaking changes made without a version bump (v1 → v2)
|
||||||
|
- Multiple versioning strategies mixed in the same API (URL vs header vs query param)
|
||||||
|
- Deprecated endpoints without a sunset timeline or migration guide
|
||||||
|
- Version-specific logic scattered across controllers instead of centralized
|
||||||
|
|
||||||
|
### Error Response Consistency
|
||||||
|
- New endpoints returning different error formats than existing ones
|
||||||
|
- Error responses missing standard fields (error code, message, details)
|
||||||
|
- HTTP status codes that don't match the error type (200 for errors, 500 for validation)
|
||||||
|
- Error messages that leak internal implementation details (stack traces, SQL)
|
||||||
|
|
||||||
|
### Rate Limiting & Pagination
|
||||||
|
- New endpoints missing rate limiting when similar endpoints have it
|
||||||
|
- Pagination changes (offset → cursor) without backwards compatibility
|
||||||
|
- Changed page sizes or default limits without documentation
|
||||||
|
- Missing total count or next-page indicators in paginated responses
|
||||||
|
|
||||||
|
### Documentation Drift
|
||||||
|
- OpenAPI/Swagger spec not updated to match new endpoints or changed params
|
||||||
|
- README or API docs describing old behavior after changes
|
||||||
|
- Example requests/responses that no longer work
|
||||||
|
- Missing documentation for new endpoints or changed parameters
|
||||||
|
|
||||||
|
### Backwards Compatibility
|
||||||
|
- Clients on older versions: will they break?
|
||||||
|
- Mobile apps that can't force-update: does the API still work for them?
|
||||||
|
- Webhook payloads changed without notifying subscribers
|
||||||
|
- SDK or client library changes needed to use new features
|
||||||
47
review/specialists/data-migration.md
Normal file
47
review/specialists/data-migration.md
Normal file
@@ -0,0 +1,47 @@
|
|||||||
|
# Data Migration Specialist Review Checklist
|
||||||
|
|
||||||
|
Scope: When SCOPE_MIGRATIONS=true
|
||||||
|
Output: JSON objects, one finding per line. Schema:
|
||||||
|
{"severity":"CRITICAL|INFORMATIONAL","confidence":N,"path":"file","line":N,"category":"data-migration","summary":"...","fix":"...","fingerprint":"path:line:data-migration","specialist":"data-migration"}
|
||||||
|
If no findings: output `NO FINDINGS` and nothing else.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Categories
|
||||||
|
|
||||||
|
### Reversibility
|
||||||
|
- Can this migration be rolled back without data loss?
|
||||||
|
- Is there a corresponding down/rollback migration?
|
||||||
|
- Does the rollback actually undo the change or just no-op?
|
||||||
|
- Would rolling back break the current application code?
|
||||||
|
|
||||||
|
### Data Loss Risk
|
||||||
|
- Dropping columns that still contain data (add deprecation period first)
|
||||||
|
- Changing column types that truncate data (varchar(255) → varchar(50))
|
||||||
|
- Removing tables without verifying no code references them
|
||||||
|
- Renaming columns without updating all references (ORM, raw SQL, views)
|
||||||
|
- NOT NULL constraints added to columns with existing NULL values (needs backfill first)
|
||||||
|
|
||||||
|
### Lock Duration
|
||||||
|
- ALTER TABLE on large tables without CONCURRENTLY (PostgreSQL)
|
||||||
|
- Adding indexes without CONCURRENTLY on tables with >100K rows
|
||||||
|
- Multiple ALTER TABLE statements that could be combined into one lock acquisition
|
||||||
|
- Schema changes that acquire exclusive locks during peak traffic hours
|
||||||
|
|
||||||
|
### Backfill Strategy
|
||||||
|
- New NOT NULL columns without DEFAULT value (requires backfill before constraint)
|
||||||
|
- New columns with computed defaults that need batch population
|
||||||
|
- Missing backfill script or rake task for existing records
|
||||||
|
- Backfill that updates all rows at once instead of batching (locks table)
|
||||||
|
|
||||||
|
### Index Creation
|
||||||
|
- CREATE INDEX without CONCURRENTLY on production tables
|
||||||
|
- Duplicate indexes (new index covers same columns as existing one)
|
||||||
|
- Missing indexes on new foreign key columns
|
||||||
|
- Partial indexes where a full index would be more useful (or vice versa)
|
||||||
|
|
||||||
|
### Multi-Phase Safety
|
||||||
|
- Migrations that must be deployed in a specific order with application code
|
||||||
|
- Schema changes that break the current running code (deploy code first, then migrate)
|
||||||
|
- Migrations that assume a deploy boundary (old code + new schema = crash)
|
||||||
|
- Missing feature flag to handle mixed old/new code during rolling deploy
|
||||||
45
review/specialists/maintainability.md
Normal file
45
review/specialists/maintainability.md
Normal file
@@ -0,0 +1,45 @@
|
|||||||
|
# Maintainability Specialist Review Checklist
|
||||||
|
|
||||||
|
Scope: Always-on (every review)
|
||||||
|
Output: JSON objects, one finding per line. Schema:
|
||||||
|
{"severity":"INFORMATIONAL","confidence":N,"path":"file","line":N,"category":"maintainability","summary":"...","fix":"...","fingerprint":"path:line:maintainability","specialist":"maintainability"}
|
||||||
|
If no findings: output `NO FINDINGS` and nothing else.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Categories
|
||||||
|
|
||||||
|
### Dead Code & Unused Imports
|
||||||
|
- Variables assigned but never read in the changed files
|
||||||
|
- Functions/methods defined but never called (check with Grep across the repo)
|
||||||
|
- Imports/requires that are no longer referenced after the change
|
||||||
|
- Commented-out code blocks (either remove or explain why they exist)
|
||||||
|
|
||||||
|
### Magic Numbers & String Coupling
|
||||||
|
- Bare numeric literals used in logic (thresholds, limits, retry counts) — should be named constants
|
||||||
|
- Error message strings used as query filters or conditionals elsewhere
|
||||||
|
- Hardcoded URLs, ports, or hostnames that should be config
|
||||||
|
- Duplicated literal values across multiple files
|
||||||
|
|
||||||
|
### Stale Comments & Docstrings
|
||||||
|
- Comments that describe old behavior after the code was changed in this diff
|
||||||
|
- TODO/FIXME comments that reference completed work
|
||||||
|
- Docstrings with parameter lists that don't match the current function signature
|
||||||
|
- ASCII diagrams in comments that no longer match the code flow
|
||||||
|
|
||||||
|
### DRY Violations
|
||||||
|
- Similar code blocks (3+ lines) appearing multiple times within the diff
|
||||||
|
- Copy-paste patterns where a shared helper would be cleaner
|
||||||
|
- Configuration or setup logic duplicated across test files
|
||||||
|
- Repeated conditional chains that could be a lookup table or map
|
||||||
|
|
||||||
|
### Conditional Side Effects
|
||||||
|
- Code paths that branch on a condition but forget a side effect on one branch
|
||||||
|
- Log messages that claim an action happened but the action was conditionally skipped
|
||||||
|
- State transitions where one branch updates related records but the other doesn't
|
||||||
|
- Event emissions that only fire on the happy path, missing error/edge paths
|
||||||
|
|
||||||
|
### Module Boundary Violations
|
||||||
|
- Reaching into another module's internal implementation (accessing private-by-convention methods)
|
||||||
|
- Direct database queries in controllers/views that should go through a service/model
|
||||||
|
- Tight coupling between components that should communicate through interfaces
|
||||||
51
review/specialists/performance.md
Normal file
51
review/specialists/performance.md
Normal file
@@ -0,0 +1,51 @@
|
|||||||
|
# Performance Specialist Review Checklist
|
||||||
|
|
||||||
|
Scope: When SCOPE_BACKEND=true OR SCOPE_FRONTEND=true
|
||||||
|
Output: JSON objects, one finding per line. Schema:
|
||||||
|
{"severity":"CRITICAL|INFORMATIONAL","confidence":N,"path":"file","line":N,"category":"performance","summary":"...","fix":"...","fingerprint":"path:line:performance","specialist":"performance"}
|
||||||
|
If no findings: output `NO FINDINGS` and nothing else.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Categories
|
||||||
|
|
||||||
|
### N+1 Queries
|
||||||
|
- ActiveRecord/ORM associations traversed in loops without eager loading (.includes, joinedload, include)
|
||||||
|
- Database queries inside iteration blocks (each, map, forEach) that could be batched
|
||||||
|
- Nested serializers that trigger lazy-loaded associations
|
||||||
|
- GraphQL resolvers that query per-field instead of batching (check for DataLoader usage)
|
||||||
|
|
||||||
|
### Missing Database Indexes
|
||||||
|
- New WHERE clauses on columns without indexes (check migration files or schema)
|
||||||
|
- New ORDER BY on non-indexed columns
|
||||||
|
- Composite queries (WHERE a AND b) without composite indexes
|
||||||
|
- Foreign key columns added without indexes
|
||||||
|
|
||||||
|
### Algorithmic Complexity
|
||||||
|
- O(n^2) or worse patterns: nested loops over collections, Array.find inside Array.map
|
||||||
|
- Repeated linear searches that could use a hash/map/set lookup
|
||||||
|
- String concatenation in loops (use join or StringBuilder)
|
||||||
|
- Sorting or filtering large collections multiple times when once would suffice
|
||||||
|
|
||||||
|
### Bundle Size Impact (Frontend)
|
||||||
|
- New production dependencies that are known-heavy (moment.js, lodash full, jquery)
|
||||||
|
- Barrel imports (import from 'library') instead of deep imports (import from 'library/specific')
|
||||||
|
- Large static assets (images, fonts) committed without optimization
|
||||||
|
- Missing code splitting for route-level chunks
|
||||||
|
|
||||||
|
### Rendering Performance (Frontend)
|
||||||
|
- Fetch waterfalls: sequential API calls that could be parallel (Promise.all)
|
||||||
|
- Unnecessary re-renders from unstable references (new objects/arrays in render)
|
||||||
|
- Missing React.memo, useMemo, or useCallback on expensive computations
|
||||||
|
- Layout thrashing from reading then writing DOM properties in loops
|
||||||
|
- Missing loading="lazy" on below-fold images
|
||||||
|
|
||||||
|
### Missing Pagination
|
||||||
|
- List endpoints that return unbounded results (no LIMIT, no pagination params)
|
||||||
|
- Database queries without LIMIT that grow with data volume
|
||||||
|
- API responses that embed full nested objects instead of IDs with expansion
|
||||||
|
|
||||||
|
### Blocking in Async Contexts
|
||||||
|
- Synchronous I/O (file reads, subprocess, HTTP requests) inside async functions
|
||||||
|
- time.sleep() / Thread.sleep() inside event-loop-based handlers
|
||||||
|
- CPU-intensive computation blocking the main thread without worker offload
|
||||||
44
review/specialists/red-team.md
Normal file
44
review/specialists/red-team.md
Normal file
@@ -0,0 +1,44 @@
|
|||||||
|
# Red Team Review
|
||||||
|
|
||||||
|
Scope: When diff > 200 lines OR security specialist found CRITICAL findings. Runs AFTER other specialists.
|
||||||
|
Output: JSON objects, one finding per line. Schema:
|
||||||
|
{"severity":"CRITICAL|INFORMATIONAL","confidence":N,"path":"file","line":N,"category":"red-team","summary":"...","fix":"...","fingerprint":"path:line:red-team","specialist":"red-team"}
|
||||||
|
If no findings: output `NO FINDINGS` and nothing else.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
This is NOT a checklist review. This is adversarial analysis.
|
||||||
|
|
||||||
|
You have access to the other specialists' findings (provided in your prompt). Your job is to find what they MISSED. Think like an attacker, a chaos engineer, and a hostile QA tester simultaneously.
|
||||||
|
|
||||||
|
## Approach
|
||||||
|
|
||||||
|
### 1. Attack the Happy Path
|
||||||
|
- What happens when the system is under 10x normal load?
|
||||||
|
- What happens when two requests hit the same resource simultaneously?
|
||||||
|
- What happens when the database is slow (>5s query time)?
|
||||||
|
- What happens when an external service returns garbage?
|
||||||
|
|
||||||
|
### 2. Find the Silent Failures
|
||||||
|
- Error handling that swallows exceptions (catch-all with just a log)
|
||||||
|
- Operations that can partially complete (3 of 5 items processed, then crash)
|
||||||
|
- State transitions that leave records in inconsistent states on failure
|
||||||
|
- Background jobs that fail without alerting anyone
|
||||||
|
|
||||||
|
### 3. Exploit Trust Assumptions
|
||||||
|
- Data validated on the frontend but not the backend
|
||||||
|
- Internal APIs called without authentication (assuming "only our code calls this")
|
||||||
|
- Configuration values assumed to be present but not validated
|
||||||
|
- File paths or URLs constructed from user input without sanitization
|
||||||
|
|
||||||
|
### 4. Break the Edge Cases
|
||||||
|
- What happens with the maximum possible input size?
|
||||||
|
- What happens with zero items, empty strings, null values?
|
||||||
|
- What happens on the first run ever (no existing data)?
|
||||||
|
- What happens when the user clicks the button twice in 100ms?
|
||||||
|
|
||||||
|
### 5. Find What the Other Specialists Missed
|
||||||
|
- Review each specialist's findings. What's the gap between their categories?
|
||||||
|
- Look for cross-category issues (e.g., a performance issue that's also a security issue)
|
||||||
|
- Look for issues at integration boundaries (where two systems meet)
|
||||||
|
- Look for issues that only manifest in specific deployment configurations
|
||||||
60
review/specialists/security.md
Normal file
60
review/specialists/security.md
Normal file
@@ -0,0 +1,60 @@
|
|||||||
|
# Security Specialist Review Checklist
|
||||||
|
|
||||||
|
Scope: When SCOPE_AUTH=true OR (SCOPE_BACKEND=true AND diff > 100 lines)
|
||||||
|
Output: JSON objects, one finding per line. Schema:
|
||||||
|
{"severity":"CRITICAL|INFORMATIONAL","confidence":N,"path":"file","line":N,"category":"security","summary":"...","fix":"...","fingerprint":"path:line:security","specialist":"security"}
|
||||||
|
If no findings: output `NO FINDINGS` and nothing else.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
This checklist goes deeper than the main CRITICAL pass. The main agent already checks SQL injection, race conditions, LLM trust, and enum completeness. This specialist focuses on auth/authz patterns, cryptographic misuse, and attack surface expansion.
|
||||||
|
|
||||||
|
## Categories
|
||||||
|
|
||||||
|
### Input Validation at Trust Boundaries
|
||||||
|
- User input accepted without validation at controller/handler level
|
||||||
|
- Query parameters used directly in database queries or file paths
|
||||||
|
- Request body fields accepted without type checking or schema validation
|
||||||
|
- File uploads without type/size/content validation
|
||||||
|
- Webhook payloads processed without signature verification
|
||||||
|
|
||||||
|
### Auth & Authorization Bypass
|
||||||
|
- Endpoints missing authentication middleware (check route definitions)
|
||||||
|
- Authorization checks that default to "allow" instead of "deny"
|
||||||
|
- Role escalation paths (user can modify their own role/permissions)
|
||||||
|
- Direct object reference vulnerabilities (user A accesses user B's data by changing an ID)
|
||||||
|
- Session fixation or session hijacking opportunities
|
||||||
|
- Token/API key validation that doesn't check expiration
|
||||||
|
|
||||||
|
### Injection Vectors (beyond SQL)
|
||||||
|
- Command injection via subprocess calls with user-controlled arguments
|
||||||
|
- Template injection (Jinja2, ERB, Handlebars) with user input
|
||||||
|
- LDAP injection in directory queries
|
||||||
|
- SSRF via user-controlled URLs (fetch, redirect, webhook targets)
|
||||||
|
- Path traversal via user-controlled file paths (../../etc/passwd)
|
||||||
|
- Header injection via user-controlled values in HTTP headers
|
||||||
|
|
||||||
|
### Cryptographic Misuse
|
||||||
|
- Weak hashing algorithms (MD5, SHA1) for security-sensitive operations
|
||||||
|
- Predictable randomness (Math.random, rand()) for tokens or secrets
|
||||||
|
- Non-constant-time comparisons (==) on secrets, tokens, or digests
|
||||||
|
- Hardcoded encryption keys or IVs
|
||||||
|
- Missing salt in password hashing
|
||||||
|
|
||||||
|
### Secrets Exposure
|
||||||
|
- API keys, tokens, or passwords in source code (even in comments)
|
||||||
|
- Secrets logged in application logs or error messages
|
||||||
|
- Credentials in URLs (query parameters or basic auth in URL)
|
||||||
|
- Sensitive data in error responses returned to users
|
||||||
|
- PII stored in plaintext when encryption is expected
|
||||||
|
|
||||||
|
### XSS via Escape Hatches
|
||||||
|
- Rails: .html_safe, raw() on user-controlled data
|
||||||
|
- React: dangerouslySetInnerHTML with user content
|
||||||
|
- Vue: v-html with user content
|
||||||
|
- Django: |safe, mark_safe() on user input
|
||||||
|
- General: innerHTML assignment with unsanitized data
|
||||||
|
|
||||||
|
### Deserialization
|
||||||
|
- Deserializing untrusted data (pickle, Marshal, YAML.load, JSON.parse of executable types)
|
||||||
|
- Accepting serialized objects from user input or external APIs without schema validation
|
||||||
45
review/specialists/testing.md
Normal file
45
review/specialists/testing.md
Normal file
@@ -0,0 +1,45 @@
|
|||||||
|
# Testing Specialist Review Checklist
|
||||||
|
|
||||||
|
Scope: Always-on (every review)
|
||||||
|
Output: JSON objects, one finding per line. Schema:
|
||||||
|
{"severity":"CRITICAL|INFORMATIONAL","confidence":N,"path":"file","line":N,"category":"testing","summary":"...","fix":"...","fingerprint":"path:line:testing","specialist":"testing"}
|
||||||
|
If no findings: output `NO FINDINGS` and nothing else.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Categories
|
||||||
|
|
||||||
|
### Missing Negative-Path Tests
|
||||||
|
- New code paths that handle errors, rejections, or invalid input with NO corresponding test
|
||||||
|
- Guard clauses and early returns that are untested
|
||||||
|
- Error branches in try/catch, rescue, or error boundaries with no failure-path test
|
||||||
|
- Permission/auth checks that are asserted in code but never tested for the "denied" case
|
||||||
|
|
||||||
|
### Missing Edge-Case Coverage
|
||||||
|
- Boundary values: zero, negative, max-int, empty string, empty array, nil/null/undefined
|
||||||
|
- Single-element collections (off-by-one on loops)
|
||||||
|
- Unicode and special characters in user-facing inputs
|
||||||
|
- Concurrent access patterns with no race-condition test
|
||||||
|
|
||||||
|
### Test Isolation Violations
|
||||||
|
- Tests sharing mutable state (class variables, global singletons, DB records not cleaned up)
|
||||||
|
- Order-dependent tests (pass in sequence, fail when randomized)
|
||||||
|
- Tests that depend on system clock, timezone, or locale
|
||||||
|
- Tests that make real network calls instead of using stubs/mocks
|
||||||
|
|
||||||
|
### Flaky Test Patterns
|
||||||
|
- Timing-dependent assertions (sleep, setTimeout, waitFor with tight timeouts)
|
||||||
|
- Assertions on ordering of unordered results (hash keys, Set iteration, async resolution order)
|
||||||
|
- Tests that depend on external services (APIs, databases) without fallback
|
||||||
|
- Randomized test data without seed control
|
||||||
|
|
||||||
|
### Security Enforcement Tests Missing
|
||||||
|
- Auth/authz checks in controllers with no test for the "unauthorized" case
|
||||||
|
- Rate limiting logic with no test proving it actually blocks
|
||||||
|
- Input sanitization with no test for malicious input
|
||||||
|
- CSRF/CORS configuration with no integration test
|
||||||
|
|
||||||
|
### Coverage Gaps
|
||||||
|
- New public methods/functions with zero test coverage
|
||||||
|
- Changed methods where existing tests only cover the old behavior, not the new branch
|
||||||
|
- Utility functions called from multiple places but tested only indirectly
|
||||||
Reference in New Issue
Block a user