Purpose: Think through data flow across layers before implementing.
Most bugs happen at layer boundaries, not within layers.
Common cross-layer bugs:
Draw out how data moves:
Source → Transform → Store → Retrieve → Transform → Display
For each arrow, ask:
| Boundary | Common Issues |
|---|---|
| API ↔ Service | Type mismatches, missing fields |
| Service ↔ Database | Format conversions, null handling |
| Backend ↔ Frontend | Serialization, date formats |
| Component ↔ Component | Props shape changes |
For each boundary:
Bad: Assuming date format without checking
Good: Explicit format conversion at boundaries
Bad: Validating the same thing in multiple layers
Good: Validate once at the entry point
Bad: Component knows about database schema
Good: Each layer only knows its neighbors
Before implementation:
After implementation:
In Trellis, command templates (e.g., record-session.md) exist in multiple platforms with identical or near-identical content. This is a cross-layer boundary.
find src/templates/*/commands/trellis/ -name "<command>.*".md and TOML .toml)\\ vs \) and triple-quoted strings/trellis:check-cross-layer to verify nothing was missedReal-world example: Updated record-session.md in Claude to use --mode record, but forgot iFlow, Kilo, OpenCode, and Gemini — caught by cross-layer check.
Some generated files are both documentation and runtime input. In Trellis,
.trellis/workflow.md is parsed by get_context.py, workflow_phase.py,
SessionStart filters, and per-turn hooks. Template changes must be validated
against both fresh init and upgrade paths.
init output and a versioned update scenario that writes
the older .trellis/.versionReal-world example: Codex inline mode changed workflow platform markers from
[Codex] / [Kilo, Antigravity, Windsurf] to [codex-sub-agent] /
[codex-inline, Kilo, Antigravity, Windsurf]. Fresh init was correct, but
trellis update only merged [workflow-state:*] blocks and preserved stale
markers outside those blocks. Result: upgraded projects got new hook scripts
but old workflow routing, so get_context.py --mode phase --platform codex
could return empty Phase 2.1 detail.
When a CLI auto-detects a mode by probing a remote resource (e.g., checking if index.json exists to decide marketplace vs direct download):
-y, --flag combos)--template skipping picker) must have the same error-handling quality as the probed path — check that downstream functions don't call catch-all wrappersprovider:repo/path#ref not provider:repo#ref/path)Real-world example: Custom registry flow had 8 bugs across 3 review rounds: (1) probe only ran in interactive mode, (2) transient errors fell through to wrong mode, (3) giget URI had #ref in wrong position, (4) prefetched templates leaked across source switches, (5) --template shortcut bypassed probe but downloadTemplateById internally used catch-all fetchTemplateIndex, turning timeouts into "Template not found".
Real-world example: Agent-session update hints fetched npm latest metadata with response.read(4096) and then parsed it as complete JSON. The @mindfoldhq/trellis package metadata exceeded 4 KB, so the JSON was truncated, parse failed silently, and the first session injection showed no update hint. Fix: read the complete response before parsing, and add a regression where version is followed by an 8 KB metadata tail.
Create detailed flow docs when: