mirror of
https://github.com/garrytan/gstack.git
synced 2026-05-17 17:51:27 +08:00
The LLM judge consistently scores the command reference table's completeness at 3/5 because it's a terse quick-reference format. Detailed argument docs live in per-command sections, not the summary table. The baseline already expects 3 — align the direct test threshold.
34 KiB
34 KiB