mirror of
https://github.com/garrytan/gstack.git
synced 2026-05-17 01:31:26 +08:00
Switch LLM-as-judge evals from Haiku to Sonnet 4.6 for more stable, nuanced scoring. Add changelog entry for all eval improvements. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
7.6 KiB
7.6 KiB