2 ヶ月前 · 508e34a233
--- a/README.md
+++ b/README.md
@@ -1,11 +1,29 @@
 
				+<div align="right">
			
 
				+
			
 
				+**[English](README_EN.md)** | 中文
			
 
				+
			
 
				+</div>
			
 
				+
			
 
				 ![达尔文.skill](assets/banner.svg)
			
 
				 
			
 
				+<div align="center">
			
 
				+
			
 
				 # 达尔文.skill
			
 
				 
			
 
				 **像训练模型一样优化你的 Claude Code Skills。**
			
 
				 
			
 
				 受 [Andrej Karpathy 的 autoresearch](https://github.com/karpathy/autoresearch) 启发，将自主实验循环从模型训练搬到 Skill 优化领域。一个只能向前转的棘轮。
			
 
				 
			
 
				+[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)
			
 
				+[![Claude Code](https://img.shields.io/badge/Claude%20Code-Skill-blueviolet)](https://claude.ai/code)
			
 
				+[![Skills](https://img.shields.io/badge/skills.sh-Compatible-green)](https://skills.sh)
			
 
				+
			
 
				+```
			
 
				+npx skills add alchaincyf/darwin-skill
			
 
				+```
			
 
				+
			
 
				+</div>
			
 
				+
			
 
				 ---
			
 
				 
			
 
				 ## 核心循环
			
@@ -39,8 +57,6 @@ Claude Code 的 Skill 生态在快速扩张。当你有 10 个 Skills 时可以
 
				 | `test set` | test-prompts.json | 验证改进是否真的有效 |
			
 
				 | 全自主运行 | **人在回路** | Skill 的好坏比 loss 更微妙，需要人的判断 |
			
 
				 
			
 
				-关键区别：autoresearch 全自主运行（loss 可以自动比较），Skill 优化增加了**人在回路**。因为 Skill 的「好坏」不像 loss 那样可以纯数值判断。
			
 
				-
			
 
				 ---
			
 
				 
			
 
				 ## 五条核心原则
			
@@ -94,43 +110,13 @@ Claude Code 的 Skill 生态在快速扩张。当你有 10 个 Skills 时可以
 
				 
			
 
				 ## 快速开始
			
 
				 
			
 
				-### 安装
			
 
				-
			
 
				 ```bash
			
 
				-# 将 SKILL.md 放入 Claude Code Skills 目录
			
 
				-mkdir -p ~/.claude/skills/darwin-skill
			
 
				-cp SKILL.md ~/.claude/skills/darwin-skill/SKILL.md
			
 
				-```
			
 
				-
			
 
				-### 使用
			
 
				-
			
 
				+npx skills add alchaincyf/darwin-skill
			
 
				 ```
			
 
				-# 评估所有 Skills（只评估不改）
			
 
				-> 评估所有 skills
			
 
				 
			
 
				-# 优化指定 Skill
			
 
				-> 优化 huashu-slides 这个 skill
			
 
				+安装后在 Claude Code 里说「优化所有skills」或「优化某个skill」就行。
			
 
				 
			
 
				-# 全量优化（推荐首次使用）
			
 
				-> 优化所有 skills
			
 
				-
			
 
				-# 查看历史
			
 
				-> 看看 skill 优化历史
			
 
				-```
			
 
				-
			
 
				-### 输出示例
			
 
				-
			
 
				-```
			
 
				-┌──────────────────────────┬────────┬────────┬────────┐
			
 
				-│ Skill                    │ Before │ After  │ Δ      │
			
 
				-├──────────────────────────┼────────┼────────┼────────┤
			
 
				-│ huashu-proofreading      │ 78     │ 87     │ +9     │
			
 
				-│ huashu-slides            │ 72     │ 83     │ +11    │
			
 
				-│ huashu-publish           │ 81     │ 88     │ +7     │
			
 
				-├──────────────────────────┼────────┼────────┼────────┤
			
 
				-│ 平均                     │ 77     │ 86     │ +9     │
			
 
				-└──────────────────────────┴────────┴────────┴────────┘
			
 
				-```
			
 
				+无法访问 GitHub 的朋友，可以直接下载 zip 包：[darwin-skill.zip](https://pub-161ae4b5ed0644c4a43b5c6412287e03.r2.dev/skills/darwin-skill.zip)，解压后把 SKILL.md 放到 `~/.claude/skills/darwin-skill/` 目录即可。
			
 
				 
			
 
				 ---
			
 
				 
			
@@ -138,54 +124,37 @@ cp SKILL.md ~/.claude/skills/darwin-skill/SKILL.md
 
				 
			
 
				 这个项目的设计直接受 **Andrej Karpathy 的 [autoresearch](https://github.com/karpathy/autoresearch)** 启发。
			
 
				 
			
 
				-autoresearch 证明了一个优雅的想法：你可以把「写论文」这件事变成一个自主实验循环。定义目标（`program.md`），让 agent 不断生成和测试变更（`train.py`），用可量化的指标（`val_bpb`）决定保留还是回滚。
			
 
				-
			
 
				-达尔文.skill 把同样的思路搬到了 Claude Code Skill 优化。区别在于：
			
 
				-
			
 
				-1. **评估更复杂**：需要 8 个维度的加权评分，单一数值说不清楚
			
 
				-2. **需要实测**：结构评分只是一半，另一半必须跑真实 prompt 看效果
			
 
				-3. **人在回路**：Skill 的「好」是主观的，需要人来做最终判断
			
 
				-
			
 
				 核心机制完全相同：**只保留可测量的改进，其余全部回滚。**
			
 
				 
			
 
				 ---
			
 
				 
			
 
				-## 约束规则
			
 
				+## 关于作者
			
 
				 
			
 
				-1. 不改变 Skill 的核心功能和用途
			
 
				-2. 不引入新依赖
			
 
				-3. 每轮只改一个维度，避免多变更无法归因
			
 
				-4. 优化后 SKILL.md 不超过原始大小的 150%
			
 
				-5. 所有改动在 git 分支上，用 git revert 回滚
			
 
				-6. 效果维度必须用子 agent 评分，不能自己改完自己评
			
 
				+| | |
			
 
				+|:---|:---|
			
 
				+| 🌐 官网 | [bookai.top](https://bookai.top) · [huasheng.ai](https://www.huasheng.ai) |
			
 
				+| 𝕏 Twitter | [@AlchainHust](https://x.com/AlchainHust) |
			
 
				+| 📺 B站 | [花叔](https://space.bilibili.com/14097567) |
			
 
				+| ▶️ YouTube | [@Alchain](https://www.youtube.com/@Alchain) |
			
 
				+| 📕 小红书 | [花叔](https://www.xiaohongshu.com/user/profile/5abc6f17e8ac2b109179dfdf) |
			
 
				+| 💬 公众号 | 微信搜「花叔」 |
			
 
				 
			
 
				 ---
			
 
				 
			
 
				-## 文件结构
			
 
				+## 许可证
			
 
				 
			
 
				-```
			
 
				-darwin-skill/
			
 
				-├── README.md              # 你正在看的文件
			
 
				-├── SKILL.md               # 核心：评估标准 + 优化流程 + 约束规则
			
 
				-├── showcase.html          # Pentagram 风格的可视化展示页（可本地打开）
			
 
				-├── docs/                  # GitHub Pages（公开后可访问）
			
 
				-│   └── index.html
			
 
				-└── assets/
			
 
				-    ├── banner.svg         # README 头图
			
 
				-    ├── chart-loop.png     # 核心循环流程图
			
 
				-    ├── chart-rubric.png   # 8 维度评估体系
			
 
				-    ├── chart-phases.png   # 5 阶段优化时间线
			
 
				-    └── chart-ratchet.png  # 棘轮机制可视化
			
 
				-```
			
 
				+MIT
			
 
				 
			
 
				 ---
			
 
				 
			
 
				-## 致谢
			
 
				+<div align="center">
			
 
				 
			
 
				-- [Andrej Karpathy](https://github.com/karpathy) 的 [autoresearch](https://github.com/karpathy/autoresearch) 提供了核心设计灵感
			
 
				-- [Claude Code](https://claude.ai/code) 的 Skill 生态提供了优化场景
			
 
				-- [花叔](https://x.com/AlchainHust) 的 60+ Skills 实践提供了真实测试环境
			
 
				+**[女娲](https://github.com/alchaincyf/nuwa-skill)** 造 Skill。<br>
			
 
				+**达尔文** 让 Skill 进化。<br><br>
			
 
				+*只保留改进，时间就站在你这边。*
			
 
				 
			
 
				----
			
 
				+<br>
			
 
				+
			
 
				+MIT License © [花叔 Huashu](https://github.com/alchaincyf)
			
 
				 
			
 
				-**License**: MIT
			
 
				+</div>
			
--- a/README_EN.md
+++ b/README_EN.md
@@ -0,0 +1,162 @@
 
				+<div align="right">
			
 
				+
			
 
				+English | **[中文](README.md)**
			
 
				+
			
 
				+</div>
			
 
				+
			
 
				+![darwin.skill](assets/banner.svg)
			
 
				+
			
 
				+<div align="center">
			
 
				+
			
 
				+# darwin.skill
			
 
				+
			
 
				+**Optimize your Claude Code Skills the way you train models.**
			
 
				+
			
 
				+Inspired by [Karpathy's autoresearch](https://github.com/karpathy/autoresearch). Autonomous experiment loops, applied to skill optimization. A ratchet that only turns forward.
			
 
				+
			
 
				+[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)
			
 
				+[![Claude Code](https://img.shields.io/badge/Claude%20Code-Skill-blueviolet)](https://claude.ai/code)
			
 
				+[![Skills](https://img.shields.io/badge/skills.sh-Compatible-green)](https://skills.sh)
			
 
				+
			
 
				+```
			
 
				+npx skills add alchaincyf/darwin-skill
			
 
				+```
			
 
				+
			
 
				+</div>
			
 
				+
			
 
				+---
			
 
				+
			
 
				+## The Core Loop
			
 
				+
			
 
				+![Core Loop](assets/chart-loop.png)
			
 
				+
			
 
				+Evaluate → Improve → Test → Human Confirm → Keep or Revert. Repeat.
			
 
				+
			
 
				+---
			
 
				+
			
 
				+## Why This Exists
			
 
				+
			
 
				+When you have 10 skills, you can maintain them by hand. When you have 60+, you need a system.
			
 
				+
			
 
				+Traditional skill review is purely structural: does the frontmatter look right? Are the steps numbered? Do the file paths exist? But a perfectly formatted skill can still produce terrible output.
			
 
				+
			
 
				+darwin.skill evaluates both **structure** and **real-world effectiveness**, then keeps only the changes that actually improve things.
			
 
				+
			
 
				+---
			
 
				+
			
 
				+## From autoresearch to Skill Optimization
			
 
				+
			
 
				+This project maps Karpathy's autoresearch directly onto skill optimization:
			
 
				+
			
 
				+| autoresearch | darwin.skill | Why |
			
 
				+|:---|:---|:---|
			
 
				+| `program.md` | This SKILL.md | Defines evaluation criteria and constraints |
			
 
				+| `train.py` | Each target SKILL.md | The single editable asset per experiment |
			
 
				+| `val_bpb` | 8-dimension weighted score (max 100) | Quantifiable optimization target |
			
 
				+| `git ratchet` | keep / revert mechanism | Only improving commits survive |
			
 
				+| `test set` | test-prompts.json | Validates whether improvements are real |
			
 
				+| Fully autonomous | **Human in the loop** | Skill quality is more subjective than loss |
			
 
				+
			
 
				+The key difference: autoresearch is fully autonomous (loss is just a number). Skill quality sometimes needs human judgment. So darwin.skill pauses after each skill's optimization cycle, shows you the diff and score delta, and waits for your confirmation.
			
 
				+
			
 
				+---
			
 
				+
			
 
				+## Five Core Principles
			
 
				+
			
 
				+| # | Principle | Details |
			
 
				+|:---|:---|:---|
			
 
				+| 01 | **Single editable asset** | One SKILL.md per experiment. One change, one measurement, one decision |
			
 
				+| 02 | **Dual evaluation** | Structure scoring (static analysis) + effectiveness scoring (live test execution) |
			
 
				+| 03 | **Ratchet mechanism** | Score can only go up. Regressions are auto-reverted |
			
 
				+| 04 | **Independent scoring** | The agent that edits is never the agent that scores |
			
 
				+| 05 | **Human in the loop** | System pauses after each skill. You review, then continue |
			
 
				+
			
 
				+---
			
 
				+
			
 
				+## 8-Dimension Evaluation Rubric
			
 
				+
			
 
				+Total: 100 points. Structure (60) + Effectiveness (40).
			
 
				+
			
 
				+![Evaluation Rubric](assets/chart-rubric.png)
			
 
				+
			
 
				+> Live test performance has the highest weight (25 points). A beautifully written skill that produces bad output is still a bad skill.
			
 
				+
			
 
				+---
			
 
				+
			
 
				+## The Optimization Cycle
			
 
				+
			
 
				+Five phases. Only one is the core.
			
 
				+
			
 
				+![Optimization Lifecycle](assets/chart-phases.png)
			
 
				+
			
 
				+**Phase 2 (the heart):**
			
 
				+
			
 
				+1. Find the lowest-scoring dimension
			
 
				+2. Generate one targeted improvement
			
 
				+3. Edit SKILL.md, git commit
			
 
				+4. Independent sub-agent re-scores
			
 
				+5. Score up → keep. Score down → git revert
			
 
				+6. Pause. Show diff + score delta. Wait for human confirmation
			
 
				+
			
 
				+---
			
 
				+
			
 
				+## The Ratchet
			
 
				+
			
 
				+Scores can only go up. Failed experiments are cleanly reverted. No regressions accumulate over time.
			
 
				+
			
 
				+![Ratchet Mechanism](assets/chart-ratchet.png)
			
 
				+
			
 
				+Round 2 scored 75, below the current best of 78. Auto-reverted. Effective baseline stays at 78. Subsequent improvements build from 78, not 75.
			
 
				+
			
 
				+---
			
 
				+
			
 
				+## Quick Start
			
 
				+
			
 
				+```bash
			
 
				+npx skills add alchaincyf/darwin-skill
			
 
				+```
			
 
				+
			
 
				+After installation, tell Claude Code: "optimize all skills" or "optimize [skill-name]".
			
 
				+
			
 
				+Can't access GitHub? Download the zip: [darwin-skill.zip](https://pub-161ae4b5ed0644c4a43b5c6412287e03.r2.dev/skills/darwin-skill.zip). Extract and place SKILL.md in `~/.claude/skills/darwin-skill/`.
			
 
				+
			
 
				+---
			
 
				+
			
 
				+## Design Inspiration
			
 
				+
			
 
				+Directly inspired by **Andrej Karpathy's [autoresearch](https://github.com/karpathy/autoresearch)**.
			
 
				+
			
 
				+The core mechanism is identical: **keep only measurable improvements, revert everything else.**
			
 
				+
			
 
				+---
			
 
				+
			
 
				+## About the Author
			
 
				+
			
 
				+| | |
			
 
				+|:---|:---|
			
 
				+| 🌐 Website | [bookai.top](https://bookai.top) · [huasheng.ai](https://www.huasheng.ai) |
			
 
				+| 𝕏 Twitter | [@AlchainHust](https://x.com/AlchainHust) |
			
 
				+| 📺 Bilibili | [花叔](https://space.bilibili.com/14097567) |
			
 
				+| ▶️ YouTube | [@Alchain](https://www.youtube.com/@Alchain) |
			
 
				+| 📕 Xiaohongshu | [花叔](https://www.xiaohongshu.com/user/profile/5abc6f17e8ac2b109179dfdf) |
			
 
				+| 💬 WeChat | Search "花叔" |
			
 
				+
			
 
				+---
			
 
				+
			
 
				+## License
			
 
				+
			
 
				+MIT
			
 
				+
			
 
				+---
			
 
				+
			
 
				+<div align="center">
			
 
				+
			
 
				+**[Nuwa](https://github.com/alchaincyf/nuwa-skill)** creates skills.<br>
			
 
				+**Darwin** makes them evolve.<br><br>
			
 
				+*Keep only improvements. Time is on your side.*
			
 
				+
			
 
				+<br>
			
 
				+
			
 
				+MIT License © [花叔 Huashu](https://github.com/alchaincyf)
			
 
				+
			
 
				+</div>