Browse Source

refactor: migrate to v5.1 SQLite-only schema

Breaking changes:
- Remove v4.0 compatibility code from state_manager.py
- EntityLinker now reads/writes from index.db aliases table
- Remove alias_index and entities_v3 initialization from init_project.py
- Delete deprecated scripts: extract_entities.py, structured_index.py, stress_test_*.py

Updated docs:
- entity-management-spec.md: v5.0 → v5.1, document SQLite schema
- tag-specification.md: v5.0 → v5.1, update alias resolution reference

Config changes:
- Remove alias_index_file property from config.py

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
lingfengQAQ 5 months ago
parent
commit
17c55ab027

+ 176 - 366
.claude/references/entity-management-spec.md

@@ -1,39 +1,92 @@
 # 实体管理规范 (Entity Management Specification)
 
-> **版本**: 5.0
+> **版本**: 5.1
 > **适用范围**: 所有实体类型(角色/地点/物品/势力/招式)
 > **核心目标**: AI 驱动的实体提取、别名管理、版本追踪
 
 ---
 
-## v5.0 变更
+## v5.1 变更
 
-1. **AI 提取替代 XML 标签**: Data Agent 从纯正文语义提取实体,不再依赖 `extract_entities.py`
-2. **alias_index 一对多**: 同一别名可映射到多个实体,内嵌在 `state.json`
-3. **entities_v3 分组格式**: 按类型分组(角色/地点/物品/势力/招式)
+1. **SQLite 存储**: 实体、别名、状态变化、关系迁移到 `index.db`
+2. **state.json 精简**: 仅保留进度、主角状态、节奏追踪(< 5KB)
+3. **AI 提取**: Data Agent 从纯正文语义提取实体
 4. **置信度消歧**: >0.8 自动采用,0.5-0.8 警告,<0.5 人工确认
-5. **无向后兼容**: 不保留旧版 `entities` 列表格式
-6. **双 Agent 架构**: Context Agent (读) + Data Agent (写)
+5. **双 Agent 架构**: Context Agent (读) + Data Agent (写)
 
-> **注意**: XML 标签仍可用于手动标注场景,但 v5.0 主流程不再要求。
+> **注意**: XML 标签仍可用于手动标注场景,但主流程不再要求。
 
 ---
 
-## 一、问题分析
-
-### 1.1 当前问题
-
-1. **别名问题**: 同一角色在不同章节有不同称呼
-   - 第1章: "废物" (贬称)
-   - 第10章: "林天" (真名)
-   - 第50章: "林宗主" (地位称呼)
-   - 第200章: "不灭战神" (称号)
-
-2. **创建/更新问题**: 当前使用 `setdefault()` 只能创建,无法更新
-
-3. **版本追踪问题**: 无法追踪属性变更历史
+## 一、存储架构 (v5.1)
+
+### 1.1 数据分布
+
+| 数据类型 | 存储位置 | 说明 |
+|---------|---------|------|
+| 实体 (entities) | index.db | SQLite entities 表 |
+| 别名 (aliases) | index.db | SQLite aliases 表 (一对多) |
+| 状态变化 | index.db | SQLite state_changes 表 |
+| 关系 | index.db | SQLite relationships 表 |
+| 章节索引 | index.db | SQLite chapters 表 |
+| 场景索引 | index.db | SQLite scenes 表 |
+| 进度/配置 | state.json | 精简 JSON (< 5KB) |
+| 主角状态 | state.json | protagonist_state 快照 |
+| 节奏追踪 | state.json | strand_tracker |
+
+### 1.2 index.db Schema
+
+```sql
+-- 实体表
+CREATE TABLE entities (
+    id TEXT PRIMARY KEY,
+    type TEXT NOT NULL,  -- 角色/地点/物品/势力/招式
+    canonical_name TEXT NOT NULL,
+    tier TEXT DEFAULT '装饰',  -- 核心/支线/装饰
+    desc TEXT,
+    current_json TEXT,  -- JSON 格式的当前状态
+    first_appearance INTEGER,
+    last_appearance INTEGER,
+    is_protagonist INTEGER DEFAULT 0,
+    created_at TEXT,
+    updated_at TEXT
+);
+
+-- 别名表 (一对多)
+CREATE TABLE aliases (
+    id INTEGER PRIMARY KEY AUTOINCREMENT,
+    alias TEXT NOT NULL,
+    entity_id TEXT NOT NULL,
+    entity_type TEXT NOT NULL,
+    UNIQUE(alias, entity_id)
+);
+
+-- 状态变化表
+CREATE TABLE state_changes (
+    id INTEGER PRIMARY KEY AUTOINCREMENT,
+    entity_id TEXT NOT NULL,
+    field TEXT NOT NULL,
+    old_value TEXT,
+    new_value TEXT,
+    reason TEXT,
+    chapter INTEGER,
+    created_at TEXT
+);
+
+-- 关系表
+CREATE TABLE relationships (
+    id INTEGER PRIMARY KEY AUTOINCREMENT,
+    from_entity TEXT NOT NULL,
+    to_entity TEXT NOT NULL,
+    type TEXT NOT NULL,
+    description TEXT,
+    chapter INTEGER,
+    created_at TEXT,
+    UNIQUE(from_entity, to_entity, type)
+);
+```
 
-### 1.2 各类实体特点
+### 1.3 各类实体特点
 
 | 实体类型 | 别名复杂度 | 属性变化 | 层级关系 |
 |---------|-----------|---------|---------|
@@ -45,11 +98,66 @@
 
 ---
 
-## 二、标签体系设计
+## 二、处理流程 (v5.1)
+
+### 2.1 Data Agent 自动提取
+
+```
+章节正文
+    ↓
+Data Agent (AI 语义分析)
+    ↓
+┌─────────────────────────────────────────────────────────┐
+│ 1. 识别出场实体                                          │
+│    - 匹配已有实体(通过 aliases 表)                      │
+│    - 识别新实体,生成 suggested_id                       │
+│                                                          │
+│ 2. 置信度评估                                            │
+│    ├─ > 0.8: 自动采用                                   │
+│    ├─ 0.5-0.8: 采用但警告                               │
+│    └─ < 0.5: 标记待人工确认                             │
+│                                                          │
+│ 3. 写入 index.db                                        │
+│    - entities 表: 新实体/更新出场章节                    │
+│    - aliases 表: 注册新别名                             │
+│    - state_changes 表: 记录属性变化                     │
+│    - relationships 表: 记录新关系                       │
+│                                                          │
+│ 4. 更新 state.json (精简)                               │
+│    - protagonist_state: 主角状态快照                    │
+│    - strand_tracker: 节奏追踪                           │
+│    - disambiguation_warnings/pending: 消歧记录          │
+└─────────────────────────────────────────────────────────┘
+    ↓
+index.db 更新完成
+```
+
+### 2.2 查询接口
+
+```bash
+# 查询实体
+python -m data_modules.index_manager get-entity --id "xiaoyan" --project-root "."
 
-### 2.1 新建实体 (`<entity>`)
+# 查询核心实体
+python -m data_modules.index_manager get-core-entities --project-root "."
 
-首次出场时使用,**推荐**包含 `id` 属性作为唯一标识(便于后续更新/加别名);不写 `id` 时脚本会自动生成并注册 `name/alias`。
+# 通过别名查找
+python -m data_modules.index_manager get-by-alias --alias "萧炎" --project-root "."
+
+# 查询状态变化历史
+python -m data_modules.index_manager get-state-changes --entity "xiaoyan" --project-root "."
+
+# 查询关系
+python -m data_modules.index_manager get-relationships --entity "xiaoyan" --project-root "."
+```
+
+---
+
+## 三、标签体系 (可选)
+
+> v5.1 主流程使用 Data Agent 自动提取。以下标签仅用于**手动标注场景**。
+
+### 3.1 新建实体 (`<entity>`)
 
 ```xml
 <entity type="角色" id="lintian" name="林天" desc="主角,觉醒吞噬金手指" tier="核心">
@@ -59,149 +167,40 @@
 
 <entity type="地点" id="tianyunzong" name="天云宗" desc="东域三大宗门之一" tier="核心">
   <alias>宗门</alias>
-  <alias>天云</alias>
-</entity>
-
-<entity type="地点" id="tianyunzong_waimen" name="天云宗外门" parent="tianyunzong" desc="外门弟子修炼区" tier="支线">
-  <alias>外门</alias>
 </entity>
 ```
 
-> 注:当前脚本不解析 `<sub-location>` 这种嵌套子标签;子地点请用独立 `<entity>` + `parent` 字段表达。
-
-### 2.2 添加别名 (`<entity-alias>`)
-
-后续章节出现新称呼时使用:
+### 3.2 添加别名 (`<entity-alias>`)
 
 ```xml
-<!-- 方式1: 通过 id 引用 -->
 <entity-alias id="lintian" alias="林宗主" context="成为天云宗主后"/>
-
-<!-- 方式2: 通过已知别名引用(自动解析) -->
 <entity-alias ref="林天" alias="不灭战神" context="晋升战神称号后"/>
 ```
 
-### 2.3 更新属性 (`<entity-update>`)
-
-属性发生重大变化时使用(v5.0 支持多种操作):
+### 3.3 更新属性 (`<entity-update>`)
 
 ```xml
-<!-- 基础操作 -->
 <entity-update id="lintian">
   <set key="realm" value="筑基期一层" reason="血煞秘境突破"/>
   <set key="location" value="天云宗"/>
 </entity-update>
-
-<!-- 删除属性 -->
-<entity-update id="lintian">
-  <unset key="bottleneck"/>
-</entity-update>
-
-<!-- 数组操作 -->
-<entity-update id="lintian">
-  <add key="titles" value="不灭战神"/>
-  <remove key="allies" value="张三"/>
-</entity-update>
-
-<!-- 计数操作 -->
-<entity-update id="lintian">
-  <inc key="kill_count" delta="1"/>
-</entity-update>
-
-<!-- 顶层字段修改 -->
-<entity-update id="lintian">
-  <set key="tier" value="核心"/>
-  <set key="canonical_name" value="林不灭" reason="觉醒后改名"/>
-</entity-update>
-
-<!-- 通过别名引用(需 type 消歧) -->
-<entity-update ref="林宗主" type="角色">
-  <set key="realm" value="金丹期"/>
-</entity-update>
 ```
 
-**顶层字段白名单**: `tier`, `desc`, `canonical_name`, `importance`, `status`, `parent`
-
 **操作类型**:
+
 | 操作 | 语法 | 说明 |
 |------|------|------|
 | set | `<set key="k" value="v"/>` | 设置属性值 |
 | unset | `<unset key="k"/>` | 删除属性 |
 | add | `<add key="k" value="v"/>` | 向数组添加元素 |
 | remove | `<remove key="k" value="v"/>` | 从数组删除元素 |
-| inc | `<inc key="k" delta="1"/>` | 数值递增(默认+1) |
-
-### 2.4 简化写法(自动检测模式)
-
-对于简单场景,可使用传统标签格式,系统自动检测:
-
-```xml
-<!-- 系统自动查询 alias_index,判断是创建还是更新 -->
-<entity type="角色" name="林宗主" realm="金丹期"/>
-```
-
-**自动检测逻辑**:
-1. 查询 `alias_index`,检查 `name` 是否已是某个实体的别名
-2. 如找到 → 更新该实体
-3. 如未找到 → 视为新实体,创建并生成 `id`
+| inc | `<inc key="k" delta="1"/>` | 数值递增 |
 
 ---
 
-## 三、存储结构设计
-
-### 3.1 state.json 结构
-
-```json
-{
-  "entities_v3": {
-    "角色": {
-      "lintian": {
-        "id": "lintian",
-        "canonical_name": "林天",
-        "aliases": ["废物", "那个少年", "林宗主", "不灭战神"],
-        "tier": "核心",
-        "desc": "主角,觉醒吞噬金手指",
-        "current": {
-          "realm": "金丹期",
-          "location": "天云宗",
-          "last_chapter": 100
-        },
-        "history": [
-          {"chapter": 1, "changes": {"realm": "练气期一层"}, "reasons": {"realm": "初始状态"}, "added_at": "2026-01-01 00:00:00"},
-          {"chapter": 10, "changes": {"realm": "练气期九层"}, "reasons": {"realm": "吞噬突破"}, "added_at": "2026-01-01 00:00:00"},
-          {"chapter": 50, "changes": {"realm": "筑基期一层"}, "reasons": {"realm": "血煞秘境突破"}, "added_at": "2026-01-01 00:00:00"}
-        ],
-        "created_chapter": 1,
-        "first_appearance": "正文/第0001章.md"
-      }
-    },
-    "地点": {},
-    "物品": {},
-    "势力": {},
-    "招式": {}
-  },
-
-  "alias_index": {
-    "废物": [{"type": "角色", "id": "lintian"}],
-    "林天": [{"type": "角色", "id": "lintian"}],
-    "林宗主": [{"type": "角色", "id": "lintian"}],
-    "天云宗": [
-      {"type": "地点", "id": "loc_tianyunzong"},
-      {"type": "势力", "id": "faction_tianyunzong"}
-    ],
-    "外门": [{"type": "地点", "id": "tianyunzong_waimen"}]
-  }
-}
-```
-
-**注意**: v5.0 的 `alias_index` 值为数组(一对多),不再是单个对象。
-
-### 3.2 ID 生成规则
+## 四、ID 生成规则
 
 ```python
-import hashlib
-from pypinyin import lazy_pinyin
-
 def generate_entity_id(entity_type: str, name: str, existing_ids: set) -> str:
     """
     生成唯一实体 ID
@@ -209,9 +208,8 @@ def generate_entity_id(entity_type: str, name: str, existing_ids: set) -> str:
     规则:
     1. 优先使用拼音(去空格、小写)
     2. 冲突时追加数字后缀
-    3. 特殊前缀按类型
+    3. 类型前缀: 物品→item_, 势力→faction_, 招式→skill_, 地点→loc_
     """
-    # 类型前缀映射
     prefix_map = {
         "物品": "item_",
         "势力": "faction_",
@@ -220,11 +218,9 @@ def generate_entity_id(entity_type: str, name: str, existing_ids: set) -> str:
         # 角色无前缀
     }
 
-    # 生成基础 ID
     pinyin = ''.join(lazy_pinyin(name))
     base_id = prefix_map.get(entity_type, '') + pinyin.lower()
 
-    # 处理冲突
     final_id = base_id
     counter = 1
     while final_id in existing_ids:
@@ -236,252 +232,66 @@ def generate_entity_id(entity_type: str, name: str, existing_ids: set) -> str:
 
 ---
 
-## 四、处理流程
-
-> **v5.0 说明**: 以下流程描述的是 XML 标签解析流程,仅适用于**手动标注场景**。
-> v5.0 主流程使用 Data Agent 从纯正文 AI 提取实体,参见 `agents/data-agent.md`。
+## 五、错误处理
 
-### 4.1 完整流程图(手动标注场景)
+### 5.1 别名冲突
 
-```
-章节内容
-    ↓
-extract_entities.py
-    ↓
-┌─────────────────────────────────────────────────────────┐
-│ 1. 解析所有 XML 标签                                      │
-│    - <entity> 标签 → 新实体候选                           │
-│    - <entity-alias> 标签 → 别名注册                       │
-│    - <entity-update> 标签 → 属性更新                      │
-│                                                          │
-│ 2. 加载 state.json 的 alias_index                        │
-│                                                          │
-│ 3. 对每个 <entity> 标签:                                  │
-│    ├─ 有 id 属性 → 使用指定 id                            │
-│    └─ 无 id 属性 → 查询 alias_index:                      │
-│        ├─ 找到 → 更新模式(使用找到的 id)                  │
-│        └─ 未找到 → 创建模式(生成新 id)                    │
-│                                                          │
-│ 4. 创建模式:                                              │
-│    - 生成唯一 id                                         │
-│    - 初始化 entity 对象(canonical_name, aliases, etc.)  │
-│    - 设置 current 初始属性                                │
-│    - 记录 history[0] 初始状态                             │
-│    - 更新 alias_index(所有别名 → id)                    │
-│                                                          │
-│ 5. 更新模式:                                              │
-│    - 合并新属性到 current                                 │
-│    - 追加 history 记录(如有重要变更)                     │
-│    - 更新 last_chapter                                   │
-│    - 添加新别名到 aliases 和 alias_index                  │
-│                                                          │
-│ 6. 处理 <entity-alias>:                                   │
-│    - 解析 id 或 ref                                       │
-│    - 添加 alias 到 aliases 列表                           │
-│    - 更新 alias_index                                    │
-│                                                          │
-│ 7. 处理 <entity-update>:                                  │
-│    - 解析 id 或 ref(通过 alias_index 解析)               │
-│    - 应用 <set> 更新到 current                            │
-│    - 追加 history 记录                                    │
-└─────────────────────────────────────────────────────────┘
-    ↓
-state.json 更新
-```
+v5.1 允许 **aliases 一对多**:同一别名可以指向多个实体。
 
-### 4.2 别名解析函数
+当 `ref="别名"` 命中多个实体且无法消歧时,报错:
 
-```python
-def resolve_entity_by_alias(alias: str, entity_type: str, state: dict) -> tuple:
-    """
-    通过别名解析实体 ID
-
-    Args:
-        alias: 别名或名称
-        entity_type: 实体类型(角色/地点/物品/势力/招式)
-        state: state.json 内容
-
-    Returns:
-        (entity_id, entity_data) 或 (None, None)
-    """
-    alias_index = state.get("alias_index", {})
-
-    # 1. 精确匹配
-    if alias in alias_index:
-        ref = alias_index[alias]
-        if ref["type"] == entity_type:
-            entity_id = ref["id"]
-            entity_data = state["entities_v3"].get(entity_type, {}).get(entity_id)
-            return (entity_id, entity_data)
-
-    # 2. 模糊匹配(可选,适用于"云长老" vs "云长老(天云宗)")
-    for key, ref in alias_index.items():
-        if ref["type"] == entity_type and alias in key:
-            entity_id = ref["id"]
-            entity_data = state["entities_v3"].get(entity_type, {}).get(entity_id)
-            return (entity_id, entity_data)
-
-    return (None, None)
-```
-
----
-
-## 五、特殊场景处理
-
-### 5.1 角色改名
-
-当角色正式改名(如赐名、觉醒后改名):
-
-```xml
-<!-- 保留旧别名,添加新的 canonical_name -->
-<entity-update id="lintian">
-  <set key="canonical_name" value="林不灭" reason="觉醒战神血脉后改名"/>
-</entity-update>
-<entity-alias id="lintian" alias="林不灭"/>
-```
-
-### 5.2 地点层级
-
-子地点作为独立实体,但记录父子关系:
-
-```xml
-<entity type="地点" id="tianyunzong_neimen" name="天云宗内门"
-        parent="tianyunzong" desc="核心弟子修炼区域" tier="支线">
-  <alias>内门</alias>
-</entity>
-```
-
-### 5.3 物品转移
-
-物品更换主人:
-
-```xml
-<entity-update ref="混沌珠">
-  <set key="owner" value="李雪" reason="林天将混沌珠赠予李雪"/>
-</entity-update>
 ```
+⚠️ 别名歧义: '宗主' 命中 2 个实体,请改用 id 或补充 type 属性
 
-### 5.4 势力合并/覆灭
-
-```xml
-<entity-update id="xueshamen">
-  <set key="status" value="覆灭" reason="被天云宗剿灭"/>
-  <set key="destroyed_chapter" value="75"/>
-</entity-update>
+解决方案:
+  1. 改用稳定 id:<entity-update id="...">...</entity-update>
+  2. 补充 type(仅能消歧跨类型;同类型重名仍需 id)
 ```
 
----
-
-## 六、迁移策略(已移除)
+### 5.2 置信度处理
 
-本插件不再提供旧格式迁移与向后兼容。v5.0 推荐做法:
-
-1. 删除 `.webnovel/index.db`(索引可重建)
-2. 保留章节文件不动(纯正文是唯一真相)
-3. 运行 `python -m data_modules.index_manager rebuild --project-root .` 重建索引
-4. Data Agent 会在后续章节中自动提取实体
-
-> **注意**: v5.0 不再依赖 `extract_entities.py`,实体提取由 Data Agent 自动完成。
+| 置信度范围 | 处理方式 |
+|-----------|---------|
+| > 0.8 | 自动采用,无需确认 |
+| 0.5 - 0.8 | 采用建议值,记录 warning |
+| < 0.5 | 标记待人工确认,不自动写入 |
 
 ---
 
-## 七、查询接口
+## 六、迁移说明
 
-### 7.1 通过别名查询实体
+从 v5.0 迁移到 v5.1:
 
-```python
-def query_entity(name_or_alias: str, entity_type: str = None) -> dict:
-    """
-    通过名称或别名查询实体完整信息
-
-    返回:
-    {
-        "id": "lintian",
-        "type": "角色",
-        "canonical_name": "林天",
-        "aliases": [...],
-        "current": {...},
-        "history": [...]
-    }
-    """
-```
+```bash
+# 运行迁移脚本
+python -m data_modules.migrate_state_to_sqlite --project-root "." --backup
 
-### 7.2 查询实体变更历史
-
-```python
-def query_entity_history(entity_id: str, entity_type: str) -> list:
-    """
-    查询实体的属性变更历史
-
-    返回:
-    [
-        {"chapter": 1, "changes": {"realm": "练气期一层"}, "reasons": {"realm": "初始"}, "added_at": "YYYY-MM-DD HH:MM:SS"},
-        {"chapter": 50, "changes": {"realm": "筑基期"}, "reasons": {"realm": "突破"}, "added_at": "YYYY-MM-DD HH:MM:SS"},
-        ...
-    ]
-    """
+# 验证迁移结果
+python -m data_modules.index_manager stats --project-root "."
 ```
 
-### 7.3 查询某章节实体状态
-
-```python
-def query_entity_at_chapter(entity_id: str, entity_type: str, chapter: int) -> dict:
-    """
-    查询实体在特定章节时的状态(通过历史回溯)
-
-    用于一致性检查:验证描述是否与当时状态匹配
-    """
-```
+迁移后:
+- `index.db` 包含所有实体、别名、状态变化、关系
+- `state.json` 仅保留进度、主角状态、节奏追踪
+- 旧的 `entities_v3`、`alias_index` 字段会被清理
 
 ---
 
-## 八、错误处理
-
-### 8.1 别名冲突
+## 七、总结
 
-v5.0 允许 **alias_index 一对多**:同一别名可以指向多个实体(跨类型或同类型)。
+### 7.1 v5.1 核心改进
 
-当你用 `ref="别名"` 进行引用,但命中多个实体且无法消歧时,脚本会直接报错:
+1. **SQLite 存储**: 解决 state.json 膨胀问题
+2. **精简 JSON**: state.json 保持 < 5KB
+3. **一对多别名**: 同一别名可映射多个实体
+4. **AI 自动提取**: Data Agent 语义分析替代 XML 标签
 
-```
-⚠️ 别名歧义: '宗主' 命中 2 个实体,请改用 id 或补充 type 属性
-
-解决方案:
-  1. 改用稳定 id:<entity-update id="...">...</entity-update>
-  2. 补充 type(仅能消歧跨类型;同类型重名仍需 id)
-  3. 追加更具体的 alias(避免以后持续歧义)
-```
-
-### 8.2 未知引用
-
-当 `<entity-update ref="xxx">` 找不到对应实体:
+### 7.2 数据流
 
 ```
-⚠️ 未知实体引用: "xxx" 在 alias_index 中未找到
-   建议: 先使用 <entity> 创建,或检查拼写
+章节正文 → Data Agent → index.db (实体/别名/关系/状态变化)
+                      → state.json (进度/主角状态/节奏)
+                      → vectors.db (场景向量)
+                              ↓
+                      Context Agent → 下一章上下文
 ```
-
----
-
-## 九、总结
-
-### 9.1 核心改进
-
-1. **统一 ID 系统**: 所有实体有唯一 ID,别名映射到 ID
-2. **自动检测**: 无需显式指定创建/更新,系统自动判断
-3. **版本追踪**: history 数组记录重要属性变更
-4. **v5.0 架构**: 使用 `entities_v3` 分组格式,XML 标签为可选(手动标注场景)
-
-### 9.2 新增标签
-
-| 标签 | 用途 | 必填属性 |
-|------|------|---------|
-| `<entity>` | 创建/更新实体 | type, name |
-| `<entity-alias>` | 添加别名 | id/ref, alias |
-| `<entity-update>` | 更新属性 | id/ref, `<set>`/`<unset>`/`<add>`/`<remove>`/`<inc>` |
-
-### 9.3 实现优先级
-
-1. **P0**: alias_index 和自动检测(解决核心问题)
-2. **P1**: 属性更新和历史记录
-3. **P2**: 索引主键迁移(entity_id)+ Context Pack

+ 1 - 3
.claude/scripts/data_modules/config.py

@@ -58,9 +58,7 @@ class DataModulesConfig:
     def index_db(self) -> Path:
         return self.webnovel_dir / "index.db"
 
-    @property
-    def alias_index_file(self) -> Path:
-        return self.webnovel_dir / "alias_index.json"
+    # v5.1: alias_index_file 已废弃,别名存储在 index.db aliases 表
 
     @property
     def chapters_dir(self) -> Path:

+ 22 - 110
.claude/scripts/data_modules/entity_linker.py

@@ -1,28 +1,24 @@
 #!/usr/bin/env python3
 # -*- coding: utf-8 -*-
 """
-Entity Linker - 实体消歧辅助模块
+Entity Linker - 实体消歧辅助模块 (v5.1)
 
 为 Data Agent 提供实体消歧的辅助功能:
 - 置信度判断
-- 别名索引管理
+- 别名索引管理 (通过 index.db aliases 表)
 - 消歧结果记录
+
+v5.1 变更:
+- 别名存储从 state.json 迁移到 index.db aliases 表
+- 使用 IndexManager 进行别名读写
+- 移除对 state.json 的直接操作
 """
 
-import json
-from pathlib import Path
 from typing import Dict, List, Optional, Tuple
 from dataclasses import dataclass, field
-import filelock
 
 from .config import get_config
-
-try:
-    # 常见:从 scripts/ 目录运行,security_utils 在 sys.path 顶层
-    from security_utils import atomic_write_json, read_json_safe
-except ImportError:  # pragma: no cover
-    # 兼容:从仓库根目录以 `python -m scripts...` 运行
-    from scripts.security_utils import atomic_write_json, read_json_safe
+from .index_manager import IndexManager
 
 
 @dataclass
@@ -37,100 +33,23 @@ class DisambiguationResult:
 
 
 class EntityLinker:
-    """实体链接器 - 辅助 Data Agent 进行实体消歧 (v5.0 一对多别名)"""
+    """实体链接器 - 辅助 Data Agent 进行实体消歧 (v5.1 SQLite)"""
 
     def __init__(self, config=None):
         self.config = config or get_config()
-        # v5.0: alias_index 改为一对多格式 {alias: [{"type": ..., "id": ...}, ...]}
-        self._alias_index: Dict[str, List[Dict]] = {}
-        self._state_file = self.config.state_file
-        self._load_alias_index()
-
-    def _load_alias_index(self):
-        """从 state.json 加载 alias_index"""
-        if self._state_file.exists():
-            try:
-                with open(self._state_file, "r", encoding="utf-8") as f:
-                    state = json.load(f)
-                self._alias_index = state.get("alias_index", {})
-            except (json.JSONDecodeError, IOError):
-                self._alias_index = {}
-        else:
-            self._alias_index = {}
-
-    def save_alias_index(self):
-        """保存 alias_index 到 state.json(v5.0 内嵌格式,锁内合并 + 原子写入)"""
-        if not self._state_file.exists():
-            return
-
-        lock_path = self._state_file.with_suffix(self._state_file.suffix + ".lock")
-        lock = filelock.FileLock(str(lock_path), timeout=10)
-        try:
-            with lock:
-                state = read_json_safe(self._state_file, default={})
-
-                disk_alias = state.get("alias_index", {})
-                if not isinstance(disk_alias, dict):
-                    disk_alias = {}
-
-                # 一对多:合并去重(避免覆盖其他进程刚写入的 state 字段/别名)
-                for alias, entries in (self._alias_index or {}).items():
-                    if not alias or not isinstance(entries, list):
-                        continue
-
-                    existing = disk_alias.get(alias)
-                    if not isinstance(existing, list):
-                        existing = []
-                        disk_alias[alias] = existing
-
-                    for entry in entries:
-                        if not isinstance(entry, dict):
-                            continue
-                        et = entry.get("type")
-                        eid = entry.get("id")
-                        if not et or not eid:
-                            continue
-                        if any(
-                            isinstance(e, dict) and e.get("type") == et and e.get("id") == eid
-                            for e in existing
-                        ):
-                            continue
-                        existing.append({"type": et, "id": eid})
-
-                state["alias_index"] = disk_alias
-
-                self.config.ensure_dirs()
-                atomic_write_json(self._state_file, state, use_lock=False, backup=True)
-
-                # 同步内存到磁盘最新快照
-                self._alias_index = disk_alias
-        except filelock.Timeout:
-            raise RuntimeError("无法获取 state.json 文件锁,请稍后重试")
-
-    # ==================== 别名管理 (v5.0 一对多) ====================
+        self._index_manager = IndexManager(self.config)
+
+    # ==================== 别名管理 (v5.1 SQLite) ====================
 
     def register_alias(self, entity_id: str, alias: str, entity_type: str = "角色") -> bool:
-        """注册新别名(v5.0 一对多:同一别名可映射多个实体)"""
-        if not alias:
+        """注册新别名(v5.1: 写入 index.db aliases 表)"""
+        if not alias or not entity_id:
             return False
-
-        if alias not in self._alias_index:
-            self._alias_index[alias] = []
-
-        # 检查是否已存在相同 (type, id) 组合
-        for entry in self._alias_index[alias]:
-            if entry.get("type") == entity_type and entry.get("id") == entity_id:
-                return True  # 已存在,视为成功
-
-        self._alias_index[alias].append({
-            "type": entity_type,
-            "id": entity_id
-        })
-        return True
+        return self._index_manager.register_alias(alias, entity_id, entity_type)
 
     def lookup_alias(self, mention: str, entity_type: str = None) -> Optional[str]:
         """查找别名对应的实体ID(返回第一个匹配,可选按类型过滤)"""
-        entries = self._alias_index.get(mention, [])
+        entries = self._index_manager.get_entities_by_alias(mention)
         if not entries:
             return None
 
@@ -144,18 +63,12 @@ class EntityLinker:
 
     def lookup_alias_all(self, mention: str) -> List[Dict]:
         """查找别名对应的所有实体(一对多)"""
-        return self._alias_index.get(mention, [])
+        entries = self._index_manager.get_entities_by_alias(mention)
+        return [{"type": e.get("type"), "id": e.get("id")} for e in entries]
 
     def get_all_aliases(self, entity_id: str, entity_type: str = None) -> List[str]:
         """获取实体的所有别名"""
-        aliases = []
-        for alias, entries in self._alias_index.items():
-            for entry in entries:
-                if entry.get("id") == entity_id:
-                    if entity_type is None or entry.get("type") == entity_type:
-                        aliases.append(alias)
-                        break
-        return aliases
+        return self._index_manager.get_entity_aliases(entity_id)
 
     # ==================== 置信度判断 ====================
 
@@ -234,7 +147,7 @@ class EntityLinker:
         new_entities: List[Dict]
     ) -> List[str]:
         """
-        注册新实体的别名 (v5.0)
+        注册新实体的别名 (v5.1)
 
         返回注册的实体ID列表
         """
@@ -267,7 +180,7 @@ class EntityLinker:
 def main():
     import argparse
 
-    parser = argparse.ArgumentParser(description="Entity Linker CLI (v5.0 一对多别名)")
+    parser = argparse.ArgumentParser(description="Entity Linker CLI (v5.1 SQLite)")
     parser.add_argument("--project-root", type=str, help="项目根目录")
 
     subparsers = parser.add_subparsers(dest="command")
@@ -306,10 +219,9 @@ def main():
         entity_type = getattr(args, "type", "角色")
         success = linker.register_alias(args.entity, args.alias, entity_type)
         if success:
-            linker.save_alias_index()
             print(f"✓ 已注册: {args.alias} → {args.entity} (类型: {entity_type})")
         else:
-            print(f"✗ 注册失败")
+            print(f"✗ 注册失败或已存在")
 
     elif args.command == "lookup":
         entity_type = getattr(args, "type", None)

+ 32 - 182
.claude/scripts/data_modules/state_manager.py

@@ -165,21 +165,8 @@ class StateManager:
         )
 
         entities_v3 = state.get("entities_v3")
-        if not isinstance(entities_v3, dict):
-            entities_v3 = {}
-            state["entities_v3"] = entities_v3
-        for t in self.ENTITY_TYPES:
-            if not isinstance(entities_v3.get(t), dict):
-                entities_v3[t] = {}
-
-        if not isinstance(state.get("alias_index"), dict):
-            state["alias_index"] = {}
-
-        if not isinstance(state.get("state_changes"), list):
-            state["state_changes"] = []
-
-        if not isinstance(state.get("structured_relationships"), list):
-            state["structured_relationships"] = []
+        # v5.1: entities_v3, alias_index, state_changes, structured_relationships 已迁移到 index.db
+        # 不再在 state.json 中初始化或维护这些字段
 
         if not isinstance(state.get("disambiguation_warnings"), list):
             state["disambiguation_warnings"] = []
@@ -264,136 +251,12 @@ class StateManager:
 
                     progress["last_updated"] = self._now_progress_timestamp()
 
-                # v5.1: 检查是否已迁移到 SQLite
-                # 如果启用了 SQLite 同步,则不再写入大数据字段到 state.json
-                _migrated = self._enable_sqlite_sync and self._sql_state_manager is not None
-
-                if not _migrated:
-                    # ==================== 旧模式:写入 state.json ====================
-                    # entities_v3(按补丁应用)
-                    entities_v3 = disk_state.get("entities_v3", {})
-                    if not isinstance(entities_v3, dict):
-                        entities_v3 = {}
-                        disk_state["entities_v3"] = entities_v3
-                    for t in self.ENTITY_TYPES:
-                        if not isinstance(entities_v3.get(t), dict):
-                            entities_v3[t] = {}
-
-                    for (entity_type, entity_id), patch in self._pending_entity_patches.items():
-                        bucket = entities_v3.setdefault(entity_type, {})
-                        if not isinstance(bucket, dict):
-                            bucket = {}
-                            entities_v3[entity_type] = bucket
-
-                        entity = bucket.get(entity_id)
-                        if not isinstance(entity, dict):
-                            entity = {}
-                            bucket[entity_id] = entity
-
-                        # 新建实体时:只填充缺失字段,避免覆盖并发写入的更完整信息
-                        if patch.base_entity:
-                            for k, v in patch.base_entity.items():
-                                if k not in entity:
-                                    entity[k] = v
-                                elif isinstance(entity.get(k), dict) and isinstance(v, dict):
-                                    # 递归填充缺失
-                                    for kk, vv in v.items():
-                                        if kk not in entity[k]:
-                                            entity[k][kk] = vv
-
-                        # top-level updates(明确写入)
-                        for k, v in patch.top_updates.items():
-                            entity[k] = v
-
-                        # current updates(明确写入)
-                        if patch.current_updates:
-                            current = entity.get("current")
-                            if not isinstance(current, dict):
-                                current = {}
-                                entity["current"] = current
-                            current.update(patch.current_updates)
-
-                        # appearance updates(first=min(non-zero), last=max)
-                        if patch.appearance_chapter is not None:
-                            chapter = int(patch.appearance_chapter)
-                            try:
-                                first = int(entity.get("first_appearance", 0) or 0)
-                            except (TypeError, ValueError):
-                                first = 0
-                            try:
-                                last = int(entity.get("last_appearance", 0) or 0)
-                            except (TypeError, ValueError):
-                                last = 0
-
-                            if first <= 0:
-                                entity["first_appearance"] = chapter
-                            else:
-                                entity["first_appearance"] = min(first, chapter)
-                            entity["last_appearance"] = max(last, chapter)
-
-                    # alias_index(一对多:合并去重)
-                    alias_index = disk_state.get("alias_index", {})
-                    if not isinstance(alias_index, dict):
-                        alias_index = {}
-                        disk_state["alias_index"] = alias_index
-
-                    for alias, entries in self._pending_alias_entries.items():
-                        if not alias:
-                            continue
-                        existing = alias_index.get(alias)
-                        if not isinstance(existing, list):
-                            existing = []
-                            alias_index[alias] = existing
-
-                        for entry in entries:
-                            et = entry.get("type")
-                            eid = entry.get("id")
-                            if not et or not eid:
-                                continue
-                            if any(e.get("type") == et and e.get("id") == eid for e in existing if isinstance(e, dict)):
-                                continue
-                            existing.append({"type": et, "id": eid})
-
-                    # state_changes(追加)
-                    if self._pending_state_changes:
-                        changes = disk_state.get("state_changes")
-                        if not isinstance(changes, list):
-                            changes = []
-                            disk_state["state_changes"] = changes
-                        changes.extend(self._pending_state_changes)
-
-                    # structured_relationships(追加去重)
-                    if self._pending_structured_relationships:
-                        rels = disk_state.get("structured_relationships")
-                        if not isinstance(rels, list):
-                            rels = []
-                            disk_state["structured_relationships"] = rels
-
-                        def _rel_key(r: Dict[str, Any]) -> tuple:
-                            return (
-                                r.get("from_entity"),
-                                r.get("to_entity"),
-                                r.get("type"),
-                                r.get("description"),
-                                r.get("chapter"),
-                            )
-
-                        existing_keys = {_rel_key(r) for r in rels if isinstance(r, dict)}
-                        for r in self._pending_structured_relationships:
-                            if not isinstance(r, dict):
-                                continue
-                            k = _rel_key(r)
-                            if k in existing_keys:
-                                continue
-                            rels.append(r)
-                            existing_keys.add(k)
-                else:
-                    # ==================== v5.1 模式:移除大数据字段 ====================
-                    # 确保 state.json 中不存在这些膨胀字段
-                    for field in ["entities_v3", "alias_index", "state_changes", "structured_relationships"]:
-                        disk_state.pop(field, None)
-                    # 标记已迁移
-                    disk_state["_migrated_to_sqlite"] = True
+                # v5.1: 强制使用 SQLite 模式,移除大数据字段
+                # 确保 state.json 中不存在这些膨胀字段
+                for field in ["entities_v3", "alias_index", "state_changes", "structured_relationships"]:
+                    disk_state.pop(field, None)
+                # 标记已迁移
+                disk_state["_migrated_to_sqlite"] = True
 
                 # disambiguation_warnings(追加去重 + 截断)
                 if self._pending_disambiguation_warnings:
@@ -774,37 +637,22 @@ class StateManager:
         patch.replace = True
         patch.base_entity = v3_entity
 
-        # 注册别名到 alias_index
-        self._register_alias_internal(entity.id, entity_type, entity.name)
-        for alias in entity.aliases:
-            self._register_alias_internal(entity.id, entity_type, alias)
+        # v5.1: 注册别名到 index.db (通过 SQLStateManager)
+        if self._sql_state_manager:
+            self._sql_state_manager._index_manager.register_alias(entity.name, entity.id, entity_type)
+            for alias in entity.aliases:
+                if alias:
+                    self._sql_state_manager._index_manager.register_alias(alias, entity.id, entity_type)
 
         return True
 
     def _register_alias_internal(self, entity_id: str, entity_type: str, alias: str):
-        """内部方法:注册别名到 state.json 的 alias_index"""
+        """内部方法:注册别名到 index.db (v5.1)"""
         if not alias:
             return
-        if "alias_index" not in self._state:
-            self._state["alias_index"] = {}
-
-        if alias not in self._state["alias_index"]:
-            self._state["alias_index"][alias] = []
-
-        # 检查是否已存在
-        exists = any(
-            e.get("type") == entity_type and e.get("id") == entity_id
-            for e in self._state["alias_index"][alias]
-        )
-        if not exists:
-            self._state["alias_index"][alias].append({
-                "type": entity_type,
-                "id": entity_id
-            })
-            # 记录待合并增量:避免锁外读-改-写覆盖
-            pending = self._pending_alias_entries.setdefault(alias, [])
-            if not any(e.get("type") == entity_type and e.get("id") == entity_id for e in pending):
-                pending.append({"type": entity_type, "id": entity_id})
+        # v5.1: 直接写入 SQLite
+        if self._sql_state_manager:
+            self._sql_state_manager._index_manager.register_alias(alias, entity_id, entity_type)
 
     def update_entity(self, entity_id: str, updates: Dict[str, Any], entity_type: str = None) -> bool:
         """更新实体属性 (v5.0)"""
@@ -1135,8 +983,9 @@ class StateManager:
         return {
             "progress": self._state.get("progress", {}),
             "entities": entities_flat,
-            "alias_index": self._state.get("alias_index", {}),
-            "recent_changes": self._state.get("state_changes", [])[-self.config.export_recent_changes_slice:],
+            # v5.1: alias_index 已迁移到 index.db,这里返回空(兼容性)
+            "alias_index": {},
+            "recent_changes": [],  # v5.1: 从 index.db 查询
             "disambiguation": {
                 "warnings": self._state.get("disambiguation_warnings", [])[-self.config.export_disambiguation_slice:],
                 "pending": self._state.get("disambiguation_pending", [])[-self.config.export_disambiguation_slice:],
@@ -1146,17 +995,18 @@ class StateManager:
     # ==================== 主角同步 ====================
 
     def get_protagonist_entity_id(self) -> Optional[str]:
-        """获取主角实体 ID(通过 is_protagonist 标记或 protagonist_state.name 查找)"""
-        # 方式1: 查找 is_protagonist 标记
-        for eid, e in self._state.get("entities_v3", {}).get("角色", {}).items():
-            if e.get("is_protagonist"):
-                return eid
-
-        # 方式2: 通过 protagonist_state.name 查找
+        """获取主角实体 ID(通过 is_protagonist 标记或 SQLite 查询)"""
+        # 方式1: 通过 SQLStateManager 查询 (v5.1)
+        if self._sql_state_manager:
+            protagonist = self._sql_state_manager.get_protagonist()
+            if protagonist:
+                return protagonist.get("id")
+
+        # 方式2: 通过 protagonist_state.name 查找别名
         protag_name = self._state.get("protagonist_state", {}).get("name")
-        if protag_name:
-            alias_entries = self._state.get("alias_index", {}).get(protag_name, [])
-            for entry in alias_entries:
+        if protag_name and self._sql_state_manager:
+            entities = self._sql_state_manager._index_manager.get_entities_by_alias(protag_name)
+            for entry in entities:
                 if entry.get("type") == "角色":
                     return entry.get("id")
 

+ 0 - 1816
.claude/scripts/extract_entities.py

@@ -1,1816 +0,0 @@
-#!/usr/bin/env python3
-"""
-XML 标签提取与同步脚本 (v4.0)
-
-> **v5.0 说明**: 此脚本用于**手动标注场景**(可选)。
-> v5.0 主流程使用 Data Agent 从纯正文进行 AI 语义提取,不再依赖 XML 标签。
-> 如果章节中包含 XML 标签,此脚本仍可用于解析和同步。
-
-功能:
-1. 扫描指定章节正文,提取所有 XML 格式标签
-2. 支持标签类型:
-   - <entity>: 实体(角色/地点/物品/势力/招式)
-   - <entity-alias>: 实体别名注册
-   - <entity-update>: 实体属性更新(支持 set/unset/add/remove/inc)
-   - <skill>: 金手指技能
-   - <foreshadow>: 伏笔标签
-   - <deviation>: 大纲偏离标记
-   - <relationship>: 角色关系
-3. 支持实体层级分类(核心/支线/装饰)
-4. 同步到设定集对应文件
-5. 更新 state.json(entities_v3 + alias_index 一对多)
-6. 支持自动化模式和交互式模式
-
-v4.0 变更:
-- alias_index 改为一对多(同一别名可映射多个实体)
-- 删除旧格式兼容代码
-- 新增操作:<unset>/<add>/<remove>/<inc>
-- 顶层字段白名单支持
-
-使用方式:
-  python extract_entities.py <章节文件> [--auto] [--dry-run]
-  python extract_entities.py --project-root "path" --chapter 1 --auto
-"""
-
-import re
-import json
-import os
-import shutil
-import sys
-import argparse
-from pathlib import Path
-from datetime import datetime
-from typing import List, Dict, Tuple, Optional, Any
-
-# ============================================================================
-# 安全修复:导入安全工具函数(P0 CRITICAL)
-# ============================================================================
-from security_utils import sanitize_filename, create_secure_directory, atomic_write_json
-from project_locator import resolve_project_root, resolve_state_file
-from chapter_paths import find_chapter_file, extract_chapter_num_from_filename
-
-# Windows 编码兼容性修复
-if sys.platform == 'win32':
-    import io
-    sys.stdout = io.TextIOWrapper(sys.stdout.buffer, encoding='utf-8')
-    sys.stderr = io.TextIOWrapper(sys.stderr.buffer, encoding='utf-8')
-
-# 实体类型与目标文件映射
-ENTITY_TYPE_MAP = {
-    "角色": "设定集/角色库/{category}/{name}.md",
-    "地点": "设定集/世界观.md",
-    "物品": "设定集/物品库/{name}.md",
-    "势力": "设定集/世界观.md",
-    "招式": "设定集/力量体系.md",
-    "其他": "设定集/其他设定/{name}.md"
-}
-
-# 有效实体类型(v4.0 不再兼容旧别名)
-VALID_ENTITY_TYPES = {"角色", "地点", "物品", "势力", "招式"}
-
-# 顶层字段白名单(可通过 entity-update 直接修改)
-TOP_LEVEL_FIELDS = {"tier", "desc", "canonical_name", "importance", "status", "parent"}
-
-
-class AmbiguousAliasError(RuntimeError):
-    """别名命中多个实体且无法消歧(必须改用 id 或补充 type)。"""
-
-
-def normalize_entity_type(raw: Any) -> str:
-    """验证实体类型(v4.0 不再支持别名转换)。"""
-    t = str(raw or "").strip()
-    if not t:
-        return ""
-    if t in VALID_ENTITY_TYPES:
-        return t
-    return ""  # 无效类型返回空
-
-# 角色分类规则
-ROLE_CATEGORY_MAP = {
-    "主角": "主要角色",
-    "配角": "次要角色",
-    "反派": "反派角色",
-    "路人": "次要角色"
-}
-
-# 实体层级权重(匹配伏笔三层级系统)
-ENTITY_TIER_MAP = {
-    "核心": {"weight": 3.0, "desc": "必须追踪,影响主线"},
-    "core": {"weight": 3.0, "desc": "必须追踪,影响主线"},
-    "支线": {"weight": 2.0, "desc": "应该追踪,丰富剧情"},
-    "sub": {"weight": 2.0, "desc": "应该追踪,丰富剧情"},
-    "装饰": {"weight": 1.0, "desc": "可选追踪,增加真实感"},
-    "decor": {"weight": 1.0, "desc": "可选追踪,增加真实感"}
-}
-
-# ============================================================================
-# 实体管理核心函数 (v3.0 新增)
-# ============================================================================
-
-def generate_entity_id(entity_type: str, name: str, existing_ids: set) -> str:
-    """
-    生成唯一实体 ID
-
-    规则:
-    1. 优先使用拼音(去空格、小写)
-    2. 冲突时追加数字后缀
-    3. 特殊前缀按类型
-
-    Args:
-        entity_type: 实体类型(角色/地点/物品/势力/招式)
-        name: 实体名称
-        existing_ids: 已存在的 ID 集合
-
-    Returns:
-        str: 唯一的实体 ID
-    """
-    # 类型前缀映射
-    prefix_map = {
-        "物品": "item_",
-        "势力": "faction_",
-        "招式": "skill_",
-        "地点": "loc_"
-        # 角色无前缀
-    }
-
-    # 尝试使用 pypinyin,如果不可用则用简单的 hash
-    try:
-        from pypinyin import lazy_pinyin
-        pinyin = ''.join(lazy_pinyin(name))
-        base_id = prefix_map.get(entity_type, '') + pinyin.lower()
-    except ImportError:
-        # pypinyin 不可用时,使用简化方案
-        import hashlib
-        hash_suffix = hashlib.md5(name.encode('utf-8')).hexdigest()[:8]
-        base_id = prefix_map.get(entity_type, '') + hash_suffix
-
-    # 清理非法字符
-    base_id = re.sub(r'[^a-z0-9_]', '', base_id)
-
-    # 处理冲突
-    final_id = base_id
-    counter = 1
-    while final_id in existing_ids:
-        final_id = f"{base_id}_{counter}"
-        counter += 1
-
-    return final_id
-
-
-def resolve_entity_by_alias(alias: str, entity_type: Optional[str], state: dict) -> Tuple[Optional[str], Optional[str], Optional[dict]]:
-    """
-    通过别名解析实体(v4.0 一对多版本)
-
-    Args:
-        alias: 别名或名称
-        entity_type: 实体类型提示(可选,用于歧义消解)
-        state: state.json 内容
-
-    Returns:
-        (entity_type, entity_id, entity_data) 或 (None, None, None)
-
-    Raises:
-        AmbiguousAliasError: 别名命中多个实体且无法消歧(必须改用 id 或补充 type)
-        ValueError: alias_index 数据格式不符合 v4.0 规范
-    """
-    alias_index = state.get("alias_index", {})
-
-    # alias_index 新格式: {"别名": [{"type": "角色", "id": "xxx"}, ...]}
-    entries = alias_index.get(alias)
-    if not entries:
-        return (None, None, None)
-
-    if not isinstance(entries, list):
-        raise ValueError(
-            f"alias_index 数据格式错误:期望 alias_index[{alias!r}] 为 list[{{type,id,...}}],实际为 {type(entries).__name__}"
-        )
-
-    # 只有一个匹配 -> 直接返回
-    if len(entries) == 1:
-        ref = entries[0]
-        et = ref.get("type", "")
-        eid = ref.get("id", "")
-        entities_v3 = state.get("entities_v3", {})
-        entity_data = entities_v3.get(et, {}).get(eid)
-        return (et, eid, entity_data) if entity_data else (None, None, None)
-
-    # 多个匹配 -> 尝试用 type 消解
-    if entity_type:
-        matches = [e for e in entries if e.get("type") == entity_type]
-        if len(matches) == 1:
-            ref = matches[0]
-            et = ref.get("type", "")
-            eid = ref.get("id", "")
-            entities_v3 = state.get("entities_v3", {})
-            entity_data = entities_v3.get(et, {}).get(eid)
-            return (et, eid, entity_data) if entity_data else (None, None, None)
-
-    # 歧义无法消解:必须强制报错,避免写错实体
-    raise AmbiguousAliasError(f"别名歧义: {alias!r} 命中 {len(entries)} 个实体,请改用 id 或补充 type 属性")
-
-
-def ensure_entities_v3_structure(state: dict) -> dict:
-    """
-    确保 state.json 有 entities_v3 和 alias_index 结构
-
-    entities_v3 格式:
-    {
-        "角色": {
-            "lintian": {
-                "id": "lintian",
-                "canonical_name": "林天",
-                "aliases": ["废物", "林天"],
-                "tier": "核心",
-                "current": {...},
-                "history": [...],
-                "created_chapter": 1
-            }
-        },
-        "地点": {...},
-        ...
-    }
-
-    alias_index 格式 (v4.0 一对多):
-    {
-        "废物": [{"type": "角色", "id": "lintian"}],
-        "天云宗": [
-            {"type": "地点", "id": "loc_tianyunzong"},
-            {"type": "势力", "id": "faction_tianyunzong"}
-        ],
-        ...
-    }
-    """
-    if "entities_v3" not in state:
-        state["entities_v3"] = {
-            "角色": {},
-            "地点": {},
-            "物品": {},
-            "势力": {},
-            "招式": {}
-        }
-
-    if "alias_index" not in state:
-        state["alias_index"] = {}
-
-    return state
-
-
-_XML_ATTR_RE = re.compile(r'([A-Za-z_][A-Za-z0-9_-]*)\s*=\s*(["\'])(.*?)\2', re.DOTALL)
-
-
-def parse_xml_attributes(tag: str) -> Dict[str, str]:
-    """从形如 `<tag a=\"1\" b='2'/>` 的片段中提取属性字典(不做 XML 语义校验)。"""
-    attrs: Dict[str, str] = {}
-    for m in _XML_ATTR_RE.finditer(tag):
-        key = m.group(1).strip()
-        value = m.group(3).strip()
-        if not key:
-            continue
-        attrs[key] = value
-    return attrs
-
-
-def _line_number_from_index(text: str, index: int) -> int:
-    return text[:index].count("\n") + 1
-
-
-def extract_new_entities(file_path: str) -> List[Dict]:
-    """
-    从章节文件中提取所有实体标签(v4.0 仅支持 XML 格式)。
-
-    支持 XML 形态:
-      1) 自闭合:<entity type="角色" name="林天" desc="..." tier="核心" [id="lintian"] [任意属性...]/>
-      2) 成对:
-         <entity type="角色" id="lintian" name="林天" desc="..." tier="核心">
-           <alias>废物</alias>
-           <alias>林宗主</alias>
-         </entity>
-
-    Returns:
-        List[Dict]: [{"type","name","desc","tier","id?","attrs","aliases","line","source_file"}, ...]
-    """
-    p = Path(file_path)
-    text = p.read_text(encoding="utf-8")
-
-    entities: List[Dict[str, Any]] = []
-
-    # ============================================================
-    # XML 成对格式: <entity ...> ... </entity>(用于内嵌 alias)
-    # ============================================================
-    block_pattern = re.compile(r"(?s)(<entity\b[^>]*>)(.*?)</entity>")
-    for m in block_pattern.finditer(text):
-        open_tag = m.group(1)
-        body = m.group(2)
-        attrs = parse_xml_attributes(open_tag)
-
-        entity_type = str(attrs.get("type", "")).strip()
-        entity_name = str(attrs.get("name", "")).strip()
-        if not entity_type or not entity_name:
-            continue
-
-        # 验证 entity_type
-        if entity_type not in VALID_ENTITY_TYPES:
-            print(f"⚠️ 无效实体类型: {entity_type}(第{_line_number_from_index(text, m.start())}行),跳过")
-            continue
-
-        entity_desc = str(attrs.get("desc", "")).strip()
-        entity_tier = str(attrs.get("tier", "支线")).strip() or "支线"
-        if entity_tier.lower() not in ENTITY_TIER_MAP:
-            entity_tier = "支线"
-
-        entity_id = str(attrs.get("id", "")).strip() or None
-        extra_attrs = {k: v for k, v in attrs.items() if k not in {"type", "id", "name", "desc", "tier"}}
-        aliases = [a.strip() for a in re.findall(r"(?s)<alias>(.*?)</alias>", body) if str(a).strip()]
-
-        entities.append(
-            {
-                "type": entity_type,
-                "id": entity_id,
-                "name": entity_name,
-                "desc": entity_desc,
-                "tier": entity_tier,
-                "attrs": extra_attrs,
-                "aliases": aliases,
-                "line": _line_number_from_index(text, m.start()),
-                "source_file": file_path,
-            }
-        )
-
-    # ============================================================
-    # XML 自闭合格式: <entity .../>
-    # ============================================================
-    self_closing_pattern = re.compile(r"<entity\b[^>]*?/\s*>")
-    for m in self_closing_pattern.finditer(text):
-        tag = m.group(0)
-        attrs = parse_xml_attributes(tag)
-
-        entity_type = str(attrs.get("type", "")).strip()
-        entity_name = str(attrs.get("name", "")).strip()
-        if not entity_type or not entity_name:
-            continue
-
-        # 验证 entity_type
-        if entity_type not in VALID_ENTITY_TYPES:
-            print(f"⚠️ 无效实体类型: {entity_type}(第{_line_number_from_index(text, m.start())}行),跳过")
-            continue
-
-        entity_desc = str(attrs.get("desc", "")).strip()
-        entity_tier = str(attrs.get("tier", "支线")).strip() or "支线"
-        if entity_tier.lower() not in ENTITY_TIER_MAP:
-            entity_tier = "支线"
-
-        entity_id = str(attrs.get("id", "")).strip() or None
-        extra_attrs = {k: v for k, v in attrs.items() if k not in {"type", "id", "name", "desc", "tier"}}
-
-        entities.append(
-            {
-                "type": entity_type,
-                "id": entity_id,
-                "name": entity_name,
-                "desc": entity_desc,
-                "tier": entity_tier,
-                "attrs": extra_attrs,
-                "aliases": [],
-                "line": _line_number_from_index(text, m.start()),
-                "source_file": file_path,
-            }
-        )
-
-    return entities
-
-
-def extract_entity_alias_ops(file_path: str) -> List[Dict[str, Any]]:
-    """
-    提取实体别名操作:
-      <entity-alias id="lintian" alias="林宗主" context="成为宗主后"/>
-      <entity-alias ref="林天" alias="不灭战神" context="晋升称号后"/>
-
-    可选:type="角色|地点|物品|势力|招式" 用于 disambiguation。
-    """
-    p = Path(file_path)
-    text = p.read_text(encoding="utf-8")
-
-    results: List[Dict[str, Any]] = []
-    pattern = re.compile(r"<entity[-_]alias\b[^>]*?/\s*>", re.IGNORECASE)
-    for m in pattern.finditer(text):
-        tag = m.group(0)
-        attrs = parse_xml_attributes(tag)
-
-        alias = str(attrs.get("alias", "")).strip()
-        if not alias:
-            continue
-
-        results.append(
-            {
-                "id": str(attrs.get("id", "")).strip() or None,
-                "ref": str(attrs.get("ref", "")).strip() or None,
-                "type": str(attrs.get("type", "")).strip() or None,
-                "alias": alias,
-                "context": str(attrs.get("context", "")).strip(),
-                "line": _line_number_from_index(text, m.start()),
-                "source_file": file_path,
-            }
-        )
-
-    return results
-
-
-def extract_entity_update_ops(file_path: str) -> List[Dict[str, Any]]:
-    """
-    提取实体更新操作(v4.0 支持 set/unset/add/remove/inc):
-      <entity-update id="lintian">
-        <set key="realm" value="筑基期一层" reason="突破"/>
-        <unset key="bottleneck"/>
-        <add key="titles" value="不灭战神"/>
-        <remove key="allies" value="张三"/>
-        <inc key="kill_count" delta="1"/>
-      </entity-update>
-
-      <entity-update ref="林宗主" type="角色">
-        <set key="realm" value="金丹期"/>
-      </entity-update>
-
-    可选:type="角色|地点|物品|势力|招式" 用于 disambiguation。
-    """
-    p = Path(file_path)
-    text = p.read_text(encoding="utf-8")
-
-    results: List[Dict[str, Any]] = []
-
-    block_pattern = re.compile(r"(?s)(<entity-update\b[^>]*>)(.*?)</entity-update>", re.IGNORECASE)
-    for m in block_pattern.finditer(text):
-        open_tag = m.group(1)
-        body = m.group(2)
-        attrs = parse_xml_attributes(open_tag)
-
-        operations: List[Dict[str, Any]] = []
-
-        # <set key="..." value="..." reason="..."/>
-        for sm in re.finditer(r"<set\b[^>]*?/\s*>", body, re.IGNORECASE):
-            set_attrs = parse_xml_attributes(sm.group(0))
-            key = str(set_attrs.get("key", "")).strip()
-            value = str(set_attrs.get("value", "")).strip()
-            if not key:
-                continue
-            operations.append({
-                "op": "set",
-                "key": key,
-                "value": value,
-                "reason": str(set_attrs.get("reason", "")).strip()
-            })
-
-        # <unset key="..."/>
-        for sm in re.finditer(r"<unset\b[^>]*?/\s*>", body, re.IGNORECASE):
-            set_attrs = parse_xml_attributes(sm.group(0))
-            key = str(set_attrs.get("key", "")).strip()
-            if not key:
-                continue
-            operations.append({
-                "op": "unset",
-                "key": key,
-                "reason": str(set_attrs.get("reason", "")).strip()
-            })
-
-        # <add key="..." value="..."/>
-        for sm in re.finditer(r"<add\b[^>]*?/\s*>", body, re.IGNORECASE):
-            set_attrs = parse_xml_attributes(sm.group(0))
-            key = str(set_attrs.get("key", "")).strip()
-            value = str(set_attrs.get("value", "")).strip()
-            if not key or not value:
-                continue
-            operations.append({
-                "op": "add",
-                "key": key,
-                "value": value,
-                "reason": str(set_attrs.get("reason", "")).strip()
-            })
-
-        # <remove key="..." value="..."/>
-        for sm in re.finditer(r"<remove\b[^>]*?/\s*>", body, re.IGNORECASE):
-            set_attrs = parse_xml_attributes(sm.group(0))
-            key = str(set_attrs.get("key", "")).strip()
-            value = str(set_attrs.get("value", "")).strip()
-            if not key or not value:
-                continue
-            operations.append({
-                "op": "remove",
-                "key": key,
-                "value": value,
-                "reason": str(set_attrs.get("reason", "")).strip()
-            })
-
-        # <inc key="..." delta="..."/>
-        for sm in re.finditer(r"<inc\b[^>]*?/\s*>", body, re.IGNORECASE):
-            set_attrs = parse_xml_attributes(sm.group(0))
-            key = str(set_attrs.get("key", "")).strip()
-            delta_str = str(set_attrs.get("delta", "1")).strip()
-            if not key:
-                continue
-            try:
-                delta = int(delta_str)
-            except ValueError:
-                delta = 1
-            operations.append({
-                "op": "inc",
-                "key": key,
-                "delta": delta,
-                "reason": str(set_attrs.get("reason", "")).strip()
-            })
-
-        if not operations:
-            continue
-
-        results.append(
-            {
-                "id": str(attrs.get("id", "")).strip() or None,
-                "ref": str(attrs.get("ref", "")).strip() or None,
-                "type": str(attrs.get("type", "")).strip() or None,
-                "operations": operations,
-                "line": _line_number_from_index(text, m.start()),
-                "source_file": file_path,
-            }
-        )
-
-    return results
-
-
-def extract_golden_finger_skills(file_path: str) -> List[Dict]:
-    """
-    从章节文件中提取金手指技能标签(v4.0 仅支持 XML 格式)
-
-    XML 格式:
-      <skill name="技能名" level="等级" desc="描述" cooldown="冷却时间"/>
-
-      示例:
-      <skill name="时间回溯" level="1" desc="回到10秒前的状态" cooldown="24小时"/>
-
-    Returns:
-        List[Dict]: [{"name": "吞噬", "level": "Lv1", "desc": "...", "cooldown": "10秒"}, ...]
-    """
-    skills = []
-
-    with open(file_path, 'r', encoding='utf-8') as f:
-        for line_num, line in enumerate(f, 1):
-            xml_matches = re.findall(
-                r'<skill\s+name=["\']([^"\']+)["\']\s+level=["\']([^"\']+)["\']\s+desc=["\']([^"\']+)["\']\s+cooldown=["\']([^"\']+)["\']\s*/?>',
-                line
-            )
-            for match in xml_matches:
-                skills.append({
-                    "name": match[0].strip(),
-                    "level": match[1].strip(),
-                    "desc": match[2].strip(),
-                    "cooldown": match[3].strip(),
-                    "line": line_num,
-                    "source_file": file_path
-                })
-
-    return skills
-
-
-def extract_foreshadowing_json(file_path: str) -> List[Dict[str, Any]]:
-    """
-    从章节文件提取伏笔标签(v4.0 仅支持 XML 格式)
-
-    XML 格式:
-      <foreshadow content="伏笔内容" tier="层级" target="目标章节" location="地点" characters="角色1,角色2"/>
-
-      示例:
-      <foreshadow content="神秘老者留下的玉佩开始发光" tier="核心" target="50" location="废弃实验室" characters="陆辰"/>
-
-    字段:
-      - content (必填)
-      - tier (可选: 核心/支线/装饰,默认 支线)
-      - planted_chapter (可选: 默认由调用方补齐)
-      - target_chapter / target (可选: 默认 planted_chapter + 100)
-      - location (可选)
-      - characters (可选: 逗号分隔字符串)
-    """
-    p = Path(file_path)
-    text = p.read_text(encoding="utf-8")
-
-    results: List[Dict[str, Any]] = []
-
-    xml_pattern = re.compile(
-        r'<foreshadow\s+'
-        r'content=["\']([^"\']+)["\']\s+'
-        r'tier=["\']([^"\']+)["\']'
-        r'(?:\s+target=["\']([^"\']*)["\'])?'
-        r'(?:\s+location=["\']([^"\']*)["\'])?'
-        r'(?:\s+characters=["\']([^"\']*)["\'])?'
-        r'\s*/?>',
-        re.DOTALL
-    )
-
-    for m in xml_pattern.finditer(text):
-        line_num = text[: m.start()].count("\n") + 1
-        content = m.group(1).strip()
-        if not content:
-            continue
-
-        tier = m.group(2).strip() or "支线"
-        if tier.lower() not in ENTITY_TIER_MAP:
-            tier = "支线"
-
-        target_str = m.group(3)
-        target_chapter = None
-        if target_str:
-            try:
-                target_chapter = int(target_str.strip())
-            except (TypeError, ValueError):
-                pass
-
-        location = (m.group(4) or "").strip()
-
-        characters_str = m.group(5) or ""
-        characters_list = [c.strip() for c in re.split(r"[,,]", characters_str) if c.strip()]
-
-        results.append({
-            "content": content,
-            "tier": tier,
-            "planted_chapter": None,
-            "target_chapter": target_chapter,
-            "location": location,
-            "characters": characters_list,
-            "line": line_num,
-            "source_file": str(p),
-        })
-
-    return results
-
-
-def extract_deviations(file_path: str) -> List[Dict[str, Any]]:
-    """
-    从章节文件提取大纲偏离标签(v4.0 仅支持 XML 格式)
-
-    XML 格式:
-      <deviation reason="偏离原因"/>
-
-      示例:
-      <deviation reason="临时灵感,增加李薇与陆辰的情感互动,为后续感情线铺垫"/>
-
-    Returns:
-        List[Dict]: [{"reason": "...", "line": 123}, ...]
-    """
-    p = Path(file_path)
-    text = p.read_text(encoding="utf-8")
-
-    results: List[Dict[str, Any]] = []
-
-    xml_pattern = re.compile(
-        r'<deviation\s+reason=["\']([^"\']+)["\']\s*/?>',
-        re.DOTALL
-    )
-
-    for m in xml_pattern.finditer(text):
-        line_num = text[: m.start()].count("\n") + 1
-        reason = m.group(1).strip()
-        if reason:
-            results.append({
-                "reason": reason,
-                "line": line_num,
-                "source_file": str(p),
-            })
-
-    return results
-
-
-def extract_relationships(file_path: str) -> List[Dict[str, Any]]:
-    """
-    从章节文件提取角色关系标签
-
-    XML 格式(推荐使用 entity_id,避免改名导致断链):
-      <relationship char1_id="lintian" char2_id="lixue" type="romance" intensity="60" desc="暧昧中,互有好感"/>
-      <relationship char1="林天" char2="李雪" type="romance" intensity="60" desc="暧昧中,互有好感"/>
-
-      示例:
-      <relationship char1="林天" char2="李雪" type="romance" intensity="60" desc="暧昧中,互有好感"/>
-      <relationship char1="林天" char2="王少" type="enemy" intensity="90" desc="杀父之仇"/>
-      <relationship char1="林天" char2="云长老" type="mentor" intensity="80" desc="师徒关系,受其指点"/>
-
-    关系类型 (type):
-      - ally: 盟友
-      - enemy: 敌人
-      - romance: 恋人/暧昧
-      - mentor: 师徒
-      - debtor: 恩怨(欠人情/被欠)
-      - family: 家族/血缘
-      - rival: 竞争对手
-
-    强度 (intensity): 0-100,越高关系越强烈
-
-    Returns:
-        List[Dict]: [{"char1","char2","char1_id?","char2_id?","type","intensity","desc",...}, ...]
-    """
-    p = Path(file_path)
-    text = p.read_text(encoding="utf-8")
-
-    results: List[Dict[str, Any]] = []
-
-    valid_types = {"ally", "enemy", "romance", "mentor", "debtor", "family", "rival"}
-
-    # XML 格式: <relationship .../>
-    xml_pattern = re.compile(r"<relationship\b[^>]*?/\s*>", re.IGNORECASE)
-    for m in xml_pattern.finditer(text):
-        line_num = text[: m.start()].count("\n") + 1
-        attrs = parse_xml_attributes(m.group(0))
-
-        char1 = str(attrs.get("char1", "")).strip()
-        char2 = str(attrs.get("char2", "")).strip()
-        char1_id = str(attrs.get("char1_id", "")).strip() or None
-        char2_id = str(attrs.get("char2_id", "")).strip() or None
-        rel_type = str(attrs.get("type", "")).strip().lower() or "ally"
-        intensity_str = str(attrs.get("intensity", "")).strip() or "50"
-        desc = str(attrs.get("desc", "")).strip()
-
-        if not ((char1_id or char1) and (char2_id or char2)):
-            continue
-
-        # 验证关系类型
-        if rel_type not in valid_types:
-            print(f"⚠️ 未知关系类型 '{rel_type}'(第{line_num}行),使用默认 'ally'")
-            rel_type = "ally"
-
-        # 解析强度
-        try:
-            intensity = int(intensity_str)
-            intensity = max(0, min(100, intensity))  # 限制 0-100
-        except ValueError:
-            intensity = 50  # 默认中等强度
-
-        results.append({
-            "char1": char1,
-            "char2": char2,
-            "char1_id": char1_id,
-            "char2_id": char2_id,
-            "type": rel_type,
-            "intensity": intensity,
-            "desc": desc,
-            "line": line_num,
-            "source_file": str(p),
-        })
-
-    return results
-
-
-def categorize_character(desc: str) -> str:
-    """
-    根据描述判断角色分类
-
-    规则:
-      - 包含"主角"/"林天" → 主要角色
-      - 包含"反派"/"敌对"/"血煞门" → 反派角色
-      - 其他 → 次要角色
-    """
-    if "主角" in desc or "重要" in desc:
-        return "主要角色"
-    elif "反派" in desc or "敌对" in desc or "血煞" in desc:
-        return "反派角色"
-    else:
-        return "次要角色"
-
-def generate_character_card(entity: Dict, category: str) -> str:
-    """生成角色卡 Markdown 内容"""
-    return f"""# {entity['name']}
-
-> **首次登场**: {entity.get('source_file', '未知')}(第 {entity.get('line', '?')} 行)
-> **创建时间**: {datetime.now().strftime('%Y-%m-%d')}
-
-## 基本信息
-
-- **姓名**: {entity['name']}
-- **性别**: 待补充
-- **年龄**: 待补充
-- **身份**: {entity['desc']}
-- **所属势力**: 待补充
-
-## 实力设定
-
-- **当前境界**: 待补充
-- **擅长招式**: 待补充
-- **特殊能力**: 待补充
-
-## 性格特点
-
-{entity['desc']}
-
-## 外貌描述
-
-待补充
-
-## 人际关系
-
-- **与主角**: 待补充
-
-## 重要剧情
-
-- 【第 X 章】{entity['desc']}
-
-## 备注
-
-自动提取自 `<entity/>` 标签,请补充完善。
-"""
-
-def update_world_view(entity: Dict, target_file: str, section: str):
-    """更新世界观.md(追加地点/势力信息)"""
-    if not os.path.exists(target_file):
-        # 创建基础模板
-        content = f"""# 世界观
-
-## 地理
-
-## 势力
-
-## 历史背景
-
-"""
-        with open(target_file, 'w', encoding='utf-8') as f:
-            f.write(content)
-
-    # 读取现有内容
-    with open(target_file, 'r', encoding='utf-8') as f:
-        content = f.read()
-
-    # 追加到对应章节
-    if section == "地理":
-        entry = f"""
-### {entity['name']}
-
-{entity['desc']}
-
-> 首次登场: {entity.get('source_file', '未知')}
-"""
-    elif section == "势力":
-        entry = f"""
-### {entity['name']}
-
-{entity['desc']}
-
-> 首次登场: {entity.get('source_file', '未知')}
-"""
-
-    # 在对应章节后追加
-    pattern = f"## {section}"
-    if pattern in content:
-        content = content.replace(pattern, f"{pattern}\n{entry}")
-    else:
-        content += f"\n## {section}\n{entry}"
-
-    with open(target_file, 'w', encoding='utf-8') as f:
-        f.write(content)
-
-def update_power_system(entity: Dict, target_file: str):
-    """更新力量体系.md(追加招式)"""
-    if not os.path.exists(target_file):
-        content = f"""# 力量体系
-
-## 境界划分
-
-## 修炼方法
-
-## 招式库
-
-"""
-        with open(target_file, 'w', encoding='utf-8') as f:
-            f.write(content)
-
-    with open(target_file, 'r', encoding='utf-8') as f:
-        content = f.read()
-
-    entry = f"""
-### {entity['name']}
-
-{entity['desc']}
-
-> 首次登场: {entity.get('source_file', '未知')}
-"""
-
-    if "## 招式库" in content:
-        content = content.replace("## 招式库", f"## 招式库\n{entry}")
-    else:
-        content += f"\n## 招式库\n{entry}"
-
-    with open(target_file, 'w', encoding='utf-8') as f:
-        f.write(content)
-
-def update_state_json(
-    entities: List[Dict],
-    state_file: str,
-    golden_finger_skills: Optional[List[Dict]] = None,
-    foreshadowing_items: Optional[List[Dict[str, Any]]] = None,
-    relationship_items: Optional[List[Dict[str, Any]]] = None,
-    entity_alias_ops: Optional[List[Dict[str, Any]]] = None,
-    entity_update_ops: Optional[List[Dict[str, Any]]] = None,
-    *,
-    default_planted_chapter: Optional[int] = None,
-):
-    """更新 state.json(实体/别名/属性更新 + 金手指/伏笔/关系)。"""
-
-    def _to_int(value: Any, default: int = 0) -> int:
-        try:
-            return int(value)
-        except (TypeError, ValueError):
-            return default
-
-    with open(state_file, 'r', encoding='utf-8') as f:
-        state = json.load(f)
-
-    first_seen_chapter = _to_int(default_planted_chapter, 0)
-    project_root = Path(state_file).resolve().parent.parent
-
-    # 确保存在金手指技能列表
-    if 'protagonist_state' not in state:
-        state['protagonist_state'] = {}
-    golden_finger = state['protagonist_state'].get('golden_finger')
-    if not isinstance(golden_finger, dict):
-        golden_finger = {}
-        state['protagonist_state']['golden_finger'] = golden_finger
-    golden_finger.setdefault("name", "")
-    golden_finger.setdefault("level", 1)
-    golden_finger.setdefault("cooldown", 0)
-    golden_finger.setdefault("skills", [])
-
-    # --- 实体别名/更新系统(entities_v3 + alias_index)---
-    state = ensure_entities_v3_structure(state)
-
-    entity_alias_ops = entity_alias_ops or []
-    entity_update_ops = entity_update_ops or []
-
-    touched = set()
-
-    def _normalize_entity_type(raw: Any) -> str:
-        t = normalize_entity_type(raw)
-        if not t or t not in state.get("entities_v3", {}):
-            return ""
-        return t
-
-    def _normalize_first_appearance(source_file: Any) -> str:
-        raw = str(source_file or "").strip()
-        if not raw:
-            return ""
-        try:
-            p = Path(raw)
-            if not p.is_absolute():
-                p = (Path.cwd() / p).resolve()
-            if p == project_root or project_root in p.parents:
-                return str(p.relative_to(project_root)).replace("\\", "/")
-            return str(p).replace("\\", "/")
-        except Exception:
-            return raw.replace("\\", "/")
-
-    def _resolve_by_id(entity_id: Any, entity_type: Optional[str]) -> tuple[Optional[str], Optional[str], Optional[dict]]:
-        eid = str(entity_id or "").strip()
-        if not eid:
-            return (None, None, None)
-
-        if entity_type:
-            et = _normalize_entity_type(entity_type)
-            data = state.get("entities_v3", {}).get(et, {}).get(eid)
-            return (et, eid, data) if isinstance(data, dict) else (None, None, None)
-
-        hits: list[tuple[str, dict]] = []
-        for et, bucket in (state.get("entities_v3") or {}).items():
-            if isinstance(bucket, dict) and eid in bucket:
-                data = bucket.get(eid)
-                if isinstance(data, dict):
-                    hits.append((et, data))
-        if len(hits) == 1:
-            return (hits[0][0], eid, hits[0][1])
-        return (None, None, None)
-
-    def _resolve_ref(ref: Any, entity_type: Optional[str]) -> tuple[Optional[str], Optional[str], Optional[dict]]:
-        """通过别名/名称解析实体(v4.0 使用一对多 alias_index)"""
-        r = str(ref or "").strip()
-        if not r:
-            return (None, None, None)
-
-        # 使用新版 resolve_entity_by_alias(支持一对多 + 歧义检测)
-        et_hint = _normalize_entity_type(entity_type) if entity_type else None
-        et, eid, data = resolve_entity_by_alias(r, et_hint, state)
-        if et and eid and isinstance(data, dict):
-            return (et, eid, data)
-
-        return (None, None, None)
-
-    def _register_alias(entity_type: str, entity_id: str, alias: Any, *, context: str = "", first_seen: int = 0) -> None:
-        """注册别名到 alias_index(v4.0 一对多版本)"""
-        a = str(alias or "").strip()
-        if not a:
-            return
-
-        state.setdefault("alias_index", {})
-        alias_index = state["alias_index"]
-
-        # 新格式:alias_index[alias] = [{type, id, first_seen_chapter?, context?}, ...]
-        entries = alias_index.get(a)
-        if entries is None:
-            entries = []
-        if not isinstance(entries, list):
-            raise ValueError(
-                f"alias_index 数据格式错误:期望 alias_index[{a!r}] 为 list[{{type,id,...}}],实际为 {type(entries).__name__}"
-            )
-
-        # 检查是否已存在相同的 (type, id) 组合
-        new_entry: Dict[str, Any] = {"type": entity_type, "id": entity_id}
-        if first_seen:
-            new_entry["first_seen_chapter"] = int(first_seen)
-        if context:
-            new_entry["context"] = context
-        for existing in entries:
-            if existing.get("type") == entity_type and existing.get("id") == entity_id:
-                # 补齐首次出现/上下文(只填空缺)
-                if first_seen and not existing.get("first_seen_chapter"):
-                    existing["first_seen_chapter"] = int(first_seen)
-                if context and not existing.get("context"):
-                    existing["context"] = context
-                return  # 已存在,无需重复注册
-
-        # 添加新条目
-        entries.append(new_entry)
-        alias_index[a] = entries
-
-        # 同时更新实体的 aliases 列表
-        data = state.get("entities_v3", {}).get(entity_type, {}).get(entity_id)
-        if not isinstance(data, dict):
-            return
-        data.setdefault("aliases", [])
-        if a not in data["aliases"]:
-            data["aliases"].append(a)
-
-    def _ensure_v3_entity(entity_type: str, entity_id: str, canonical_name: str, *, tier: str, desc: str, first_appearance: str) -> dict:
-        bucket = state.setdefault("entities_v3", {}).setdefault(entity_type, {})
-        data = bucket.get(entity_id)
-        if not isinstance(data, dict):
-            data = {
-                "id": entity_id,
-                "canonical_name": canonical_name,
-                "aliases": [],
-                "tier": tier or "支线",
-                "desc": desc or "",
-                "current": {},
-                "history": [],
-                "created_chapter": first_seen_chapter or 1,
-                "first_appearance": first_appearance or "",
-            }
-            bucket[entity_id] = data
-
-        if canonical_name and not data.get("canonical_name"):
-            data["canonical_name"] = canonical_name
-        if tier and str(tier).lower() in ENTITY_TIER_MAP:
-            data["tier"] = tier
-        if desc:
-            data["desc"] = desc
-        if first_appearance and not data.get("first_appearance"):
-            data["first_appearance"] = first_appearance
-
-        data.setdefault("current", {})
-        data.setdefault("history", [])
-        data.setdefault("aliases", [])
-        return data
-
-    def _apply_operations(entity_type: str, entity_id: str, data: dict, operations: List[Dict[str, Any]]) -> None:
-        """应用实体更新操作(v4.0 支持 set/unset/add/remove/inc + 顶层字段)"""
-        if not operations:
-            return
-
-        current = data.setdefault("current", {})
-        changes: Dict[str, Any] = {}
-        reasons: Dict[str, str] = {}
-
-        def _rename(new_name: str, reason: str = "") -> None:
-            new_name = str(new_name or "").strip()
-            if not new_name:
-                return
-            old_name = str(data.get("canonical_name", "")).strip()
-            if old_name and old_name != new_name:
-                _register_alias(entity_type, entity_id, old_name, first_seen=first_seen_chapter)
-            data["canonical_name"] = new_name
-            _register_alias(entity_type, entity_id, new_name, first_seen=first_seen_chapter)
-            changes["canonical_name"] = new_name
-            if reason:
-                reasons["canonical_name"] = reason
-
-        for op_item in operations:
-            op = str(op_item.get("op", "set")).strip().lower()
-            key = str(op_item.get("key", "")).strip()
-            reason = str(op_item.get("reason", "")).strip()
-            if not key:
-                continue
-
-            # 顶层字段处理
-            if key in TOP_LEVEL_FIELDS:
-                if op == "set":
-                    value = str(op_item.get("value", "")).strip()
-                    if key == "canonical_name":
-                        _rename(value, reason)
-                    elif key == "tier":
-                        # 校验 tier 值
-                        if value.lower() in ENTITY_TIER_MAP or value in {"核心", "支线", "装饰"}:
-                            if data.get("tier") != value:
-                                data["tier"] = value
-                                changes["tier"] = value
-                                if reason:
-                                    reasons["tier"] = reason
-                        else:
-                            print(f"⚠️ 无效 tier 值: {value},跳过")
-                    else:
-                        if data.get(key) != value:
-                            data[key] = value
-                            changes[key] = value
-                            if reason:
-                                reasons[key] = reason
-                elif op == "unset":
-                    if key in data:
-                        del data[key]
-                        changes[key] = None
-                        if reason:
-                            reasons[key] = reason
-                continue
-
-            # canonical_name 的特殊别名
-            if key in {"name", "canonical_name"} and op == "set":
-                value = str(op_item.get("value", "")).strip()
-                _rename(value, reason)
-                continue
-
-            # current 字段操作
-            if op == "set":
-                value = str(op_item.get("value", "")).strip()
-                prev = current.get(key)
-                if prev != value:
-                    current[key] = value
-                    changes[key] = value
-                    if reason:
-                        reasons[key] = reason
-
-            elif op == "unset":
-                if key in current:
-                    del current[key]
-                    changes[key] = None
-                    if reason:
-                        reasons[key] = reason
-
-            elif op == "add":
-                value = str(op_item.get("value", "")).strip()
-                if not value:
-                    continue
-                arr = current.get(key, [])
-                if not isinstance(arr, list):
-                    arr = [arr] if arr else []
-                if value not in arr:
-                    arr.append(value)
-                    current[key] = arr
-                    changes[key] = arr
-                    if reason:
-                        reasons[key] = reason
-
-            elif op == "remove":
-                value = str(op_item.get("value", "")).strip()
-                if not value:
-                    continue
-                arr = current.get(key, [])
-                if isinstance(arr, list) and value in arr:
-                    arr.remove(value)
-                    current[key] = arr
-                    changes[key] = arr
-                    if reason:
-                        reasons[key] = reason
-
-            elif op == "inc":
-                delta = op_item.get("delta", 1)
-                try:
-                    delta = int(delta)
-                except (TypeError, ValueError):
-                    delta = 1
-                prev = current.get(key, 0)
-                try:
-                    prev = int(prev)
-                except (TypeError, ValueError):
-                    prev = 0
-                new_val = prev + delta
-                current[key] = new_val
-                changes[key] = new_val
-                if reason:
-                    reasons[key] = reason
-
-        if first_seen_chapter:
-            current["last_chapter"] = max(_to_int(current.get("last_chapter"), 0), first_seen_chapter)
-
-        if changes:
-            entry: Dict[str, Any] = {"chapter": first_seen_chapter or 0, "changes": changes}
-            if reasons:
-                entry["reasons"] = reasons
-            entry["added_at"] = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
-            data.setdefault("history", []).append(entry)
-
-    # 1) 处理 <entity .../> / <entity>...</entity>
-    for entity in entities or []:
-        entity_type = _normalize_entity_type(entity.get("type", ""))
-        name = str(entity.get("name", "")).strip()
-        if not name:
-            continue
-
-        raw_id = entity.get("id")
-        entity_id = (str(raw_id).strip() if raw_id is not None else "") or None
-        data: Optional[dict] = None
-
-        if entity_id:
-            _, _, data = _resolve_by_id(entity_id, entity_type)
-        else:
-            _, rid, rdata = _resolve_ref(name, entity_type)
-            if rid and isinstance(rdata, dict):
-                entity_id = rid
-                data = rdata
-
-        if not entity_id:
-            existing_ids = set((state.get("entities_v3") or {}).get(entity_type, {}).keys())
-            entity_id = generate_entity_id(entity_type, name, existing_ids)
-
-        first_appearance = _normalize_first_appearance(entity.get("source_file", ""))
-        tier = str(entity.get("tier", "支线")).strip() or "支线"
-        if tier.lower() not in ENTITY_TIER_MAP:
-            tier = "支线"
-        desc = str(entity.get("desc", "")).strip()
-
-        data = _ensure_v3_entity(entity_type, entity_id, name, tier=tier, desc=desc, first_appearance=first_appearance)
-
-        # canonical name & aliases
-        _register_alias(entity_type, entity_id, str(data.get("canonical_name", "")).strip() or name, first_seen=first_seen_chapter)
-        _register_alias(entity_type, entity_id, name, first_seen=first_seen_chapter)
-        for alias in (entity.get("aliases") or []):
-            _register_alias(entity_type, entity_id, alias, first_seen=first_seen_chapter)
-
-        # attribute updates (auto mode)
-        extra_attrs = entity.get("attrs") or {}
-        if isinstance(extra_attrs, dict) and extra_attrs:
-            ops = [{"op": "set", "key": k, "value": str(v), "reason": ""} for k, v in extra_attrs.items()]
-            _apply_operations(entity_type, entity_id, data, ops)
-
-        touched.add((entity_type, entity_id))
-
-    # 2) 处理 <entity-alias .../>
-    for op in entity_alias_ops:
-        alias = str(op.get("alias", "")).strip()
-        if not alias:
-            continue
-
-        hint = op.get("type")
-        entity_type_hint = _normalize_entity_type(hint) if hint else None
-
-        et: Optional[str] = None
-        eid: Optional[str] = None
-        data: Optional[dict] = None
-
-        if op.get("id"):
-            et, eid, data = _resolve_by_id(op.get("id"), entity_type_hint)
-        elif op.get("ref"):
-            et, eid, data = _resolve_ref(op.get("ref"), entity_type_hint)
-
-        if not (et and eid and isinstance(data, dict)):
-            print(f"??  entity-alias 无法解析引用: id={op.get('id')!r} ref={op.get('ref')!r}")
-            continue
-
-        _register_alias(et, eid, alias, context=str(op.get("context", "")).strip(), first_seen=first_seen_chapter)
-        touched.add((et, eid))
-
-    # 3) 处理 <entity-update>...</entity-update>
-    for op in entity_update_ops:
-        operations = op.get("operations") or []
-        if not isinstance(operations, list) or not operations:
-            continue
-
-        hint = op.get("type")
-        entity_type_hint = _normalize_entity_type(hint) if hint else None
-
-        et: Optional[str] = None
-        eid: Optional[str] = None
-        data: Optional[dict] = None
-
-        if op.get("id"):
-            et, eid, data = _resolve_by_id(op.get("id"), entity_type_hint)
-        elif op.get("ref"):
-            et, eid, data = _resolve_ref(op.get("ref"), entity_type_hint)
-
-        if not (et and eid and isinstance(data, dict)):
-            print(f"⚠️ entity-update 无法解析引用: id={op.get('id')!r} ref={op.get('ref')!r}")
-            continue
-
-        _apply_operations(et, eid, data, operations)
-        touched.add((et, eid))
-
-    # 4) 更新金手指技能
-    if golden_finger_skills:
-        existing = state['protagonist_state']['golden_finger'].get('skills', [])
-        if not isinstance(existing, list):
-            existing = []
-            state['protagonist_state']['golden_finger']['skills'] = existing
-
-        existing_by_name = {s.get("name"): s for s in existing if isinstance(s, dict) and s.get("name")}
-        for skill in golden_finger_skills:
-            if not isinstance(skill, dict):
-                continue
-
-            name = str(skill.get("name", "")).strip()
-            if not name:
-                continue
-
-            level = str(skill.get("level", "")).strip()
-            desc = str(skill.get("desc", "")).strip()
-            cooldown = str(skill.get("cooldown", "")).strip()
-            source_file = str(skill.get("source_file", "")).strip()
-
-            existing_skill = existing_by_name.get(name)
-            if existing_skill is None:
-                new_skill = {
-                    "name": name,
-                    "level": level,
-                    "desc": desc,
-                    "cooldown": cooldown,
-                    "unlocked_at": source_file,
-                    "added_at": datetime.now().strftime('%Y-%m-%d')
-                }
-                existing.append(new_skill)
-                existing_by_name[name] = new_skill
-                print(f"  ✨ 新增金手指技能: {name} ({level})")
-                continue
-
-            changed = False
-            if level and existing_skill.get("level") != level:
-                existing_skill["level"] = level
-                changed = True
-            if desc and existing_skill.get("desc") != desc:
-                existing_skill["desc"] = desc
-                changed = True
-            if cooldown and existing_skill.get("cooldown") != cooldown:
-                existing_skill["cooldown"] = cooldown
-                changed = True
-            if source_file and not existing_skill.get("unlocked_at"):
-                existing_skill["unlocked_at"] = source_file
-                changed = True
-
-            if changed:
-                existing_skill["updated_at"] = datetime.now().strftime('%Y-%m-%d')
-                print(f"  🔁 更新金手指技能: {name} ({existing_skill.get('level', level)})")
-
-    # 更新伏笔(结构化)
-    if foreshadowing_items:
-        state.setdefault("plot_threads", {"active_threads": [], "foreshadowing": []})
-        state["plot_threads"].setdefault("foreshadowing", [])
-
-        existing = state["plot_threads"]["foreshadowing"]
-
-        for item in foreshadowing_items:
-            content = str(item.get("content", "")).strip()
-            if not content:
-                continue
-
-            planted = item.get("planted_chapter") or default_planted_chapter or 1
-            try:
-                planted = int(planted)
-            except (TypeError, ValueError):
-                planted = default_planted_chapter or 1
-
-            target = item.get("target_chapter")
-            if target is None:
-                target = planted + 100
-            try:
-                target = int(target)
-            except (TypeError, ValueError):
-                target = planted + 100
-
-            tier = str(item.get("tier", "支线")).strip() or "支线"
-            if tier.lower() not in ENTITY_TIER_MAP:
-                tier = "支线"
-
-            location = str(item.get("location", "")).strip()
-            characters = item.get("characters", [])
-            if not isinstance(characters, list):
-                characters = []
-
-            found = None
-            for old in existing:
-                if old.get("content") == content:
-                    found = old
-                    break
-
-            if found is None:
-                existing.append({
-                    "content": content,
-                    "status": "未回收",
-                    "tier": tier,
-                    "planted_chapter": planted,
-                    "target_chapter": target,
-                    "location": location,
-                    "characters": characters,
-                    "added_at": datetime.now().strftime("%Y-%m-%d"),
-                })
-                print(f"  ?? 新增伏笔: {content[:30]}...")
-            else:
-                found["tier"] = tier
-                found["planted_chapter"] = planted
-                found["target_chapter"] = target
-                if location:
-                    found["location"] = location
-
-                old_chars = found.get("characters", [])
-                if not isinstance(old_chars, list):
-                    old_chars = []
-                merged = []
-                seen = set()
-                for n in [*old_chars, *characters]:
-                    s = str(n).strip()
-                    if not s or s in seen:
-                        continue
-                    merged.append(s)
-                    seen.add(s)
-                found["characters"] = merged
-
-    # 更新关系(结构化,推荐使用 entity_id)
-    if relationship_items:
-        state.setdefault("structured_relationships", [])
-        existing = state["structured_relationships"]
-
-        for item in relationship_items:
-            # 优先使用显式 entity_id;否则按别名解析(强制消歧)
-            char1_id = str(item.get("char1_id", "") or "").strip()
-            char2_id = str(item.get("char2_id", "") or "").strip()
-            char1_ref = str(item.get("char1", "")).strip()
-            char2_ref = str(item.get("char2", "")).strip()
-
-            # relationship 只允许角色
-            if char1_id:
-                _, rid, rdata = _resolve_by_id(char1_id, "角色")
-                if not rid or not isinstance(rdata, dict):
-                    raise ValueError(f"relationship.char1_id 无法解析: {char1_id!r}")
-                char1_id = rid
-                char1_name = str(rdata.get("canonical_name", "")).strip() or char1_ref
-            else:
-                _, rid, rdata = _resolve_ref(char1_ref, "角色")
-                if not rid or not isinstance(rdata, dict):
-                    raise ValueError(f"relationship.char1 无法解析: {char1_ref!r}")
-                char1_id = rid
-                char1_name = str(rdata.get("canonical_name", "")).strip() or char1_ref
-
-            if char2_id:
-                _, rid, rdata = _resolve_by_id(char2_id, "角色")
-                if not rid or not isinstance(rdata, dict):
-                    raise ValueError(f"relationship.char2_id 无法解析: {char2_id!r}")
-                char2_id = rid
-                char2_name = str(rdata.get("canonical_name", "")).strip() or char2_ref
-            else:
-                _, rid, rdata = _resolve_ref(char2_ref, "角色")
-                if not rid or not isinstance(rdata, dict):
-                    raise ValueError(f"relationship.char2 无法解析: {char2_ref!r}")
-                char2_id = rid
-                char2_name = str(rdata.get("canonical_name", "")).strip() or char2_ref
-
-            rel_type = str(item.get("type", "ally")).strip().lower() or "ally"
-            intensity = item.get("intensity", 50)
-            desc = str(item.get("desc", "")).strip()
-
-            try:
-                intensity = int(intensity)
-                intensity = max(0, min(100, intensity))
-            except (TypeError, ValueError):
-                intensity = 50
-
-            # 查找是否已存在相同关系
-            found = None
-            for old in existing:
-                if (
-                    old.get("char1_id") == char1_id
-                    and old.get("char2_id") == char2_id
-                    and old.get("type") == rel_type
-                ):
-                    found = old
-                    break
-
-            if found is None:
-                existing.append({
-                    "char1_id": char1_id,
-                    "char2_id": char2_id,
-                    "char1_name": char1_name,
-                    "char2_name": char2_name,
-                    "type": rel_type,
-                    "intensity": intensity,
-                    "description": desc,
-                    "last_update_chapter": default_planted_chapter or 1,
-                    "added_at": datetime.now().strftime("%Y-%m-%d"),
-                })
-                print(f"  💕 新增关系: {char1_name} ↔ {char2_name} ({rel_type}, 强度 {intensity})")
-            else:
-                # 更新强度和描述
-                found["intensity"] = intensity
-                found["description"] = desc
-                found["last_update_chapter"] = default_planted_chapter or found.get("last_update_chapter", 1)
-                found.setdefault("char1_name", char1_name)
-                found.setdefault("char2_name", char2_name)
-                print(f"  💕 更新关系: {char1_name} ↔ {char2_name} ({rel_type}, 强度 {intensity})")
-
-    # 使用集中式原子写入(带 filelock + 自动备份)
-    atomic_write_json(state_file, state, use_lock=True, backup=True)
-    print(f"✅ state.json 已原子化更新(带备份)")
-
-def sync_entity_to_settings(entity: Dict, project_root: str, auto_mode: bool = False) -> bool:
-    """
-    将实体同步到设定集
-
-    Returns:
-        bool: 是否成功同步
-    """
-    entity_type = normalize_entity_type(entity.get('type'))
-    entity_name = entity['name']
-
-    if entity_type == "角色":
-        category = categorize_character(entity['desc'])
-        category_dir = ROLE_CATEGORY_MAP.get(category.split('/')[0], "次要角色")
-
-        target_dir = Path(project_root) / f"设定集/角色库/{category_dir}"
-        # ============================================================================
-        # 安全修复:使用安全目录创建函数(文件权限修复)
-        # ============================================================================
-        create_secure_directory(str(target_dir))
-
-        # ============================================================================
-        # 安全修复:清理文件名,防止路径遍历 (CWE-22) - P0 CRITICAL
-        # 原代码: target_file = target_dir / f"{entity_name}.md"
-        # 漏洞: entity_name可能包含 "../" 导致目录遍历攻击
-        # ============================================================================
-        safe_entity_name = sanitize_filename(entity_name)
-        target_file = target_dir / f"{safe_entity_name}.md"
-
-        if target_file.exists():
-            print(f"⚠️  角色卡已存在: {target_file}")
-            if not auto_mode:
-                choice = input("是否覆盖?(y/n): ")
-                if choice.lower() != 'y':
-                    return False
-
-        with open(target_file, 'w', encoding='utf-8') as f:
-            f.write(generate_character_card(entity, category))
-
-        print(f"✅ 已创建角色卡: {target_file}")
-        return True
-
-    elif entity_type == "地点":
-        target_file = Path(project_root) / "设定集/世界观.md"
-        update_world_view(entity, str(target_file), "地理")
-        print(f"✅ 已更新世界观(地理): {entity_name}")
-        return True
-
-    elif entity_type == "势力":
-        target_file = Path(project_root) / "设定集/世界观.md"
-        update_world_view(entity, str(target_file), "势力")
-        print(f"✅ 已更新世界观(势力): {entity_name}")
-        return True
-
-    elif entity_type == "招式":
-        target_file = Path(project_root) / "设定集/力量体系.md"
-        update_power_system(entity, str(target_file))
-        print(f"✅ 已更新力量体系(招式): {entity_name}")
-        return True
-
-    elif entity_type == "物品":
-        target_dir = Path(project_root) / "设定集/物品库"
-        # ============================================================================
-        # 安全修复:使用安全目录创建函数(文件权限修复)
-        # ============================================================================
-        create_secure_directory(str(target_dir))
-
-        # ============================================================================
-        # 安全修复:清理文件名,防止路径遍历 (CWE-22) - P0 CRITICAL
-        # 原代码: target_file = target_dir / f"{entity_name}.md"
-        # 漏洞: entity_name可能包含 "../" 导致目录遍历攻击
-        # ============================================================================
-        safe_entity_name = sanitize_filename(entity_name)
-        target_file = target_dir / f"{safe_entity_name}.md"
-
-        if target_file.exists():
-            print(f"⚠️  物品卡已存在: {target_file}")
-            if not auto_mode:
-                choice = input("是否覆盖?(y/n): ")
-                if choice.lower() != 'y':
-                    return False
-
-        content = f"""# {entity_name}
-
-> **首次登场**: {entity.get('source_file', '未知')}
-> **创建时间**: {datetime.now().strftime('%Y-%m-%d')}
-
-## 基本信息
-
-{entity['desc']}
-
-## 详细设定
-
-待补充
-
-## 相关剧情
-
-- 【第 X 章】首次出现
-
-## 备注
-
-自动提取自 `<entity/>` 标签,请补充完善。
-"""
-
-        with open(target_file, 'w', encoding='utf-8') as f:
-            f.write(content)
-
-        print(f"✅ 已创建物品卡: {target_file}")
-        return True
-
-    else:
-        print(f"⚠️  未知实体类型: {entity_type}")
-        return False
-
-def main():
-    parser = argparse.ArgumentParser(
-        description="XML 标签提取与同步 (<entity/>, <entity-alias/>, <entity-update>, <skill/>, <foreshadow/>, <deviation/>, <relationship/>)",
-        formatter_class=argparse.RawDescriptionHelpFormatter,
-        epilog="""
-示例:
-  # 指定文件(兼容卷目录)
-  python extract_entities.py "webnovel-project/正文/第1卷/第001章-死亡降临.md" --auto
-
-  # 指定章节号(推荐)
-  python extract_entities.py --project-root "webnovel-project" --chapter 1 --auto
-""".strip(),
-    )
-
-    parser.add_argument("chapter_file", nargs="?", help="章节文件路径(或使用 --chapter)")
-    parser.add_argument("--chapter", type=int, help="章节号(与 --project-root 配合,自动定位章节文件)")
-    parser.add_argument("--project-root", default=None, help="项目根目录(包含 .webnovel/state.json)")
-    parser.add_argument("--auto", action="store_true", help="自动模式(非交互)")
-    parser.add_argument("--dry-run", action="store_true", help="仅预览,不写入文件/状态")
-
-    args = parser.parse_args()
-
-    auto_mode = args.auto
-    dry_run = args.dry_run
-
-    project_root: Optional[Path] = None
-    if args.project_root:
-        project_root = resolve_project_root(args.project_root)
-    else:
-        try:
-            project_root = resolve_project_root()
-        except FileNotFoundError:
-            project_root = None
-
-    chapter_file: Optional[str] = None
-    chapter_num: Optional[int] = None
-
-    if args.chapter is not None:
-        if not project_root:
-            print("❌ 未提供有效的 --project-root,无法用 --chapter 定位章节文件")
-            sys.exit(1)
-
-        chapter_num = int(args.chapter)
-        chapter_path = find_chapter_file(project_root, chapter_num)
-        if not chapter_path:
-            print(f"❌ 未找到第{chapter_num}章文件(请先生成/保存章节)")
-            sys.exit(1)
-        chapter_file = str(chapter_path)
-    else:
-        if not args.chapter_file:
-            parser.error("必须提供 chapter_file 或 --chapter")
-        chapter_file = args.chapter_file
-        if not os.path.exists(chapter_file):
-            print(f"❌ 文件不存在: {chapter_file}")
-            sys.exit(1)
-
-        chapter_num = extract_chapter_num_from_filename(Path(chapter_file).name)
-
-    print(f"📖 正在扫描: {chapter_file}")
-    entities = extract_new_entities(chapter_file)
-    entity_alias_ops = extract_entity_alias_ops(chapter_file)
-    entity_update_ops = extract_entity_update_ops(chapter_file)
-    golden_finger_skills = extract_golden_finger_skills(chapter_file)
-    foreshadowing_items = extract_foreshadowing_json(chapter_file)
-    deviations = extract_deviations(chapter_file)
-    relationship_items = extract_relationships(chapter_file)
-
-    if not entities and not entity_alias_ops and not entity_update_ops and not golden_finger_skills and not foreshadowing_items and not deviations and not relationship_items:
-        print("✅ 未发现任何 XML 标签(<entity>/<entity-alias>/<entity-update>/<skill>/<foreshadow>/<deviation>/<relationship>)")
-        return
-
-    if entities:
-        print(f"\n🔍 发现 {len(entities)} 个新实体:")
-        for i, entity in enumerate(entities, 1):
-            tier_emoji = {"核心": "🔴", "支线": "🟡", "装饰": "🟢"}.get(entity.get("tier", "支线"), "⚪")
-            print(
-                f"  {i}. [{entity['type']}] {entity['name']} {tier_emoji}{entity.get('tier', '支线')} - {entity['desc'][:25]}..."
-            )
-
-    if golden_finger_skills:
-        print(f"\n✨ 发现 {len(golden_finger_skills)} 个金手指技能:")
-        for i, skill in enumerate(golden_finger_skills, 1):
-            print(f"  {i}. {skill['name']} ({skill['level']}) - {skill['desc'][:25]}...")
-
-    if entity_alias_ops:
-        print(f"\n🏷️ 发现 {len(entity_alias_ops)} 条实体别名:")
-        for i, op in enumerate(entity_alias_ops, 1):
-            ref = op.get("id") or op.get("ref") or "?"
-            print(f"  {i}. {ref} -> {op.get('alias', '')}")
-
-    if entity_update_ops:
-        print(f"\n🛠️ 发现 {len(entity_update_ops)} 条实体更新:")
-        for i, op in enumerate(entity_update_ops, 1):
-            ref = op.get("id") or op.get("ref") or "?"
-            operations = op.get("operations") or []
-            ops_preview = []
-            for o in operations[:6]:
-                if isinstance(o, dict):
-                    op_type = o.get("op", "set")
-                    key = o.get("key", "")
-                    ops_preview.append(f"{op_type}:{key}")
-            preview = ", ".join(ops_preview) + ("..." if len(operations) > 6 else "")
-            print(f"  {i}. {ref}: {preview}")
-
-    if foreshadowing_items:
-        print(f"\n🧩 发现 {len(foreshadowing_items)} 条伏笔:")
-        for i, item in enumerate(foreshadowing_items, 1):
-            tier = item.get("tier", "支线")
-            target = item.get("target_chapter", "未设定")
-            print(f"  {i}. {tier} → 目标Ch{target}: {str(item.get('content', ''))[:40]}...")
-
-    if deviations:
-        print(f"\n⚡ 发现 {len(deviations)} 条大纲偏离:")
-        for i, dev in enumerate(deviations, 1):
-            print(f"  {i}. {dev.get('reason', '')[:50]}...")
-
-    if relationship_items:
-        print(f"\n💕 发现 {len(relationship_items)} 条关系:")
-        for i, rel in enumerate(relationship_items, 1):
-            char1 = str(rel.get("char1") or rel.get("char1_id") or "").strip() or "?"
-            char2 = str(rel.get("char2") or rel.get("char2_id") or "").strip() or "?"
-            print(f"  {i}. {char1} ↔ {char2} ({rel['type']}, 强度 {rel['intensity']})")
-
-    if dry_run:
-        print("\n⚠️  Dry-run 模式,不执行实际写入")
-        return
-
-    if not project_root:
-        chapter_path = Path(chapter_file).resolve()
-        for parent in [chapter_path.parent] + list(chapter_path.parents):
-            if (parent / ".webnovel" / "state.json").exists():
-                project_root = parent
-                break
-
-    if not project_root:
-        print("❌ 找不到项目根目录(缺少 .webnovel/state.json)")
-        print("请先运行 /webnovel-init 初始化项目,或使用 --project-root 指定路径")
-        sys.exit(1)
-
-    state_file = resolve_state_file(explicit_project_root=str(project_root))
-
-    print("\n📝 开始同步到设定集...")
-    success_count = 0
-    for entity in entities:
-        if sync_entity_to_settings(entity, str(project_root), auto_mode):
-            success_count += 1
-
-    print("\n💾 更新 state.json...")
-    try:
-        update_state_json(
-            entities=entities,
-            state_file=str(state_file),
-            golden_finger_skills=golden_finger_skills,
-            foreshadowing_items=foreshadowing_items,
-            relationship_items=relationship_items,
-            entity_alias_ops=entity_alias_ops,
-            entity_update_ops=entity_update_ops,
-            default_planted_chapter=chapter_num,
-        )
-    except (AmbiguousAliasError, ValueError) as e:
-        print(f"❌ {e}")
-        sys.exit(2)
-
-    print("\n✅ 完成!")
-    print(f"  - 实体同步: {success_count}/{len(entities)} 个")
-    if golden_finger_skills:
-        print(f"  - 金手指技能: {len(golden_finger_skills)} 个")
-    if foreshadowing_items:
-        print(f"  - 伏笔同步: {len(foreshadowing_items)} 条")
-    if relationship_items:
-        print(f"  - 关系同步: {len(relationship_items)} 条")
-    if deviations:
-        print(f"  - 大纲偏离: {len(deviations)} 条(仅记录,不同步到 state.json)")
-
-    if not auto_mode:
-        print("\n💡 建议:")
-        print("  1. 检查生成的角色卡/物品卡,补充详细设定")
-        print("  2. 查看 世界观.md 和 力量体系.md 的更新")
-        print("  3. 确认 .webnovel/state.json 中的实体记录")
-        if golden_finger_skills:
-            print("  4. 检查金手指技能是否正确记录在 protagonist_state.golden_finger.skills")
-        if foreshadowing_items:
-            print("  5. 检查 plot_threads.foreshadowing 的 planted/target/tier/location/characters 是否合理")
-        if relationship_items:
-            print("  6. 检查 structured_relationships 关系记录是否合理")
-        if deviations:
-            print("  7. 大纲偏离已记录,请在 plan.md 或大纲中同步调整")
-
-if __name__ == "__main__":
-    main()

+ 9 - 10
.claude/scripts/init_project.py

@@ -49,12 +49,16 @@ def _write_text_if_missing(path: Path, content: str) -> None:
 
 
 def _ensure_state_schema(state: Dict[str, Any]) -> Dict[str, Any]:
-    """确保 state.json 具备 v5.0 架构所需的字段集合。"""
+    """确保 state.json 具备 v5.1 架构所需的字段集合。
+
+    v5.1 变更:
+    - entities_v3 和 alias_index 已迁移到 index.db,不再存储在 state.json
+    - structured_relationships 已迁移到 index.db relationships 表
+    - state.json 保持精简 (< 5KB)
+    """
     state.setdefault("project_info", {})
     state.setdefault("progress", {})
     state.setdefault("protagonist_state", {})
-    state.setdefault("relationships", {})
-    state.setdefault("structured_relationships", [])
     state.setdefault("disambiguation_warnings", [])
     state.setdefault("disambiguation_pending", [])
     state.setdefault("world_settings", {"power_system": [], "factions": [], "locations": []})
@@ -71,13 +75,8 @@ def _ensure_state_schema(state: Dict[str, Any]) -> Dict[str, Any]:
             "history": [],
         },
     )
-    # v5.0: entities_v3 分组格式(按类型)
-    state.setdefault(
-        "entities_v3",
-        {"角色": {}, "地点": {}, "物品": {}, "势力": {}, "招式": {}},
-    )
-    # v5.0: alias_index 一对多映射
-    state.setdefault("alias_index", {})
+    # v5.1: entities_v3, alias_index, structured_relationships 已迁移到 index.db
+    # 不再在 state.json 中初始化这些字段
 
     # progress schema evolution
     state["progress"].setdefault("current_chapter", 0)

+ 0 - 520
.claude/scripts/stress_test_500chapters.py

@@ -1,520 +0,0 @@
-#!/usr/bin/env python3
-# -*- coding: utf-8 -*-
-"""
-500章写作沙盘模拟 - 数据链稳定性压力测试
-
-测试目标:
-1. state.json 增长曲线(文件大小随章节变化)
-2. entities_v3 实体数量增长
-3. alias_index 别名索引膨胀
-4. 伏笔追踪(埋设/回收比例)
-5. 原子写入性能
-6. index.db 查询性能
-
-模拟参数(基于典型网文):
-- 500章,每章约3500字
-- 平均每章新增 0.8 个角色(前100章密集,后期稀疏)
-- 平均每章新增 0.3 个地点
-- 平均每章埋设 0.5 个伏笔,回收 0.3 个
-- 主角每 10 章升级一次境界
-- 每 5 章更新一次关系
-"""
-
-import json
-import os
-import sys
-import time
-import random
-import shutil
-import tempfile
-from pathlib import Path
-from datetime import datetime
-from typing import Dict, Any, List
-
-# 添加脚本目录到路径
-script_dir = Path(__file__).resolve().parent
-sys.path.insert(0, str(script_dir))
-
-from security_utils import atomic_write_json, read_json_safe
-
-# Windows 编码修复
-if sys.platform == 'win32':
-    import io
-    sys.stdout = io.TextIOWrapper(sys.stdout.buffer, encoding='utf-8')
-    sys.stderr = io.TextIOWrapper(sys.stderr.buffer, encoding='utf-8')
-
-
-# ============================================================================
-# 模拟配置
-# ============================================================================
-
-CONFIG = {
-    "total_chapters": 500,
-    "words_per_chapter": 3500,
-
-    # 实体生成概率(随章节递减)
-    "new_character_base_rate": 0.8,  # 前50章
-    "new_character_decay": 0.95,      # 每50章衰减
-    "new_location_rate": 0.3,
-    "new_item_rate": 0.2,
-    "new_faction_rate": 0.1,
-    "new_technique_rate": 0.15,
-
-    # 伏笔
-    "foreshadow_plant_rate": 0.5,
-    "foreshadow_resolve_rate": 0.3,
-    "foreshadow_tiers": ["核心", "支线", "装饰"],
-    "foreshadow_tier_weights": [0.1, 0.3, 0.6],
-
-    # 主角升级
-    "protagonist_upgrade_interval": 10,
-    "realms": ["练气", "筑基", "金丹", "元婴", "化神", "炼虚", "合体", "大乘", "渡劫"],
-    "layers_per_realm": 9,
-
-    # 关系更新
-    "relationship_update_interval": 5,
-    "relationship_types": ["ally", "enemy", "romance", "mentor", "rival", "family"],
-
-    # 别名生成
-    "alias_per_character": 2.5,  # 平均每个角色的别名数
-}
-
-# 随机名字池
-SURNAME_POOL = ["林", "陈", "王", "李", "张", "刘", "赵", "黄", "周", "吴", "徐", "孙", "马", "朱", "胡", "郭", "何", "高", "罗", "郑"]
-NAME_POOL = ["天", "云", "风", "雷", "火", "水", "月", "星", "龙", "凤", "虎", "鹤", "剑", "刀", "枪", "棍", "拳", "掌", "指", "心"]
-LOCATION_PREFIX = ["天", "云", "龙", "凤", "青", "白", "黑", "红", "金", "玉"]
-LOCATION_SUFFIX = ["山", "谷", "城", "峰", "洞", "海", "林", "湖", "殿", "宗"]
-
-
-class SimulationMetrics:
-    """模拟指标收集器"""
-
-    def __init__(self):
-        self.checkpoints: List[Dict] = []
-        self.write_times: List[float] = []
-        self.errors: List[str] = []
-
-    def record_checkpoint(self, chapter: int, state: Dict, state_file: Path):
-        """记录检查点"""
-        file_size = state_file.stat().st_size if state_file.exists() else 0
-
-        entities_v3 = state.get("entities_v3", {})
-        entity_counts = {
-            etype: len(entities)
-            for etype, entities in entities_v3.items()
-        }
-        total_entities = sum(entity_counts.values())
-
-        alias_count = len(state.get("alias_index", {}))
-
-        foreshadowing = state.get("foreshadowing", [])
-        active_foreshadow = len([f for f in foreshadowing if f.get("status") == "未回收"])
-        resolved_foreshadow = len([f for f in foreshadowing if f.get("status") == "已回收"])
-
-        relationships = state.get("relationships", [])
-        if isinstance(relationships, dict):
-            relationships = list(relationships.values())
-
-        self.checkpoints.append({
-            "chapter": chapter,
-            "file_size_kb": file_size / 1024,
-            "total_entities": total_entities,
-            "entity_counts": entity_counts,
-            "alias_count": alias_count,
-            "active_foreshadow": active_foreshadow,
-            "resolved_foreshadow": resolved_foreshadow,
-            "relationship_count": len(relationships) if isinstance(relationships, list) else 0,
-            "avg_write_time_ms": sum(self.write_times[-10:]) / max(len(self.write_times[-10:]), 1) * 1000,
-        })
-
-    def record_write_time(self, duration: float):
-        self.write_times.append(duration)
-
-    def record_error(self, error: str):
-        self.errors.append(error)
-
-    def generate_report(self) -> str:
-        """生成测试报告"""
-        if not self.checkpoints:
-            return "No data collected"
-
-        final = self.checkpoints[-1]
-        first = self.checkpoints[0]
-
-        lines = [
-            "=" * 60,
-            "📊 500章沙盘模拟测试报告",
-            "=" * 60,
-            "",
-            "## 基础指标",
-            f"- 总章节数: {final['chapter']}",
-            f"- 总字数: {final['chapter'] * CONFIG['words_per_chapter']:,}",
-            "",
-            "## state.json 增长",
-            f"- 初始大小: {first['file_size_kb']:.2f} KB",
-            f"- 最终大小: {final['file_size_kb']:.2f} KB",
-            f"- 增长倍数: {final['file_size_kb'] / max(first['file_size_kb'], 0.1):.1f}x",
-            "",
-            "## 实体统计",
-            f"- 总实体数: {final['total_entities']}",
-        ]
-
-        for etype, count in final['entity_counts'].items():
-            lines.append(f"  - {etype}: {count}")
-
-        lines.extend([
-            f"- 别名索引条目: {final['alias_count']}",
-            "",
-            "## 伏笔统计",
-            f"- 活跃伏笔: {final['active_foreshadow']}",
-            f"- 已回收伏笔: {final['resolved_foreshadow']}",
-            f"- 回收率: {final['resolved_foreshadow'] / max(final['active_foreshadow'] + final['resolved_foreshadow'], 1) * 100:.1f}%",
-            "",
-            "## 性能指标",
-            f"- 平均写入时间: {sum(self.write_times) / max(len(self.write_times), 1) * 1000:.2f} ms",
-            f"- 最大写入时间: {max(self.write_times) * 1000:.2f} ms" if self.write_times else "N/A",
-            f"- 最小写入时间: {min(self.write_times) * 1000:.2f} ms" if self.write_times else "N/A",
-            "",
-            "## 错误统计",
-            f"- 错误数: {len(self.errors)}",
-        ])
-
-        if self.errors:
-            lines.append("- 错误详情:")
-            for err in self.errors[:5]:
-                lines.append(f"  - {err}")
-
-        # 增长曲线(每100章采样)
-        lines.extend([
-            "",
-            "## 增长曲线(每100章)",
-            "| 章节 | 文件大小(KB) | 实体数 | 别名数 | 活跃伏笔 | 写入时间(ms) |",
-            "|------|-------------|-------|-------|---------|-------------|",
-        ])
-
-        for cp in self.checkpoints:
-            if cp['chapter'] % 100 == 0 or cp['chapter'] == final['chapter']:
-                lines.append(
-                    f"| {cp['chapter']} | {cp['file_size_kb']:.1f} | "
-                    f"{cp['total_entities']} | {cp['alias_count']} | "
-                    f"{cp['active_foreshadow']} | {cp['avg_write_time_ms']:.1f} |"
-                )
-
-        # 稳定性评估
-        lines.extend([
-            "",
-            "## 稳定性评估",
-        ])
-
-        # 检查文件大小是否在合理范围
-        if final['file_size_kb'] < 500:
-            lines.append("✅ 文件大小合理 (< 500KB)")
-        elif final['file_size_kb'] < 1024:
-            lines.append("⚠️ 文件大小偏大 (500KB-1MB),建议启用归档")
-        else:
-            lines.append("❌ 文件过大 (> 1MB),需要优化")
-
-        # 检查写入性能
-        avg_write = sum(self.write_times) / max(len(self.write_times), 1) * 1000
-        if avg_write < 50:
-            lines.append("✅ 写入性能良好 (< 50ms)")
-        elif avg_write < 200:
-            lines.append("⚠️ 写入性能一般 (50-200ms)")
-        else:
-            lines.append("❌ 写入性能差 (> 200ms)")
-
-        # 检查错误率
-        if not self.errors:
-            lines.append("✅ 无错误")
-        else:
-            lines.append(f"❌ 有 {len(self.errors)} 个错误")
-
-        lines.append("")
-        lines.append("=" * 60)
-
-        return "\n".join(lines)
-
-
-class ChapterSimulator:
-    """章节模拟器"""
-
-    def __init__(self, project_root: Path):
-        self.project_root = project_root
-        self.state_file = project_root / ".webnovel" / "state.json"
-        self.metrics = SimulationMetrics()
-        self.generated_names = set()
-        self.entity_id_counter = 0
-
-    def _generate_id(self, prefix: str) -> str:
-        self.entity_id_counter += 1
-        return f"{prefix}_{self.entity_id_counter:05d}"
-
-    def _generate_character_name(self) -> str:
-        for _ in range(100):
-            name = random.choice(SURNAME_POOL) + random.choice(NAME_POOL) + random.choice(NAME_POOL)
-            if name not in self.generated_names:
-                self.generated_names.add(name)
-                return name
-        return f"角色_{len(self.generated_names)}"
-
-    def _generate_location_name(self) -> str:
-        return random.choice(LOCATION_PREFIX) + random.choice(LOCATION_SUFFIX)
-
-    def _get_character_rate(self, chapter: int) -> float:
-        """根据章节获取角色生成概率(递减)"""
-        decay_periods = chapter // 50
-        rate = CONFIG["new_character_base_rate"] * (CONFIG["new_character_decay"] ** decay_periods)
-        return max(rate, 0.1)  # 最低 10%
-
-    def init_project(self):
-        """初始化模拟项目"""
-        self.project_root.mkdir(parents=True, exist_ok=True)
-        (self.project_root / ".webnovel").mkdir(exist_ok=True)
-        (self.project_root / "正文").mkdir(exist_ok=True)
-
-        # 初始 state.json
-        initial_state = {
-            "project_info": {
-                "title": "模拟测试小说",
-                "genre": "玄幻",
-                "created_at": datetime.now().strftime("%Y-%m-%d"),
-                "target_chapters": 500,
-            },
-            "progress": {
-                "current_chapter": 0,
-                "total_words": 0,
-            },
-            "protagonist_state": {
-                "name": "林天",
-                "realm": "练气",
-                "layer": 1,
-                "golden_finger": {"name": "混沌珠", "level": 1},
-            },
-            "entities_v3": {
-                "角色": {},
-                "地点": {},
-                "物品": {},
-                "势力": {},
-                "招式": {},
-            },
-            "alias_index": {},
-            "foreshadowing": [],
-            "relationships": [],
-        }
-
-        # 添加主角到实体
-        protagonist_id = "protagonist_lintian"
-        initial_state["entities_v3"]["角色"][protagonist_id] = {
-            "canonical_name": "林天",
-            "desc": "主角,拥有混沌珠",
-            "tier": "核心",
-            "aliases": ["林天", "天哥", "林少侠"],
-            "current": {"realm": "练气", "layer": 1},
-            "history": [],
-        }
-        initial_state["alias_index"]["林天"] = [{"type": "角色", "id": protagonist_id}]
-        initial_state["alias_index"]["天哥"] = [{"type": "角色", "id": protagonist_id}]
-
-        atomic_write_json(self.state_file, initial_state, backup=False)
-        return initial_state
-
-    def simulate_chapter(self, chapter: int, state: Dict) -> Dict:
-        """模拟一章的数据变化"""
-
-        # 1. 更新进度
-        state["progress"]["current_chapter"] = chapter
-        state["progress"]["total_words"] += CONFIG["words_per_chapter"]
-
-        entities_v3 = state["entities_v3"]
-        alias_index = state["alias_index"]
-
-        # 2. 新增角色(概率递减)
-        if random.random() < self._get_character_rate(chapter):
-            char_name = self._generate_character_name()
-            char_id = self._generate_id("char")
-            tier = random.choices(
-                ["核心", "支线", "装饰"],
-                weights=[0.1, 0.3, 0.6]
-            )[0]
-
-            entities_v3["角色"][char_id] = {
-                "canonical_name": char_name,
-                "desc": f"第{chapter}章出场的{tier}角色",
-                "tier": tier,
-                "aliases": [char_name],
-                "current": {"first_appearance": chapter},
-                "history": [],
-            }
-            alias_index[char_name] = [{"type": "角色", "id": char_id}]
-
-            # 生成额外别名
-            if random.random() < 0.5:
-                alias = char_name[0] + "兄" if random.random() < 0.5 else char_name + "前辈"
-                entities_v3["角色"][char_id]["aliases"].append(alias)
-                if alias not in alias_index:
-                    alias_index[alias] = []
-                alias_index[alias].append({"type": "角色", "id": char_id})
-
-        # 3. 新增地点
-        if random.random() < CONFIG["new_location_rate"]:
-            loc_name = self._generate_location_name()
-            loc_id = self._generate_id("loc")
-            entities_v3["地点"][loc_id] = {
-                "canonical_name": loc_name,
-                "desc": f"第{chapter}章出现的地点",
-                "tier": "装饰",
-                "aliases": [loc_name],
-                "current": {},
-                "history": [],
-            }
-            alias_index[loc_name] = [{"type": "地点", "id": loc_id}]
-
-        # 4. 新增物品
-        if random.random() < CONFIG["new_item_rate"]:
-            item_name = random.choice(["灵", "仙", "神", "圣"]) + random.choice(["剑", "丹", "符", "器"])
-            item_id = self._generate_id("item")
-            entities_v3["物品"][item_id] = {
-                "canonical_name": item_name,
-                "desc": f"第{chapter}章获得的物品",
-                "tier": "装饰",
-                "aliases": [item_name],
-                "current": {},
-                "history": [],
-            }
-            if item_name not in alias_index:
-                alias_index[item_name] = []
-            alias_index[item_name].append({"type": "物品", "id": item_id})
-
-        # 5. 埋设伏笔
-        if random.random() < CONFIG["foreshadow_plant_rate"]:
-            tier = random.choices(
-                CONFIG["foreshadow_tiers"],
-                weights=CONFIG["foreshadow_tier_weights"]
-            )[0]
-            target = chapter + random.randint(10, 100)
-
-            state["foreshadowing"].append({
-                "id": f"fs_{chapter}_{random.randint(1000, 9999)}",
-                "content": f"第{chapter}章埋设的{tier}伏笔",
-                "tier": tier,
-                "status": "未回收",
-                "planted_chapter": chapter,
-                "target_chapter": target,
-            })
-
-        # 6. 回收伏笔
-        active_foreshadows = [
-            f for f in state["foreshadowing"]
-            if f.get("status") == "未回收" and f.get("target_chapter", 999) <= chapter
-        ]
-        for fs in active_foreshadows:
-            if random.random() < CONFIG["foreshadow_resolve_rate"]:
-                fs["status"] = "已回收"
-                fs["resolved_chapter"] = chapter
-
-        # 7. 主角升级
-        if chapter % CONFIG["protagonist_upgrade_interval"] == 0:
-            ps = state["protagonist_state"]
-            current_layer = ps.get("layer", 1)
-            current_realm_idx = CONFIG["realms"].index(ps.get("realm", "练气"))
-
-            if current_layer < CONFIG["layers_per_realm"]:
-                ps["layer"] = current_layer + 1
-            elif current_realm_idx < len(CONFIG["realms"]) - 1:
-                ps["realm"] = CONFIG["realms"][current_realm_idx + 1]
-                ps["layer"] = 1
-
-        # 8. 更新关系
-        if chapter % CONFIG["relationship_update_interval"] == 0:
-            char_ids = list(entities_v3["角色"].keys())
-            if len(char_ids) >= 2:
-                char1, char2 = random.sample(char_ids, 2)
-                rel_type = random.choice(CONFIG["relationship_types"])
-
-                state["relationships"].append({
-                    "char1_id": char1,
-                    "char2_id": char2,
-                    "type": rel_type,
-                    "intensity": random.randint(30, 100),
-                    "established_chapter": chapter,
-                })
-
-        return state
-
-    def run_simulation(self, checkpoint_interval: int = 10):
-        """运行完整模拟"""
-        print("🚀 开始500章沙盘模拟...")
-        print(f"📁 测试目录: {self.project_root}")
-        print()
-
-        state = self.init_project()
-        self.metrics.record_checkpoint(0, state, self.state_file)
-
-        start_time = time.time()
-
-        for chapter in range(1, CONFIG["total_chapters"] + 1):
-            try:
-                # 模拟章节
-                state = self.simulate_chapter(chapter, state)
-
-                # 原子写入
-                write_start = time.time()
-                atomic_write_json(self.state_file, state, use_lock=True, backup=False)
-                write_duration = time.time() - write_start
-                self.metrics.record_write_time(write_duration)
-
-                # 记录检查点
-                if chapter % checkpoint_interval == 0:
-                    self.metrics.record_checkpoint(chapter, state, self.state_file)
-                    elapsed = time.time() - start_time
-                    eta = elapsed / chapter * (CONFIG["total_chapters"] - chapter)
-                    print(f"  第 {chapter:3d} 章完成 | "
-                          f"文件 {self.state_file.stat().st_size / 1024:.1f}KB | "
-                          f"实体 {sum(len(e) for e in state['entities_v3'].values())} | "
-                          f"写入 {write_duration*1000:.1f}ms | "
-                          f"ETA {eta:.0f}s")
-
-            except Exception as e:
-                self.metrics.record_error(f"Chapter {chapter}: {str(e)}")
-                print(f"  ❌ 第 {chapter} 章错误: {e}")
-
-        # 最终检查点
-        self.metrics.record_checkpoint(CONFIG["total_chapters"], state, self.state_file)
-
-        total_time = time.time() - start_time
-        print()
-        print(f"✅ 模拟完成!总耗时: {total_time:.1f}s")
-        print()
-
-        return self.metrics.generate_report()
-
-
-def main():
-    """主函数"""
-    # 创建临时测试目录
-    test_dir = Path(tempfile.mkdtemp(prefix="webnovel_stress_test_"))
-
-    try:
-        simulator = ChapterSimulator(test_dir)
-        report = simulator.run_simulation(checkpoint_interval=10)
-
-        print(report)
-
-        # 保存报告
-        report_file = test_dir / "stress_test_report.md"
-        report_file.write_text(report, encoding="utf-8")
-        print(f"\n📄 报告已保存: {report_file}")
-
-        # 询问是否保留测试数据
-        print(f"\n测试数据目录: {test_dir}")
-        print("(测试完成后可手动删除)")
-
-    except KeyboardInterrupt:
-        print("\n⚠️ 测试被中断")
-    except Exception as e:
-        print(f"\n❌ 测试失败: {e}")
-        raise
-
-
-if __name__ == "__main__":
-    main()

+ 0 - 721
.claude/scripts/stress_test_index.py

@@ -1,721 +0,0 @@
-#!/usr/bin/env python3
-# -*- coding: utf-8 -*-
-"""
-500章索引系统压力测试
-
-测试目标:
-1. index.db 大小增长曲线
-2. 实体同步性能(entities_v3 → index.db)
-3. 别名查询性能
-4. 模糊搜索性能
-5. 伏笔紧急度计算性能
-6. 关系图查询性能
-7. 并发读写稳定性
-
-依赖:stress_test_500chapters.py 生成的 state.json
-"""
-
-import json
-import os
-import sys
-import time
-import random
-import sqlite3
-import tempfile
-import shutil
-from pathlib import Path
-from datetime import datetime
-from typing import Dict, Any, List, Tuple
-
-# 添加脚本目录到路径
-script_dir = Path(__file__).resolve().parent
-sys.path.insert(0, str(script_dir))
-
-from security_utils import atomic_write_json, read_json_safe
-
-# Windows 编码修复
-if sys.platform == 'win32':
-    import io
-    sys.stdout = io.TextIOWrapper(sys.stdout.buffer, encoding='utf-8')
-    sys.stderr = io.TextIOWrapper(sys.stderr.buffer, encoding='utf-8')
-
-
-# ============================================================================
-# 模拟配置(与 stress_test_500chapters.py 保持一致)
-# ============================================================================
-
-CONFIG = {
-    "total_chapters": 500,
-    "words_per_chapter": 3500,
-    "new_character_base_rate": 0.8,
-    "new_character_decay": 0.95,
-    "new_location_rate": 0.3,
-    "new_item_rate": 0.2,
-    "foreshadow_plant_rate": 0.5,
-    "foreshadow_resolve_rate": 0.3,
-    "relationship_update_interval": 5,
-}
-
-SURNAME_POOL = ["林", "陈", "王", "李", "张", "刘", "赵", "黄", "周", "吴", "徐", "孙", "马", "朱", "胡", "郭", "何", "高", "罗", "郑"]
-NAME_POOL = ["天", "云", "风", "雷", "火", "水", "月", "星", "龙", "凤", "虎", "鹤", "剑", "刀", "枪", "棍", "拳", "掌", "指", "心"]
-
-
-class IndexMetrics:
-    """索引性能指标收集器"""
-
-    def __init__(self):
-        self.checkpoints: List[Dict] = []
-        self.sync_times: List[float] = []
-        self.query_times: Dict[str, List[float]] = {
-            "alias_lookup": [],
-            "fuzzy_search": [],
-            "foreshadow_urgency": [],
-            "relationship_query": [],
-            "entity_by_type": [],
-        }
-        self.errors: List[str] = []
-
-    def record_checkpoint(self, chapter: int, db_path: Path, state: Dict):
-        """记录检查点"""
-        db_size = db_path.stat().st_size if db_path.exists() else 0
-
-        # 统计各表行数
-        table_counts = {}
-        if db_path.exists():
-            try:
-                conn = sqlite3.connect(str(db_path))
-                cursor = conn.cursor()
-                for table in ["chapters", "entities", "entity_aliases", "entity_kv",
-                              "entity_history", "foreshadowing_index", "relationships"]:
-                    try:
-                        cursor.execute(f"SELECT COUNT(*) FROM {table}")
-                        table_counts[table] = cursor.fetchone()[0]
-                    except sqlite3.OperationalError:
-                        table_counts[table] = 0
-                conn.close()
-            except Exception as e:
-                self.errors.append(f"DB stats error: {e}")
-
-        self.checkpoints.append({
-            "chapter": chapter,
-            "db_size_kb": db_size / 1024,
-            "table_counts": table_counts,
-            "avg_sync_time_ms": sum(self.sync_times[-10:]) / max(len(self.sync_times[-10:]), 1) * 1000,
-            "query_performance": {
-                k: sum(v[-10:]) / max(len(v[-10:]), 1) * 1000
-                for k, v in self.query_times.items()
-            }
-        })
-
-    def record_sync_time(self, duration: float):
-        self.sync_times.append(duration)
-
-    def record_query_time(self, query_type: str, duration: float):
-        if query_type in self.query_times:
-            self.query_times[query_type].append(duration)
-
-    def record_error(self, error: str):
-        self.errors.append(error)
-
-    def generate_report(self) -> str:
-        """生成测试报告"""
-        if not self.checkpoints:
-            return "No data collected"
-
-        final = self.checkpoints[-1]
-        first = self.checkpoints[0] if self.checkpoints else final
-
-        lines = [
-            "=" * 70,
-            "📊 500章索引系统压力测试报告",
-            "=" * 70,
-            "",
-            "## index.db 增长",
-            f"- 初始大小: {first['db_size_kb']:.2f} KB",
-            f"- 最终大小: {final['db_size_kb']:.2f} KB",
-            f"- 增长倍数: {final['db_size_kb'] / max(first['db_size_kb'], 0.1):.1f}x",
-            "",
-            "## 表行数统计",
-        ]
-
-        for table, count in final.get('table_counts', {}).items():
-            lines.append(f"  - {table}: {count:,}")
-
-        lines.extend([
-            "",
-            "## 同步性能",
-            f"- 平均同步时间: {sum(self.sync_times) / max(len(self.sync_times), 1) * 1000:.2f} ms",
-            f"- 最大同步时间: {max(self.sync_times) * 1000:.2f} ms" if self.sync_times else "N/A",
-            f"- 最小同步时间: {min(self.sync_times) * 1000:.2f} ms" if self.sync_times else "N/A",
-            "",
-            "## 查询性能(平均)",
-        ])
-
-        for query_type, times in self.query_times.items():
-            if times:
-                avg = sum(times) / len(times) * 1000
-                lines.append(f"  - {query_type}: {avg:.2f} ms")
-
-        lines.extend([
-            "",
-            "## 错误统计",
-            f"- 错误数: {len(self.errors)}",
-        ])
-
-        if self.errors:
-            lines.append("- 错误详情:")
-            for err in self.errors[:10]:
-                lines.append(f"  - {err[:80]}")
-
-        # 增长曲线
-        lines.extend([
-            "",
-            "## 增长曲线(每100章)",
-            "| 章节 | DB大小(KB) | entities | aliases | foreshadow | 同步(ms) |",
-            "|------|-----------|----------|---------|------------|----------|",
-        ])
-
-        for cp in self.checkpoints:
-            if cp['chapter'] % 100 == 0 or cp['chapter'] == final['chapter']:
-                tc = cp.get('table_counts', {})
-                lines.append(
-                    f"| {cp['chapter']} | {cp['db_size_kb']:.1f} | "
-                    f"{tc.get('entities', 0)} | {tc.get('entity_aliases', 0)} | "
-                    f"{tc.get('foreshadowing_index', 0)} | {cp['avg_sync_time_ms']:.1f} |"
-                )
-
-        # 查询性能趋势
-        lines.extend([
-            "",
-            "## 查询性能趋势(每100章)",
-            "| 章节 | alias查询(ms) | 模糊搜索(ms) | 伏笔紧急度(ms) | 关系查询(ms) |",
-            "|------|--------------|-------------|---------------|-------------|",
-        ])
-
-        for cp in self.checkpoints:
-            if cp['chapter'] % 100 == 0 or cp['chapter'] == final['chapter']:
-                qp = cp.get('query_performance', {})
-                lines.append(
-                    f"| {cp['chapter']} | {qp.get('alias_lookup', 0):.2f} | "
-                    f"{qp.get('fuzzy_search', 0):.2f} | "
-                    f"{qp.get('foreshadow_urgency', 0):.2f} | "
-                    f"{qp.get('relationship_query', 0):.2f} |"
-                )
-
-        # 稳定性评估
-        lines.extend([
-            "",
-            "## 稳定性评估",
-        ])
-
-        if final['db_size_kb'] < 1024:
-            lines.append("✅ 数据库大小合理 (< 1MB)")
-        elif final['db_size_kb'] < 5120:
-            lines.append("⚠️ 数据库偏大 (1-5MB)")
-        else:
-            lines.append("❌ 数据库过大 (> 5MB)")
-
-        avg_sync = sum(self.sync_times) / max(len(self.sync_times), 1) * 1000
-        if avg_sync < 100:
-            lines.append("✅ 同步性能良好 (< 100ms)")
-        elif avg_sync < 500:
-            lines.append("⚠️ 同步性能一般 (100-500ms)")
-        else:
-            lines.append("❌ 同步性能差 (> 500ms)")
-
-        # 查询性能评估
-        for query_type, times in self.query_times.items():
-            if times:
-                avg = sum(times) / len(times) * 1000
-                if avg < 10:
-                    lines.append(f"✅ {query_type} 查询快速 (< 10ms)")
-                elif avg < 50:
-                    lines.append(f"⚠️ {query_type} 查询一般 (10-50ms)")
-                else:
-                    lines.append(f"❌ {query_type} 查询慢 (> 50ms)")
-
-        if not self.errors:
-            lines.append("✅ 无错误")
-        else:
-            lines.append(f"❌ 有 {len(self.errors)} 个错误")
-
-        lines.append("")
-        lines.append("=" * 70)
-
-        return "\n".join(lines)
-
-
-class IndexSimulator:
-    """索引系统模拟器"""
-
-    def __init__(self, project_root: Path):
-        self.project_root = project_root
-        self.state_file = project_root / ".webnovel" / "state.json"
-        self.db_path = project_root / ".webnovel" / "index.db"
-        self.metrics = IndexMetrics()
-        self.generated_names = set()
-        self.entity_id_counter = 0
-
-    def _generate_id(self, prefix: str) -> str:
-        self.entity_id_counter += 1
-        return f"{prefix}_{self.entity_id_counter:05d}"
-
-    def _generate_character_name(self) -> str:
-        for _ in range(100):
-            name = random.choice(SURNAME_POOL) + random.choice(NAME_POOL) + random.choice(NAME_POOL)
-            if name not in self.generated_names:
-                self.generated_names.add(name)
-                return name
-        return f"角色_{len(self.generated_names)}"
-
-    def _get_character_rate(self, chapter: int) -> float:
-        decay_periods = chapter // 50
-        rate = CONFIG["new_character_base_rate"] * (CONFIG["new_character_decay"] ** decay_periods)
-        return max(rate, 0.1)
-
-    def init_database(self):
-        """初始化数据库"""
-        conn = sqlite3.connect(str(self.db_path))
-        cursor = conn.cursor()
-
-        # 创建表结构(与 structured_index.py 一致)
-        cursor.executescript("""
-            -- 章节表
-            CREATE TABLE IF NOT EXISTS chapters (
-                chapter_num INTEGER PRIMARY KEY,
-                title TEXT,
-                word_count INTEGER,
-                summary TEXT,
-                main_location TEXT,
-                characters TEXT,
-                content_hash TEXT,
-                created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
-            );
-
-            -- 实体主表
-            CREATE TABLE IF NOT EXISTS entities (
-                entity_id TEXT PRIMARY KEY,
-                entity_type TEXT NOT NULL,
-                canonical_name TEXT,
-                tier TEXT,
-                desc TEXT,
-                created_chapter INTEGER,
-                updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
-            );
-
-            -- 别名表
-            CREATE TABLE IF NOT EXISTS entity_aliases (
-                alias TEXT,
-                entity_id TEXT,
-                entity_type TEXT,
-                first_seen_chapter INTEGER,
-                context TEXT,
-                PRIMARY KEY (alias, entity_id)
-            );
-            CREATE INDEX IF NOT EXISTS idx_alias ON entity_aliases(alias);
-
-            -- 实体属性 (KV)
-            CREATE TABLE IF NOT EXISTS entity_kv (
-                entity_id TEXT,
-                key TEXT,
-                value TEXT,
-                last_chapter INTEGER,
-                PRIMARY KEY (entity_id, key)
-            );
-
-            -- 实体历史
-            CREATE TABLE IF NOT EXISTS entity_history (
-                id INTEGER PRIMARY KEY AUTOINCREMENT,
-                entity_id TEXT,
-                chapter INTEGER,
-                changes_json TEXT,
-                reasons_json TEXT,
-                added_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
-            );
-
-            -- 伏笔索引
-            CREATE TABLE IF NOT EXISTS foreshadowing_index (
-                foreshadow_id TEXT PRIMARY KEY,
-                content TEXT,
-                tier TEXT,
-                status TEXT,
-                planted_chapter INTEGER,
-                target_chapter INTEGER,
-                resolved_chapter INTEGER,
-                urgency_score REAL
-            );
-
-            -- 关系表
-            CREATE TABLE IF NOT EXISTS relationships (
-                id INTEGER PRIMARY KEY AUTOINCREMENT,
-                char1_id TEXT,
-                char2_id TEXT,
-                rel_type TEXT,
-                intensity INTEGER,
-                established_chapter INTEGER,
-                description TEXT
-            );
-            CREATE INDEX IF NOT EXISTS idx_rel_char1 ON relationships(char1_id);
-            CREATE INDEX IF NOT EXISTS idx_rel_char2 ON relationships(char2_id);
-        """)
-
-        conn.commit()
-        conn.close()
-
-    def init_project(self):
-        """初始化模拟项目"""
-        self.project_root.mkdir(parents=True, exist_ok=True)
-        (self.project_root / ".webnovel").mkdir(exist_ok=True)
-
-        # 初始 state.json
-        initial_state = {
-            "project_info": {"title": "索引测试小说", "genre": "玄幻"},
-            "progress": {"current_chapter": 0, "total_words": 0},
-            "protagonist_state": {"name": "林天", "realm": "练气", "layer": 1},
-            "entities_v3": {"角色": {}, "地点": {}, "物品": {}, "势力": {}, "招式": {}},
-            "alias_index": {},
-            "foreshadowing": [],
-            "relationships": [],
-        }
-
-        # 添加主角
-        protagonist_id = "protagonist_lintian"
-        initial_state["entities_v3"]["角色"][protagonist_id] = {
-            "canonical_name": "林天",
-            "desc": "主角",
-            "tier": "核心",
-            "aliases": ["林天", "天哥"],
-            "current": {"realm": "练气"},
-            "history": [],
-        }
-        initial_state["alias_index"]["林天"] = [{"type": "角色", "id": protagonist_id}]
-
-        atomic_write_json(self.state_file, initial_state, backup=False)
-        self.init_database()
-        return initial_state
-
-    def sync_to_index(self, state: Dict, chapter: int):
-        """同步 state.json 到 index.db"""
-        conn = sqlite3.connect(str(self.db_path))
-        cursor = conn.cursor()
-
-        try:
-            # 同步章节
-            cursor.execute("""
-                INSERT OR REPLACE INTO chapters
-                (chapter_num, title, word_count, summary)
-                VALUES (?, ?, ?, ?)
-            """, (chapter, f"第{chapter}章", CONFIG["words_per_chapter"], f"第{chapter}章摘要"))
-
-            # 同步实体
-            entities_v3 = state.get("entities_v3", {})
-            for entity_type, entities in entities_v3.items():
-                for entity_id, entity_data in entities.items():
-                    cursor.execute("""
-                        INSERT OR REPLACE INTO entities
-                        (entity_id, entity_type, canonical_name, tier, desc, created_chapter)
-                        VALUES (?, ?, ?, ?, ?, ?)
-                    """, (
-                        entity_id,
-                        entity_type,
-                        entity_data.get("canonical_name", ""),
-                        entity_data.get("tier", "装饰"),
-                        entity_data.get("desc", ""),
-                        chapter
-                    ))
-
-                    # 同步别名
-                    for alias in entity_data.get("aliases", []):
-                        cursor.execute("""
-                            INSERT OR IGNORE INTO entity_aliases
-                            (alias, entity_id, entity_type, first_seen_chapter)
-                            VALUES (?, ?, ?, ?)
-                        """, (alias, entity_id, entity_type, chapter))
-
-                    # 同步当前属性
-                    for key, value in entity_data.get("current", {}).items():
-                        cursor.execute("""
-                            INSERT OR REPLACE INTO entity_kv
-                            (entity_id, key, value, last_chapter)
-                            VALUES (?, ?, ?, ?)
-                        """, (entity_id, key, str(value), chapter))
-
-            # 同步伏笔
-            for fs in state.get("foreshadowing", []):
-                # 计算紧急度
-                if fs.get("status") == "未回收":
-                    target = fs.get("target_chapter", chapter + 100)
-                    urgency = max(0, 100 - (target - chapter))
-                else:
-                    urgency = 0
-
-                cursor.execute("""
-                    INSERT OR REPLACE INTO foreshadowing_index
-                    (foreshadow_id, content, tier, status, planted_chapter,
-                     target_chapter, resolved_chapter, urgency_score)
-                    VALUES (?, ?, ?, ?, ?, ?, ?, ?)
-                """, (
-                    fs.get("id", f"fs_{chapter}"),
-                    fs.get("content", ""),
-                    fs.get("tier", "装饰"),
-                    fs.get("status", "未回收"),
-                    fs.get("planted_chapter", chapter),
-                    fs.get("target_chapter"),
-                    fs.get("resolved_chapter"),
-                    urgency
-                ))
-
-            # 同步关系(使用 REPLACE 避免重复)
-            # 先清空再重建(简化策略,实际生产应增量同步)
-            cursor.execute("DELETE FROM relationships WHERE established_chapter <= ?", (chapter,))
-            for rel in state.get("relationships", []):
-                cursor.execute("""
-                    INSERT INTO relationships
-                    (char1_id, char2_id, rel_type, intensity, established_chapter)
-                    VALUES (?, ?, ?, ?, ?)
-                """, (
-                    rel.get("char1_id", ""),
-                    rel.get("char2_id", ""),
-                    rel.get("type", "ally"),
-                    rel.get("intensity", 50),
-                    rel.get("established_chapter", chapter)
-                ))
-
-            conn.commit()
-
-        finally:
-            conn.close()
-
-    def run_queries(self, state: Dict, chapter: int):
-        """执行各类查询并计时"""
-        conn = sqlite3.connect(str(self.db_path))
-        cursor = conn.cursor()
-
-        try:
-            # 1. 别名查询
-            alias_list = list(state.get("alias_index", {}).keys())
-            if alias_list:
-                test_alias = random.choice(alias_list)
-                start = time.time()
-                cursor.execute("SELECT entity_id, entity_type FROM entity_aliases WHERE alias = ?", (test_alias,))
-                cursor.fetchall()
-                self.metrics.record_query_time("alias_lookup", time.time() - start)
-
-            # 2. 模糊搜索
-            if alias_list:
-                search_term = random.choice(alias_list)[:2]  # 取前两个字
-                start = time.time()
-                cursor.execute("""
-                    SELECT DISTINCT entity_id, entity_type, alias
-                    FROM entity_aliases
-                    WHERE alias LIKE ?
-                    LIMIT 20
-                """, (f"%{search_term}%",))
-                cursor.fetchall()
-                self.metrics.record_query_time("fuzzy_search", time.time() - start)
-
-            # 3. 伏笔紧急度查询
-            start = time.time()
-            cursor.execute("""
-                SELECT foreshadow_id, content, urgency_score
-                FROM foreshadowing_index
-                WHERE status = '未回收'
-                ORDER BY urgency_score DESC
-                LIMIT 10
-            """)
-            cursor.fetchall()
-            self.metrics.record_query_time("foreshadow_urgency", time.time() - start)
-
-            # 4. 关系查询
-            entities_v3 = state.get("entities_v3", {})
-            char_ids = list(entities_v3.get("角色", {}).keys())
-            if char_ids:
-                test_char = random.choice(char_ids)
-                start = time.time()
-                cursor.execute("""
-                    SELECT char2_id, rel_type, intensity
-                    FROM relationships
-                    WHERE char1_id = ?
-                    UNION
-                    SELECT char1_id, rel_type, intensity
-                    FROM relationships
-                    WHERE char2_id = ?
-                """, (test_char, test_char))
-                cursor.fetchall()
-                self.metrics.record_query_time("relationship_query", time.time() - start)
-
-            # 5. 按类型查询实体
-            start = time.time()
-            cursor.execute("""
-                SELECT entity_id, canonical_name, tier
-                FROM entities
-                WHERE entity_type = '角色' AND tier = '核心'
-            """)
-            cursor.fetchall()
-            self.metrics.record_query_time("entity_by_type", time.time() - start)
-
-        finally:
-            conn.close()
-
-    def simulate_chapter(self, chapter: int, state: Dict) -> Dict:
-        """模拟一章的数据变化(与主测试脚本类似)"""
-        state["progress"]["current_chapter"] = chapter
-        state["progress"]["total_words"] += CONFIG["words_per_chapter"]
-
-        entities_v3 = state["entities_v3"]
-        alias_index = state["alias_index"]
-
-        # 新增角色
-        if random.random() < self._get_character_rate(chapter):
-            char_name = self._generate_character_name()
-            char_id = self._generate_id("char")
-            tier = random.choices(["核心", "支线", "装饰"], weights=[0.1, 0.3, 0.6])[0]
-
-            entities_v3["角色"][char_id] = {
-                "canonical_name": char_name,
-                "desc": f"第{chapter}章出场",
-                "tier": tier,
-                "aliases": [char_name],
-                "current": {"first_appearance": chapter},
-                "history": [],
-            }
-            alias_index[char_name] = [{"type": "角色", "id": char_id}]
-
-            # 额外别名
-            if random.random() < 0.5:
-                alias = char_name[0] + "兄"
-                entities_v3["角色"][char_id]["aliases"].append(alias)
-                if alias not in alias_index:
-                    alias_index[alias] = []
-                alias_index[alias].append({"type": "角色", "id": char_id})
-
-        # 新增地点
-        if random.random() < CONFIG["new_location_rate"]:
-            loc_name = random.choice(["天", "云", "龙"]) + random.choice(["山", "谷", "城"])
-            loc_id = self._generate_id("loc")
-            entities_v3["地点"][loc_id] = {
-                "canonical_name": loc_name,
-                "desc": f"第{chapter}章",
-                "tier": "装饰",
-                "aliases": [loc_name],
-                "current": {},
-                "history": [],
-            }
-            if loc_name not in alias_index:
-                alias_index[loc_name] = []
-            alias_index[loc_name].append({"type": "地点", "id": loc_id})
-
-        # 伏笔
-        if random.random() < CONFIG["foreshadow_plant_rate"]:
-            state["foreshadowing"].append({
-                "id": f"fs_{chapter}_{random.randint(1000, 9999)}",
-                "content": f"第{chapter}章伏笔",
-                "tier": random.choice(["核心", "支线", "装饰"]),
-                "status": "未回收",
-                "planted_chapter": chapter,
-                "target_chapter": chapter + random.randint(10, 100),
-            })
-
-        # 回收伏笔
-        for fs in state["foreshadowing"]:
-            if (fs.get("status") == "未回收" and
-                fs.get("target_chapter", 999) <= chapter and
-                random.random() < CONFIG["foreshadow_resolve_rate"]):
-                fs["status"] = "已回收"
-                fs["resolved_chapter"] = chapter
-
-        # 关系
-        if chapter % CONFIG["relationship_update_interval"] == 0:
-            char_ids = list(entities_v3["角色"].keys())
-            if len(char_ids) >= 2:
-                char1, char2 = random.sample(char_ids, 2)
-                state["relationships"].append({
-                    "char1_id": char1,
-                    "char2_id": char2,
-                    "type": random.choice(["ally", "enemy", "romance", "rival"]),
-                    "intensity": random.randint(30, 100),
-                    "established_chapter": chapter,
-                })
-
-        return state
-
-    def run_simulation(self, checkpoint_interval: int = 10):
-        """运行完整模拟"""
-        print("🚀 开始500章索引系统压力测试...")
-        print(f"📁 测试目录: {self.project_root}")
-        print()
-
-        state = self.init_project()
-        self.metrics.record_checkpoint(0, self.db_path, state)
-
-        start_time = time.time()
-
-        for chapter in range(1, CONFIG["total_chapters"] + 1):
-            try:
-                # 模拟章节数据
-                state = self.simulate_chapter(chapter, state)
-
-                # 保存 state.json
-                atomic_write_json(self.state_file, state, use_lock=True, backup=False)
-
-                # 同步到索引
-                sync_start = time.time()
-                self.sync_to_index(state, chapter)
-                sync_duration = time.time() - sync_start
-                self.metrics.record_sync_time(sync_duration)
-
-                # 执行查询测试
-                self.run_queries(state, chapter)
-
-                # 记录检查点
-                if chapter % checkpoint_interval == 0:
-                    self.metrics.record_checkpoint(chapter, self.db_path, state)
-                    elapsed = time.time() - start_time
-                    eta = elapsed / chapter * (CONFIG["total_chapters"] - chapter)
-                    db_size = self.db_path.stat().st_size / 1024 if self.db_path.exists() else 0
-                    print(f"  第 {chapter:3d} 章 | "
-                          f"DB {db_size:.1f}KB | "
-                          f"同步 {sync_duration*1000:.1f}ms | "
-                          f"ETA {eta:.0f}s")
-
-            except Exception as e:
-                self.metrics.record_error(f"Chapter {chapter}: {str(e)}")
-                print(f"  ❌ 第 {chapter} 章错误: {e}")
-
-        # 最终检查点
-        self.metrics.record_checkpoint(CONFIG["total_chapters"], self.db_path, state)
-
-        total_time = time.time() - start_time
-        print()
-        print(f"✅ 索引测试完成!总耗时: {total_time:.1f}s")
-        print()
-
-        return self.metrics.generate_report()
-
-
-def main():
-    """主函数"""
-    test_dir = Path(tempfile.mkdtemp(prefix="webnovel_index_test_"))
-
-    try:
-        simulator = IndexSimulator(test_dir)
-        report = simulator.run_simulation(checkpoint_interval=10)
-
-        print(report)
-
-        # 保存报告
-        report_file = test_dir / "index_stress_test_report.md"
-        report_file.write_text(report, encoding="utf-8")
-        print(f"\n📄 报告已保存: {report_file}")
-        print(f"\n测试数据目录: {test_dir}")
-
-    except KeyboardInterrupt:
-        print("\n⚠️ 测试被中断")
-    except Exception as e:
-        print(f"\n❌ 测试失败: {e}")
-        import traceback
-        traceback.print_exc()
-
-
-if __name__ == "__main__":
-    main()

+ 0 - 1261
.claude/scripts/structured_index.py

@@ -1,1261 +0,0 @@
-#!/usr/bin/env python3
-"""
-结构化索引系统(Structured Index System)v4.0
-
-⚠️ DEPRECATED: 本模块已被 v5.1 index_manager 替代。
-   - v5.1 使用不同的 schema(entities.id, aliases, current_json)
-   - 本模块仅保留用于兼容旧项目迁移
-   - 新项目请使用 data_modules.index_manager
-
-目标:取代向量化检索,使用 SQLite 提供精确、快速的结构化查询
-
-v4.0 变更:
-- 新增 entities/entity_aliases/entity_kv/entity_history 表
-- 主键从 name 迁移到 entity_id
-- relationships 表使用 char1_id/char2_id
-- 不再写回 state.json(消除循环依赖)
-- 从 entities_v3 + alias_index 同步数据
-
-核心功能:
-1. 实体索引(entities, entity_aliases, entity_kv, entity_history)
-2. 章节元数据索引(location, characters, word_count)
-3. 伏笔追踪索引(status, urgency calculation)
-4. 文件 Hash 自愈机制(auto-rebuild on change)
-
-性能目标:
-- 查询速度:2-5ms(vs 文件遍历 500ms,提升 250x)
-- 索引构建:10ms/章(增量更新)
-- 存储开销:200 章 ≈ 100 KB
-
-使用方式:
-  # 更新单章索引
-  python structured_index.py --update-chapter 7 --metadata-file /tmp/ch7.json
-
-  # 批量重建索引(历史章节)
-  python structured_index.py --rebuild-index
-
-  # 查询地点相关章节
-  python structured_index.py --query-location "血煞秘境"
-
-  # 查询紧急伏笔
-  python structured_index.py --query-urgent-foreshadowing
-
-  # 模糊查询角色
-  python structured_index.py --fuzzy-search "姓李" "女弟子"
-
-  # 查看统计信息
-  python structured_index.py --stats
-"""
-
-import json
-import os
-import sys
-import argparse
-import sqlite3
-import hashlib
-import re
-import tempfile
-from datetime import datetime
-from pathlib import Path
-from typing import Optional, List, Dict, Tuple
-
-# ============================================================================
-# 安全修复:导入安全工具函数(P1 MEDIUM)
-# ============================================================================
-from security_utils import create_secure_directory
-from project_locator import resolve_project_root
-from chapter_paths import find_chapter_file
-
-
-class StructuredIndex:
-    """结构化索引管理器(取代向量化检索)"""
-
-    def __init__(self, project_root=None):
-        if project_root is None:
-            try:
-                project_root = resolve_project_root()
-            except FileNotFoundError:
-                project_root = Path.cwd()
-        else:
-            project_root = Path(project_root)
-
-        self.project_root = project_root
-        self.state_file = project_root / ".webnovel" / "state.json"
-        self.chapters_dir = project_root / "正文"
-        self.index_db = project_root / ".webnovel" / "index.db"
-
-        # ============================================================================
-        # 安全修复:使用安全目录创建函数(P1 MEDIUM)
-        # 原代码: self.index_db.parent.mkdir(parents=True, exist_ok=True)
-        # 漏洞: 未设置权限,使用OS默认(可能为755,允许同组用户读取)
-        # ============================================================================
-        create_secure_directory(str(self.index_db.parent))
-
-        # 连接数据库
-        self.conn = sqlite3.connect(str(self.index_db))
-        self.conn.row_factory = sqlite3.Row  # 返回字典式行
-
-        # 创建表结构
-        self._create_tables()
-
-    def _create_tables(self):
-        """创建索引表结构(v4.0 主键迁移到 entity_id)"""
-
-        # ============== 新增实体表(v4.0)==============
-
-        # 实体主表(取代旧 characters 表)
-        self.conn.execute("""
-            CREATE TABLE IF NOT EXISTS entities (
-                entity_id TEXT PRIMARY KEY,
-                entity_type TEXT NOT NULL,
-                canonical_name TEXT,
-                tier TEXT,
-                desc TEXT,
-                created_chapter INTEGER,
-                updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
-            )
-        """)
-
-        # 实体类型索引
-        self.conn.execute("""
-            CREATE INDEX IF NOT EXISTS idx_entity_type
-            ON entities(entity_type)
-        """)
-
-        # 别名表(支持一对多查询)
-        self.conn.execute("""
-            CREATE TABLE IF NOT EXISTS entity_aliases (
-                alias TEXT,
-                entity_id TEXT,
-                entity_type TEXT,
-                first_seen_chapter INTEGER,
-                context TEXT,
-                PRIMARY KEY (alias, entity_id)
-            )
-        """)
-
-        # 别名索引(加速反向查询)
-        self.conn.execute("""
-            CREATE INDEX IF NOT EXISTS idx_alias
-            ON entity_aliases(alias)
-        """)
-
-        # 实体属性 KV 表
-        self.conn.execute("""
-            CREATE TABLE IF NOT EXISTS entity_kv (
-                entity_id TEXT,
-                key TEXT,
-                value TEXT,
-                last_chapter INTEGER,
-                PRIMARY KEY (entity_id, key)
-            )
-        """)
-
-        # 实体历史表
-        self.conn.execute("""
-            CREATE TABLE IF NOT EXISTS entity_history (
-                id INTEGER PRIMARY KEY AUTOINCREMENT,
-                entity_id TEXT,
-                chapter INTEGER,
-                changes_json TEXT,
-                reasons_json TEXT,
-                added_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
-            )
-        """)
-
-        # 历史索引
-        self.conn.execute("""
-            CREATE INDEX IF NOT EXISTS idx_entity_history
-            ON entity_history(entity_id, chapter)
-        """)
-
-        # ============== 章节元数据表 ==============
-
-        # 1. 章节元数据表(v4.0: characters 改为存 entity_id 列表)
-        self.conn.execute("""
-            CREATE TABLE IF NOT EXISTS chapters (
-                chapter_num INTEGER PRIMARY KEY,
-                title TEXT,
-                location TEXT,
-                location_id TEXT,
-                characters TEXT,  -- JSON: ["entity_id_1", "entity_id_2"]
-                word_count INTEGER,
-                content_hash TEXT,
-                created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
-                updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
-            )
-        """)
-
-        # 地点索引(加速查询)
-        self.conn.execute("""
-            CREATE INDEX IF NOT EXISTS idx_location
-            ON chapters(location)
-        """)
-
-        # 2. 伏笔追踪表
-        self.conn.execute("""
-            CREATE TABLE IF NOT EXISTS foreshadowing_index (
-                id INTEGER PRIMARY KEY,
-                content TEXT,
-                location TEXT,
-                characters TEXT,  -- JSON: ["李雪", "主角"]
-                introduced_chapter INTEGER,
-                resolved_chapter INTEGER,
-                status TEXT,  -- '未回收' / '已回收'
-                urgency INTEGER DEFAULT 0,  -- 0-100,自动计算
-                created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
-                updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
-            )
-        """)
-
-        # 状态索引
-        self.conn.execute("""
-            CREATE INDEX IF NOT EXISTS idx_status
-            ON foreshadowing_index(status)
-        """)
-
-        # 紧急度索引
-        self.conn.execute("""
-            CREATE INDEX IF NOT EXISTS idx_urgency
-            ON foreshadowing_index(urgency)
-        """)
-
-        # 3. 角色关系表(v4.0: 使用 entity_id)
-        self.conn.execute("""
-            CREATE TABLE IF NOT EXISTS relationships (
-                id INTEGER PRIMARY KEY AUTOINCREMENT,
-                char1_id TEXT,
-                char2_id TEXT,
-                char1_name TEXT,
-                char2_name TEXT,
-                relation_type TEXT,  -- 'ally', 'enemy', 'romance', 'mentor', 'debtor'
-                intensity INTEGER,    -- 关系强度 0-100
-                description TEXT,
-                last_update_chapter INTEGER,
-                created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
-                updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
-                UNIQUE(char1_id, char2_id, relation_type)  -- 防止重复
-            )
-        """)
-
-        # 关系索引(v4.0: 使用 entity_id)
-        self.conn.execute("""
-            CREATE INDEX IF NOT EXISTS idx_char1_char2
-            ON relationships(char1_id, char2_id)
-        """)
-
-        # 4. 角色索引表(v4.0 已废弃,保留兼容)
-        # 新代码应使用 entities 表
-        self.conn.execute("""
-            CREATE TABLE IF NOT EXISTS characters (
-                name TEXT PRIMARY KEY,
-                description TEXT,
-                personality TEXT,
-                importance TEXT,  -- 'major' / 'minor'
-                power_level TEXT,
-                first_appearance INTEGER,
-                last_appearance INTEGER,
-                status TEXT DEFAULT 'active',  -- 'active' / 'archived'
-                archived_at TEXT,  -- ISO timestamp
-                created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
-                updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
-            )
-        """)
-
-        # 角色名索引(加速模糊搜索)
-        self.conn.execute("""
-            CREATE INDEX IF NOT EXISTS idx_character_name
-            ON characters(name)
-        """)
-
-        # 状态索引
-        self.conn.execute("""
-            CREATE INDEX IF NOT EXISTS idx_character_status
-            ON characters(status)
-        """)
-
-        self.conn.commit()
-
-    # ================== 核心功能 1:章节元数据索引 ==================
-
-    def index_chapter(self, chapter_num: int, metadata: Dict):
-        """为新章节建立索引(在 webnovel-write Step 4.6 调用)
-
-        Args:
-            chapter_num: 章节编号
-            metadata: {
-                'title': '章节标题',
-                'location': '地点',
-                'characters': ['李雪', '主角'],
-                'word_count': 3500,
-                'hash': 'md5_hash'
-            }
-        """
-        def _normalize_str_list(v) -> List[str]:
-            if v is None:
-                return []
-            if isinstance(v, list):
-                return [str(x).strip() for x in v if str(x).strip()]
-            if isinstance(v, str):
-                return [s.strip() for s in re.split(r"[,,]", v) if s.strip()]
-            return [str(v).strip()] if str(v).strip() else []
-
-        def _exists_entity(entity_id: str, entity_type: str) -> bool:
-            row = self.conn.execute(
-                "SELECT 1 FROM entities WHERE entity_id = ? AND entity_type = ? LIMIT 1",
-                (entity_id, entity_type),
-            ).fetchone()
-            return bool(row)
-
-        def _resolve_alias_ids(alias: str, entity_type: str) -> List[str]:
-            rows = self.conn.execute(
-                "SELECT entity_id FROM entity_aliases WHERE alias = ? AND entity_type = ?",
-                (alias, entity_type),
-            ).fetchall()
-            return [r["entity_id"] for r in rows] if rows else []
-
-        # v4.0: chapters.characters 存 entity_id 列表(metadata 允许传入 name/alias,索引层负责解析)
-        resolved_character_ids: List[str] = []
-        seen_ids = set()
-        for ref in _normalize_str_list(metadata.get("characters", [])):
-            if _exists_entity(ref, "角色"):
-                if ref not in seen_ids:
-                    resolved_character_ids.append(ref)
-                    seen_ids.add(ref)
-                continue
-
-            candidates = _resolve_alias_ids(ref, "角色")
-            if len(candidates) == 1:
-                cid = candidates[0]
-                if cid not in seen_ids:
-                    resolved_character_ids.append(cid)
-                    seen_ids.add(cid)
-                continue
-
-            if len(candidates) > 1:
-                print(f"⚠️ 角色别名歧义,跳过: {ref!r} 命中 {len(candidates)} 个角色")
-            else:
-                print(f"⚠️ 未知角色,跳过: {ref!r}")
-
-        # v4.0: 可选 location_id(只解析为地点实体)
-        location = str(metadata.get("location", "")).strip()
-        location_id = ""
-        if location:
-            if _exists_entity(location, "地点"):
-                location_id = location
-            else:
-                loc_candidates = _resolve_alias_ids(location, "地点")
-                if len(loc_candidates) == 1:
-                    location_id = loc_candidates[0]
-                elif len(loc_candidates) > 1:
-                    print(f"⚠️ 地点别名歧义,location_id 留空: {location!r} 命中 {len(loc_candidates)} 个地点")
-
-        self.conn.execute("""
-            INSERT OR REPLACE INTO chapters
-            (chapter_num, title, location, location_id, characters, word_count, content_hash, updated_at)
-            VALUES (?, ?, ?, ?, ?, ?, ?, CURRENT_TIMESTAMP)
-        """, (
-            chapter_num,
-            metadata['title'],
-            location,
-            location_id,
-            json.dumps(resolved_character_ids, ensure_ascii=False),
-            metadata['word_count'],
-            metadata['hash']
-        ))
-
-        self.conn.commit()
-        print(f"✅ 章节索引已更新:Ch{chapter_num} - {metadata['title']}")
-
-    # bump_character_last_appearance_in_state 已删除(v4.0)
-    # 原因:消除索引层写回 state.json 的循环依赖
-    # last_appearance_chapter 现在作为 index.db 的派生字段
-
-    def query_chapters_by_location(self, location: str, limit: int = 10) -> List[Tuple]:
-        """O(log n) 查询:返回该地点的最近 N 章
-
-        Args:
-            location: 地点名称
-            limit: 返回数量
-
-        Returns:
-            [(chapter_num, title, characters), ...]
-        """
-        cursor = self.conn.execute("""
-            SELECT chapter_num, title, characters
-            FROM chapters
-            WHERE location = ?
-            ORDER BY chapter_num DESC
-            LIMIT ?
-        """, (location, limit))
-
-        return cursor.fetchall()
-
-    def calculate_chapter_hash(self, chapter_file: Path) -> str:
-        """计算章节文件 MD5 Hash(用于自愈机制)"""
-        if not chapter_file.exists():
-            return ""
-
-        with open(chapter_file, 'rb') as f:
-            return hashlib.md5(f.read()).hexdigest()
-
-    def get_stored_hash(self, chapter_num: int) -> Optional[str]:
-        """从索引中读取存储的 Hash"""
-        cursor = self.conn.execute("""
-            SELECT content_hash FROM chapters WHERE chapter_num = ?
-        """, (chapter_num,))
-
-        row = cursor.fetchone()
-        return row['content_hash'] if row else None
-
-    def validate_and_rebuild_if_needed(self, chapter_num: int):
-        """校验章节 Hash,不一致则自动重建索引(Self-Healing Index)
-
-        触发时机:
-        - context_manager.py 查询章节前调用
-        - 增加耗时:~5ms(Hash 计算 + 对比)
-        - 仅当检测到变更时才重建(增量成本)
-        """
-        chapter_file = find_chapter_file(self.project_root, chapter_num)
-        if chapter_file is None or not chapter_file.exists():
-            return  # 文件不存在,跳过
-
-        # 计算当前文件 Hash
-        current_hash = self.calculate_chapter_hash(chapter_file)
-
-        # 从索引中读取存储的 Hash
-        stored_hash = self.get_stored_hash(chapter_num)
-
-        if current_hash != stored_hash:
-            print(f"⚠️ 检测到 Ch{chapter_num} 已修改,自动重建索引...")
-            self._rebuild_chapter_index(chapter_num, chapter_file)
-            print(f"✅ Ch{chapter_num} 索引已更新")
-
-    def _rebuild_chapter_index(self, chapter_num: int, chapter_file: Path):
-        """重建单章索引(自动提取元数据)"""
-
-        # 读取章节内容
-        with open(chapter_file, 'r', encoding='utf-8') as f:
-            content = f.read()
-
-        # 提取元数据
-        metadata = self._extract_metadata_from_content(content, chapter_num)
-
-        # 重建索引
-        self.index_chapter(chapter_num, metadata)
-
-    def _extract_metadata_from_content(self, content: str, chapter_num: int) -> Dict:
-        """从章节内容中提取元数据"""
-
-        # 提取标题(第一行)
-        lines = content.split('\n')
-        title = lines[0].strip('# ').strip() if lines else f"第{chapter_num}章"
-
-        # 提取地点(在章节开头查找,通常格式为 **地点:XXX**)
-        location_match = re.search(r'\*\*地点[::]\s*(.+?)\*\*', content)
-        location = location_match.group(1).strip() if location_match else "未知"
-
-        # 提取角色(查找所有对话和描述中的角色名)
-        # 简化实现:从 state.json 读取已知角色,匹配出现频率
-        characters = self._extract_characters_from_content(content)
-
-        # 计算字数
-        word_count = len(content)
-
-        # 计算 Hash
-        content_hash = hashlib.md5(content.encode('utf-8')).hexdigest()
-
-        return {
-            'title': title,
-            'location': location,
-            'characters': characters[:5],  # 最多 5 个主要角色
-            'word_count': word_count,
-            'hash': content_hash
-        }
-
-    def _extract_characters_from_content(self, content: str) -> List[str]:
-        """从内容中提取角色(简化实现:读取索引中已知角色 canonical_name)"""
-
-        # 获取已知角色列表(限制规模,避免超大角色库拖慢)
-        rows = self.conn.execute(
-            "SELECT canonical_name FROM entities WHERE entity_type = ? AND canonical_name != '' LIMIT 800",
-            ("角色",),
-        ).fetchall()
-        known_characters = [r["canonical_name"] for r in rows] if rows else []
-        if not known_characters:
-            return []
-
-        # 统计每个角色在内容中的出现次数
-        char_counts = {}
-        for char_name in known_characters:
-            count = content.count(char_name)
-            if count > 0:
-                char_counts[char_name] = count
-
-        # 按出现次数排序,返回前 5 个
-        sorted_chars = sorted(char_counts.items(), key=lambda x: x[1], reverse=True)
-        return [char for char, _ in sorted_chars[:5]]
-
-    # ================== 核心功能 2:伏笔追踪索引 ==================
-
-    def sync_foreshadowing_from_state(self):
-        """从 state.json 同步伏笔数据到索引
-
-        触发时机:
-        - update_state.py 更新伏笔后调用
-        - --rebuild-index 批量重建时调用
-        """
-        if not self.state_file.exists():
-            print("❌ state.json 不存在,跳过伏笔同步")
-            return
-
-        # 读取 state.json
-        with open(self.state_file, 'r', encoding='utf-8') as f:
-            state = json.load(f)
-
-        current_chapter = state.get('progress', {}).get('current_chapter', 0)
-
-        plot_threads = state.get('plot_threads', {}) or {}
-
-        # 兼容新格式:plot_threads.foreshadowing = [{"content": "...", "status": "active", ...}, ...]
-        foreshadowing_items = plot_threads.get('foreshadowing', []) or []
-        active_count = 0
-        resolved_count = 0
-
-        for item in foreshadowing_items:
-            desc = item.get('description') or item.get('content') or ''
-            if not desc:
-                continue
-
-            raw_status = (item.get('status') or '').strip()
-            if raw_status in ['已回收', 'resolved']:
-                status = '已回收'
-                resolved_count += 1
-            else:
-                # 默认都视为未回收(兼容 active/未回收/pending/空)
-                status = '未回收'
-                active_count += 1
-
-            normalized = {
-                'description': desc,
-                'location': item.get('location', ''),
-                'characters': item.get('characters', []),
-                # 如果没有明确记录,至少给一个可用的默认值(避免紧急度恒为0)
-                'introduced_chapter': item.get('introduced_chapter') or item.get('planted_chapter') or 1,
-                'resolved_chapter': item.get('resolved_chapter', None),
-            }
-
-            self._index_foreshadowing(normalized, current_chapter, status=status)
-
-        self.conn.commit()
-        print(f"✅ 伏笔索引已同步:{active_count} 条活跃 + {resolved_count} 条已回收")
-
-    def _index_foreshadowing(self, plot: Dict, current_chapter: int, status: str):
-        """为单个伏笔建立索引"""
-
-        # 计算紧急度
-        urgency = self._calculate_urgency(plot, current_chapter)
-
-        # 提取地点和角色(如果有)
-        location = plot.get('location', '')
-        characters = plot.get('characters', [])
-
-        self.conn.execute("""
-            INSERT OR REPLACE INTO foreshadowing_index
-            (id, content, location, characters, introduced_chapter, resolved_chapter, status, urgency, updated_at)
-            VALUES ((SELECT id FROM foreshadowing_index WHERE content = ?), ?, ?, ?, ?, ?, ?, ?, CURRENT_TIMESTAMP)
-        """, (
-            plot.get('description', ''),  # 用于查重
-            plot.get('description', ''),
-            location,
-            json.dumps(characters, ensure_ascii=False),
-            plot.get('introduced_chapter', 0),
-            plot.get('resolved_chapter', None),
-            status,
-            urgency
-        ))
-
-    def _calculate_urgency(self, plot: Dict, current_chapter: int) -> int:
-        """计算伏笔紧急度(0-100)
-
-        规则:
-        - 超过 100 章未回收 → 极度紧急(100)
-        - 超过 50 章未回收 → 中等紧急(60)
-        - 其他 → 正常(20)
-        """
-        introduced_ch = plot.get('introduced_chapter', 0)
-        chapters_pending = current_chapter - introduced_ch
-
-        if chapters_pending > 100:
-            return 100  # 极度紧急
-        elif chapters_pending > 50:
-            return 60   # 中等紧急
-        else:
-            return 20   # 正常
-
-    # ================== v4.0 实体同步(使用 entities_v3)==================
-
-    def sync_entities_from_state(self):
-        """从 state.json.entities_v3 同步实体到 entities/entity_aliases 表
-
-        v4.0 新增:取代旧的 sync_characters_from_state
-        数据源:state.json.entities_v3 + alias_index
-        """
-        if not self.state_file.exists():
-            print("❌ state.json 不存在,跳过实体同步")
-            return
-
-        with open(self.state_file, 'r', encoding='utf-8') as f:
-            state = json.load(f)
-
-        entities_v3 = state.get('entities_v3', {})
-        alias_index = state.get('alias_index', {})
-
-        # v4.0:索引层为派生数据,可直接重建(避免重复插入导致膨胀)
-        self.conn.execute("DELETE FROM entity_kv")
-        self.conn.execute("DELETE FROM entity_aliases")
-        self.conn.execute("DELETE FROM entity_history")
-        self.conn.execute("DELETE FROM entities")
-
-        entity_count = 0
-        alias_count = 0
-
-        # 遍历所有实体类型
-        for entity_type, entities in entities_v3.items():
-            for entity_id, entity_data in entities.items():
-                # 写入 entities 主表
-                canonical_name = entity_data.get('canonical_name', '')
-                tier = entity_data.get('tier', '')
-                desc = entity_data.get('desc', '')
-                created_chapter = entity_data.get('created_chapter', 0)
-
-                self.conn.execute("""
-                    INSERT OR REPLACE INTO entities
-                    (entity_id, entity_type, canonical_name, tier, desc, created_chapter, updated_at)
-                    VALUES (?, ?, ?, ?, ?, ?, CURRENT_TIMESTAMP)
-                """, (entity_id, entity_type, canonical_name, tier, desc, created_chapter))
-                entity_count += 1
-
-                # 写入实体 KV 属性
-                current = entity_data.get('current', {})
-                last_chapter = current.get("last_chapter", created_chapter) if isinstance(current, dict) else created_chapter
-                try:
-                    last_chapter = int(last_chapter)
-                except (TypeError, ValueError):
-                    last_chapter = int(created_chapter or 0)
-                for key, value in current.items():
-                    value_str = json.dumps(value, ensure_ascii=False) if isinstance(value, (dict, list)) else str(value)
-                    self.conn.execute("""
-                        INSERT OR REPLACE INTO entity_kv
-                        (entity_id, key, value, last_chapter)
-                        VALUES (?, ?, ?, ?)
-                    """, (entity_id, key, value_str, last_chapter))
-
-                # 写入历史记录
-                history = entity_data.get('history', [])
-                for record in history:
-                    chapter = record.get('chapter', 0)
-                    changes = record.get('changes', {})
-                    reasons = record.get('reasons', {})
-                    self.conn.execute("""
-                        INSERT OR IGNORE INTO entity_history
-                        (entity_id, chapter, changes_json, reasons_json)
-                        VALUES (?, ?, ?, ?)
-                    """, (entity_id, chapter, json.dumps(changes, ensure_ascii=False), json.dumps(reasons, ensure_ascii=False)))
-
-        # 同步别名索引
-        for alias, entries in alias_index.items():
-            # v4.0: entries 必须是数组(一对多)
-            if not isinstance(entries, list):
-                raise ValueError(
-                    f"alias_index 数据格式错误:期望 alias_index[{alias!r}] 为 list[{{type,id,...}}],实际为 {type(entries).__name__}"
-                )
-            for entry in entries:
-                entry_type = entry.get('type', '')
-                entry_id = entry.get('id', '')
-                first_seen = entry.get('first_seen_chapter', 0)
-                context = entry.get('context', '')
-
-                self.conn.execute("""
-                    INSERT OR REPLACE INTO entity_aliases
-                    (alias, entity_id, entity_type, first_seen_chapter, context)
-                    VALUES (?, ?, ?, ?, ?)
-                """, (alias, entry_id, entry_type, first_seen, context))
-                alias_count += 1
-
-        self.conn.commit()
-        print(f"✅ 实体索引已同步:{entity_count} 个实体,{alias_count} 个别名")
-
-    def query_entity_by_id(self, entity_id: str) -> Optional[Dict]:
-        """通过 entity_id 查询实体详情"""
-        cursor = self.conn.execute("""
-            SELECT entity_id, entity_type, canonical_name, tier, desc, created_chapter
-            FROM entities WHERE entity_id = ?
-        """, (entity_id,))
-        row = cursor.fetchone()
-        if not row:
-            return None
-
-        result = dict(row)
-
-        # 获取 KV 属性
-        cursor = self.conn.execute("""
-            SELECT key, value FROM entity_kv WHERE entity_id = ?
-        """, (entity_id,))
-        result['current'] = {}
-        for kv_row in cursor.fetchall():
-            try:
-                result['current'][kv_row['key']] = json.loads(kv_row['value'])
-            except json.JSONDecodeError:
-                result['current'][kv_row['key']] = kv_row['value']
-
-        # 获取别名
-        cursor = self.conn.execute("""
-            SELECT alias FROM entity_aliases WHERE entity_id = ?
-        """, (entity_id,))
-        result['aliases'] = [row['alias'] for row in cursor.fetchall()]
-
-        return result
-
-    def query_entities_by_alias(self, alias: str) -> List[Dict]:
-        """通过别名查询实体(支持一对多)"""
-        cursor = self.conn.execute("""
-            SELECT ea.entity_id, ea.entity_type, e.canonical_name, e.tier
-            FROM entity_aliases ea
-            LEFT JOIN entities e ON ea.entity_id = e.entity_id
-            WHERE ea.alias = ?
-        """, (alias,))
-        return [dict(row) for row in cursor.fetchall()]
-
-    def query_entities_by_type(self, entity_type: str, limit: int = 50) -> List[Dict]:
-        """按类型查询实体"""
-        cursor = self.conn.execute("""
-            SELECT entity_id, canonical_name, tier, desc
-            FROM entities
-            WHERE entity_type = ?
-            ORDER BY created_chapter DESC
-            LIMIT ?
-        """, (entity_type, limit))
-        return [dict(row) for row in cursor.fetchall()]
-
-    def sync_characters_from_state(self):
-        """从 state.json 同步角色数据到索引(v4.0 已废弃)
-
-        保留兼容:调用新的 sync_entities_from_state
-        """
-        # v4.0: 委托给新函数
-        self.sync_entities_from_state()
-
-    def _index_character(self, char: Dict, status: str = 'active'):
-        """为单个角色建立索引"""
-        description = char.get('description') or char.get('desc') or ''
-        tier = str(char.get('tier', '') or '').strip()
-        importance = char.get('importance') or ('major' if tier == '核心' else 'minor')
-
-        first_appearance = char.get('first_appearance_chapter', 0) or 0
-        try:
-            first_appearance = int(first_appearance)
-        except (TypeError, ValueError):
-            first_appearance = 0
-
-        if first_appearance == 0:
-            src = char.get('first_appearance')
-            if isinstance(src, str):
-                m = re.search(r'第(\d+)章', src)
-                if m:
-                    try:
-                        first_appearance = int(m.group(1))
-                    except ValueError:
-                        first_appearance = 0
-
-        last_appearance = char.get('last_appearance_chapter', 0) or first_appearance
-        try:
-            last_appearance = int(last_appearance)
-        except (TypeError, ValueError):
-            last_appearance = first_appearance
-
-        self.conn.execute("""
-            INSERT OR REPLACE INTO characters
-            (name, description, personality, importance, power_level,
-             first_appearance, last_appearance, status, updated_at)
-            VALUES (?, ?, ?, ?, ?, ?, ?, ?, CURRENT_TIMESTAMP)
-        """, (
-            char.get('name', ''),
-            description,
-            char.get('personality', ''),
-            importance,
-            char.get('power_level', ''),
-            first_appearance,
-            last_appearance,
-            status
-        ))
-
-    def mark_character_archived(self, name: str, archived_at: str = None):
-        """标记角色为已归档状态(Priority 2 修复)
-
-        Args:
-            name: 角色名
-            archived_at: 归档时间戳(ISO格式),默认当前时间
-        """
-        if archived_at is None:
-            from datetime import datetime
-            archived_at = datetime.now().isoformat()
-
-        self.conn.execute("""
-            UPDATE characters
-            SET status = 'archived', archived_at = ?, updated_at = CURRENT_TIMESTAMP
-            WHERE name = ?
-        """, (archived_at, name))
-        self.conn.commit()
-
-    def mark_character_active(self, name: str):
-        """恢复角色为活跃状态(与 mark_character_archived 对应)"""
-        self.conn.execute("""
-            UPDATE characters
-            SET status = 'active', archived_at = NULL, updated_at = CURRENT_TIMESTAMP
-            WHERE name = ?
-        """, (name,))
-        self.conn.commit()
-
-    def query_urgent_foreshadowing(self, threshold: int = 60) -> List[Dict]:
-        """查询紧急伏笔(urgency >= threshold)
-
-        Args:
-            threshold: 紧急度阈值(60=中等紧急,80=高度紧急,100=极度紧急)
-
-        Returns:
-            [{'content': '...', 'introduced_chapter': 45, 'urgency': 80}, ...]
-        """
-        cursor = self.conn.execute("""
-            SELECT content, introduced_chapter, urgency
-            FROM foreshadowing_index
-            WHERE status = '未回收' AND urgency >= ?
-            ORDER BY urgency DESC
-        """, (threshold,))
-
-        return [dict(row) for row in cursor.fetchall()]
-
-    def sync_relationships_from_state(self):
-        """从 state.json 同步关系数据到索引(v4.0: 使用 entity_id)
-
-        触发时机:
-        - extract_entities.py 更新关系后调用
-        - --rebuild-index 批量重建时调用
-
-        数据来源: state.json 的 structured_relationships 列表
-        """
-        if not self.state_file.exists():
-            print("❌ state.json 不存在,跳过关系同步")
-            return
-
-        # 读取 state.json
-        with open(self.state_file, 'r', encoding='utf-8') as f:
-            state = json.load(f)
-
-        # 获取结构化关系列表
-        relationships = state.get('structured_relationships', [])
-        if not relationships:
-            print("ℹ️ 无结构化关系数据")
-            return
-
-        count = 0
-        for rel in relationships:
-            # v4.0: 关系必须用 entity_id(chapter tags 是真相,避免 name 漂移)
-            char1_id = str(rel.get('char1_id', '') or '').strip()
-            char2_id = str(rel.get('char2_id', '') or '').strip()
-            char1_name = str(rel.get('char1_name', '') or '').strip()
-            char2_name = str(rel.get('char2_name', '') or '').strip()
-            rel_type = rel.get('type', 'ally')
-            intensity = rel.get('intensity', 50)
-            desc = rel.get('description', '')
-            last_chapter = rel.get('last_update_chapter', 0)
-
-            if not char1_id or not char2_id:
-                print("⚠️ 跳过无效关系(缺少 char1_id/char2_id)")
-                continue
-
-            # 补齐显示名(可选)
-            if not char1_name:
-                row = self.conn.execute("SELECT canonical_name FROM entities WHERE entity_id = ? LIMIT 1", (char1_id,)).fetchone()
-                char1_name = (row["canonical_name"] if row else "") or char1_id
-            if not char2_name:
-                row = self.conn.execute("SELECT canonical_name FROM entities WHERE entity_id = ? LIMIT 1", (char2_id,)).fetchone()
-                char2_name = (row["canonical_name"] if row else "") or char2_id
-
-            self.conn.execute("""
-                INSERT OR REPLACE INTO relationships
-                (id, char1_id, char2_id, char1_name, char2_name, relation_type, intensity, description, last_update_chapter, updated_at)
-                VALUES (
-                    (SELECT id FROM relationships WHERE char1_id = ? AND char2_id = ? AND relation_type = ?),
-                    ?, ?, ?, ?, ?, ?, ?, ?, CURRENT_TIMESTAMP
-                )
-            """, (
-                char1_id, char2_id, rel_type,  # for subquery
-                char1_id, char2_id, char1_name, char2_name, rel_type, intensity, desc, last_chapter
-            ))
-            count += 1
-
-        self.conn.commit()
-        print(f"✅ 关系索引已同步:{count} 条关系")
-
-    def query_relationships(self, char_id: str = None, rel_type: str = None) -> List[Dict]:
-        """查询角色关系(v4.0: 使用 entity_id)
-
-        Args:
-            char_id: 角色 entity_id(可选,查该角色的所有关系)
-            rel_type: 关系类型(可选,过滤特定类型)
-
-        Returns:
-            [{'char1_id': '...', 'char2_id': '...', 'type': 'romance', 'intensity': 80, ...}, ...]
-        """
-        conditions = []
-        params = []
-
-        if char_id:
-            conditions.append("(char1_id = ? OR char2_id = ?)")
-            params.extend([char_id, char_id])
-
-        if rel_type:
-            conditions.append("relation_type = ?")
-            params.append(rel_type)
-
-        where_clause = " AND ".join(conditions) if conditions else "1=1"
-
-        cursor = self.conn.execute(f"""
-            SELECT char1_id, char2_id, char1_name, char2_name, relation_type, intensity, description, last_update_chapter
-            FROM relationships
-            WHERE {where_clause}
-            ORDER BY intensity DESC
-        """, params)
-
-        return [dict(row) for row in cursor.fetchall()]
-
-    # ================== 核心功能 3:模糊查询(Fuzzy Search via SQL LIKE)==================
-
-    def fuzzy_search_entity(self, keywords: List[str], entity_type: str = None) -> List[Dict]:
-        """模糊查询实体(v4.0 新增,支持多关键词 + 类型过滤)
-
-        Args:
-            keywords: 关键词列表,如 ["李", "女弟子"]
-            entity_type: 可选,过滤实体类型(角色/地点/物品/势力/招式)
-
-        Returns:
-            [{'entity_id': '...', 'canonical_name': '...', 'desc': '...', 'tier': '...'}, ...]
-        """
-        # 构建 WHERE 子句
-        conditions = []
-        params = []
-
-        for kw in keywords:
-            # 每个关键词在 canonical_name/desc 任一字段中出现即可
-            conditions.append("(e.canonical_name LIKE ? OR e.desc LIKE ? OR ea.alias LIKE ?)")
-            params.extend([f'%{kw}%', f'%{kw}%', f'%{kw}%'])
-
-        if entity_type:
-            conditions.append("e.entity_type = ?")
-            params.append(entity_type)
-
-        where_clause = " AND ".join(conditions)
-
-        query = f"""
-            SELECT DISTINCT e.entity_id, e.entity_type, e.canonical_name, e.tier, e.desc, e.created_chapter
-            FROM entities e
-            LEFT JOIN entity_aliases ea ON e.entity_id = ea.entity_id
-            WHERE {where_clause}
-            ORDER BY e.tier DESC, e.created_chapter DESC
-            LIMIT 20
-        """
-
-        cursor = self.conn.execute(query, params)
-        return [dict(row) for row in cursor.fetchall()]
-
-    def fuzzy_search_character(self, keywords: List[str]) -> List[Dict]:
-        """模糊查询角色(v4.0: 委托给 fuzzy_search_entity)
-
-        Args:
-            keywords: 关键词列表,如 ["李", "女弟子"]
-
-        Returns:
-            [{'entity_id': '...', 'canonical_name': '...', 'desc': '...', ...}, ...]
-        """
-        return self.fuzzy_search_entity(keywords, entity_type="角色")
-
-    # ================== 批量操作 ==================
-
-    def rebuild_all_indexes(self):
-        """批量重建所有历史章节的索引
-
-        使用场景:
-        - 索引系统首次上线
-        - 索引数据库损坏
-        """
-        if not self.chapters_dir.exists():
-            print("❌ 章节目录不存在")
-            return
-
-        # 获取所有章节文件
-        chapter_files = sorted(self.chapters_dir.rglob("第*.md"))
-
-        print(f"🔍 发现 {len(chapter_files)} 个章节文件,开始重建索引...")
-
-        seen = set()
-        for chapter_file in chapter_files:
-            # 提取章节编号
-            match = re.search(r'第(\d+)章', chapter_file.name)
-            if not match:
-                continue
-
-            chapter_num = int(match.group(1))
-            if chapter_num in seen:
-                continue
-            seen.add(chapter_num)
-
-            # 重建索引
-            self._rebuild_chapter_index(chapter_num, chapter_file)
-
-        # 同步伏笔索引
-        self.sync_foreshadowing_from_state()
-        self.sync_characters_from_state()
-        self.sync_relationships_from_state()
-
-        print(f"✅ 批量重建完成:{len(seen)} 章")
-
-    # ================== 查询与统计 ==================
-
-    def get_index_stats(self) -> Dict:
-        """获取索引统计信息(v4.0: 增加实体/别名统计)"""
-
-        # 章节统计
-        cursor = self.conn.execute("SELECT COUNT(*) as count FROM chapters")
-        chapter_count = cursor.fetchone()['count']
-
-        # 实体统计(v4.0 新增)
-        cursor = self.conn.execute("""
-            SELECT entity_type, COUNT(*) as count
-            FROM entities
-            GROUP BY entity_type
-        """)
-        entity_stats = {row['entity_type']: row['count'] for row in cursor.fetchall()}
-
-        # 别名统计(v4.0 新增)
-        cursor = self.conn.execute("SELECT COUNT(*) as count FROM entity_aliases")
-        alias_count = cursor.fetchone()['count']
-
-        # 伏笔统计
-        cursor = self.conn.execute("""
-            SELECT status, COUNT(*) as count
-            FROM foreshadowing_index
-            GROUP BY status
-        """)
-        foreshadowing_stats = {row['status']: row['count'] for row in cursor.fetchall()}
-
-        # 关系统计
-        cursor = self.conn.execute("SELECT COUNT(*) as count FROM relationships")
-        relationship_count = cursor.fetchone()['count']
-
-        # 数据库大小
-        db_size_kb = self.index_db.stat().st_size / 1024
-
-        return {
-            'chapter_count': chapter_count,
-            'entity_stats': entity_stats,
-            'alias_count': alias_count,
-            'foreshadowing_active': foreshadowing_stats.get('未回收', 0),
-            'foreshadowing_resolved': foreshadowing_stats.get('已回收', 0),
-            'relationship_count': relationship_count,
-            'db_size_kb': round(db_size_kb, 2)
-        }
-
-    def __del__(self):
-        """析构函数:关闭数据库连接"""
-        if hasattr(self, 'conn'):
-            self.conn.close()
-
-
-def main():
-    parser = argparse.ArgumentParser(description="结构化索引系统(取代向量化检索)")
-
-    # 更新操作
-    parser.add_argument("--update-chapter", type=int, metavar="NUM", help="更新单章索引")
-    parser.add_argument("--metadata", metavar="PATH", help="章节文件路径(配合 --update-chapter)")
-    parser.add_argument("--metadata-json", metavar="JSON", help="元数据 JSON 字符串(配合 --update-chapter,由 metadata-extractor agent 提供)")
-    parser.add_argument("--metadata-file", metavar="FILE", help="元数据 JSON 文件路径(配合 --update-chapter,Windows 推荐使用此参数)")
-
-    # 批量操作
-    parser.add_argument("--rebuild-index", action="store_true", help="批量重建所有索引")
-
-    # 查询操作
-    parser.add_argument("--query-location", metavar="LOCATION", help="查询地点相关章节")
-    parser.add_argument("--query-urgent-foreshadowing", action="store_true", help="查询紧急伏笔")
-    parser.add_argument("--fuzzy-search", nargs='+', metavar="KEYWORD", help="模糊查询角色(多个关键词)")
-
-    # 统计信息
-    parser.add_argument("--stats", action="store_true", help="显示索引统计信息")
-
-    # 项目路径
-    parser.add_argument("--project-root", metavar="PATH", help="项目根目录(默认为当前目录)")
-
-    args = parser.parse_args()
-
-    # 创建索引管理器
-    index = StructuredIndex(project_root=args.project_root)
-
-    # 执行操作
-    if args.update_chapter:
-        # 模式1:从 JSON 文件读取(Windows 推荐,避免 CLI 引号转义问题)
-        if args.metadata_file:
-            try:
-                metadata_file = Path(args.metadata_file)
-                if not metadata_file.exists():
-                    print(f"❌ 元数据文件不存在: {metadata_file}")
-                    return
-
-                with open(metadata_file, 'r', encoding='utf-8') as f:
-                    metadata = json.load(f)
-
-                # 验证必需字段
-                required_fields = ['title', 'location', 'characters', 'word_count', 'hash']
-                missing_fields = [f for f in required_fields if f not in metadata]
-
-                if missing_fields:
-                    print(f"❌ JSON 缺少必需字段: {', '.join(missing_fields)}")
-                    return
-
-                # 先同步实体(用于将 metadata.characters/name 解析为 entity_id)
-                index.sync_entities_from_state()
-
-                # 更新章节索引
-                index.index_chapter(args.update_chapter, metadata)
-
-                # 同步伏笔索引
-                index.sync_foreshadowing_from_state()
-                # bump_character_last_appearance_in_state 已删除(v4.0)
-                index.sync_relationships_from_state()
-
-            except json.JSONDecodeError as e:
-                print(f"❌ JSON 解析失败: {e}")
-                return
-
-        # 模式2:直接接收 JSON 字符串(Linux/macOS,或测试时使用)
-        elif args.metadata_json:
-            try:
-                metadata = json.loads(args.metadata_json)
-
-                # 验证必需字段
-                required_fields = ['title', 'location', 'characters', 'word_count', 'hash']
-                missing_fields = [f for f in required_fields if f not in metadata]
-
-                if missing_fields:
-                    print(f"❌ JSON 缺少必需字段: {', '.join(missing_fields)}")
-                    return
-
-                # 先同步实体(用于将 metadata.characters/name 解析为 entity_id)
-                index.sync_entities_from_state()
-
-                # 更新章节索引
-                index.index_chapter(args.update_chapter, metadata)
-
-                # 同步伏笔索引
-                index.sync_foreshadowing_from_state()
-                # bump_character_last_appearance_in_state 已删除(v4.0)
-                index.sync_relationships_from_state()
-
-            except json.JSONDecodeError as e:
-                print(f"❌ JSON 解析失败: {e}")
-                return
-
-        # 模式3:从章节文件提取元数据(旧模式,保持向后兼容)
-        elif args.metadata:
-            # 读取章节文件
-            chapter_file = Path(args.metadata)
-            if not chapter_file.exists():
-                print(f"❌ 章节文件不存在: {chapter_file}")
-                return
-
-            # 提取元数据
-            with open(chapter_file, 'r', encoding='utf-8') as f:
-                content = f.read()
-
-            metadata = index._extract_metadata_from_content(content, args.update_chapter)
-
-            # 先同步实体(用于将 metadata.characters/name 解析为 entity_id)
-            index.sync_entities_from_state()
-
-            # 更新章节索引
-            index.index_chapter(args.update_chapter, metadata)
-
-            # 同步伏笔索引
-            index.sync_foreshadowing_from_state()
-            # bump_character_last_appearance_in_state 已删除(v4.0)
-            index.sync_relationships_from_state()
-
-        else:
-            print("❌ 缺少参数:--metadata-file (推荐) / --metadata-json / --metadata")
-            return
-
-    elif args.rebuild_index:
-        index.rebuild_all_indexes()
-
-    elif args.query_location:
-        results = index.query_chapters_by_location(args.query_location)
-
-        if not results:
-            print(f"未找到地点相关章节: {args.query_location}")
-        else:
-            print(f"找到 {len(results)} 个相关章节:")
-            for chapter_num, title, characters in results:
-                print(f"  Ch{chapter_num}: {title} - 角色: {characters}")
-
-    elif args.query_urgent_foreshadowing:
-        results = index.query_urgent_foreshadowing(threshold=60)
-
-        if not results:
-            print("✅ 无紧急伏笔")
-        else:
-            print(f"⚠️ 检测到 {len(results)} 条紧急伏笔:")
-            for item in results:
-                print(f"  - {item['content'][:30]}...(第 {item['introduced_chapter']} 章埋设,紧急度 {item['urgency']}/100)")
-
-    elif args.fuzzy_search:
-        results = index.fuzzy_search_character(args.fuzzy_search)
-
-        if not results:
-            print(f"未找到匹配角色: {' + '.join(args.fuzzy_search)}")
-        else:
-            print(f"找到 {len(results)} 个匹配角色:")
-            for i, char in enumerate(results, 1):
-                # v4.0: 使用新字段名
-                name = char.get('canonical_name', char.get('name', ''))
-                desc = char.get('desc', char.get('description', ''))[:50]
-                tier = char.get('tier', '')
-                print(f"{i}. {name} [{tier}] - {desc}...")
-
-    elif args.stats:
-        stats = index.get_index_stats()
-
-        print("📊 索引统计信息:")
-        print(f"   章节索引: {stats['chapter_count']}")
-
-        # v4.0: 显示实体统计
-        entity_stats = stats.get('entity_stats', {})
-        if entity_stats:
-            entity_summary = ", ".join([f"{t}: {c}" for t, c in entity_stats.items()])
-            print(f"   实体索引: {entity_summary}")
-        print(f"   别名索引: {stats.get('alias_count', 0)}")
-
-        print(f"   伏笔索引: {stats['foreshadowing_active']} 条活跃 + {stats['foreshadowing_resolved']} 条已回收")
-        print(f"   关系索引: {stats['relationship_count']}")
-        print(f"   数据库大小: {stats['db_size_kb']} KB")
-
-    else:
-        parser.print_help()
-
-
-if __name__ == "__main__":
-    # Windows UTF-8 编码修复(仅在脚本直接运行时)
-    if sys.platform == 'win32':
-        import io
-        sys.stdout = io.TextIOWrapper(sys.stdout.buffer, encoding='utf-8')
-        sys.stderr = io.TextIOWrapper(sys.stderr.buffer, encoding='utf-8')
-
-    main()

+ 5 - 5
.claude/skills/webnovel-query/references/tag-specification.md

@@ -1,15 +1,15 @@
 ---
 name: tag-specification
-purpose: XML 标签格式参考(v5.0 可选使用)
-version: "5.0"
+purpose: XML 标签格式参考(v5.1 可选使用)
+version: "5.1"
 ---
 
 <context>
 此文件用于 XML 标签格式参考。
 
-**v5.0 重要变更**:
+**v5.1 重要变更**:
 - 章节写作时**不再要求**添加 XML 标签
-- Data Agent 会自动从纯正文中提取实体
+- Data Agent 会自动从纯正文中提取实体,写入 index.db
 - 标签仅用于**手动标注**场景(如明确标记重要实体、补充提取遗漏)
 - 如果你选择使用标签,请遵循以下规范
 </context>
@@ -40,7 +40,7 @@ version: "5.0"
 
 ### id / ref(实体引用)
 - **id(推荐)**: 稳定唯一标识(便于后续更新/加别名)
-- **ref**: 用已出现过的名称/别名引用(脚本会通过 `alias_index` 自动解析)
+- **ref**: 用已出现过的名称/别名引用(通过 index.db aliases 表自动解析)
 - **type(可选)**: 当 ref 有歧义时用于消歧(如同名不同人);若仍歧义必须改用 `id`
 
 ### `<entity-update>` 子操作