瀏覽代碼

fix: align projection writers with real LLM commit schema

DeepSeek v4pro 实际输出的 commit payload 字段名与各 projection writer 期望的
schema 不一致,导致投影报 done 但状态/记忆/索引层全部漏写或误写。

- state_projection_writer: 接受 field_path/new_value/old_value,支持点号路径
  展开成嵌套字典;新增 _collect_protagonist_ids,通过 is_protagonist /
  tier=主角 / canonical_name 三种信号识别主角并把 state_delta 镜像到
  state.protagonist_state
- memory/writer: state_changes 和 entities_new 接受 field_path/new_value/
  entity_type;open_loop_created 加 _coerce_loop_content(content >
  unanswered_question > loop_type+description fallback);world_rule_revealed
  接受 rule_content/description
- vector_projection_writer: character_state_changed 加 description/new_state
  兜底;新增 open_loop_created 文本生成
- index_manager.apply_entity_delta: 识别 tier=主角 自动设 is_protagonist;
  接受 entity_type/field_path/new_value,从 payload.name 取 canonical_name
- index_entity_mixin.upsert_entity: UPDATE 含 type 字段,可修正历史误标
- index_entity_mixin.get_entity: 加分隔符剥离的兜底(lu_ming → luming)
- agents/data-agent.md: 新增 §7.1 字段命名硬性约定,文档化所有 event_type
  payload 必填字段,避免 LLM 自由发挥
- tests/test_projection_schema_compat.py: 17 个用真实 DeepSeek v4pro 输出
  schema 作为 fixture 的端到端测试

测试现状 508 全过;旧 schema (field/new) 兼容路径保留,向后兼容无破坏。
lingfengQAQ 1 月之前
父節點
當前提交
f58d657f5d

+ 14 - 1
webnovel-writer/agents/data-agent.md

@@ -88,7 +88,7 @@ hook_strength: "strong"
   "entities_appeared": [{"id": "xiaoyan", "type": "角色", "mentions": ["萧炎"], "confidence": 0.95}],
   "entities_new": [{"suggested_id": "hongyi_girl", "name": "红衣女子", "type": "角色", "tier": "装饰"}],
   "state_deltas": [{"entity_id": "xiaoyan", "field": "realm", "old": "斗者", "new": "斗师"}],
-  "entity_deltas": [{"entity_id": "hongyi_girl", "action": "upsert", "payload": {"name": "红衣女子"}}],
+  "entity_deltas": [{"entity_id": "hongyi_girl", "action": "upsert", "entity_type": "角色", "tier": "装饰", "payload": {"name": "红衣女子"}}],
   "accepted_events": [],
   "summary_text": "摘要",
   "scenes_chunked": 4,
@@ -97,6 +97,19 @@ hook_strength: "strong"
 }
 ```
 
+### 7.1 字段命名硬性约定(投影器读不到不同义词,必须严格遵守)
+
+- **state_deltas 子项**:必须用 `field`(不是 `field_path`),`new`(不是 `new_value`),`old`(不是 `old_value`)。简单字段名直接写(如 `realm`),嵌套路径用点号(如 `power.realm`、`location.current`)。投影器会自动展开嵌套字典。
+- **entity_deltas 子项**:必须用 `entity_type`(不是 `type`),值为 `角色|组织|地点|物品|势力` 等,不是默认填 `"角色"`。`is_protagonist: true` 用于标记主角,主角字段会同步到 `state.protagonist_state`。
+- **accepted_events 通用**:`event_type` 用枚举值(`character_state_changed|power_breakthrough|relationship_changed|world_rule_revealed|world_rule_broken|open_loop_created|promise_created|promise_paid_off|artifact_obtained`)。`subject` 是事件主体的 entity_id(不是中文名)。
+- **character_state_changed.payload**:用 `field`(或 `field_path`)+ `new`(或 `new_state`/`new_value`)+ `old`(或 `previous_state`/`old_value`)。建议直接用 `field` + `new` + `old` 与 state_deltas 保持一致。
+- **open_loop_created.payload**:必须有 `content`(悬念正文),可选 `loop_type`(悬念类型)、`unanswered_question`(核心疑问)、`urgency`、`planted_chapter`、`expected_payoff`/`loop_deadline`。投影器会从 content > unanswered_question > description 取值,不要省略 content。
+- **world_rule_revealed.payload**:必须有 `rule_content`(或 `rule`、`description`),可选 `rule_category` / `domain`、`scope`。
+- **relationship_changed.payload**:必须有 `to_entity` 和 `relationship_type`(不是 `type`)。
+- **artifact_obtained.payload**:必须有 `artifact_id`、`name`、`owner`(或 `holder`)。
+
+注:旧字段名(`field_path`、`new_value`、`type`、`description` 等)作为兼容输入也能被正确投影,但首选清单中列出的规范名。
+
 ## 8. 错误处理
 
 artifacts 失败→重跑 C/D。commit 失败→修复 JSON 后补提。索引失败→只补跑 E。耗时>30s→附原因。

+ 20 - 1
webnovel-writer/scripts/data_modules/index_entity_mixin.py

@@ -94,6 +94,7 @@ class IndexEntityMixin:
                     cursor.execute(
                         """
                         UPDATE entities SET
+                            type = ?,
                             canonical_name = ?,
                             tier = ?,
                             desc = ?,
@@ -105,6 +106,7 @@ class IndexEntityMixin:
                         WHERE id = ?
                     """,
                         (
+                            entity.type,
                             entity.canonical_name,
                             entity.tier,
                             entity.desc,
@@ -170,7 +172,24 @@ class IndexEntityMixin:
                 return self._row_to_dict(row, parse_json=["current_json"])
 
         alias_matches = self.get_entities_by_alias(entity_id)
-        return alias_matches[0] if alias_matches else None
+        if alias_matches:
+            return alias_matches[0]
+
+        # ID 命名风格兜底:调用方传 'lu_ming' 但实体登记为 'luming' 时,
+        # 把分隔符(下划线/连字符)剥掉再试一次直接 ID 查询。
+        compact = str(entity_id or "").replace("_", "").replace("-", "").strip()
+        if compact and compact != entity_id:
+            with self._get_conn() as conn:
+                cursor = conn.cursor()
+                cursor.execute("SELECT * FROM entities WHERE id = ?", (compact,))
+                row = cursor.fetchone()
+                if row:
+                    return self._row_to_dict(row, parse_json=["current_json"])
+            alias_matches = self.get_entities_by_alias(compact)
+            if alias_matches:
+                return alias_matches[0]
+
+        return None
 
     def get_entities_by_type(
         self, entity_type: str, include_archived: bool = False

+ 35 - 9
webnovel-writer/scripts/data_modules/index_manager.py

@@ -659,22 +659,48 @@ class IndexManager(IndexChapterMixin, IndexEntityMixin, IndexDebtMixin, IndexRea
             return False
 
         current = dict(delta.get("current") or {})
-        field = str(delta.get("field") or "").strip()
-        if field and "new" in delta and field not in current:
-            current[field] = delta.get("new")
-
-        canonical_name = str(delta.get("canonical_name") or delta.get("name") or entity_id).strip()
+        field = str(delta.get("field") or delta.get("field_path") or "").strip()
+        if field:
+            new_value = (
+                delta.get("new")
+                if "new" in delta
+                else delta.get("new_value")
+                if "new_value" in delta
+                else None
+            )
+            if new_value is not None and field not in current:
+                current[field] = new_value
+
+        payload = delta.get("payload") or {}
+        canonical_name = str(
+            delta.get("canonical_name")
+            or delta.get("name")
+            or payload.get("name")
+            or entity_id
+        ).strip()
+
+        tier = str(delta.get("tier") or "装饰").strip() or "装饰"
         is_protagonist = bool(delta.get("is_protagonist"))
-        if "is_protagonist" not in delta:
+        # tier='主角' 视同 is_protagonist=True(LLM 实际输出常用 tier 标注)
+        if not is_protagonist and tier == "主角":
+            is_protagonist = True
+        if "is_protagonist" not in delta and tier != "主角":
             existing = self.get_entity(entity_id)
             if existing:
                 is_protagonist = bool(existing.get("is_protagonist"))
+
+        entity_type = str(
+            delta.get("type")
+            or delta.get("entity_type")
+            or "角色"
+        ).strip() or "角色"
+
         entity = EntityMeta(
             id=entity_id,
-            type=str(delta.get("type") or "角色").strip() or "角色",
+            type=entity_type,
             canonical_name=canonical_name,
-            tier=str(delta.get("tier") or "装饰").strip() or "装饰",
-            desc=str(delta.get("desc") or "").strip(),
+            tier=tier,
+            desc=str(delta.get("desc") or delta.get("description") or "").strip(),
             current=current,
             first_appearance=chapter,
             last_appearance=chapter,

+ 81 - 10
webnovel-writer/scripts/data_modules/memory/writer.py

@@ -29,6 +29,32 @@ class MemoryWriter:
         stats["items_updated"] += int(result.get("updated", 0))
         stats["items_outdated"] += int(result.get("outdated", 0))
 
+    @staticmethod
+    def _coerce_loop_content(payload: Dict[str, Any], event: Dict[str, Any]) -> str:
+        """从 open_loop 事件 payload 多个候选字段里取出有意义的悬念内容。
+
+        优先级:content(旧 schema)→ unanswered_question(信息悬疑)
+        → loop_type + description(结构化)→ description → subject 兜底。
+        若兜底到 subject(通常是角色 ID),加上 loop_type 前缀避免变成纯 ID。
+        """
+        for key in ("content", "unanswered_question"):
+            value = str(payload.get(key) or "").strip()
+            if value:
+                return value
+
+        description = str(payload.get("description") or "").strip()
+        loop_type = str(payload.get("loop_type") or "").strip()
+
+        if description and loop_type:
+            return f"{loop_type}:{description}"
+        if description:
+            return description
+        if loop_type:
+            return loop_type
+
+        subject = str(event.get("subject") or "").strip()
+        return subject
+
     def update_from_chapter_result(self, chapter: int, result: Dict[str, Any]) -> Dict[str, Any]:
         stats: Dict[str, Any] = {
             "chapter": int(chapter),
@@ -41,17 +67,31 @@ class MemoryWriter:
         # Stage 2: 零成本结构化映射
         for change in result.get("state_changes", []) or []:
             entity_id = str(change.get("entity_id", "") or "").strip()
-            field = str(change.get("field", "") or "").strip()
+            field = str(
+                change.get("field", "")
+                or change.get("field_path", "")
+                or ""
+            ).strip()
             if not entity_id or not field:
                 continue
+            new_val = change.get("new")
+            if new_val is None:
+                new_val = change.get("new_value")
+            if new_val is None:
+                new_val = change.get("to")
+            old_val = change.get("old")
+            if old_val is None:
+                old_val = change.get("old_value")
+            if old_val is None:
+                old_val = change.get("from")
             item = MemoryItem(
                 id=self._item_id("character_state", entity_id, field, chapter),
                 layer="semantic",
                 category="character_state",
                 subject=entity_id,
                 field=field,
-                value=str(change.get("new", "") or ""),
-                payload={"old_value": change.get("old")},
+                value=str(new_val if new_val is not None else "" or ""),
+                payload={"old_value": old_val},
                 source_chapter=int(chapter),
                 evidence=[f"state_change:{entity_id}:{field}:{chapter}"],
             )
@@ -69,7 +109,10 @@ class MemoryWriter:
                 subject=entity_id,
                 field="first_seen",
                 value=name,
-                payload={"tier": entity.get("tier"), "type": entity.get("type")},
+                payload={
+                    "tier": entity.get("tier"),
+                    "type": entity.get("type") or entity.get("entity_type"),
+                },
                 source_chapter=int(chapter),
                 evidence=[f"entity_new:{entity_id}:{chapter}"],
             )
@@ -242,28 +285,52 @@ class MemoryWriter:
             event_type = str(event.get("event_type") or "").strip()
             payload = event.get("payload") or {}
             if event_type in {"world_rule_revealed", "world_rule_broken"}:
-                rule_text = str(payload.get("proposed_value") or payload.get("rule") or payload.get("base_value") or "").strip()
+                rule_text = str(
+                    payload.get("rule_content")
+                    or payload.get("proposed_value")
+                    or payload.get("rule")
+                    or payload.get("base_value")
+                    or payload.get("description")
+                    or ""
+                ).strip()
                 if rule_text:
                     memory_facts["world_rules"].append(
                         {
                             "rule": rule_text,
                             "scope": payload.get("scope") or "global",
-                            "domain": payload.get("domain") or event.get("subject") or "global",
+                            "domain": (
+                                payload.get("domain")
+                                or payload.get("rule_category")
+                                or event.get("subject")
+                                or "global"
+                            ),
                             "field": payload.get("field") or event_type,
                         }
                     )
             elif event_type == "open_loop_created":
-                content = str(payload.get("content") or event.get("subject") or "").strip()
+                content = self._coerce_loop_content(payload, event)
                 if content:
                     memory_facts["open_loops"].append(
                         {
                             "content": content,
                             "status": payload.get("status") or "active",
                             "urgency": payload.get("urgency") or 0,
+                            "planted_chapter": (
+                                payload.get("planted_chapter") or event.get("chapter") or chapter
+                            ),
+                            "expected_payoff": (
+                                payload.get("expected_payoff")
+                                or payload.get("loop_deadline")
+                            ),
                         }
                     )
             elif event_type in {"promise_created", "promise_paid_off"}:
-                content = str(payload.get("content") or event.get("subject") or "").strip()
+                content = str(
+                    payload.get("content")
+                    or payload.get("description")
+                    or event.get("subject")
+                    or ""
+                ).strip()
                 if content:
                     memory_facts["reader_promises"].append(
                         {
@@ -277,8 +344,12 @@ class MemoryWriter:
             "entities_new": [
                 {
                     "suggested_id": row.get("entity_id") or row.get("id"),
-                    "name": row.get("canonical_name") or row.get("name") or row.get("entity_id") or row.get("id"),
-                    "type": row.get("type") or "角色",
+                    "name": row.get("canonical_name")
+                    or (row.get("payload") or {}).get("name")
+                    or row.get("name")
+                    or row.get("entity_id")
+                    or row.get("id"),
+                    "type": row.get("type") or row.get("entity_type") or "角色",
                     "tier": row.get("tier") or "装饰",
                 }
                 for row in entity_deltas

+ 103 - 10
webnovel-writer/scripts/data_modules/state_projection_writer.py

@@ -5,6 +5,7 @@ from __future__ import annotations
 import re
 from datetime import datetime
 from pathlib import Path
+from typing import Any
 
 from .story_contracts import read_json_if_exists, write_json
 
@@ -41,13 +42,19 @@ class StateProjectionWriter:
         progress = state.setdefault("progress", {})
         chapter_status = progress.setdefault("chapter_status", {})
 
+        protagonist_ids = self._collect_protagonist_ids(commit_payload, state)
+
         applied_count = 0
         for delta in self._collect_state_deltas(commit_payload):
             entity_id = str(delta.get("entity_id") or "").strip()
             field = str(delta.get("field") or "").strip()
             if not entity_id or not field:
                 continue
-            entity_state.setdefault(entity_id, {})[field] = delta.get("new")
+            new_value = delta.get("new")
+            entity_bucket = entity_state.setdefault(entity_id, {})
+            self._set_path(entity_bucket, field, new_value)
+            if entity_id in protagonist_ids:
+                self._set_path(state.setdefault("protagonist_state", {}), field, new_value)
             applied_count += 1
 
         if chapter > 0:
@@ -79,12 +86,13 @@ class StateProjectionWriter:
         }
 
     def _collect_state_deltas(self, commit_payload: dict) -> list[dict]:
-        deltas = [dict(delta) for delta in (commit_payload.get("state_deltas") or []) if isinstance(delta, dict)]
+        deltas = [
+            self._normalize_state_delta(delta)
+            for delta in (commit_payload.get("state_deltas") or [])
+            if isinstance(delta, dict)
+        ]
         seen = {
-            (
-                str(delta.get("entity_id") or "").strip(),
-                str(delta.get("field") or "").strip(),
-            )
+            (str(delta.get("entity_id") or "").strip(), str(delta.get("field") or "").strip())
             for delta in deltas
         }
 
@@ -99,9 +107,11 @@ class StateProjectionWriter:
 
             field = ""
             if event_type == "power_breakthrough":
-                field = str(payload.get("field") or "realm").strip()
+                field = (
+                    str(payload.get("field") or payload.get("field_path") or "realm").strip()
+                )
             elif event_type == "character_state_changed":
-                field = str(payload.get("field") or "").strip()
+                field = str(payload.get("field") or payload.get("field_path") or "").strip()
             else:
                 continue
 
@@ -114,12 +124,95 @@ class StateProjectionWriter:
                 {
                     "entity_id": entity_id,
                     "field": field,
-                    "old": payload.get("old") if "old" in payload else payload.get("from"),
-                    "new": payload.get("new") if "new" in payload else payload.get("to"),
+                    "old": (
+                        payload.get("old")
+                        if "old" in payload
+                        else payload.get("from")
+                        if "from" in payload
+                        else payload.get("old_value")
+                        if "old_value" in payload
+                        else payload.get("previous_state")
+                    ),
+                    "new": (
+                        payload.get("new")
+                        if "new" in payload
+                        else payload.get("to")
+                        if "to" in payload
+                        else payload.get("new_value")
+                        if "new_value" in payload
+                        else payload.get("new_state")
+                    ),
                 }
             )
         return deltas
 
+    @staticmethod
+    def _normalize_state_delta(delta: dict) -> dict:
+        """统一 state_delta 字段名:field/field_path → field, new/new_value → new."""
+        result = dict(delta)
+        if "field" not in result and "field_path" in result:
+            result["field"] = result["field_path"]
+        if "new" not in result and "new_value" in result:
+            result["new"] = result["new_value"]
+        if "old" not in result and "old_value" in result:
+            result["old"] = result["old_value"]
+        return result
+
+    @staticmethod
+    def _set_path(target: dict, path: str, value: Any) -> None:
+        """支持点号路径写入嵌套字典:'power.realm' → target['power']['realm']=value。"""
+        if not isinstance(target, dict) or not path:
+            return
+        if "." not in path:
+            target[path] = value
+            return
+        parts = path.split(".")
+        cursor = target
+        for part in parts[:-1]:
+            nxt = cursor.get(part)
+            if not isinstance(nxt, dict):
+                nxt = {}
+                cursor[part] = nxt
+            cursor = nxt
+        cursor[parts[-1]] = value
+
+    def _collect_protagonist_ids(self, commit_payload: dict, state: dict) -> set[str]:
+        """聚合本次 commit + state.json 中已知的主角实体 ID。
+
+        识别信号(任一命中即视为主角):
+        - entity_deltas 子项 `is_protagonist: true`
+        - entity_deltas 子项 `tier == "主角"`
+        - entity_deltas 的 canonical_name 与 state.protagonist_state.name 相同
+        - state.protagonist_state.entity_id 已经被显式设置过
+        """
+        ids: set[str] = set()
+
+        protagonist_state = state.get("protagonist_state") or {}
+        existing_eid = str(protagonist_state.get("entity_id") or "").strip()
+        if existing_eid:
+            ids.add(existing_eid)
+        protagonist_name = str(protagonist_state.get("name") or "").strip()
+
+        for delta in commit_payload.get("entity_deltas") or []:
+            if not isinstance(delta, dict):
+                continue
+            eid = str(delta.get("entity_id") or delta.get("id") or "").strip()
+            if not eid:
+                continue
+            tier = str(delta.get("tier") or "").strip()
+            canonical = str(
+                delta.get("canonical_name")
+                or (delta.get("payload") or {}).get("name")
+                or ""
+            ).strip()
+            if (
+                delta.get("is_protagonist")
+                or tier == "主角"
+                or (protagonist_name and canonical == protagonist_name)
+            ):
+                ids.add(eid)
+        return ids
+
     def _project_total_words(self, chapter_status: dict) -> int:
         total = 0
         for raw_chapter, raw_status in chapter_status.items():

+ 602 - 0
webnovel-writer/scripts/data_modules/tests/test_projection_schema_compat.py

@@ -0,0 +1,602 @@
+#!/usr/bin/env python3
+# -*- coding: utf-8 -*-
+"""Schema 兼容性测试:保护投影器在 LLM (DeepSeek v4pro) 实际输出 schema 下的行为。
+
+背景:data-agent.md 提示词写的是 `{"field": "realm", "new": "..."}`,
+但 LLM 实际输出 `{"field_path": "physical.condition", "new_value": "..."}`,
+entity_deltas 用 `entity_type` 而非 `type`,open_loop_created 事件 payload 没有 `content`
+但有 `description/loop_type/unanswered_question`。
+
+这些测试用真实 DeepSeek v4pro 输出形态作为 fixture,确保投影器同时接受新旧 schema。
+"""
+
+import json
+
+from data_modules.config import DataModulesConfig
+from data_modules.memory.store import ScratchpadManager
+from data_modules.memory_projection_writer import MemoryProjectionWriter
+from data_modules.state_projection_writer import StateProjectionWriter
+from data_modules.vector_projection_writer import VectorProjectionWriter
+
+
+# ============================================================
+# state_projection_writer:接受 field_path / new_value
+# ============================================================
+
+
+def test_state_writer_accepts_field_path_alias(tmp_path):
+    (tmp_path / ".webnovel").mkdir(parents=True, exist_ok=True)
+    (tmp_path / ".webnovel" / "state.json").write_text("{}", encoding="utf-8")
+    writer = StateProjectionWriter(tmp_path)
+
+    writer.apply(
+        {
+            "meta": {"status": "accepted", "chapter": 2},
+            "state_deltas": [
+                {
+                    "entity_id": "luming",
+                    "field_path": "physical.condition",
+                    "old_value": "虚弱",
+                    "new_value": "虚弱(持续)",
+                    "change_type": "confirmed",
+                }
+            ],
+        }
+    )
+
+    payload = json.loads((tmp_path / ".webnovel" / "state.json").read_text(encoding="utf-8"))
+    luming = payload["entity_state"]["luming"]
+    # 嵌套路径展开成字典
+    assert luming["physical"]["condition"] == "虚弱(持续)"
+
+
+def test_state_writer_accepts_flat_field_legacy(tmp_path):
+    """既有 schema field/new 也必须继续工作(向后兼容)。"""
+    (tmp_path / ".webnovel").mkdir(parents=True, exist_ok=True)
+    (tmp_path / ".webnovel" / "state.json").write_text("{}", encoding="utf-8")
+    writer = StateProjectionWriter(tmp_path)
+
+    writer.apply(
+        {
+            "meta": {"status": "accepted", "chapter": 3},
+            "state_deltas": [{"entity_id": "x", "field": "realm", "new": "斗师"}],
+        }
+    )
+
+    payload = json.loads((tmp_path / ".webnovel" / "state.json").read_text(encoding="utf-8"))
+    assert payload["entity_state"]["x"]["realm"] == "斗师"
+
+
+def test_state_writer_handles_array_value_in_field_path(tmp_path):
+    (tmp_path / ".webnovel").mkdir(parents=True, exist_ok=True)
+    (tmp_path / ".webnovel" / "state.json").write_text("{}", encoding="utf-8")
+    writer = StateProjectionWriter(tmp_path)
+
+    writer.apply(
+        {
+            "meta": {"status": "accepted", "chapter": 2},
+            "state_deltas": [
+                {
+                    "entity_id": "luming",
+                    "field_path": "relationships.acquaintances",
+                    "old_value": [],
+                    "new_value": [
+                        {"entity_id": "liu_dazhu", "type": "同屋杂役"},
+                        {"entity_id": "sun_wang", "type": "同屋杂役"},
+                    ],
+                    "change_type": "initialize",
+                }
+            ],
+        }
+    )
+
+    payload = json.loads((tmp_path / ".webnovel" / "state.json").read_text(encoding="utf-8"))
+    acquaintances = payload["entity_state"]["luming"]["relationships"]["acquaintances"]
+    assert len(acquaintances) == 2
+    assert acquaintances[0]["entity_id"] == "liu_dazhu"
+
+
+def test_state_writer_mirrors_protagonist_state_when_entity_is_protagonist(tmp_path):
+    """主角实体的 state_delta 应同步到 state.json:protagonist_state,让旧读取路径仍可用。"""
+    (tmp_path / ".webnovel").mkdir(parents=True, exist_ok=True)
+    initial = {
+        "protagonist_state": {
+            "name": "陆鸣",
+            "power": {"realm": "", "layer": 1},
+            "location": {"current": "", "last_chapter": 0},
+            "golden_finger": {"name": "穿越者知识", "level": 1, "cooldown": 0, "skills": []},
+            "attributes": {},
+        }
+    }
+    (tmp_path / ".webnovel" / "state.json").write_text(json.dumps(initial), encoding="utf-8")
+    writer = StateProjectionWriter(tmp_path)
+
+    writer.apply(
+        {
+            "meta": {"status": "accepted", "chapter": 2},
+            "state_deltas": [
+                {"entity_id": "luming", "field_path": "power.realm", "new_value": "练气五层"},
+                {
+                    "entity_id": "luming",
+                    "field_path": "location.current",
+                    "new_value": "青云宗杂役院",
+                },
+            ],
+            "entity_deltas": [
+                {
+                    "entity_id": "luming",
+                    "canonical_name": "陆鸣",
+                    "entity_type": "角色",
+                    "tier": "核心",
+                    "is_protagonist": True,
+                }
+            ],
+        }
+    )
+
+    payload = json.loads((tmp_path / ".webnovel" / "state.json").read_text(encoding="utf-8"))
+    assert payload["protagonist_state"]["power"]["realm"] == "练气五层"
+    assert payload["protagonist_state"]["location"]["current"] == "青云宗杂役院"
+    # name 不应被 delta 写回覆盖
+    assert payload["protagonist_state"]["name"] == "陆鸣"
+
+
+def test_state_writer_recognizes_protagonist_via_tier_zhujue(tmp_path):
+    """实际 LLM 用 tier='主角' 标注,而不是 is_protagonist=True。"""
+    (tmp_path / ".webnovel").mkdir(parents=True, exist_ok=True)
+    initial = {
+        "protagonist_state": {
+            "name": "陆鸣",
+            "power": {"realm": "", "layer": 1},
+        }
+    }
+    (tmp_path / ".webnovel" / "state.json").write_text(json.dumps(initial), encoding="utf-8")
+    writer = StateProjectionWriter(tmp_path)
+
+    writer.apply(
+        {
+            "meta": {"status": "accepted", "chapter": 1},
+            "state_deltas": [
+                {"entity_id": "luming", "field_path": "power.realm", "new_value": "练气五层"},
+            ],
+            "entity_deltas": [
+                {
+                    "entity_id": "luming",
+                    "canonical_name": "陆鸣",
+                    "entity_type": "角色",
+                    "tier": "主角",
+                }
+            ],
+        }
+    )
+
+    payload = json.loads((tmp_path / ".webnovel" / "state.json").read_text(encoding="utf-8"))
+    assert payload["protagonist_state"]["power"]["realm"] == "练气五层"
+
+
+def test_state_writer_recognizes_protagonist_via_canonical_name_match(tmp_path):
+    """没有 tier 也没有 is_protagonist 时,按名字匹配 state.protagonist_state.name 兜底。"""
+    (tmp_path / ".webnovel").mkdir(parents=True, exist_ok=True)
+    initial = {"protagonist_state": {"name": "陆鸣", "power": {}}}
+    (tmp_path / ".webnovel" / "state.json").write_text(json.dumps(initial), encoding="utf-8")
+    writer = StateProjectionWriter(tmp_path)
+
+    writer.apply(
+        {
+            "meta": {"status": "accepted", "chapter": 1},
+            "state_deltas": [
+                {"entity_id": "luming", "field_path": "power.realm", "new_value": "练气五层"},
+            ],
+            "entity_deltas": [
+                {"entity_id": "luming", "canonical_name": "陆鸣", "entity_type": "角色"}
+            ],
+        }
+    )
+
+    payload = json.loads((tmp_path / ".webnovel" / "state.json").read_text(encoding="utf-8"))
+    assert payload["protagonist_state"]["power"]["realm"] == "练气五层"
+
+
+# ============================================================
+# index_manager.apply_entity_delta:tier=主角 / entity_type 识别
+# ============================================================
+
+
+def test_index_manager_marks_protagonist_via_tier_zhujue(tmp_path):
+    """tier='主角' 时应自动设置 is_protagonist=True,让 get_protagonist 找得到。"""
+    from data_modules.index_manager import IndexManager
+
+    cfg = DataModulesConfig.from_project_root(tmp_path)
+    cfg.ensure_dirs()
+    manager = IndexManager(cfg)
+
+    manager.apply_entity_delta(
+        {
+            "entity_id": "luming",
+            "action": "upsert",
+            "entity_type": "角色",
+            "tier": "主角",
+            "chapter": 1,
+            "payload": {"name": "陆鸣"},
+        }
+    )
+
+    protagonist = manager.get_protagonist()
+    assert protagonist is not None, "tier=主角 should be recognized as protagonist"
+    assert protagonist["id"] == "luming"
+
+
+def test_index_manager_preserves_entity_type_for_organization(tmp_path):
+    """entity_deltas 用 entity_type='组织' 时,索引里 type 也必须是 '组织' 而非默认 '角色'。"""
+    from data_modules.index_manager import IndexManager
+
+    cfg = DataModulesConfig.from_project_root(tmp_path)
+    cfg.ensure_dirs()
+    manager = IndexManager(cfg)
+
+    manager.apply_entity_delta(
+        {
+            "entity_id": "qingyun_zong",
+            "action": "upsert",
+            "entity_type": "组织",
+            "tier": "重要",
+            "chapter": 1,
+            "payload": {"name": "青云宗"},
+        }
+    )
+
+    entity = manager.get_entity("qingyun_zong")
+    assert entity is not None
+    assert entity["type"] == "组织", f"expected type=组织, got {entity['type']!r}"
+
+
+def test_index_manager_uses_payload_name_when_canonical_name_missing(tmp_path):
+    """LLM 实际输出常把名字放在 payload.name,而非顶层 canonical_name。"""
+    from data_modules.index_manager import IndexManager
+
+    cfg = DataModulesConfig.from_project_root(tmp_path)
+    cfg.ensure_dirs()
+    manager = IndexManager(cfg)
+
+    manager.apply_entity_delta(
+        {
+            "entity_id": "lu_ming",
+            "action": "upsert",
+            "entity_type": "角色",
+            "tier": "主角",
+            "chapter": 1,
+            "payload": {"name": "陆鸣"},
+        }
+    )
+
+    entity = manager.get_entity("lu_ming")
+    assert entity is not None
+    assert entity["canonical_name"] == "陆鸣", (
+        f"expected canonical_name=陆鸣, got {entity['canonical_name']!r}"
+    )
+
+
+def test_index_manager_corrects_entity_type_on_re_upsert(tmp_path):
+    """已有实体被旧版本误标为 '角色' 时,重放新 commit 必须能把 type 修正为 '组织' 等。"""
+    from data_modules.index_manager import IndexManager
+
+    cfg = DataModulesConfig.from_project_root(tmp_path)
+    cfg.ensure_dirs()
+    manager = IndexManager(cfg)
+
+    # 模拟旧 bug:先以 type='角色' 落库(默认兜底)
+    manager.apply_entity_delta(
+        {
+            "entity_id": "qingyun_zong",
+            "tier": "重要",
+            "chapter": 1,
+            "payload": {"name": "青云宗"},
+        }
+    )
+    entity = manager.get_entity("qingyun_zong")
+    assert entity["type"] == "角色"
+
+    # 修复后重放 commit,明确传 entity_type='组织'
+    manager.apply_entity_delta(
+        {
+            "entity_id": "qingyun_zong",
+            "entity_type": "组织",
+            "tier": "重要",
+            "chapter": 1,
+            "payload": {"name": "青云宗"},
+        }
+    )
+    entity = manager.get_entity("qingyun_zong")
+    assert entity["type"] == "组织", f"expected type=组织 after re-upsert, got {entity['type']!r}"
+
+
+def test_index_manager_resolves_underscored_id_to_compact_entity(tmp_path):
+    """实体登记为 'luming' 时,查询 'lu_ming' 也应能找到(LLM 命名风格不一致兜底)。"""
+    from data_modules.index_manager import IndexManager
+
+    cfg = DataModulesConfig.from_project_root(tmp_path)
+    cfg.ensure_dirs()
+    manager = IndexManager(cfg)
+
+    manager.apply_entity_delta(
+        {
+            "entity_id": "luming",
+            "entity_type": "角色",
+            "tier": "主角",
+            "chapter": 1,
+            "payload": {"name": "陆鸣"},
+        }
+    )
+
+    # 直接查 — 已经能工作
+    assert manager.get_entity("luming")["id"] == "luming"
+    # 反向兜底 — 带下划线变体
+    found = manager.get_entity("lu_ming")
+    assert found is not None, "lu_ming should resolve to luming"
+    assert found["id"] == "luming"
+
+
+# ============================================================
+# memory_writer:接受 entity_type / field_path / loop description
+# ============================================================
+
+
+def test_memory_writer_preserves_entity_type_for_organization(tmp_path):
+    """entity_deltas 用 entity_type 字段时,组织不能被误标为'角色'。"""
+    cfg = DataModulesConfig.from_project_root(tmp_path)
+    cfg.ensure_dirs()
+    writer = MemoryProjectionWriter(tmp_path)
+
+    writer.apply(
+        {
+            "meta": {"status": "accepted", "chapter": 1},
+            "entity_deltas": [
+                {
+                    "entity_id": "qingyun_zong",
+                    "action": "upsert",
+                    "entity_type": "组织",
+                    "tier": "重要",
+                    "payload": {"name": "青云宗"},
+                }
+            ],
+            "state_deltas": [],
+            "accepted_events": [],
+        }
+    )
+
+    store = ScratchpadManager(cfg)
+    chars = store.query(category="character_state", status="active")
+    qingyun = [x for x in chars if x.subject == "qingyun_zong"]
+    assert qingyun, "qingyun_zong entity should be recorded"
+    assert qingyun[0].payload.get("type") == "组织", (
+        f"expected type=组织, got {qingyun[0].payload.get('type')!r}"
+    )
+
+
+def test_memory_writer_accepts_field_path_in_state_delta(tmp_path):
+    cfg = DataModulesConfig.from_project_root(tmp_path)
+    cfg.ensure_dirs()
+    writer = MemoryProjectionWriter(tmp_path)
+
+    writer.apply(
+        {
+            "meta": {"status": "accepted", "chapter": 2},
+            "state_deltas": [
+                {
+                    "entity_id": "luming",
+                    "field_path": "physical.condition",
+                    "old_value": "虚弱",
+                    "new_value": "虚弱(持续)",
+                }
+            ],
+            "entity_deltas": [],
+            "accepted_events": [],
+        }
+    )
+
+    store = ScratchpadManager(cfg)
+    chars = store.query(category="character_state", status="active")
+    assert any(
+        x.subject == "luming" and "physical" in x.field for x in chars
+    ), [(x.subject, x.field) for x in chars]
+
+
+def test_memory_writer_extracts_open_loop_from_description_when_no_content(tmp_path):
+    """open_loop_created 事件 payload 没有 content 时,应从 description / unanswered_question 取。"""
+    cfg = DataModulesConfig.from_project_root(tmp_path)
+    cfg.ensure_dirs()
+    writer = MemoryProjectionWriter(tmp_path)
+
+    writer.apply(
+        {
+            "meta": {"status": "accepted", "chapter": 2},
+            "state_deltas": [],
+            "entity_deltas": [],
+            "accepted_events": [
+                {
+                    "event_id": "evt_002",
+                    "chapter": 2,
+                    "event_type": "open_loop_created",
+                    "subject": "luming",
+                    "payload": {
+                        "description": "陆鸣发现借据'一式三份'条款,保人身份成谜",
+                        "loop_type": "身份悬疑",
+                        "unanswered_question": "保人是谁?谁带原身去签的借据?",
+                        "narrative_weight": "major",
+                    },
+                }
+            ],
+        }
+    )
+
+    store = ScratchpadManager(cfg)
+    loops = store.query(category="open_loop", status="active")
+    assert loops, "open_loop should be recorded"
+    # subject 应是有意义的悬念内容,不应是 'luming'(来自 event.subject 的兜底)
+    contents = [x.subject for x in loops]
+    assert any("保人" in c or "借据" in c or "身份" in c for c in contents), (
+        f"expected meaningful loop content, got {contents}"
+    )
+
+
+def test_memory_writer_extracts_world_rule_from_rule_content(tmp_path):
+    """world_rule_revealed 事件用 rule_content / description 时,应能落入 world_rule 记忆。"""
+    cfg = DataModulesConfig.from_project_root(tmp_path)
+    cfg.ensure_dirs()
+    writer = MemoryProjectionWriter(tmp_path)
+
+    writer.apply(
+        {
+            "meta": {"status": "accepted", "chapter": 2},
+            "state_deltas": [],
+            "entity_deltas": [],
+            "accepted_events": [
+                {
+                    "event_id": "evt_003",
+                    "chapter": 2,
+                    "event_type": "world_rule_revealed",
+                    "subject": "luming",
+                    "payload": {
+                        "description": "青云城借贷市场三大空白",
+                        "rule_category": "金融/经济",
+                        "rule_content": "利率垄断、无信用体系、暴力收债",
+                    },
+                }
+            ],
+        }
+    )
+
+    store = ScratchpadManager(cfg)
+    rules = store.query(category="world_rule", status="active")
+    assert rules
+    rule_texts = [x.value for x in rules]
+    assert any("利率" in r or "信用" in r or "金融" in r or "借贷" in r for r in rule_texts), (
+        f"expected meaningful rule text, got {rule_texts}"
+    )
+
+
+# ============================================================
+# vector_projection_writer:接受 description / new_state
+# ============================================================
+
+
+def test_vector_writer_handles_character_state_changed_with_description():
+    writer = VectorProjectionWriter.__new__(VectorProjectionWriter)
+    event = {
+        "event_type": "character_state_changed",
+        "chapter": 2,
+        "subject": "luming",
+        "payload": {
+            "description": "陆鸣意识到自己是这个世界唯一懂金融的人",
+            "previous_state": "刚穿越的茫然",
+            "new_state": "认知激活",
+            "narrative_weight": "pivotal",
+        },
+    }
+    text = writer._event_to_text(event)
+    assert text, "should produce non-empty text"
+    assert "第2章" in text
+    assert "luming" in text or "陆鸣" in text or "认知" in text or "金融" in text
+
+
+# ============================================================
+# 集成:用真实 DeepSeek 输出 commit payload 走完整投影链
+# ============================================================
+
+
+def test_integration_real_deepseek_commit_projects_full_state(tmp_path):
+    """端到端:组合 state + memory 投影器处理真实 commit payload,确认所有数据都落到对应位置。"""
+    cfg = DataModulesConfig.from_project_root(tmp_path)
+    cfg.ensure_dirs()
+    (tmp_path / ".webnovel" / "state.json").write_text(
+        json.dumps(
+            {
+                "protagonist_state": {
+                    "name": "陆鸣",
+                    "power": {"realm": "", "layer": 1},
+                    "location": {"current": ""},
+                }
+            }
+        ),
+        encoding="utf-8",
+    )
+
+    # 真实 DeepSeek v4pro 输出片段
+    real_payload = {
+        "meta": {"status": "accepted", "chapter": 2},
+        "state_deltas": [
+            {
+                "entity_id": "luming",
+                "field_path": "physical.condition",
+                "old_value": "虚弱",
+                "new_value": "虚弱(持续)",
+                "change_type": "confirmed",
+            },
+            {
+                "entity_id": "luming",
+                "field_path": "knowledge.lending_ecosystem",
+                "old_value": "仅知有阎王债",
+                "new_value": "完整市场图谱",
+                "change_type": "initialize",
+            },
+        ],
+        "entity_deltas": [
+            {
+                "entity_id": "luming",
+                "action": "upsert",
+                "entity_type": "角色",
+                "tier": "核心",
+                "is_protagonist": True,
+                "payload": {"name": "陆鸣"},
+            },
+            {
+                "entity_id": "qingyun_zong",
+                "action": "upsert",
+                "entity_type": "组织",
+                "tier": "重要",
+                "payload": {"name": "青云宗"},
+            },
+            {
+                "entity_id": "heishi_fangshi",
+                "action": "upsert",
+                "entity_type": "地点",
+                "tier": "重要",
+                "payload": {"name": "黑石坊市"},
+            },
+        ],
+        "accepted_events": [
+            {
+                "event_id": "evt_ch002_guarantor_mystery",
+                "chapter": 2,
+                "event_type": "open_loop_created",
+                "subject": "luming",
+                "payload": {
+                    "description": "保人身份不明",
+                    "loop_type": "身份悬疑",
+                    "unanswered_question": "保人是谁?",
+                },
+            }
+        ],
+    }
+
+    StateProjectionWriter(tmp_path).apply(real_payload)
+    MemoryProjectionWriter(tmp_path).apply(real_payload)
+
+    state = json.loads((tmp_path / ".webnovel" / "state.json").read_text(encoding="utf-8"))
+    # entity_state 必须有内容(不能再是 {})
+    assert state["entity_state"], "entity_state should not be empty"
+    assert "luming" in state["entity_state"]
+    # 主角字段镜像到 protagonist_state 不丢
+    assert state["protagonist_state"]["name"] == "陆鸣"
+
+    # memory_scratchpad:组织和地点不能被误标
+    store = ScratchpadManager(cfg)
+    chars = store.query(category="character_state", status="active")
+    by_id = {x.subject: x for x in chars}
+    assert by_id["qingyun_zong"].payload.get("type") == "组织"
+    assert by_id["heishi_fangshi"].payload.get("type") == "地点"
+
+    # open_loop 必须有有意义内容(不能是 'luming')
+    loops = store.query(category="open_loop", status="active")
+    contents = [x.subject for x in loops]
+    assert any("保人" in c or "身份悬疑" in c for c in contents), contents

+ 39 - 5
webnovel-writer/scripts/data_modules/vector_projection_writer.py

@@ -72,20 +72,54 @@ class VectorProjectionWriter:
         payload = event.get("payload") or {}
 
         if event_type == "power_breakthrough":
-            new_val = str(payload.get("new") or payload.get("to") or "").strip()
+            new_val = str(
+                payload.get("new")
+                or payload.get("to")
+                or payload.get("new_value")
+                or payload.get("new_state")
+                or ""
+            ).strip()
             return f"第{chapter}章:{subject}突破至{new_val}" if new_val else ""
         elif event_type == "character_state_changed":
-            field = str(payload.get("field") or "").strip()
-            new_val = str(payload.get("new") or payload.get("to") or "").strip()
-            return f"第{chapter}章:{subject}的{field}变为{new_val}" if field and new_val else ""
+            field = str(
+                payload.get("field") or payload.get("field_path") or ""
+            ).strip()
+            new_val = str(
+                payload.get("new")
+                or payload.get("to")
+                or payload.get("new_value")
+                or payload.get("new_state")
+                or ""
+            ).strip()
+            description = str(payload.get("description") or "").strip()
+            if field and new_val:
+                return f"第{chapter}章:{subject}的{field}变为{new_val}"
+            if new_val:
+                return f"第{chapter}章:{subject}变化为{new_val}"
+            if description:
+                return f"第{chapter}章:{subject}:{description}"
+            return ""
         elif event_type == "relationship_changed":
             to_entity = str(payload.get("to_entity") or payload.get("to") or "").strip()
             rel_type = str(payload.get("relationship_type") or payload.get("type") or "").strip()
             return f"第{chapter}章:{subject}与{to_entity}关系变为{rel_type}" if to_entity else ""
         elif event_type in ("world_rule_revealed", "world_rule_broken"):
-            desc = str(payload.get("description") or payload.get("rule") or "").strip()
+            desc = str(
+                payload.get("description")
+                or payload.get("rule")
+                or payload.get("rule_content")
+                or ""
+            ).strip()
             action = "揭示" if "revealed" in event_type else "打破"
             return f"第{chapter}章:{action}世界规则——{desc}" if desc else ""
+        elif event_type == "open_loop_created":
+            description = str(
+                payload.get("description")
+                or payload.get("unanswered_question")
+                or payload.get("content")
+                or ""
+            ).strip()
+            return f"第{chapter}章:{subject}埋下悬念——{description}" if description else ""
         elif event_type == "artifact_obtained":
             name = str(payload.get("name") or subject or "").strip()
             owner = str(payload.get("owner") or payload.get("holder") or "").strip()