Forráskód Böngészése

fix(prompt-hook): bound the call/trace/affect/connect stems on the right so ordinary words can't fire the gate (#1138) (#1147)

The multilingual structural-question gate (#1134) matches stems as open
prefixes (left boundary only) so derived forms fire without enumeration.
Four English stems have common non-structural completions — callus,
calligraphy, Connecticut, connective, affectionate, Tracey — that
false-fired the HIGH (full-explore) tier. Those four now enumerate their
structural suffixes and re-assert the right boundary; callbacks/callable/
call sites are included so no structural form regresses. Also documents
the verified-unfixable Korean homograph class on the unsegmented table
(#1140): segmentation can't split 구조대 from 구조가, and a denylist would
break 구조대로.

Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
Colby Mchenry 1 napja
szülő
commit
713ab7af43
3 módosított fájl, 51 hozzáadás és 3 törlés
  1. 1 0
      CHANGELOG.md
  2. 26 0
      __tests__/frontload-hook.test.ts
  3. 24 3
      src/directory.ts

+ 1 - 0
CHANGELOG.md

@@ -20,6 +20,7 @@ and adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 - The automatic context hook for Claude Code now fires for structural questions asked in nearly thirty languages — French, Spanish, Portuguese, German, Italian, Dutch, Polish, Czech, Romanian, Hungarian, Greek, Swedish, Danish, Norwegian, Finnish, Russian, Ukrainian, Turkish, Indonesian, Vietnamese, Thai, Hindi, Arabic, Farsi, Hebrew, Japanese, Korean, and both simplified and traditional Chinese — instead of just English and simplified Chinese. Previously a natural question like "comment marche la state machine des commandes ?" injected nothing unless it happened to contain a code-shaped symbol name, making the hook look broken for non-English teams. English questions phrased with derived word forms ("explain the architecture…", "what are the dependencies…") now fire too, and prompts in any other language still fire when they name a symbol from the index. Thanks @anthonyle-roy-lgtm for the report. (#1126)
 - Lua and Luau method calls with capitalized names (`obj:Method()` — the standard Roblox convention) now link to the right method. Because Lua's method-call syntax looks identical to a Luau type annotation, a capitalized call like `lg:Log()` was misread as declaring the variable's type, so whenever two or more classes shared a method name (`Init`, `Update`, `Destroy`, …) the call was silently dropped from callers, impact/blast-radius, and flow traces. Lowercase method names were unaffected. Thanks @inth3shadows for the precise root-cause analysis and repro. (#1124)
 - Removed dead code left behind by the discontinued managed-reasoning feature. Its `codegraph login` flow was unplugged before ever shipping in a release, but the unused module still shipped inside the platform bundles, and a security review flagged its Windows browser-open step (it routed the login URL through `cmd`, which would have been unsafe had the flow ever been wired back up). The leftover module and its tests are now fully deleted. Thanks @inth3shadows for the report. (#1114)
+- The Claude Code context hook no longer treats ordinary English words that merely start with "call", "trace", "affect", or "connect" — callus, calligraphy, Connecticut, connective, affectionate, Tracey — as structural questions, which used to inject full CodeGraph context into prompts that had nothing to do with code structure. Genuinely structural forms (calls, callers, callbacks, call site, traced, tracing, affected, connections, connectivity, …) still fire exactly as before. Thanks @inth3shadows for the report. (#1138)
 
 ## [1.2.0] - 2026-07-02
 

+ 26 - 0
__tests__/frontload-hook.test.ts

@@ -232,6 +232,32 @@ describe('hasStructuralKeyword — Latin-script languages, Cyrillic, JA/KO (#112
     expect(hasStructuralKeyword('water the flower')).toBe(false);           // unchanged guarantee
   });
 
+  it('bounded stems reject ordinary-English completions (#1138)', () => {
+    expect(hasStructuralKeyword('he has a callus on his palm')).toBe(false);
+    expect(hasStructuralKeyword('a lovely calligraphy font')).toBe(false);
+    expect(hasStructuralKeyword('Connecticut is a state')).toBe(false);
+    expect(hasStructuralKeyword('connective tissue damage')).toBe(false);
+    expect(hasStructuralKeyword('she is very affectionate')).toBe(false);
+    expect(hasStructuralKeyword('Tracey went home early')).toBe(false);
+  });
+
+  it('bounded stems keep every structural derived form (#1138)', () => {
+    // call
+    expect(hasStructuralKeyword('list the callers of parseToken')).toBe(true);
+    expect(hasStructuralKeyword('what callbacks fire on save')).toBe(true);
+    expect(hasStructuralKeyword('is submitOrder callable from the worker')).toBe(true);
+    expect(hasStructuralKeyword('find every call site of dispose')).toBe(true);
+    expect(hasStructuralKeyword('who called setupRouter')).toBe(true);
+    // trace ("tracing" is covered by the exact-word list — the e drops)
+    expect(hasStructuralKeyword('trace the request')).toBe(true);
+    expect(hasStructuralKeyword('we traced it to the cache layer')).toBe(true);
+    expect(hasStructuralKeyword('add tracing to the pipeline')).toBe(true);
+    // affect / connect
+    expect(hasStructuralKeyword('which modules are affected by this change')).toBe(true);
+    expect(hasStructuralKeyword('how do the connections get pooled')).toBe(true);
+    expect(hasStructuralKeyword('the connector registers itself at boot')).toBe(true);
+  });
+
   it('non-structural prose stays a no-op in every covered language', () => {
     expect(hasStructuralKeyword('corrige cette faute de frappe')).toBe(false);   // FR "fix this typo"
     expect(hasStructuralKeyword('arregla este error tipográfico')).toBe(false);  // ES

+ 24 - 3
src/directory.ts

@@ -316,12 +316,23 @@ const STRUCTURAL_WORDS = [
  * "вызыва" on вызывает/вызывается. Mid-word occurrences stay excluded —
  * "restructure"/"independent" don't fire — so precision stays close to the
  * exact-word class. Add a stem only when every plausible completion is still a
- * structural word.
+ * structural word; a stem with ordinary-English completions must instead
+ * enumerate its structural suffixes and re-assert the right boundary (see the
+ * four bounded English entries below, #1138).
  */
 const STRUCTURAL_STEMS = [
   // English + the Latin-script languages that share the spelling (French
-  // architecture/structure/trace/impact, Spanish depende/implementa/impacto, …)
-  'architect', 'structur', 'depend', 'implement', 'connect', 'impact', 'affect', 'trace', 'call', 'explain',
+  // architecture/structure/trace/impact, Spanish depende/implementa/impacto, …).
+  // call/trace/affect/connect are NOT safe as open prefixes — callus,
+  // calligraphy, Connecticut, connective, affectionate, Tracey are ordinary
+  // words that would false-fire the full-explore tier (#1138) — so they carry
+  // an enumerated suffix set + right boundary. "tracing" lives in
+  // STRUCTURAL_WORDS (the e is dropped, so no trace-prefix form matches it).
+  'architect', 'structur', 'depend', 'implement', 'impact', 'explain',
+  `call(?:s|ing|ed|ers?|backs?|able|sites?)?${NOT_WORD_AFTER}`,
+  `trace(?:s|d|rs?)?${NOT_WORD_AFTER}`,
+  `affect(?:s|ed|ing)?${NOT_WORD_AFTER}`,
+  `connect(?:s|ed|ing|ions?|ors?|ivity)?${NOT_WORD_AFTER}`,
   // French (appel(le)=call, dépend=depends, implément(e)=implement,
   // connex(ion)=connection, expliqu(e)=explain, fonctionn(e/ement)=works)
   'appel', 'dépend', 'implément', 'connex', 'expliqu', 'fonctionn',
@@ -417,6 +428,16 @@ const STRUCTURAL_STEMS_RE = new RegExp(`${NOT_WORD_BEFORE}(?:${STRUCTURAL_STEMS.
  * يعمل/تعمل/ทำงาน=works) plus structural-overview words with no single clean
  * English equivalent (介绍/介紹/解析/分析/原理/机制/機制/仕組み/説明/설명/動作/동작/작동/
  * اشرح/شرح/توضیح/הסבר/อธิบาย=explain).
+ *
+ * KNOWN, ACCEPTED false-positive class (#1140): substring matching cannot see
+ * homograph compounds — Korean 구조 (structure) also fires inside 구조대
+ * (rescue squad). Verified unfixable at this layer: ICU word segmentation
+ * (Intl.Segmenter) returns 구조대 and the particle form 구조가 (which the gate
+ * MUST keep matching) as equally opaque single segments, and a 구조대 denylist
+ * would break 구조대로 ("according to the structure" — 구조 + the 대로
+ * particle), a legitimate structural prompt. The miss rate this design avoids
+ * (silently no-op'ing every prompt in these languages, #994) outweighs the
+ * occasional off-domain fire.
  */
 const STRUCTURAL_UNSEGMENTED = /如何|怎么|怎麼|在哪|哪里|哪裡|追踪|跟踪|追蹤|追跡|トレース|流程|流向|流れ|路径|路徑|経路|调用|調用|呼び出|依赖|依賴|依存|影响|影響|实现|實現|実装|架构|架構|アーキテクチャ|结构|結構|構造|介绍|介紹|解析|分析|原理|机制|機制|仕組み|説明|動作|どうやって|どのように|어떻게|어디|호출|흐름|경로|의존|영향|구현|구조|아키텍처|추적|동작|작동|설명|كيف|أين|اين|يستدعي|استدعاء|يعتمد|تعتمد|يؤثر|تأثير|معماري|بنية|هيكل|تدفق|مسار|تتبع|يعمل|تعمل|اشرح|شرح|چگونه|چطور|کجا|فراخوان|وابسته|تأثیر|معماری|ساختار|مسیر|توضیح|איך|איפה|קורא|תלוי|משפיע|ארכיטקטור|מבנה|זרימה|נתיב|הסבר|อย่างไร|ยังไง|ที่ไหน|เรียกใช้|ขึ้นอยู่กับ|ผลกระทบ|สถาปัตยกรรม|โครงสร้าง|เส้นทาง|ติดตาม|ทำงาน|อธิบาย/;