mirror of
https://github.com/affaan-m/everything-claude-code.git
synced 2026-05-20 19:29:58 +08:00
Extend `isDangerousInvisibleCodePoint` with five additional code points / ranges that are routinely cited in invisible-character smuggling references but were not in the previous denylist: - **U+180E** MONGOLIAN VOWEL SEPARATOR. Formerly classified as a space separator (Zs) until Unicode 6.3 reclassified it as Cf (Format control). Renders as zero-width; widely abused for homograph attacks and prompt smuggling. - **U+115F** HANGUL CHOSEONG FILLER and **U+1160** HANGUL JUNGSEONG FILLER. Zero-width fillers used in Korean text shaping. Both are cited as common LLM-injection vectors in Korean / multilingual threat models. - **U+2061–U+2064** invisible math operators (FUNCTION APPLICATION, INVISIBLE TIMES, INVISIBLE SEPARATOR, INVISIBLE PLUS). Zero-width and only meaningful inside math typesetting. No legitimate Markdown or source code uses them. - **U+3164** HANGUL FILLER. Reported in real-world Discord and Twitter smuggling incidents; not used in legitimate Korean text. Reproduced before this commit: a file containing any one of these code points passed `check-unicode-safety.js` silently. After this commit each one is reported as `dangerous-invisible U+<HEX>` and `--write` mode strips it. Verified by writing 8 single-character probe files (`probe-0x180E.md`, `probe-0x115F.md`, …) and confirming exit=1 with each violation listed. ECC repo self-scan reports only the pre-existing `U+2605` BLACK STAR warnings (unchanged) and exits with the same status (no new in-repo violations introduced). Existing 5 unicode-safety tests still pass; `yarn lint` clean. Regression coverage for both the previous commit's Tag block fix and this commit's additions lands in the next commit.
7.3 KiB
7.3 KiB