Pārlūkot izejas kodu

Add Detection Guidance: false positives, human-writing signs, LLM idiolects

Most of this skill tells the editor what to remove. This adds the
inverse — what to leave alone, and how to decide.

Sourced from Wikipedia: Signs of AI writing (revision fetched
2026-05-01), specifically the "Ineffective indicators", "Signs of
human writing", and "Differences between LLMs" sections.

Three subsections, no new patterns:

- "What NOT to flag (false positives)" — the indicators that look
  AI-coded but are actually neutral (perfect grammar, em dashes
  alone, curly quotes alone, formal vocabulary, common transition
  words). The over-editing risk is real: if the skill is applied
  too aggressively, it strips legitimate prose. Closes with the
  "clusters matter, isolated signs don't" rule.

- "Signs of human writing (preserve these)" — positive markers
  that should be left untouched: specific detail, mixed feelings,
  era-bound references, sentence-length variation, parenthetical
  self-corrections, and the November 30, 2022 cutoff for ruling
  out AI involvement entirely.

- "LLM Idiolects" — quick triage notes per model family
  (ChatGPT/Grok verbose with artifacts; Gemini/Claude concise, no
  curly quotes by default). Tendencies, not rules.

No pattern-count change. No README changes (the README's pattern
table is unaffected since this section is meta-guidance, not new
patterns). No version bump.
Philipp Dubach 1 mēnesi atpakaļ
vecāks
revīzija
564b7aa138
1 mainītis faili ar 45 papildinājumiem un 0 dzēšanām
  1. 45 0
      SKILL.md

+ 45 - 0
SKILL.md

@@ -461,6 +461,51 @@ Avoiding AI patterns is only half the job. Sterile, voiceless writing is just as
 >
 >
 > When users hit a slow page, they leave.
 > When users hit a slow page, they leave.
 
 
+
+## DETECTION GUIDANCE
+
+### What NOT to flag (false positives)
+
+A clean human writer can hit several of the patterns above without any AI involvement. Before rewriting, sanity-check that you are not gutting legitimate prose. The following are *not* reliable indicators on their own:
+
+- **Perfect grammar and consistent style.** Many writers are professionals or have been edited. Polish does not equal AI.
+- **Mixed casual and formal registers.** This often signals a person in a technical field, a young writer, or someone with neurodivergent prose habits — not a chatbot.
+- **"Bland" or "robotic" prose.** AI prose has *specific* tells. Generic dryness without those tells is just dry writing.
+- **Formal or academic vocabulary.** AI overuses *specific* fancy words (see §7), not all fancy words. Don't flatten "ostensibly" or "constituent" just because they sound brainy.
+- **Letter-style opening or closing on a comment.** Salutations and sign-offs predate ChatGPT by centuries.
+- **Common transition words in isolation.** *Additionally*, *moreover*, *consequently* are AI-coded only when piled up. One *however* is not a tell.
+- **Curly quotes alone.** macOS, Word, Google Docs, and most CMSes auto-curl by default. Curly quotes only count when stacked with other tells.
+- **Em dashes alone.** Many editors and journalists use them often. Em dashes are evidence only when paired with formulaic sales-y rhythm.
+- **Unsourced claims.** Most of the web is unsourced. Lack of citations doesn't prove anything.
+- **Correct, complex formatting.** Visual editors and templates produce clean output without any AI.
+
+When in doubt, look for **clusters** of tells, not isolated ones. A single em dash means nothing; em dashes plus rule-of-three plus *vibrant tapestry* plus a "Conclusion" section is a confession.
+
+
+### Signs of human writing (preserve these)
+
+When you see these, lean toward leaving the prose alone — they are evidence of a real person writing, and over-editing will destroy what makes the piece sound human:
+
+- **Specific, unusual, hard-to-fabricate detail.** A real address. A weird quote. The phrase "the lawyer who used to work upstairs from my dentist." LLMs round off specifics; humans hoard them.
+- **Mixed feelings and unresolved tension.** "I think this is mostly good, but it bothers me, and I can't fully explain why." LLMs default to clean takes.
+- **Dated, era-bound references.** Slang, memes, or in-jokes that map to a specific year and subculture. Models lag by a year or more.
+- **First-person editorial choices the writer can defend.** If the writer can explain *why* they made a particular cut or used a particular word, that's a strong human signal.
+- **Variety in sentence length.** Real writing alternates short and long. AI writing tends toward an even, mid-length cadence.
+- **Genuine asides, parentheticals, or self-corrections.** "(I keep wanting to say 'almost' here, but it really was certain.)" Models rarely interrupt themselves like this.
+- **Edits made before November 30, 2022.** ChatGPT's public launch. Anything older than that is, with very rare exceptions, not AI-written.
+
+
+### LLM Idiolects (which model wrote this?)
+
+Each model family writes a little differently. Useful when triaging a suspected passage:
+
+- **ChatGPT (GPT-4 / 4o / 5):** Most prevalent. Heavy on broader-context throat-clearing, "evolving landscape," media-coverage padding. Most likely to leave reference-markup artifacts. Most likely to use em dashes (suppressed in 5.1 but still leaks through).
+- **Grok:** Similar to ChatGPT in verbosity and broader-context framing. Leaves `<grok_card>` tags and `referrer=grok.com`.
+- **Gemini (1.5–3 Pro):** More concise than ChatGPT. Avoids curly quotes by default. Less prone to "broader trends" puffery.
+- **Claude (3.5–Opus 4.x):** Concise. Avoids curly quotes by default. Tends toward direct expository style; less likely to insert "It's important to note that..." but can fall into rule-of-three and inline-header lists when doing structured output.
+
+These are tendencies, not rules. All four families produce all the patterns in this guide given the right prompt.
+
 ---
 ---
 
 
 ## Process
 ## Process