Status: implemented — ingest Worker (telemetry-worker/), client (src/telemetry/),
codegraph telemetry CLI, MCP + installer wiring, TELEMETRY.md. Pending: Worker deploy
codegraph engine (CLI + MCP server + installer)CodeGraph is a local-first tool whose whole pitch is "your code never leaves your machine."
Telemetry has to be designed so that sentence stays true and provable: a short, auditable list
of anonymous counters, documented field-by-field, easy to turn off, and impossible to grow
quietly. This doc is the contract; TELEMETRY.md (repo root, user-facing) restates it and the
implementation must never collect anything not listed there.
Answer, in aggregate and anonymously:
clientInfo.codegraph-pro fork (see "codegraph-pro rule" below).TELEMETRY.md + the Worker allowlist together.fetch, Node ≥18),
zero bytes on stdout (stdio is the MCP protocol channel), zero retries, zero error noise.
Every failure mode is silence.telemetry.getcodegraph.com. The URL
baked into a published npm version POSTs there forever, so the domain must be ours; the
backend behind it can change without a client release.Common envelope on every batch (computed once per process):
| field | example | notes |
|---|---|---|
machine_id |
b3a8… (UUIDv4) |
random, minted at first run, stored in global config |
codegraph_version |
0.9.12 |
from package.json |
os / arch |
darwin / arm64 |
process.platform / process.arch |
node_major |
22 |
major only |
ci |
false |
CI env var present |
schema_version |
1 |
bump when the schema changes |
Event types:
install — one per installer run. Props: targets (e.g. ["claude","cursor"]),
scope (local/global), kind (fresh/upgrade/reinstall), sqlite_backend
(native/wasm).index — one per full index (init/index, not per sync). Props: languages
(names only, e.g. ["typescript","go"]), file_count_bucket (<100, 100-1k, 1k-10k,
10k+), duration_bucket (<10s, 10-60s, 1-5m, 5m+), sqlite_backend.usage_rollup — the workhorse. One event per (day, kind, name) per machine,
aggregated locally. Props: kind (mcp_tool/cli_command), name
(e.g. codegraph_explore, affected), count, error_count, and for MCP:
client_name/client_version from the initialize handshake (src/mcp/session.ts
case 'initialize' — plumbing to add; currently unread).uninstall — one per uninstall/uninit run (churn signal). Props: targets.Volume math: rollups mean monthly events ≈ active machines × active days × distinct tools used (single digits) — the PostHog free tier (1M events/mo) covers tens of thousands of MAU. There is no per-call event by design.
Events are sent as PostHog anonymous events ($process_person_profile: false):
cheaper, no person profiles, unique-machine counts still work on distinct_id =
machine_id. Revisit only if retention tooling demands profiles.
Resolution order (first match wins):
DO_NOT_TRACK=1 (community standard — always honored) → offCODEGRAPH_TELEMETRY=0|1 → forced off/on for that process~/.codegraph/telemetry.json → stored user choiceSurfaces:
consent_source: "installer". Re-runs/upgrades respect the
stored choice and don't re-ask.npx codegraph init, MCP server — no TTY, never prompt): right
before the first actual send (recording only buffers locally and stays silent — so
the installer's explicit toggle always precedes any notice), print one line to
stderr and record first_run_notice_shown:
codegraph collects anonymous usage stats (no code or paths) — "codegraph telemetry off" or CODEGRAPH_TELEMETRY=0 disables. Details: TELEMETRY.mdcodegraph telemetry status|on|off (status prints the machine ID, current
state, and what decided it). Deleting ~/.codegraph/telemetry.json resets everything,
including the machine ID.~/.codegraph/telemetry.json:
{
"enabled": true,
"machine_id": "uuid-v4",
"consent_source": "installer | default-notice | cli",
"first_run_notice_shown": true,
"updated_at": "2026-06-12T00:00:00Z"
}
(~/.codegraph/ is new — today nothing global exists. Coexists by filename if a user ever
indexes $HOME itself, since per-project data lives in <project>/.codegraph/ with fixed
other filenames.)
New module src/telemetry/ (single small module, no deps):
telemetry.count('mcp_tool', name, ok) and move on.~/.codegraph/telemetry-queue.jsonl.
Hard cap ~256 KB; on overflow drop oldest lines. Corrupt buffer → truncate, never throw.process.exit(), where beforeExit never fires
and async sends die, so the design is: a tiny synchronous append on process.on('exit')
persists in-memory deltas (survives process.exit), and actual network sends happen
opportunistically — at the start of long-running commands (init/index/sync/
uninit/upgrade), on an unref'd interval in the long-lived MCP server/daemon, and
awaited-with-cap at the end of install/init/index/uninit where a second is
invisible. Sends POST completed-day rollups + lifecycle events to
https://telemetry.getcodegraph.com/v1/events with AbortSignal.timeout(1500),
fire-and-forget: any response (or none) is final — no retry, no error surfaced. The
queue is claimed by atomic rename so concurrent processes can't double-send (a crashed
sender's claim merges back after an hour). CODEGRAPH_TELEMETRY_DEBUG=1 echoes
payloads to stderr for development.telemetry.getcodegraph.com → small Worker living at telemetry-worker/ in this repo —
public on purpose, so anyone can audit exactly what the endpoint stores. It ships nowhere
with the npm package (excluded by the files allowlist):
POST /v1/events: validate against the event/property allowlist (drop unknown events,
strip unknown props), enforce sane sizes, never forward or log the client IP
(drop CF-Connecting-IP), light per-machine_id rate limit so abuse can't burn the
ingest cap, forward to https://us.i.posthog.com/batch/ with the project key from a
Worker secret. Responds 204 on accept (including events dropped by the allowlist)
and honest 4xx for malformed/oversized/rate-limited requests — the client treats
every response as final and never retries.The private codegraph-pro fork ships inside customer containers whose guarantee is
"nothing leaves the box" — including telemetry. In the fork, telemetry must be default-off
and not enableable by the installer (compile-time constant or stripped module), and the
container sets CODEGRAPH_TELEMETRY=0 as belt-and-braces. This rule lives in the fork's
CLAUDE.md and must survive every upstream merge.
TELEMETRY.md (user-facing field-by-field list) + README section.codegraph telemetry subcommand + MCP clientInfo plumbing.[Unreleased] announcing
telemetry, the default, and every off-switch. Release.Tests (no DB mocking, per repo convention; fetch mocked at globalThis.fetch):
consent precedence (env > config > default), off ⇒ zero fetch calls, rollup aggregation
across days, buffer cap + corrupt-buffer recovery, no-stdout invariant under MCP transport,
flush abort honors timeout, installer toggle persists + re-run doesn't re-ask
(__tests__/installer-targets.test.ts per house rules).
uninstall event: keep or drop? (Honest churn signal vs. "pinging on the way out" optics.)ci: true) because engine-in-CI is a real usage mode — revisit
if it ever dominates volume.