Răsfoiți Sursa

fix(daemon): reap dead-peer clients + inactivity backstop so a daemon can't leak (#692) (#712)

Layer-2 defense-in-depth follow-up to the Windows PPID watchdog fix (#711).
That fix makes an orphaned proxy exit so its socket closes and the daemon
reaps via the refcount + idle timer. This adds two daemon-side safety nets for
the residual case where a socket close is never delivered (a Windows named-pipe
hazard) and a phantom client would otherwise pin the daemon forever:

  - Liveness sweep: a proxy now sends an optional client-hello carrying its pid
    (+ host pid) right after verifying the daemon hello; the daemon periodically
    drops any client whose peer process is dead, re-arming the idle timer.
    Fail-safe and version-pinned — a connection that never sends the hello just
    falls back to the socket-close lifecycle, and the daemon reads it before the
    transport so a non-hello first line is handed through untouched.
  - Inactivity backstop: the daemon exits after a generous no-traffic window
    (CODEGRAPH_DAEMON_MAX_IDLE_MS, default 30 min) even with clients attached, so
    a phantom client that sends nothing can't keep it alive.

Pure helpers (parseClientHelloLine, peerIsDead) are unit-tested; the full
handshake + sweep and the backstop are covered end-to-end in mcp-daemon.test.ts.
Validated on a real Windows 11 VM: the sweep reaps a dead-pid client over a
named pipe and the backstop fires with a client still connected.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Colby Mchenry 2 săptămâni în urmă
părinte
comite
80358a84d9
5 a modificat fișierele cu 390 adăugiri și 9 ștergeri
  1. 1 0
      CHANGELOG.md
  2. 69 0
      __tests__/daemon-client-liveness.test.ts
  3. 70 0
      __tests__/mcp-daemon.test.ts
  4. 229 8
      src/mcp/daemon.ts
  5. 21 1
      src/mcp/proxy.ts

+ 1 - 0
CHANGELOG.md

@@ -22,6 +22,7 @@ and adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 ### Fixes
 
 - On Windows, CodeGraph's background processes no longer pile up without bound and saturate CPU over a long session. When the editor or agent that launched CodeGraph exited, its helper process couldn't tell its parent had gone — Windows reports process lineage differently than macOS and Linux — so the helper kept running, the shared background server never saw the client disconnect, and its idle timer never fired to shut it down. CodeGraph now detects parent-process exit directly on Windows, so helpers and the idle background server wind down promptly, the same as they already did on macOS and Linux. (#692, #576, #680)
+- The shared background server has two further safeguards against ever lingering: it now drops a client the moment it detects that client's process is gone (even if the disconnect arrived uncleanly — a force-quit or a dropped connection that never closed the socket), and it won't stay running indefinitely with clients attached but no activity. Together these guarantee it always winds down, on every platform. (#692)
 - React Native native→JS events now connect through the common `sendEvent(context, "X", body)` wrapper. Many libraries (react-native-device-info and others) wrap the event emitter behind a helper whose `.emit(eventName, …)` takes a *variable*, so the matcher — which looked for `.emit("literal", …)` — missed it; the literal event name actually lives in the wrapper call. Now a native method that fires `sendEvent(…, "batteryLevelChanged", …)` links to the JS `addListener('batteryLevelChanged', …)` handler, so editing the native emitter surfaces the JS subscriber. (React Native)
 - React Native / Expo cross-language bridges are more complete and more precise. An Expo Module method declared with a generic type — Android's `AsyncFunction<Float>("getBatteryLevelAsync")` — is now indexed (the `<Float>` used to defeat the matcher, so every Android Expo method was dropped and a JS call resolved only to the iOS Swift impl). The iOS and Android implementations of the same JS-visible method — both Expo Modules and classic NativeModules (`@ReactMethod` on Android, the matching method on iOS) — are now linked to each other, so a JS call that resolves to one platform still reaches the other and editing either platform's native code surfaces the JS caller. And a `Type.member` static read in native code (e.g. Android's `BatteryManager.EXTRA_LEVEL`) no longer falsely links to a coincidentally same-named class in another language (a web `BatteryManager`) — type references stay within a language family, while genuine cross-language bridges (config→code, JS↔native calls) are unaffected. (React Native, Expo)
 - A TypeScript/JavaScript reference or import no longer gets mis-linked to a same-named class in a native language. In a React Native / Expo repo that has both a TypeScript `TestRunner` type and a Kotlin `TestRunner` class, a TS reference to `TestRunner` — or an `import React` sitting next to a Swift `React` — used to resolve onto the native symbol (the component resolver matched any same-named class regardless of language, and import statements weren't language-checked at all). References and imports now stay within their language family, so they land on the right symbol while genuine cross-language bridges (JS↔native calls, config→code) are untouched. A C/C++ `#include "Foo.h"` likewise no longer resolves to a same-named header from another platform (an iOS Objective-C `Foo.h`). (React Native, Expo, TypeScript, C/C++)

+ 69 - 0
__tests__/daemon-client-liveness.test.ts

@@ -0,0 +1,69 @@
+/**
+ * Unit coverage for the daemon-side client-liveness primitives (#692, Layer 2).
+ *
+ * These back the daemon's defense against a phantom client — one whose process
+ * died without the socket ever signalling close (a Windows named-pipe hazard).
+ * The wire parsing and the liveness decision are pure, so they're tested here;
+ * the full handshake + sweep is exercised end-to-end in `mcp-daemon.test.ts`.
+ */
+import { describe, it, expect } from 'vitest';
+import { parseClientHelloLine, peerIsDead } from '../src/mcp/daemon';
+
+describe('parseClientHelloLine', () => {
+  it('parses a well-formed client-hello', () => {
+    expect(parseClientHelloLine('{"codegraph_client":1,"pid":1234,"hostPid":56}'))
+      .toEqual({ pid: 1234, hostPid: 56 });
+  });
+
+  it('accepts a null host pid and a missing host pid', () => {
+    expect(parseClientHelloLine('{"codegraph_client":1,"pid":1234,"hostPid":null}'))
+      .toEqual({ pid: 1234, hostPid: null });
+    expect(parseClientHelloLine('{"codegraph_client":1,"pid":1234}'))
+      .toEqual({ pid: 1234, hostPid: null });
+  });
+
+  it('returns null for a JSON-RPC message (no marker) so it is treated as data', () => {
+    expect(parseClientHelloLine('{"jsonrpc":"2.0","id":1,"method":"initialize"}')).toBeNull();
+  });
+
+  it('rejects a wrong-typed marker, a non-numeric pid, and a non-integer marker', () => {
+    expect(parseClientHelloLine('{"codegraph_client":true,"pid":1}')).toBeNull();
+    expect(parseClientHelloLine('{"codegraph_client":2,"pid":1}')).toBeNull();
+    expect(parseClientHelloLine('{"codegraph_client":1,"pid":"1"}')).toBeNull();
+  });
+
+  it('returns null for invalid / empty / non-object JSON', () => {
+    expect(parseClientHelloLine('not json')).toBeNull();
+    expect(parseClientHelloLine('')).toBeNull();
+    expect(parseClientHelloLine('42')).toBeNull();
+    expect(parseClientHelloLine('null')).toBeNull();
+  });
+});
+
+describe('peerIsDead', () => {
+  const aliveAll = () => true;
+  const deadAll = () => false;
+  const deadOnly = (...pids: number[]) => (pid: number) => !pids.includes(pid);
+
+  it('never reaps a client with an unknown pid (no client-hello)', () => {
+    expect(peerIsDead({ pid: null, hostPid: null }, deadAll)).toBe(false);
+    expect(peerIsDead({ pid: null, hostPid: 99 }, deadAll)).toBe(false);
+  });
+
+  it('keeps a client whose proxy is alive', () => {
+    expect(peerIsDead({ pid: 100, hostPid: null }, aliveAll)).toBe(false);
+  });
+
+  it('reaps a client whose proxy process is gone', () => {
+    expect(peerIsDead({ pid: 100, hostPid: null }, deadOnly(100))).toBe(true);
+  });
+
+  it('reaps when the proxy is alive but its host is gone', () => {
+    // proxy 100 alive, host 42 dead
+    expect(peerIsDead({ pid: 100, hostPid: 42 }, deadOnly(42))).toBe(true);
+  });
+
+  it('keeps a client when both proxy and host are alive', () => {
+    expect(peerIsDead({ pid: 100, hostPid: 42 }, aliveAll)).toBe(false);
+  });
+});

+ 70 - 0
__tests__/mcp-daemon.test.ts

@@ -143,6 +143,16 @@ function readLockPid(root: string): number | null {
   } catch { return null; }
 }
 
+/** The socket path the daemon actually bound, as it recorded in its lockfile —
+ *  robust on Windows where a recomputed pipe path can differ from the daemon's. */
+function readLockSocketPath(root: string): string | null {
+  try {
+    const raw = fs.readFileSync(path.join(root, '.codegraph', 'daemon.pid'), 'utf8');
+    const info = JSON.parse(raw);
+    return typeof info.socketPath === 'string' ? info.socketPath : null;
+  } catch { return null; }
+}
+
 function readDaemonLog(root: string): string {
   try { return fs.readFileSync(path.join(root, '.codegraph', 'daemon.log'), 'utf8'); }
   catch { return ''; }
@@ -359,6 +369,66 @@ describe('Shared MCP daemon (issue #411)', () => {
     }
   }, 30000);
 
+  it('reaps a client whose process died without the socket closing (liveness sweep, #692)', async () => {
+    const net = await import('net');
+    // Bring a daemon up via a real proxy (a live client), sweep fast.
+    const env = { CODEGRAPH_DAEMON_IDLE_TIMEOUT_MS: '30000', CODEGRAPH_DAEMON_CLIENT_SWEEP_MS: '300' };
+    const server = spawnServer(tempDir, env);
+    servers.push(server);
+    sendInitialize(server.child, `file://${tempDir}`, 1);
+    await waitFor(() => findResponse(server.stdout, 1), 10000);
+    await waitFor(() => (readLockPid(realRoot) ?? 0) > 0, 8000);
+
+    // Connect a RAW client that announces a dead pid and then never closes its
+    // socket — the exact phantom-client shape the sweep exists to catch. Use the
+    // socket path the daemon recorded in its lockfile (robust on Windows, where
+    // a recomputed named-pipe path can differ from the one the daemon bound).
+    const sockPath = await waitFor(() => readLockSocketPath(realRoot), 8000);
+    const raw = net.createConnection(sockPath);
+    raw.on('error', () => { /* ignore — we destroy it ourselves */ });
+    try {
+      // Consume the daemon hello (one line), then send our client-hello.
+      // Generous timeouts: the unref'd sweep interval can stretch under a busy
+      // event loop (engine init / a loaded CI box), so don't race it tight.
+      await new Promise<void>((resolve, reject) => {
+        let buf = '';
+        const to = setTimeout(() => reject(new Error('no daemon hello within 15s')), 15000);
+        raw.on('data', (c: Buffer) => {
+          buf += c.toString('utf8');
+          if (buf.includes('\n')) { clearTimeout(to); resolve(); }
+        });
+      });
+      raw.write(JSON.stringify({ codegraph_client: 1, pid: 999_999, hostPid: null }) + '\n');
+
+      // The sweep should detect pid 999999 is dead and reap that client.
+      await waitFor(
+        () => readDaemonLog(realRoot).includes('Reaping client with dead peer (pid 999999'),
+        15000,
+      );
+    } finally {
+      raw.destroy();
+    }
+  }, 60000);
+
+  it('exits on the inactivity backstop even while a client stays connected (#692)', async () => {
+    // Backstop short, idle timeout long: with a client connected the idle timer
+    // never arms, so only the inactivity backstop can take the daemon down.
+    const env = { CODEGRAPH_DAEMON_MAX_IDLE_MS: '1500', CODEGRAPH_DAEMON_IDLE_TIMEOUT_MS: '60000' };
+    const server = spawnServer(tempDir, env);
+    servers.push(server);
+    sendInitialize(server.child, `file://${tempDir}`, 1);
+    await waitFor(() => findResponse(server.stdout, 1), 10000);
+    await waitFor(() => (readLockPid(realRoot) ?? 0) > 0, 8000);
+    const daemonPid = readLockPid(realRoot)!;
+    expect(isAlive(daemonPid)).toBe(true);
+
+    // Send nothing further — the client stays connected but idle. The backstop
+    // should fire and the daemon should exit and clean up its lockfile.
+    expect(await waitProcessExit(daemonPid, 12000)).toBe(true);
+    expect(readDaemonLog(realRoot)).toContain('inactivity backstop');
+    expect(fs.existsSync(path.join(realRoot, '.codegraph', 'daemon.pid'))).toBe(false);
+  }, 30000);
+
   it('daemon idle-times-out after the last client disconnects', async () => {
     const env = { CODEGRAPH_DAEMON_IDLE_TIMEOUT_MS: '800', CODEGRAPH_PPID_POLL_MS: '200' };
     const server = spawnServer(tempDir, env);

+ 229 - 8
src/mcp/daemon.ts

@@ -58,6 +58,22 @@ import { CodeGraphPackageVersion } from './version';
 /** Default idle linger after the last client disconnects. */
 const DEFAULT_IDLE_TIMEOUT_MS = 300_000;
 
+/**
+ * Hard ceiling on how long the daemon stays up with clients connected but no
+ * inbound traffic. A backstop (#692): if a client's socket-close is never
+ * delivered (a Windows named-pipe hazard) it stays counted forever and the
+ * normal idle timer — which only arms at zero clients — never fires. A phantom
+ * client sends no traffic, so bounding on inactivity reaps the daemon anyway.
+ * Set generously so a real but momentarily-idle session isn't reaped mid-use.
+ */
+const DEFAULT_MAX_IDLE_MS = 1_800_000; // 30 min
+
+/** How often the daemon sweeps connected clients for a dead peer process (#692). */
+const DEFAULT_CLIENT_SWEEP_MS = 30_000;
+
+/** How long the daemon waits for the optional client-hello before proceeding without it. */
+const CLIENT_HELLO_TIMEOUT_MS = 3_000;
+
 /** Bytes/parse-window for an oversized hello line — bounded against a malicious peer. */
 const MAX_HELLO_LINE_BYTES = 4096;
 
@@ -74,6 +90,21 @@ export interface DaemonHello {
   protocol: 1;       // bump if the hello shape changes
 }
 
+/**
+ * Optional reverse-handshake line a proxy sends right after it verifies the
+ * daemon hello, carrying its own pids so the daemon can reap the client if its
+ * process dies WITHOUT the socket ever signalling close (the Windows named-pipe
+ * hazard behind #692). Entirely optional and fail-safe: a connection that never
+ * sends it (a legacy/direct client) just falls back to the socket-close
+ * lifecycle. The `codegraph_client` marker is what tells it apart from the
+ * client's first JSON-RPC message.
+ */
+export interface DaemonClientHello {
+  codegraph_client: 1;
+  pid: number;             // the proxy process's own pid
+  hostPid: number | null;  // the MCP host pid (past any launcher shim), if known
+}
+
 export interface DaemonStartResult {
   /** Always-non-null for a successfully-started daemon. */
   socketPath: string;
@@ -95,8 +126,14 @@ export interface DaemonStartResult {
 export class Daemon {
   private server: net.Server | null = null;
   private clients = new Set<MCPSession>();
+  /** Per-client peer pids from the optional client-hello, for the liveness sweep. */
+  private clientPeers = new Map<MCPSession, { pid: number | null; hostPid: number | null }>();
   private idleTimer: NodeJS.Timeout | null = null;
   private idleTimeoutMs: number;
+  private maxIdleMs: number;
+  private lastActivityAt = Date.now();
+  private maxIdleTimer: NodeJS.Timeout | null = null;
+  private clientSweepTimer: NodeJS.Timeout | null = null;
   private engine: MCPEngine;
   private stopping = false;
   private socketPath: string;
@@ -104,11 +141,12 @@ export class Daemon {
 
   constructor(
     private projectRoot: string,
-    opts: { idleTimeoutMs?: number } = {},
+    opts: { idleTimeoutMs?: number; maxIdleMs?: number } = {},
   ) {
     this.socketPath = getDaemonSocketPath(projectRoot);
     this.pidPath = getDaemonPidPath(projectRoot);
     this.idleTimeoutMs = opts.idleTimeoutMs ?? resolveIdleTimeoutMs();
+    this.maxIdleMs = opts.maxIdleMs ?? resolveMaxIdleMs();
     this.engine = new MCPEngine();
     this.engine.setProjectPathHint(projectRoot);
   }
@@ -161,6 +199,7 @@ export class Daemon {
     // ever connects to (e.g. spawned then abandoned because the launcher died)
     // doesn't pin resources forever.
     this.armIdleTimer();
+    this.startLivenessTimers();
 
     process.on('SIGINT', () => this.stop('SIGINT'));
     process.on('SIGTERM', () => this.stop('SIGTERM'));
@@ -186,6 +225,14 @@ export class Daemon {
       clearTimeout(this.idleTimer);
       this.idleTimer = null;
     }
+    if (this.maxIdleTimer) {
+      clearInterval(this.maxIdleTimer);
+      this.maxIdleTimer = null;
+    }
+    if (this.clientSweepTimer) {
+      clearInterval(this.clientSweepTimer);
+      this.clientSweepTimer = null;
+    }
     process.stderr.write(`[CodeGraph daemon] Shutting down (${reason}; clients=${this.clients.size}).\n`);
     for (const session of [...this.clients]) {
       try { session.stop(); } catch { /* best-effort */ }
@@ -214,18 +261,30 @@ export class Daemon {
     };
     socket.write(JSON.stringify(hello) + '\n');
 
-    const transport = new SocketTransport(socket);
-    const session = new MCPSession(transport, this.engine, {
-      explicitProjectPath: this.projectRoot,
+    // Read the optional client-hello (proxy → daemon) to learn the client's
+    // peer pids, then hand the socket to the session. Fail-safe: any problem —
+    // timeout, a non-hello first line, an early close — yields null pids and we
+    // fall back to the socket-close lifecycle exactly as before (#692).
+    void readClientHello(socket).then((peers) => {
+      const transport = new SocketTransport(socket);
+      const session = new MCPSession(transport, this.engine, {
+        explicitProjectPath: this.projectRoot,
+      });
+      transport.onClose(() => this.dropClient(session));
+      this.clients.add(session);
+      this.clientPeers.set(session, peers);
+      this.disarmIdleTimer();
+      session.start();
+      // Observe inbound bytes purely to feed the inactivity backstop — a second
+      // 'data' listener that reads nothing, added AFTER the transport's so the
+      // unshifted client-hello tail reaches the transport intact.
+      socket.on('data', () => { this.lastActivityAt = Date.now(); });
     });
-    transport.onClose(() => this.dropClient(session));
-    this.clients.add(session);
-    this.disarmIdleTimer();
-    session.start();
   }
 
   private dropClient(session: MCPSession): void {
     if (!this.clients.delete(session)) return;
+    this.clientPeers.delete(session);
     if (this.clients.size === 0) this.armIdleTimer();
   }
 
@@ -255,6 +314,58 @@ export class Daemon {
     this.idleTimer = null;
   }
 
+  /**
+   * Defense-in-depth against a daemon that outlives its clients (#692), for the
+   * cases the refcount + idle timer miss because a socket close never arrives:
+   *   - **Inactivity backstop:** exit if no inbound traffic for `maxIdleMs` while
+   *     clients are still (nominally) connected. A phantom client sends nothing,
+   *     so it can't pin the daemon past this window.
+   *   - **Liveness sweep:** drop any client whose peer process has died (per the
+   *     client-hello pids), which re-arms the idle timer once the last real
+   *     client is gone. Catches a dead peer within one sweep instead of waiting
+   *     out the whole backstop.
+   * Both timers are unref'd — the listening server keeps the loop alive, and
+   * neither should hold it open on its own.
+   */
+  private startLivenessTimers(): void {
+    if (this.maxIdleMs > 0) {
+      const tick = Math.min(this.maxIdleMs, 60_000);
+      this.maxIdleTimer = setInterval(() => {
+        if (this.stopping || this.clients.size === 0) return; // idle timer owns the no-client case
+        if (Date.now() - this.lastActivityAt >= this.maxIdleMs) {
+          void this.stop('inactivity backstop');
+        }
+      }, tick);
+      this.maxIdleTimer.unref?.();
+    }
+    const sweepMs = resolveClientSweepMs();
+    if (sweepMs > 0) {
+      this.clientSweepTimer = setInterval(() => this.reapDeadClients(isProcessAlive), sweepMs);
+      this.clientSweepTimer.unref?.();
+    }
+  }
+
+  /**
+   * Drop every connected client whose peer process is gone. Returns the count
+   * reaped. `isAlive` is injected for testing. Clients with unknown pids (no
+   * client-hello) are skipped — they rely on the socket-close path.
+   */
+  reapDeadClients(isAlive: (pid: number) => boolean): number {
+    if (this.clients.size === 0) return 0;
+    let reaped = 0;
+    for (const session of [...this.clients]) {
+      const peers = this.clientPeers.get(session);
+      if (!peers || !peerIsDead(peers, isAlive)) continue;
+      process.stderr.write(
+        `[CodeGraph daemon] Reaping client with dead peer (pid ${peers.pid}); clients=${this.clients.size - 1}.\n`
+      );
+      try { session.stop(); } catch { /* best-effort */ }
+      this.dropClient(session);
+      reaped++;
+    }
+    return reaped;
+  }
+
   private cleanupLockfile(): void {
     try {
       if (fs.existsSync(this.pidPath)) {
@@ -393,5 +504,115 @@ function resolveIdleTimeoutMs(): number {
   return Math.floor(parsed);
 }
 
+function resolveMaxIdleMs(): number {
+  const raw = process.env.CODEGRAPH_DAEMON_MAX_IDLE_MS;
+  if (raw === undefined || raw === '') return DEFAULT_MAX_IDLE_MS;
+  const parsed = Number(raw);
+  if (!Number.isFinite(parsed) || parsed < 0) return DEFAULT_MAX_IDLE_MS;
+  return Math.floor(parsed); // 0 disables the backstop
+}
+
+function resolveClientSweepMs(): number {
+  const raw = process.env.CODEGRAPH_DAEMON_CLIENT_SWEEP_MS;
+  if (raw === undefined || raw === '') return DEFAULT_CLIENT_SWEEP_MS;
+  const parsed = Number(raw);
+  if (!Number.isFinite(parsed) || parsed < 0) return DEFAULT_CLIENT_SWEEP_MS;
+  return Math.floor(parsed); // 0 disables the sweep
+}
+
+/**
+ * Parse one client-hello line. Returns the peer pids if `line` is a well-formed
+ * client-hello (carries the `codegraph_client` marker), or null otherwise — in
+ * which case the caller treats the bytes as ordinary JSON-RPC.
+ */
+export function parseClientHelloLine(
+  line: string,
+): { pid: number; hostPid: number | null } | null {
+  let parsed: unknown;
+  try { parsed = JSON.parse(line); } catch { return null; }
+  if (!parsed || typeof parsed !== 'object') return null;
+  const o = parsed as Record<string, unknown>;
+  if (o.codegraph_client !== 1 || typeof o.pid !== 'number') return null;
+  return { pid: o.pid, hostPid: typeof o.hostPid === 'number' ? o.hostPid : null };
+}
+
+/**
+ * A client's peer is dead when its proxy process is gone, or when its known
+ * host process is gone. Unknown pid (no client-hello) is never "dead" on this
+ * basis — those clients rely on the socket-close path. Exported for testing.
+ */
+export function peerIsDead(
+  peers: { pid: number | null; hostPid: number | null },
+  isAlive: (pid: number) => boolean,
+): boolean {
+  if (peers.pid === null) return false;
+  if (!isAlive(peers.pid)) return true;
+  if (peers.hostPid !== null && !isAlive(peers.hostPid)) return true;
+  return false;
+}
+
+/**
+ * Read the optional client-hello line a proxy sends after the daemon hello.
+ * Always resolves (never rejects) — fail-safe by design, since every connection
+ * funnels through here. Resolves with the peer pids when the first line is a
+ * client-hello; otherwise resolves with null pids and unshifts the already-read
+ * bytes so the transport parses them as the client's first JSON-RPC message(s).
+ * Accumulates as Buffers and splits on the newline byte so a UTF-8 sequence
+ * straddling a chunk boundary in the unshifted tail is never corrupted.
+ */
+function readClientHello(
+  socket: net.Socket,
+): Promise<{ pid: number | null; hostPid: number | null }> {
+  return new Promise((resolve) => {
+    let chunks: Buffer[] = [];
+    let total = 0;
+    let settled = false;
+    const finish = (
+      peers: { pid: number | null; hostPid: number | null },
+      putBack?: Buffer,
+    ) => {
+      if (settled) return;
+      settled = true;
+      socket.removeListener('data', onData);
+      socket.removeListener('error', onEnd);
+      socket.removeListener('close', onEnd);
+      clearTimeout(timer);
+      if (putBack && putBack.length > 0 && !socket.destroyed) {
+        try { socket.unshift(putBack); } catch { /* stream already gone */ }
+      }
+      resolve(peers);
+    };
+    const onData = (chunk: Buffer | string) => {
+      const buf = typeof chunk === 'string' ? Buffer.from(chunk, 'utf8') : chunk;
+      chunks.push(buf);
+      total += buf.length;
+      const all = chunks.length === 1 ? buf : Buffer.concat(chunks, total);
+      const nl = all.indexOf(0x0a); // '\n'
+      if (nl === -1) {
+        // No newline yet. If it's already too long to be a hello, it isn't one —
+        // hand the bytes back as data; otherwise keep accumulating.
+        if (total > MAX_HELLO_LINE_BYTES) finish({ pid: null, hostPid: null }, all);
+        else chunks = [all];
+        return;
+      }
+      const peers = parseClientHelloLine(all.subarray(0, nl).toString('utf8'));
+      if (peers) {
+        const tail = all.subarray(nl + 1);
+        finish(peers, tail.length > 0 ? tail : undefined);
+      } else {
+        // First line is not a client-hello (legacy/direct client) — hand the
+        // whole buffer back so the transport sees the message verbatim.
+        finish({ pid: null, hostPid: null }, all);
+      }
+    };
+    const onEnd = () => finish({ pid: null, hostPid: null });
+    const timer = setTimeout(() => finish({ pid: null, hostPid: null }), CLIENT_HELLO_TIMEOUT_MS);
+    timer.unref?.();
+    socket.on('data', onData);
+    socket.on('error', onEnd);
+    socket.on('close', onEnd);
+  });
+}
+
 /** Exported for test stubs that need to bound the hello-line read. */
 export { MAX_HELLO_LINE_BYTES };

+ 21 - 1
src/mcp/proxy.ts

@@ -21,7 +21,7 @@
 import * as fs from 'fs';
 import * as net from 'net';
 import { HOST_PPID_ENV } from '../extraction/wasm-runtime-flags';
-import { DaemonHello, MAX_HELLO_LINE_BYTES } from './daemon';
+import { DaemonClientHello, DaemonHello, MAX_HELLO_LINE_BYTES } from './daemon';
 import { supervisionLostReason } from './ppid-watchdog';
 import { CodeGraphPackageVersion } from './version';
 import { SERVER_INFO, PROTOCOL_VERSION } from './session';
@@ -93,6 +93,7 @@ export async function runProxy(
     `[CodeGraph MCP] Attached to shared daemon on ${socketPath} (pid ${hello.pid}, v${hello.codegraph}).\n`
   );
 
+  sendClientHello(socket);
   startPpidWatchdog(socket);
   await pipeUntilClose(socket);
   // Host disconnected (or the daemon went away). The proxy's only job is the
@@ -132,9 +133,28 @@ export async function connectWithHello(
   process.stderr.write(
     `[CodeGraph MCP] Attached to shared daemon on ${socketPath} (pid ${hello.pid}, v${hello.codegraph}).\n`
   );
+  sendClientHello(socket);
   return socket;
 }
 
+/**
+ * Tell the daemon our pids right after we verify its hello, so its liveness
+ * sweep can reap this client if our process dies without the socket ever
+ * signalling close (the Windows named-pipe hazard behind #692). Best-effort:
+ * sent before any piped bytes so it's always the daemon's first line from us,
+ * and a write failure here is harmless (the daemon just falls back to the
+ * socket-close lifecycle). `hostPid` mirrors the PPID watchdog: the threaded
+ * host pid if set, else our own parent (the host, on a no-relaunch bundle).
+ */
+function sendClientHello(socket: net.Socket): void {
+  const clientHello: DaemonClientHello = {
+    codegraph_client: 1,
+    pid: process.pid,
+    hostPid: parseHostPpid(process.env[HOST_PPID_ENV]) ?? process.ppid,
+  };
+  try { socket.write(JSON.stringify(clientHello) + '\n'); } catch { /* best-effort */ }
+}
+
 type JsonRpc = Record<string, unknown>;
 
 /** Dependencies the local-handshake proxy needs, injected by MCPServer (which