Jelajahi Sumber

fix(index): yield during resolution so the liveness watchdog can't kill a valid large index (#1091) (#1105)

The #850 liveness watchdog SIGKILLs a process whose main-thread event loop
stalls past its window (60s default). It was extended to `index`/`init` in
#999, but reference resolution and callback-edge synthesis run synchronously
on that same thread — so on a large repo a legitimate, in-progress index gets
killed, and users had to disable the watchdog entirely (CODEGRAPH_NO_WATCHDOG=1).

Make the long synchronous spans yield cooperatively so the heartbeat keeps
firing during real work, while a genuinely wedged span (which never reaches a
yield) still trips the watchdog:

- synthesizeCallbackEdges yields between its whole-graph passes, and the heavy
  scanners (closure-collection, event-emitter, JSX-child, object-registry,
  field-channel) yield within their loops;
- batched resolution sub-chunks each batch with yields;
- the deferred chained-call and this-member post-passes yield per ref.

Behaviour-preserving — only timing changes; node/edge counts are identical.

Validated end-to-end with the real watchdog armed at the default 60s: the
released build is SIGKILLed partway through indexing the Swift compiler (27k
files, ~1.1M edges) and the TypeScript compiler, while the fixed build indexes
both to completion.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Colby Mchenry 11 jam lalu
induk
melakukan
ed39233f1a

+ 1 - 0
CHANGELOG.md

@@ -11,6 +11,7 @@ and adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 
 ### Fixes
 
+- Indexing a large project no longer gets killed partway through with a "Main thread unresponsive — killing the wedged process" message. The safety watchdog that stops a genuinely stuck index was mistaking slow-but-normal work for a hang: on a big repo, linking up references and cross-file relationships can legitimately run for a while, and that work now regularly yields so the watchdog can tell real progress from a true stall. Projects that previously failed to finish `codegraph init` / `codegraph index` (and had to fall back to `CODEGRAPH_NO_WATCHDOG=1`) now complete normally, while a genuinely hung process is still caught. Thanks @zmcrazy, @YoungLiao, and @GeeLab-Mob for the reports. (#1091)
 - On Windows, a console window no longer briefly flashes when CodeGraph runs as a background MCP server. When the npm launcher started the bundled runtime — which happens every time an editor starts the server or reconnects after the daemon idles out — and during its self-heal step that extracts a missing platform bundle, Windows would pop up a black console (conhost) window for a moment. Both now launch hidden, matching how the daemon already behaved; the browser-open step of `codegraph login` was hardened the same way. Thanks @luoyerr for the report and root-cause analysis. (#1092)
 - C++ forward declarations no longer crowd out the real class definition. A `class Foo;` forward declaration — common in large C++ and Unreal Engine codebases, where a heavily used class is forward-declared across dozens of headers — was indexed as its own class node every time it appeared. So exploring that class returned mostly forward-declaration sites, and could even pick one of them as the representative for blast-radius, burying the actual definition and its members and callers. Bodiless forward declarations are now skipped for C and C++, exactly as forward-declared structs and enums already were, so only the real definition is indexed. Languages where a class with no body is a complete definition — such as Kotlin's `class Empty` and Scala — are unaffected. Thanks @luoyxy for the report and root-cause analysis. (#1093)
 - C++ methods that return a reference, and user-defined conversion operators, are now indexed under their correct names. An inline getter like `const FGameplayTagContainer& GetActiveTags() const` — everywhere in Unreal Engine headers — was indexed as `& GetActiveTags() const` instead of `GetActiveTags`, and a conversion operator like `operator EALSMovementState() const` kept its trailing `() const` instead of reading `operator EALSMovementState`. In both cases the garbled name meant you couldn't find the symbol by name and its callers weren't linked. Both now read cleanly, matching how pointer-returning and value-returning methods already worked. (#1096)

+ 82 - 0
__tests__/cooperative-yield.test.ts

@@ -0,0 +1,82 @@
+/**
+ * Cooperative-yield helper + the async contract of the main-thread resolution
+ * spans it protects (#1091).
+ *
+ * Background: reference resolution and callback-edge synthesis run on the
+ * indexer's MAIN thread. The #850 liveness watchdog SIGKILLs the process when
+ * that thread doesn't turn its event loop within the timeout window, because its
+ * heartbeat is a timer on that same thread. On a large repo those spans run for
+ * minutes, so they must yield periodically or a VALID index gets killed. These
+ * tests pin (a) the yielder's budget semantics and (b) that the three long spans
+ * stayed `async` so they CAN yield — a revert to a synchronous version would
+ * reintroduce the wedge, and the AsyncFunction assertions fail loudly if so.
+ */
+import { describe, it, expect } from 'vitest';
+import { createYielder, DEFAULT_YIELD_BUDGET_MS } from '../src/resolution/cooperative-yield';
+import { synthesizeCallbackEdges } from '../src/resolution/callback-synthesizer';
+import { ReferenceResolver } from '../src/resolution/index';
+
+/**
+ * A `setImmediate` callback runs in the check phase — AFTER the microtask queue
+ * drains. So if `await maybeYield()` did NOT cross a macrotask boundary (it was
+ * under budget and returned a synchronously-resolved promise), a `setImmediate`
+ * scheduled just before it has NOT fired yet. If it DID yield (awaited its own
+ * `setImmediate`), the earlier `setImmediate` — queued first, FIFO — has fired.
+ * This makes "did it yield?" a deterministic, non-timing assertion.
+ */
+async function yieldedDuring(maybeYield: () => Promise<void>): Promise<boolean> {
+  let macrotaskRan = false;
+  setImmediate(() => { macrotaskRan = true; });
+  await maybeYield();
+  return macrotaskRan;
+}
+
+describe('createYielder', () => {
+  it('does not yield while under the time budget', async () => {
+    const maybeYield = createYielder(100_000); // effectively never elapses in-test
+    expect(await yieldedDuring(maybeYield)).toBe(false);
+    // Repeated calls stay coalesced — still no macrotask boundary crossed.
+    expect(await yieldedDuring(maybeYield)).toBe(false);
+  });
+
+  it('yields once the budget has elapsed, then resets', async () => {
+    const maybeYield = createYielder(0); // 0ms budget → every checkpoint yields
+    expect(await yieldedDuring(maybeYield)).toBe(true);
+    // Reset: the next checkpoint also yields (budget is measured from the last
+    // yield, and 0ms has "elapsed" again).
+    expect(await yieldedDuring(maybeYield)).toBe(true);
+  });
+
+  it('yields after real wall-clock exceeds the budget', async () => {
+    const maybeYield = createYielder(20);
+    expect(await yieldedDuring(maybeYield)).toBe(false); // fresh — under budget
+    const until = Date.now() + 35;
+    while (Date.now() < until) { /* busy-wait past the 20ms budget */ }
+    expect(await yieldedDuring(maybeYield)).toBe(true);
+  });
+
+  it('exposes a sane default budget under the watchdog heartbeat cadence', () => {
+    // The watchdog writes a heartbeat every ~1s at minimum; the yield budget
+    // must be well under that so a beat can always land between yields.
+    expect(DEFAULT_YIELD_BUDGET_MS).toBeGreaterThan(0);
+    expect(DEFAULT_YIELD_BUDGET_MS).toBeLessThan(1000);
+  });
+});
+
+describe('main-thread resolution spans stay async (so they can yield) — #1091', () => {
+  it('synthesizeCallbackEdges is an async function', () => {
+    expect(synthesizeCallbackEdges.constructor.name).toBe('AsyncFunction');
+  });
+
+  it('resolveChainedCallsViaConformance is an async function', () => {
+    expect(ReferenceResolver.prototype.resolveChainedCallsViaConformance.constructor.name).toBe('AsyncFunction');
+  });
+
+  it('resolveDeferredThisMemberRefs is an async function', () => {
+    expect(ReferenceResolver.prototype.resolveDeferredThisMemberRefs.constructor.name).toBe('AsyncFunction');
+  });
+
+  it('resolveAndPersistBatched is an async function', () => {
+    expect(ReferenceResolver.prototype.resolveAndPersistBatched.constructor.name).toBe('AsyncFunction');
+  });
+});

+ 10 - 0
src/bin/command-supervision.ts

@@ -17,6 +17,16 @@
  * available to a one-shot command. Best-effort and self-disabling: a missing
  * watchdog never blocks the command from running. Both honour the same env
  * switches as `serve` (`CODEGRAPH_NO_WATCHDOG`, `CODEGRAPH_PPID_POLL_MS=0`).
+ *
+ * Unlike the daemon — whose main thread only does fast, bounded work — the
+ * `index`/`init` path runs reference resolution and dynamic-edge synthesis
+ * SYNCHRONOUSLY on this thread, and on a large repo that is legitimately many
+ * seconds of work. So those spans yield cooperatively to the event loop
+ * (`src/resolution/cooperative-yield.ts`) to keep the heartbeat alive; without
+ * that the watchdog would SIGKILL a valid, in-progress index (#1091). The
+ * distinction it must preserve — kill a TRUE wedge, spare slow-but-progressing
+ * work — is exactly what cooperative yielding buys: a genuinely stuck span never
+ * reaches its next yield, so it still trips the timeout.
  */
 import { installMainThreadWatchdog } from '../mcp/liveness-watchdog';
 import { supervisionLostReason, parsePpidPollMs, parseHostPpid } from '../mcp/ppid-watchdog';

+ 4 - 4
src/index.ts

@@ -473,10 +473,10 @@ export class CodeGraph {
           // receiver conforms to (protocol-extension / inherited / default-
           // interface). Needs the implements/extends edges the main pass just
           // built, so it runs after resolution (#750).
-          this.resolver.resolveChainedCallsViaConformance();
+          await this.resolver.resolveChainedCallsViaConformance();
           // Same lifecycle for `this.<member>` callback registrations whose
           // member is inherited from a supertype (#808).
-          this.resolver.resolveDeferredThisMemberRefs();
+          await this.resolver.resolveDeferredThisMemberRefs();
         }
 
         // Refresh planner stats + checkpoint the WAL after bulk writes.
@@ -597,10 +597,10 @@ export class CodeGraph {
           // Second pass: chained calls whose method lives on a supertype the
           // receiver conforms to (protocol-extension / inherited). Needs the
           // implements/extends edges built above (#750).
-          this.resolver.resolveChainedCallsViaConformance();
+          await this.resolver.resolveChainedCallsViaConformance();
           // Same lifecycle for `this.<member>` callback registrations whose
           // member is inherited from a supertype (#808).
-          this.resolver.resolveDeferredThisMemberRefs();
+          await this.resolver.resolveDeferredThisMemberRefs();
         }
 
         // Refresh planner stats + checkpoint the WAL after bulk writes.

+ 63 - 38
src/resolution/callback-synthesizer.ts

@@ -28,6 +28,7 @@ import { isGeneratedFile } from '../extraction/generated-detection';
 import { stripCommentsForRegex } from './strip-comments';
 import { cFnPointerDispatchEdges } from './c-fnptr-synthesizer';
 import { goframeRouteEdges } from './goframe-synthesizer';
+import { createYielder, type MaybeYield } from './cooperative-yield';
 
 const REGISTRAR_NAME = /^(on[A-Z]\w*|subscribe|addListener|addEventListener|register|watch|listen|addCallback)$/;
 const DISPATCHER_NAME = /(emit|trigger|notify|dispatch|fire|publish|flush)/i;
@@ -139,11 +140,13 @@ function* methodAndFunctionNodes(queries: QueryBuilder): IterableIterator<Node>
 }
 
 /** Phase 1: field-backed observer channels (registrar/dispatcher share a store). */
-function fieldChannelEdges(queries: QueryBuilder, ctx: ResolutionContext): Edge[] {
+async function fieldChannelEdges(queries: QueryBuilder, ctx: ResolutionContext, onYield: MaybeYield): Promise<Edge[]> {
   const registrars: Array<{ node: Node; field: string }> = [];
   const dispatchers: Array<{ node: Node; field: string }> = [];
 
+  let scanned = 0;
   for (const m of methodAndFunctionNodes(queries)) {
+    if ((++scanned & 255) === 0) await onYield(); // #1091: yield mid-scan on huge graphs
     const isReg = REGISTRAR_NAME.test(m.name);
     const isDisp = DISPATCHER_NAME.test(m.name);
     if (!isReg && !isDisp) continue;
@@ -210,7 +213,7 @@ function fieldChannelEdges(queries: QueryBuilder, ctx: ResolutionContext): Edge[
  * subclass `DataRequest.validate`), bounded by a fan-out cap so a generic field
  * name shared across unrelated classes can't fan out into noise.
  */
-function closureCollectionEdges(queries: QueryBuilder, ctx: ResolutionContext): Edge[] {
+async function closureCollectionEdges(queries: QueryBuilder, ctx: ResolutionContext, onYield: MaybeYield): Promise<Edge[]> {
   const dispatchers = new Map<string, Array<{ node: Node; line: number }>>(); // field → dispatcher methods + forEach line
   const registrars = new Map<string, Array<{ node: Node; line: number }>>();   // field → registrar methods + append line
 
@@ -221,7 +224,12 @@ function closureCollectionEdges(queries: QueryBuilder, ctx: ResolutionContext):
     registrars.set(field, arr);
   };
 
+  // Slices EVERY method/function's source (no cheap name-gate), so on a repo
+  // with a huge file this is the heaviest synthesis pass — yield mid-scan so it
+  // can't wedge the #850 watchdog on its own (#1091).
+  let scanned = 0;
   for (const m of methodAndFunctionNodes(queries)) {
+    if ((++scanned & 127) === 0) await onYield();
     const content = ctx.readFile(m.filePath);
     const src = content && sliceLines(content, m.startLine, m.endLine);
     if (!src) continue;
@@ -271,11 +279,13 @@ function closureCollectionEdges(queries: QueryBuilder, ctx: ResolutionContext):
 }
 
 /** Phase 2: string-keyed EventEmitter channels (on('e', fn) ↔ emit('e')). */
-function eventEmitterEdges(ctx: ResolutionContext): Edge[] {
+async function eventEmitterEdges(ctx: ResolutionContext, onYield: MaybeYield): Promise<Edge[]> {
   const emitsByEvent = new Map<string, Set<string>>();          // event → dispatcher node ids
   const handlersByEvent = new Map<string, Map<string, string>>(); // event → handler id → registration site (file:line)
 
+  let scanned = 0;
   for (const file of ctx.getAllFiles()) {
+    if ((++scanned & 255) === 0) await onYield(); // #1091: yield mid-scan on huge graphs
     const content = ctx.readFile(file);
     if (!content) continue;
     const hasEmit = content.includes('.emit(') || content.includes('.fire(') || content.includes('.dispatchEvent(');
@@ -842,11 +852,13 @@ function goGrpcStubImplEdges(queries: QueryBuilder): Edge[] {
  * component/function/class node — TS generics like `Array<Foo>` resolve to a type
  * (or nothing) and are dropped.
  */
-function reactJsxChildEdges(ctx: ResolutionContext): Edge[] {
+async function reactJsxChildEdges(ctx: ResolutionContext, onYield: MaybeYield): Promise<Edge[]> {
   const edges: Edge[] = [];
   const seen = new Set<string>();
   const PARENT_KINDS = new Set(['method', 'function', 'component']);
+  let scanned = 0;
   for (const file of ctx.getAllFiles()) {
+    if ((++scanned & 255) === 0) await onYield(); // #1091: yield mid-scan on huge graphs
     const content = ctx.readFile(file);
     if (!content || (!content.includes('</') && !content.includes('/>'))) continue; // JSX-file gate
     const parents = ctx.getNodesInFile(file).filter((n) => PARENT_KINDS.has(n.kind));
@@ -1792,10 +1804,12 @@ function resolveRegistryHandler(ctx: ResolutionContext, name: string, chained: s
   return cands.find((n) => n.kind === 'method') ?? null;
 }
 
-function objectRegistryEdges(ctx: ResolutionContext): Edge[] {
+async function objectRegistryEdges(ctx: ResolutionContext, onYield: MaybeYield): Promise<Edge[]> {
   const edges: Edge[] = [];
   const seen = new Set<string>();
+  let scanned = 0;
   for (const file of ctx.getAllFiles()) {
+    if ((++scanned & 255) === 0) await onYield(); // #1091: yield mid-scan on huge graphs
     if (!REGISTRY_JS_EXT.test(file)) continue;
     const content = ctx.readFile(file);
     // Cheap pre-filter: a computed member access BY NAME (`ident[ident`) — the dispatch shape.
@@ -2658,7 +2672,16 @@ function laravelEventEdges(ctx: ResolutionContext): Edge[] {
  * Sidekiq Worker.perform_async → #perform + Laravel event(new X) → listener handle).
  * Returns the count added. Never throws into indexing — callers wrap in try/catch.
  */
-export function synthesizeCallbackEdges(queries: QueryBuilder, ctx: ResolutionContext): number {
+export async function synthesizeCallbackEdges(queries: QueryBuilder, ctx: ResolutionContext): Promise<number> {
+  // Each sub-pass below is a whole-graph scan, and there are ~30 of them, all
+  // running synchronously on the indexer's main thread. Their AGGREGATE can run
+  // for well over a minute on a large repo — long enough for the #850 liveness
+  // watchdog to SIGKILL the process mid-index (#1091), since its heartbeat lives
+  // on this same thread. Yield between passes so the heartbeat can fire; a pass
+  // that itself hangs (a real wedge) never reaches the next yield, so the
+  // watchdog still catches that. See ./cooperative-yield.
+  const yieldToLoop = createYielder();
+
   // Cross-file Go method→type `contains` edges must be synthesized AND persisted
   // FIRST: a method declared in a different file from its receiver type is
   // otherwise orphaned from the struct, and goImplementsEdges (next) derives a
@@ -2666,6 +2689,7 @@ export function synthesizeCallbackEdges(queries: QueryBuilder, ctx: ResolutionCo
   // under-count the interfaces a cross-file struct satisfies. (#583)
   const goMethodContains = goCrossFileMethodContainsEdges(queries);
   if (goMethodContains.length > 0) queries.insertEdges(goMethodContains);
+  await yieldToLoop();
 
   // Go implicit `implements` edges must be synthesized AND persisted next: the
   // interface-dispatch bridge below reads `implements` edges from the DB, and
@@ -2673,38 +2697,39 @@ export function synthesizeCallbackEdges(queries: QueryBuilder, ctx: ResolutionCo
   // edges from extraction, so they don't need this pre-pass.)
   const goImpl = goImplementsEdges(queries);
   if (goImpl.length > 0) queries.insertEdges(goImpl);
-
-  const fieldEdges = fieldChannelEdges(queries, ctx);
-  const closureCollEdges = closureCollectionEdges(queries, ctx);
-  const emitterEdges = eventEmitterEdges(ctx);
-  const renderEdges = reactRenderEdges(queries, ctx);
-  const jsxEdges = reactJsxChildEdges(ctx);
-  const vueEdges = vueTemplateEdges(ctx);
-  const svelteKitEdges = svelteKitLoadEdges(ctx);
-  const pascalEdges = pascalFormEdges(ctx);
-  const flutterEdges = flutterBuildEdges(queries, ctx);
-  const cppEdges = cppOverrideEdges(queries);
-  const ifaceEdges = interfaceOverrideEdges(queries);
-  const kotlinExpectActual = kotlinExpectActualEdges(queries);
-  const goGrpcEdges = goGrpcStubImplEdges(queries);
-  const rnEventEdgesList = rnEventEdges(ctx);
-  const fabricNativeEdges = fabricNativeImplEdges(ctx);
-  const expoXPlatEdges = expoCrossPlatformEdges(queries);
-  const rnXPlatEdges = rnCrossPlatformEdges(queries);
-  const mybatisEdges = mybatisJavaXmlEdges(queries);
-  const ginEdges = ginMiddlewareChainEdges(queries, ctx);
-  const thunkEdges = reduxThunkEdges(queries, ctx);
-  const registryEdges = objectRegistryEdges(ctx);
-  const rtkEdges = rtkQueryEdges(queries, ctx);
-  const piniaEdges = piniaStoreEdges(ctx);
-  const vuexEdges = vuexDispatchEdges(ctx);
-  const celeryEdges = celeryDispatchEdges(ctx);
-  const springEdges = springEventEdges(ctx);
-  const mediatrEdges = mediatrDispatchEdges(ctx);
-  const sidekiqEdges = sidekiqDispatchEdges(ctx);
-  const laravelEdges = laravelEventEdges(ctx);
-  const cFnPtrEdges = cFnPointerDispatchEdges(queries, ctx);
-  const goframeEdges = goframeRouteEdges(ctx);
+  await yieldToLoop();
+
+  const fieldEdges = await fieldChannelEdges(queries, ctx, yieldToLoop); await yieldToLoop();
+  const closureCollEdges = await closureCollectionEdges(queries, ctx, yieldToLoop); await yieldToLoop();
+  const emitterEdges = await eventEmitterEdges(ctx, yieldToLoop); await yieldToLoop();
+  const renderEdges = reactRenderEdges(queries, ctx); await yieldToLoop();
+  const jsxEdges = await reactJsxChildEdges(ctx, yieldToLoop); await yieldToLoop();
+  const vueEdges = vueTemplateEdges(ctx); await yieldToLoop();
+  const svelteKitEdges = svelteKitLoadEdges(ctx); await yieldToLoop();
+  const pascalEdges = pascalFormEdges(ctx); await yieldToLoop();
+  const flutterEdges = flutterBuildEdges(queries, ctx); await yieldToLoop();
+  const cppEdges = cppOverrideEdges(queries); await yieldToLoop();
+  const ifaceEdges = interfaceOverrideEdges(queries); await yieldToLoop();
+  const kotlinExpectActual = kotlinExpectActualEdges(queries); await yieldToLoop();
+  const goGrpcEdges = goGrpcStubImplEdges(queries); await yieldToLoop();
+  const rnEventEdgesList = rnEventEdges(ctx); await yieldToLoop();
+  const fabricNativeEdges = fabricNativeImplEdges(ctx); await yieldToLoop();
+  const expoXPlatEdges = expoCrossPlatformEdges(queries); await yieldToLoop();
+  const rnXPlatEdges = rnCrossPlatformEdges(queries); await yieldToLoop();
+  const mybatisEdges = mybatisJavaXmlEdges(queries); await yieldToLoop();
+  const ginEdges = ginMiddlewareChainEdges(queries, ctx); await yieldToLoop();
+  const thunkEdges = reduxThunkEdges(queries, ctx); await yieldToLoop();
+  const registryEdges = await objectRegistryEdges(ctx, yieldToLoop); await yieldToLoop();
+  const rtkEdges = rtkQueryEdges(queries, ctx); await yieldToLoop();
+  const piniaEdges = piniaStoreEdges(ctx); await yieldToLoop();
+  const vuexEdges = vuexDispatchEdges(ctx); await yieldToLoop();
+  const celeryEdges = celeryDispatchEdges(ctx); await yieldToLoop();
+  const springEdges = springEventEdges(ctx); await yieldToLoop();
+  const mediatrEdges = mediatrDispatchEdges(ctx); await yieldToLoop();
+  const sidekiqEdges = sidekiqDispatchEdges(ctx); await yieldToLoop();
+  const laravelEdges = laravelEventEdges(ctx); await yieldToLoop();
+  const cFnPtrEdges = cFnPointerDispatchEdges(queries, ctx); await yieldToLoop();
+  const goframeEdges = goframeRouteEdges(ctx); await yieldToLoop();
 
   const merged: Edge[] = [];
   const seen = new Set<string>();

+ 41 - 0
src/resolution/cooperative-yield.ts

@@ -0,0 +1,41 @@
+/**
+ * Cooperative yielding for long synchronous resolution spans.
+ *
+ * Reference resolution and callback-edge synthesis run on the indexer's MAIN
+ * thread — unlike parsing, which is off-thread in the parse worker. The #850
+ * liveness watchdog (armed on `index`/`init` since #999) SIGKILLs the process
+ * when that thread doesn't turn its event loop for the timeout window (default
+ * 60s), because its heartbeat is a `setInterval` on that same thread. On a large
+ * repo, resolving refs + synthesizing dynamic-dispatch edges legitimately runs
+ * for minutes, so a span that never yields starves the heartbeat and the
+ * watchdog kills a VALID, in-progress index — the exact symptom of #1091 (the
+ * progress bar freezes at wherever it last rendered — 88% / 100% — then the
+ * process is killed).
+ *
+ * `createYielder` returns a `maybeYield()` that yields (via `setImmediate`) only
+ * once more than `budgetMs` of wall-clock has elapsed since the last yield, so
+ * fast repos pay essentially nothing while slow ones give the heartbeat a
+ * regular window to fire. Call it at natural boundaries in a long loop (between
+ * batches, between synthesis passes).
+ *
+ * This does NOT weaken the watchdog. A genuinely wedged loop — an infinite or
+ * non-terminating span, the case the watchdog exists to catch — never reaches a
+ * yield point, so the heartbeat still stops and the SIGKILL still fires. We only
+ * stop killing work that is demonstrably making progress.
+ */
+
+/** Yield when more than `budgetMs` of wall-clock has passed since the last yield. */
+export type MaybeYield = () => Promise<void>;
+
+/** Default budget: well under the watchdog's minimum heartbeat cadence (~1s), so
+ * a heartbeat byte always has a chance to land between yields. */
+export const DEFAULT_YIELD_BUDGET_MS = 250;
+
+export function createYielder(budgetMs: number = DEFAULT_YIELD_BUDGET_MS): MaybeYield {
+  let last = Date.now();
+  return async function maybeYield(): Promise<void> {
+    if (Date.now() - last < budgetMs) return;
+    await new Promise<void>((resolve) => setImmediate(resolve));
+    last = Date.now();
+  };
+}

+ 58 - 4
src/resolution/index.ts

@@ -20,6 +20,7 @@ import { matchReference, matchFunctionRef, matchDottedCallChain, matchScopedCall
 import { resolveViaImport, resolveJvmImport, extractImportMappings, extractReExports, loadCppIncludeDirs, isPhpIncludePathRef } from './import-resolver';
 import { detectFrameworks } from './frameworks';
 import { synthesizeCallbackEdges } from './callback-synthesizer';
+import { createYielder, type MaybeYield } from './cooperative-yield';
 import { loadProjectAliases, type AliasMap } from './path-aliases';
 import { loadGoModule, type GoModule } from './go-module';
 import { loadWorkspacePackages, type WorkspacePackages } from './workspace-packages';
@@ -874,7 +875,7 @@ export class ReferenceResolver {
    * (re-resolving an already-resolved ref is a no-op since it's been deleted).
    * Returns the number of newly-created edges.
    */
-  resolveChainedCallsViaConformance(): number {
+  async resolveChainedCallsViaConformance(): Promise<number> {
     const deferred = this.deferredChainRefs;
     this.deferredChainRefs = [];
     if (deferred.length === 0) return 0;
@@ -883,6 +884,10 @@ export class ReferenceResolver {
     // these refs were deferred). matchDottedCallChain now resolves a method on a
     // supertype via context.getSupertypes -> resolveMethodOnType's conformance walk.
     this.clearCaches();
+    // This post-pass runs synchronously on the indexer's main thread; yield
+    // periodically so the #850 liveness watchdog heartbeat can fire on a repo
+    // with many deferred chained calls (#1091).
+    const maybeYield = createYielder();
     const resolved: ResolvedRef[] = [];
     for (const ref of deferred) {
       // `::`-receiver languages (Rust) split on `::` (matchScopedCallChain);
@@ -892,6 +897,7 @@ export class ReferenceResolver {
         : matchDottedCallChain(ref, this.context);
       const match = this.gateLanguage(chainMatch, ref);
       if (match) resolved.push(match);
+      await maybeYield();
     }
     if (resolved.length === 0) return 0;
 
@@ -903,6 +909,42 @@ export class ReferenceResolver {
     return edges.length;
   }
 
+  /**
+   * Resolve one batch in smaller sub-chunks, yielding to the event loop between
+   * them so the #850 liveness heartbeat can fire on a slow/dense batch (#1091).
+   * Behaviourally identical to a single `resolveAll(batch)`: `warmCaches()` is
+   * idempotent (guarded) and `resolveOne` is independent per ref, so splitting
+   * and re-merging changes only timing, never which edges get created. Falls
+   * through to a plain `resolveAll` when the batch is already small.
+   */
+  private async resolveBatchYielding(
+    batch: UnresolvedReference[],
+    maybeYield: MaybeYield,
+    subChunkSize: number = 500
+  ): Promise<ResolutionResult> {
+    if (batch.length <= subChunkSize) return this.resolveAll(batch);
+
+    const resolved: ResolvedRef[] = [];
+    const unresolved: UnresolvedRef[] = [];
+    const byMethod: Record<string, number> = {};
+    let total = 0;
+    let resolvedCount = 0;
+    let unresolvedCount = 0;
+    for (let i = 0; i < batch.length; i += subChunkSize) {
+      const chunk = this.resolveAll(batch.slice(i, i + subChunkSize));
+      for (const r of chunk.resolved) resolved.push(r);
+      for (const u of chunk.unresolved) unresolved.push(u);
+      total += chunk.stats.total;
+      resolvedCount += chunk.stats.resolved;
+      unresolvedCount += chunk.stats.unresolved;
+      for (const [m, c] of Object.entries(chunk.stats.byMethod)) {
+        byMethod[m] = (byMethod[m] || 0) + c;
+      }
+      await maybeYield();
+    }
+    return { resolved, unresolved, stats: { total, resolved: resolvedCount, unresolved: unresolvedCount, byMethod } };
+  }
+
   /**
    * Resolve and persist in batches to keep memory bounded.
    * Processes unresolved references in chunks, persisting edges and cleaning
@@ -914,6 +956,14 @@ export class ReferenceResolver {
   ): Promise<ResolutionResult> {
     this.warmCaches();
 
+    // Resolution runs on the indexer's MAIN thread, and the #850 liveness
+    // watchdog SIGKILLs a process whose event loop stalls past its window (60s
+    // by default). A single dense batch's resolveAll — or the synthesis pass
+    // below — can exceed that on a large repo, killing a VALID in-progress index
+    // (#1091). A shared yielder lets both give the watchdog heartbeat a regular
+    // window to fire; see ./cooperative-yield.
+    const maybeYield = createYielder();
+
     const total = this.queries.getUnresolvedReferencesCount();
     let processed = 0;
     const aggregateStats = {
@@ -930,7 +980,7 @@ export class ReferenceResolver {
       const batch = this.queries.getUnresolvedReferencesBatch(0, batchSize);
       if (batch.length === 0) break;
 
-      const result = this.resolveAll(batch);
+      const result = await this.resolveBatchYielding(batch, maybeYield);
 
       // Persist edges immediately
       const edges = this.createEdges(result.resolved);
@@ -998,7 +1048,7 @@ export class ReferenceResolver {
     // callbacks) that static parsing leaves out. Best-effort — never fail the
     // index on it. See docs/design/callback-edge-synthesis.md.
     try {
-      aggregateStats.byMethod['callback-synthesis'] = synthesizeCallbackEdges(this.queries, this.context);
+      aggregateStats.byMethod['callback-synthesis'] = await synthesizeCallbackEdges(this.queries, this.context);
     } catch {
       // synthesis is additive and optional; ignore failures
     }
@@ -1257,14 +1307,18 @@ export class ReferenceResolver {
    * Mirrors resolveChainedCallsViaConformance's lifecycle. Returns the number
    * of newly-created edges.
    */
-  resolveDeferredThisMemberRefs(): number {
+  async resolveDeferredThisMemberRefs(): Promise<number> {
     const deferred = this.deferredThisMemberRefs;
     this.deferredThisMemberRefs = [];
     if (deferred.length === 0) return 0;
 
     this.clearCaches();
+    // Synchronous main-thread post-pass with a per-ref supertype BFS — yield
+    // periodically so the #850 liveness watchdog heartbeat can fire (#1091).
+    const maybeYield = createYielder();
     const resolved: ResolvedRef[] = [];
     for (const ref of deferred) {
+      await maybeYield();
       const member = ref.referenceName.slice('this.'.length);
       const fromNode = this.queries.getNodeById(ref.fromNodeId);
       if (!fromNode || !member) continue;