Przeglądaj źródła

fix(cli): stop rendering raw FTS score as nonsensical percentages in `query` (#1045) (#1052)

`codegraph query` printed `(score * 100)%` next to each hit, but `score`
is an unbounded BM25/FTS relevance magnitude (relative-ranking only), so
it rendered as values like "12042%" that made the output look broken.

Results already arrive in rank order, so drop the score from the
human-readable output entirely — matching the MCP search tool, which
shows no score. The raw `score` stays in `--json` for programmatic
sorting/thresholding. Also corrects the SearchResult.score doc comment,
which wrongly claimed a 0-1 range. Adds an end-to-end regression test.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Colby Mchenry 10 godzin temu
rodzic
commit
4b58a6d2d0
4 zmienionych plików z 78 dodań i 4 usunięć
  1. 1 0
      CHANGELOG.md
  2. 65 0
      __tests__/cli-query-command.test.ts
  3. 6 3
      src/bin/codegraph.ts
  4. 6 1
      src/types.ts

+ 1 - 0
CHANGELOG.md

@@ -18,6 +18,7 @@ and adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 - C++ objects constructed on the stack — `Calculator calc(0)` or `Widget w{1, 2}` — now record that the enclosing function instantiates that class, the same as heap construction (`new Calculator(0)`) already did. Previously only the `new` form was tracked, so a function that built objects with the ordinary stack syntax looked like it didn't construct them and the dependency was missing from impact/callers. Thanks @Dshuishui for the report. (#1035)
 - The graph no longer stores duplicate copies of the same relationship. The same dependency between the same two symbols at the same spot could be recorded more than once, which inflated edge counts and let callers/impact results list a relationship twice. Each relationship is now stored exactly once, and existing projects are de-duplicated automatically the next time CodeGraph opens them. Thanks @inth3shadows for the detailed report. (#1034)
 - `codegraph node` can now read a file from the command line. File-read mode — pass `-f`/`--file` to get a file's source with line numbers plus the files that depend on it, the same output as the `codegraph_node` MCP tool — was rejected with "missing required argument 'name'", because the command always demanded a symbol name even though file mode has none, leaving the feature unreachable from the CLI. The symbol name is now optional: `codegraph node -f src/auth.ts` (or `codegraph node src/auth.ts`) reads the file, `codegraph node parseToken` looks up a symbol, and running it with neither prints a short usage hint instead of a cryptic error. Thanks @jcrabapple for the report. (#1044)
+- `codegraph query` no longer prints meaningless relevance percentages like "12042%" next to each result. The number was a raw full-text search score — useful only for ordering the results, not as a real 0–100% figure — so multiplying it by 100 produced wild values that made the output look broken. Results are already listed best-match first, so the CLI now just shows them in that order with no score, matching what the search tool reports to AI agents. If you script against `codegraph query --json`, the raw `score` is still included for sorting or thresholding. Thanks @jcrabapple for the report. (#1045)
 
 
 ## [1.1.2] - 2026-06-28

+ 65 - 0
__tests__/cli-query-command.test.ts

@@ -0,0 +1,65 @@
+/**
+ * `codegraph query` score rendering (#1045).
+ *
+ * The human-readable output used to print `(score * 100)%` next to each hit,
+ * but `score` is an unbounded BM25/FTS relevance magnitude (relative-ranking
+ * only), so it rendered as nonsensical percentages like "12042%". The CLI now
+ * shows no score — results are already in rank order, matching the MCP search
+ * tool — while `--json` still carries the raw `score` for programmatic use.
+ *
+ * Exercised end-to-end against the built binary.
+ */
+
+import { describe, it, expect, beforeEach, afterEach } from 'vitest';
+import { execFileSync } from 'child_process';
+import * as fs from 'fs';
+import * as os from 'os';
+import * as path from 'path';
+import { CodeGraph } from '../src';
+
+const BIN = path.resolve(__dirname, '../dist/bin/codegraph.js');
+
+function query(cwd: string, extraArgs: string[]): string {
+  return execFileSync(process.execPath, [BIN, 'query', 'parseToken', ...extraArgs, '-p', cwd], {
+    encoding: 'utf-8',
+    env: { ...process.env, CODEGRAPH_NO_DAEMON: '1', CODEGRAPH_WASM_RELAUNCHED: '1' },
+    stdio: ['ignore', 'pipe', 'ignore'], // drop stderr (SQLite experimental warning)
+  });
+}
+
+describe('codegraph query — score rendering (#1045)', () => {
+  let tempDir: string;
+
+  beforeEach(async () => {
+    tempDir = fs.mkdtempSync(path.join(os.tmpdir(), 'codegraph-query-cmd-'));
+    fs.mkdirSync(path.join(tempDir, 'src'));
+    fs.writeFileSync(
+      path.join(tempDir, 'src/auth.ts'),
+      'export function parseToken(t: string){ return t.trim(); }\n' +
+        'export function parseTokenExpiry(t: string){ return Date.parse(t); }\n',
+    );
+    const cg = CodeGraph.initSync(tempDir);
+    await cg.indexAll();
+    cg.close();
+  });
+
+  afterEach(() => {
+    fs.rmSync(tempDir, { recursive: true, force: true });
+  });
+
+  it('human output ranks results without rendering a raw score as a percentage', () => {
+    const out = query(tempDir, ['-l', '5']);
+    // Still finds and lists the symbol...
+    expect(out).toContain('parseToken');
+    // ...but never prints the bogus `(12042%)`-style score.
+    expect(out).not.toMatch(/\(\d+%\)/);
+    expect(out).not.toContain('%');
+  });
+
+  it('--json still carries the raw numeric score for programmatic use', () => {
+    const parsed = JSON.parse(query(tempDir, ['-l', '5', '--json']));
+    expect(Array.isArray(parsed)).toBe(true);
+    expect(parsed.length).toBeGreaterThan(0);
+    expect(typeof parsed[0].score).toBe('number');
+  });
+});

+ 6 - 3
src/bin/codegraph.ts

@@ -964,15 +964,18 @@ program
         } else {
           console.log(chalk.bold(`\nSearch Results for "${search}":\n`));
 
+          // Results arrive already ranked by relevance, so the order conveys
+          // it. We don't print the raw score: it's an unbounded BM25/FTS value
+          // (relative-ranking only), and the old `(score * 100)%` rendered it
+          // as nonsensical percentages like "12042%" (#1045). The MCP search
+          // tool likewise shows no score. Raw `score` stays in --json output.
           for (const result of results) {
             const node = result.node;
             const location = `${node.filePath}:${node.startLine}`;
-            const score = chalk.dim(`(${(result.score * 100).toFixed(0)}%)`);
 
             console.log(
               chalk.cyan(node.kind.padEnd(12)) +
-              chalk.white(node.name) +
-              ' ' + score
+              chalk.white(node.name)
             );
             console.log(chalk.dim(`  ${location}`));
             if (node.signature) {

+ 6 - 1
src/types.ts

@@ -398,7 +398,12 @@ export interface SearchResult {
   /** Matching node */
   node: Node;
 
-  /** Relevance score (0-1) */
+  /**
+   * Relevance score for relative ranking only — higher is more relevant.
+   * NOT normalized and NOT a 0-1 fraction: the FTS path returns an unbounded
+   * BM25 magnitude (often in the tens or hundreds), while the fuzzy/exact
+   * paths return ~0-1. Use it to order results, not as an absolute percentage.
+   */
   score: number;
 
   /** Matched text snippets for highlighting */