hai/everything-claude-code

Fork 0

mirror of https://github.com/affaan-m/everything-claude-code.git synced 2026-05-13 16:13:03 +08:00

Files

Affaan Mustafa b38992f60e docs: record ECC Tools PR review salvage evidence (#1809 )

2026-05-12 12:02:57 -04:00

20 KiB

Raw Blame History

ECC 2.0 GA Roadmap

This roadmap is the durable repo mirror for the Linear project:

https://linear.app/ecctools/project/ecc-20-ga-harness-os-security-platform-de2a0ecace6f

Linear issue creation is currently blocked by the workspace active issue limit, so the live execution truth is split across:

the Linear project description, status updates, and milestones;
this repo document;
merged PR evidence;
handoffs under ~/.cluster-swarm/handoffs/.

Current Evidence

As of 2026-05-12:

Public GitHub queues are clean across everything-claude-code, agentshield, JARVIS, ECC-Tools, and ECC-website.
npm run harness:audit -- --format json reports 70/70 on current main.
npm run observability:ready reports 14/14 readiness on current main.
docs/architecture/harness-adapter-compliance.md maps Claude Code, Codex, OpenCode, Cursor, Gemini, Zed-adjacent, dmux, Orca, Superset, Ghast, and terminal-only support to install paths, verification commands, and risk notes.
npm run harness:adapters -- --check validates that the public adapter matrix still matches the source data in scripts/lib/harness-adapter-compliance.js.
docs/releases/2.0.0-rc.1/publication-readiness.md gates GitHub release, npm dist-tag, Claude plugin, Codex plugin, OpenCode package, billing, and announcement publication on fresh evidence fields.
docs/legacy-artifact-inventory.md records that no _legacy-documents-* directories exist in the current checkout, inventories the two sibling workspace-level _legacy-documents-* repos as sanitized extraction sources, and classifies legacy-command-shims/ as an opt-in archive/no-action surface.
docs/stale-pr-salvage-ledger.md records stale PR salvage outcomes, skipped PRs, superseded work, and the remaining #1687 translator/manual review tail.
AgentShield PR #53 reduced two context-rule false positives and closed the remaining AgentShield issues.
AgentShield PR #55 added GitHub Action organization-policy enforcement with policy / fail-on-policy inputs, policy-status / policy-violations outputs, job-summary evidence, and policy violation annotations.
AgentShield PR #56 added SARIF/code-scanning output for organization-policy violations as agentshield-policy/* results.
AgentShield PR #57 added OSS, team, enterprise, regulated, high-risk-hooks/MCP, and CI-enforcement policy-pack presets plus agentshield policy init --pack.
AgentShield PR #58 added MCP package provenance fields and report-level counts for npm vs git, pinned vs unpinned, known-good, and registry-backed supply-chain evidence.
AgentShield PR #59 added self-contained HTML executive summaries with risk posture, critical/high priority findings, category exposure, README/API docs, built-CLI smoke validation, and 1,704-test coverage.
AgentShield PR #60 added category-level built-in corpus benchmark output, a readyForRegressionGate signal, terminal --corpus category coverage, README/API docs, built-CLI smoke validation, and 1,705-test coverage.
ECC PR #1778 recovered the useful stale #1413 network/homelab architect-agent concepts.
ECC-Tools PR #26 added cost/token-risk predictive follow-ups for AI routing, Claude/model calls, usage limits, quota, and analysis-budget changes that lack budget, quota, rate-limit, or cost validation evidence.
ECC-Tools PR #27 added the non-blocking ECC Tools / PR Risk Taxonomy check-run for Security Evidence, Harness Drift, Install Manifest Integrity, CI/CD Recommendation, Cost/Token Risk, and Agent Config Review buckets.
ECC-Tools PR #28 added billing readiness audit checks for plan limits, entitlements, Marketplace plan shape, subscription source, seats, and overage metering.
ECC-Tools PR #29 added deterministic Reference Set Validation signals for analyzer, skill, agent, command, and harness-guidance changes that lack eval, golden trace, benchmark, or reference-set evidence.
ECC-Tools PR #30 capped follow-up generation to three new GitHub issues and one draft PR per run, then emits the remaining deterministic findings as a project sync backlog for Linear/status tracking without flooding trackers.
ECC-Tools PR #31 added review follow-up signals to analysis completion comments for outstanding change requests, unresolved or outdated review threads, and review activity without an explicit approval.
ECC-Tools PR #32 added CI failure-mode predictive follow-ups for workflow and test-runner changes that lack failure fixtures, captured logs, troubleshooting notes, dry-run evidence, or regression coverage.
ECC-Tools PR #33 added harness-config quality predictive follow-ups for MCP, plugin, agent, hook, command, and harness config changes that lack harness audit, adapter matrix, cross-harness docs, or compatibility regression evidence.
ECC-Tools PR #34 added skill-quality predictive follow-ups and a Skill Quality PR-risk bucket for skill, agent, command, and rule guidance changes that lack examples, validation, eval, or reference evidence.
ECC-Tools PR #35 added RAG/evaluator predictive follow-ups and a RAG/Evaluator Evidence PR-risk bucket for retrieval, embedding, ranking, and evaluator changes that lack reference-set comparison, golden trace, benchmark, fixture, or eval-run evidence.
ECC-Tools PR #36 added deep-analyzer predictive follow-ups, a Deep Analyzer Evidence PR-risk bucket, and a Linear-ready project sync backlog table for deferred follow-up work.
ECC-Tools PR #37 added a maintained analyzer corpus fixture, corpus validation tests, and co-located analyzer reference-set evidence recognition for future predictive follow-ups and PR-risk taxonomy checks.
ECC-Tools PR #38 added PR review/stale-salvage predictive follow-ups, a PR Review/Salvage Evidence taxonomy bucket, and maintained corpus fixtures for stale-closure salvage, reviewer-thread, and reopen-flow evidence.
ECC PR #1803 landed the contributor Quarkus handling branch after maintainer cleanup, current-main alignment, full local validation, and preservation of the author's removal of incomplete ja-JP and zh-CN Quarkus translations.

Operating Rules

Keep public PRs and issues below 20, with zero as the preferred release-lane target.
Maintain 70/70 harness audit and 14/14 observability readiness after every GA-readiness batch.
Do not publish release or social announcements until the GitHub release, npm/package state, billing state, and plugin submission surfaces are verified with fresh evidence.
Do not treat closed stale PRs as discarded. Pair each cleanup batch with a salvage pass: inspect the closed diffs, port useful compatible work on maintainer-owned branches, and credit the source PR.
Do not create new Linear issues until the active issue limit is cleared.

Prompt-To-Artifact Execution Checklist

This table keeps the long operator prompt tied to concrete artifacts. A status is not complete unless the evidence column exists and has been freshly verified.

Prompt requirement	Required artifact or gate	Current evidence	Status
Keep public PRs below 20	Repo-family PR recheck	0 open PRs across the tracked public repos on 2026-05-12	Complete for this checkpoint
Keep public issues below 20	Repo-family issue recheck	0 open issues across the tracked public repos on 2026-05-12	Complete for this checkpoint
Manage PR discussions	PR review/comment closure plus merge/close state	#1803 was maintainer-edited and merged; no open PRs remain	Complete for this checkpoint
Salvage useful stale work	`docs/stale-pr-salvage-ledger.md`	Ledger records salvaged, superseded, skipped, and manual-review tails	Complete except #1687 manual review
ECC 2.0 preview pack ready	Release docs, quickstart, publication readiness, release notes	`docs/releases/2.0.0-rc.1/` and readiness docs are in-tree	Needs final release evidence
Hermes specialized skills included safely	Hermes setup/import docs and sanitized skill surface	Hermes setup and import playbook are public; secrets stay local	Needs final release review
Naming and rename readiness	Naming matrix across package/plugin/docs/social surfaces	Milestone 1 defines the needed matrix	Not complete
Claude and Codex plugin publication	Contact/submission path with required artifacts and status	Publication readiness gate exists	Not complete
Articles, tweets, and announcements	X thread, LinkedIn copy, GitHub release copy, push checklist	Draft launch collateral exists under rc.1 release docs	Needs URL-backed refresh
AgentShield enterprise iteration	Policy gates, SARIF, packs, provenance, corpus, HTML reports	PRs #53, #55-#60 landed with test evidence	Needs next value decision
ECC Tools next-level app	Billing audit, PR checks, deep analyzer, sync backlog	PRs #26-#38 landed with test evidence	Needs native Linear API sync / broader evaluator corpus
GitGuardian/Dependabot/CodeRabbit-style checks	Non-blocking taxonomy and deterministic follow-up checks	ECC-Tools risk taxonomy check plus follow-up signals landed, including Skill Quality, Deep Analyzer Evidence, Analyzer Corpus Evidence, RAG/Evaluator Evidence, and PR Review/Salvage Evidence	Partially complete
Harness-agnostic learning system	Audit, adapter matrix, observability, traces, promotion loop	Audit/adapters/observability gates exist	Needs evaluation/RAG prototype
Linear roadmap is detailed	Linear project status plus repo mirror	Repo mirror exists; issue creation is blocked by workspace limit	Needs recurring status updates
Flow separation and progress tracking	Flow lanes with owner artifacts and update cadence	This roadmap defines lanes below	Active
Realtime Linear sync	Project updates while issue limit is blocked; issues later	Follow-up flood-control and Linear-ready backlog tables exist in ECC Tools	Needs native API sync once capacity clears
Observability for self-use	Local readiness gate, traces, status snapshots, risk ledger	`npm run observability:ready` reports 14/14	Complete for local gate
Proper release and notifications	Release tag, npm publish state, plugin state, social posts	Publication readiness gate exists	Not complete

Execution Lanes And Tracking Contract

Until Linear issue capacity is cleared, this document is the durable execution ledger and Linear receives project status updates only. When capacity is available, each lane below should become a small set of Linear issues linked back to the repo evidence and merge commits.

Lane	Source of truth	Next tracked artifact	Update cadence
Queue hygiene and salvage	GitHub PR/issue state, salvage ledger	Append ledger entries for any future stale closures	Every cleanup batch
Release and publication	rc.1 release docs, publication readiness doc	Naming matrix and plugin submission/contact checklist	Before any tag
Harness OS core	Audit, adapter matrix, observability docs, `ecc2/`	HUD/session-control acceptance spec	Weekly until GA
Evaluation and RAG	Reference-set validation, harness audit, traces	Read-only evaluator/RAG prototype design	Before deep analyzer expansion
AgentShield enterprise	AgentShield PR evidence and roadmap notes	PDF-export decision or next enterprise signal	After value decision
ECC Tools app	ECC-Tools PR evidence, billing audit, risk taxonomy	Native Linear sync or broader evaluator/RAG corpus slice	Next implementation batch
Linear progress	Linear project status updates and this mirror	Status update with queue/evidence/missing gates	Every significant merge batch

The project status update should always include:

Current public PR and issue counts.
Merged evidence since the previous update.
Deferred or blocked items with the reason.
The next one or two implementation slices.
Any release or publication gate that is still not evidence-backed.

Reference Pressure

The GA roadmap is informed by these reference surfaces:

stablyai/orca and superset-sh/superset for worktree-native parallel agent UX, review loops, and workspace presets.
standardagents/dmux and aidenybai/ghast for terminal/worktree multiplexing, session grouping, and lifecycle hooks.
jarrodwatts/claude-hud for always-visible status, tool, agent, todo, and context telemetry.
stanford-iris-lab/meta-harness and greyhaven-ai/autocontext for evaluation-driven harness improvement, traces, playbooks, and promotion loops.
NousResearch/hermes-agent for operator shell, gateway, memory, skills, and multi-platform command patterns.
anthropics/claude-code, active sst/opencode / anomalyco/opencode, Zed, Codex, Cursor, Gemini, and terminal-only workflows for adapter expectations.

The output of this reference work should be concrete ECC deltas, not a second strategy memo.

Milestones

1. GA Release, Naming, And Plugin Publication Readiness

Target: 2026-05-24

Acceptance:

Naming matrix covers product name, npm package, Claude plugin, Codex plugin, OpenCode package, marketplace metadata, docs, and migration copy.
GitHub release, npm dist-tag, plugin publication, and announcement gates are mapped to fresh command evidence.
Release notes, migration guide, known issues, quickstart, X thread, LinkedIn post, and GitHub release copy are ready but not posted before release URLs exist.
Plugin publication/contact paths for Claude and Codex are documented with owner, required artifacts, and submission status.

2. Harness Adapter Compliance Matrix And Scorecard Onramp

Target: 2026-05-31

Acceptance:

Adapter matrix covers Claude Code, Codex, OpenCode, Cursor, Gemini, Zed-adjacent surfaces, dmux, Orca, Superset, Ghast, and terminal-only use.
Each adapter has supported assets, unsupported surfaces, install path, verification command, and risk notes.
Harness audit remains 70/70 and gains a public onramp that explains how teams use the scorecard.
Reference findings are converted into concrete adapter, observability, or operator-surface deltas.

3. Local Observability, HUD/Status, And Session Control Plane

Target: 2026-06-07

Acceptance:

Observability readiness remains 14/14 and is backed by JSONL traces, status snapshots, risk ledger, and exportable handoff contracts.
HUD/status model covers context, tool calls, active agents, todos, checks, cost, risk, and queue state.
Worktree/session controls cover create, resume, status, stop, diff, PR, merge queue, and conflict queue.
Linear/GitHub/handoff sync model is explicit enough for real-time progress tracking.

4. Self-Improving Harness Evaluation Loop

Target: 2026-06-10

Acceptance:

Scenario specs, verifier contracts, traces, playbooks, and regression gates are documented and at least one read-only prototype exists.
The loop separates observation, proposal, verification, and promotion.
Team and individual setups can be scored and improved without blindly mutating configs.
RAG/reference-set design covers vetted ECC patterns, team history, CI failures, diffs, review outcomes, and harness config quality.

5. AgentShield Enterprise Security Platform

Target: 2026-06-14

Acceptance:

Formal policy schema exists for org baselines, exceptions, owners, expiration, severity, and audit trails.
SARIF/code-scanning output is implemented and tested.
GitHub Action policy gates expose organization policy status and violation counts for branch-protection and CI evidence.
Policy packs are defined for OSS, team, enterprise, regulated, high-risk hooks/MCP, and CI enforcement.
Supply-chain intelligence covers MCP package provenance and has an extension path for npm/pip reputation, CVEs, typosquats, and dependency risk.
Prompt-injection corpus and regression benchmark are ready for continuous rule hardening with category-level coverage and regression-gate output.
Enterprise reports include JSON plus self-contained HTML executive output with risk posture, priority findings, and category exposure.

6. ECC Tools Billing, Deep Analysis, PR Checks, And Linear Sync

Target: 2026-06-21

Acceptance:

Native GitHub Marketplace billing announcement is backed by verified implementation and docs.
Internal billing readiness audit covers plan limits, seats, entitlement mapping, Marketplace plan shape, subscription state, overage hooks, and failure modes.
Deep analyzer covers diff patterns, CI/CD workflows, dependency/security surface, PR review behavior, failure history, harness config, skill quality, dedicated analyzer corpus evidence, co-located analyzer reference sets, PR review/stale-salvage evidence, RAG/evaluator comparison, and reference-set validation.
PR check suite taxonomy includes Security Evidence, Harness Drift, Install Manifest Integrity, CI/CD Recommendation, Cost/Token Risk, Reference Set Validation, Deep Analyzer Evidence, RAG/Evaluator Evidence, PR Review/Salvage Evidence, Skill Quality, and Agent Config Review.
Cost/token-risk predictive follow-ups flag AI routing, model-call, usage, quota, and budget changes when budget evidence is missing.
Reference-set validation follow-ups flag analyzer, skill, agent, command, and harness-guidance changes that lack eval, golden trace, benchmark, or maintained reference-set evidence.
Deep-analyzer follow-ups flag repository, commit, architecture, pattern, and analysis-pipeline changes that lack analyzer corpus, snapshot, fixture, or benchmark evidence.
Analyzer corpus evidence includes maintained fixtures and tests for current architecture and commit analyzer outputs, plus co-located src/analyzers/{fixtures,goldens,reference-sets,benchmarks,evals}/ evidence paths.
RAG/evaluator follow-ups flag retrieval, embedding, ranking, and evaluator changes that lack reference-set comparison, golden trace, benchmark, fixture, or eval-run evidence.
PR review/stale-salvage follow-ups flag review, triage, stale-closure, and pull-request automation changes that lack stale-salvage fixtures, reviewer-thread cases, or reopen-flow reference evidence.
PR analysis comments summarize review follow-up signals for requested changes, unresolved or outdated review threads, and missing approvals.
CI failure-mode predictive follow-ups flag workflow and test-runner changes that lack failure fixtures, captured logs, troubleshooting notes, dry-run evidence, or regression coverage.
Harness-config quality predictive follow-ups flag MCP, plugin, agent, hook, command, and harness config changes that lack audit, adapter matrix, cross-harness doc, or compatibility regression evidence.
Linear sync design maps findings to issues/status without flooding the workspace, and deferred follow-up comments include a Linear-ready table.
Follow-up generation caps automatic GitHub object creation and keeps overflow findings in a copy-ready project sync backlog.

7. Legacy Audit And Stale-Work Salvage Closure

Target: 2026-06-15

Acceptance:

Legacy directories and orphaned handoffs are inventoried.
Each useful artifact is marked landed, Linear/project-tracked, salvage branch, or archive/no-action.
Workspace-level legacy repos are mined only through sanitized maintainer branches; raw context, secrets, personal paths, local settings, and private drafts are never imported wholesale.
Stale PR salvage policy stays in force: close stale/conflicted PRs first, record a salvage ledger item, then port useful compatible content on maintainer branches with attribution.
#1687 localization leftovers are handled only by translator/manual review, not blind cherry-pick.

Next Engineering Slices

Decide whether AgentShield PDF export adds value beyond the merged HTML executive report and corpus benchmark output.
Add native Linear API sync for ECC Tools backlog items after workspace issue capacity clears.
Expand the evaluator/RAG corpus with real cleanup-batch cases as future maintainer-owned examples land.

20 KiB Raw Blame History

ECC 2.0 GA Roadmap

Current Evidence

Operating Rules

Prompt-To-Artifact Execution Checklist

Execution Lanes And Tracking Contract

Reference Pressure

Milestones

1. GA Release, Naming, And Plugin Publication Readiness

2. Harness Adapter Compliance Matrix And Scorecard Onramp

3. Local Observability, HUD/Status, And Session Control Plane

4. Self-Improving Harness Evaluation Loop

5. AgentShield Enterprise Security Platform

6. ECC Tools Billing, Deep Analysis, PR Checks, And Linear Sync

7. Legacy Audit And Stale-Work Salvage Closure

Next Engineering Slices

20 KiB

Raw Blame History