diff --git a/docs/guide/benchmarking/2026-06-11-competitor-strength-evidence-matrix.md b/docs/guide/benchmarking/2026-06-11-competitor-strength-evidence-matrix.md
new file mode 100644
index 00000000..1802eaf5
--- /dev/null
+++ b/docs/guide/benchmarking/2026-06-11-competitor-strength-evidence-matrix.md
@@ -0,0 +1,160 @@
+# Competitor-Strength Evidence Matrix - June 11, 2026
+
+Goal: Define a durable competitor-strength matrix so ELF benchmark claims are tied to
+measured evidence classes, typed blockers, and explicit next measurement gates.
+Read this when: You need to decide whether ELF can claim a win, tie, loss, gap, or
+non-claim against a tracked memory, RAG, or graph project.
+Inputs: `docs/guide/benchmarking/2026-06-10-production-adoption-refresh.md`,
+`docs/guide/benchmarking/2026-06-10-real-world-comparison-report.md`,
+`docs/guide/benchmarking/2026-06-10-live-real-world-sweep-report.md`,
+`docs/guide/research/external_memory_improvement_plan.md`,
+`docs/guide/research/research_projects_inventory.md`,
+`apps/elf-eval/fixtures/real_world_external_adapters/memory_projects_manifest.json`,
+and `Makefile.toml`.
+Depends on: `docs/spec/real_world_agent_memory_benchmark_v1.md`,
+`docs/guide/benchmarking/live_baseline_benchmark.md`, and the current external adapter
+manifest.
+Outputs: Human-readable matrix, claim boundaries, scenario next-measurement gates,
+and the machine-readable companion file
+`docs/research/2026-06-11-xy-897-competitor-strength-matrix.json`.
+
+## Decision Boundary
+
+Do not claim that ELF beats, ties, or loses to a competitor unless the named scenario
+is encoded and run at a comparable evidence class.
+
+Current boundary:
+
+- ELF and qmd have full-suite `live_real_world` sweeps, but neither has a full-suite
+  live pass. Each sweep produced 38 jobs with 18 pass, 5 wrong_result, 1 incomplete,
+  2 blocked, and 12 not_encoded.
+- ELF fixture evidence is strong: `cargo make real-world-memory` reports 38 jobs
+  across 11 suites with 36 pass and 2 blocked production-ops operator boundaries.
+  That proves the fixture contract, not live-service parity.
+- qmd is the strongest measured local retrieval-debug comparison, but the current
+  evidence still separates its same-corpus/live-retrieval strengths from the full-suite
+  live non-pass sweep.
+- Most other projects are `live_baseline_only` or `research_gate`. They must not be
+  treated as beaten until a comparable scenario is encoded and run.
+- Private-corpus and credentialed production-ops checks remain operator-owned
+  `blocked` states.
+
+## Current Ledger Summary
+
+The current manifest has 21 adapter records across 17 projects. Evidence-class counts:
+1 `fixture_backed`, 6 `live_baseline_only`, 2 `live_real_world`, and 12
+`research_gate`. Overall adapter-status counts: 1 `pass`, 6 `wrong_result`, 1
+`lifecycle_fail`, 6 `blocked`, and 7 `not_encoded`.
+
+## State Taxonomy
+
+This report uses the benchmark's snake_case state names. Hyphenated prose names map
+directly to these states: fixture-backed -> `fixture_backed`,
+live-baseline -> `live_baseline_only`, live-real-world -> `live_real_world`,
+research-gate -> `research_gate`, wrong-result -> `wrong_result`,
+lifecycle-fail -> `lifecycle_fail`, and not-encoded -> `not_encoded`.
+
+| State | Meaning | Claim boundary |
+| --- | --- | --- |
+| `fixture_backed` | Checked-in real-world jobs or fixture responses are scored by the benchmark runner. | Useful for contract coverage, not live runtime proof. |
+| `live_baseline_only` | Docker same-corpus or lifecycle checks ran, but no real-world job suite was scored for that project. | Cannot imply real-world job parity. |
+| `live_real_world` | A runtime or CLI adapter materialized and scored real-world job records. | Can support scenario claims only for the encoded suite statuses. |
+| `research_gate` | Source, setup, resource, retry, or output-contract metadata exists. | Follow-up routing only; not pass evidence. |
+| `blocked` | Safe measurement needs unavailable credentials, private data, setup proof, or external dependency. | Keep typed until the missing input exists. |
+| `unsupported` | Capability is outside the project shape or requires a non-comparable path. | Do not turn into a loss. |
+| `wrong_result` | The system ran but missed expected memory, answer, or evidence terms. | Behavioral non-pass. |
+| `lifecycle_fail` | Retrieval may work, but update/delete/reload/persistence/cold-start behavior fails. | Lifecycle non-pass, not a retrieval win. |
+| `incomplete` | The run did not reach the behavioral check because setup or runtime failed. | Setup/runtime non-pass, not quality evidence. |
+| `not_encoded` | The scenario is not currently covered. | No pass/fail claim is allowed. |
+
+## Project Matrix
+
+| Project | Strongest user-facing scenario | Current evidence | Measured status and proof | Unsupported or blocked status | Required benchmark before ELF claim | Borrow if stronger |
+| --- | --- | --- | --- | --- | --- | --- |
+| ELF | Evidence-linked source-of-truth memory service with real-world fixtures and live retrieval sweeps. | `live_real_world`; supporting `fixture_backed`. | `wrong_result` full live sweep: `cargo make real-world-memory-live-adapters`, `tmp/real-world-memory/live-adapters/elf-report.md`. Fixture contract: `cargo make real-world-memory`, `tmp/real-world-memory/real-world-memory-report.json`. | `blocked`: private manifest and provider credentials; broader live suites remain `wrong_result`, `incomplete`, or `not_encoded`. | Full-suite live pass plus separate private-corpus and credentialed production-ops proof. | Keep borrowing qmd debug knobs, OpenViking staged trajectory, mem0 history, Letta core memory, and graph/RAG navigation. |
+| qmd | Local retrieval-debug workflow with transparent CLI indexing, querying, expansion, fusion, and rerank ergonomics. | `live_real_world`; supporting `live_baseline_only` and `research_gate`. | `wrong_result` full live sweep: `cargo make real-world-memory-live-adapters`, `tmp/real-world-memory/live-adapters/qmd-report.md`; targeted retrieval suites pass. | `not_encoded`: deep profile and non-retrieval live behavior are not encoded; memory_evolution is `wrong_result`. | qmd deep retrieval/debug profile plus full-suite live replay with trace-level diagnostics. | Weighted fusion, rerank explanation, local debug knobs, and command-line replay. |
+| agentmemory | Coding-agent continuity, MCP/REST packaging, viewer workflow, and durable cross-agent memory lifecycle. | `live_baseline_only`. | `lifecycle_fail`: `ELF_BASELINE_PROJECTS=agentmemory cargo make baseline-live-docker`, `tmp/live-baseline/live-baseline-report.json`. | `blocked`: durable cold-start and real-world adapter coverage are missing. | Durable local adapter with update, delete, cold-start reload, work_resume, capture/write-policy, and lifecycle-staleness jobs. | Cross-agent hooks, packaging, continuity scenarios, and viewer affordances. |
+| mem0/OpenMemory | Memory lifecycle, personalization, hosted/OpenMemory UI ergonomics, and optional graph memory. | `live_baseline_only`. | `wrong_result`: `ELF_BASELINE_PROJECTS=mem0 cargo make baseline-live-docker`, `tmp/live-baseline/live-baseline-report.json`. | `not_encoded`: OpenMemory UI, hosted claims, and real-world personalization coverage are not encoded. | Fix local same-corpus result, then encode memory_evolution, personalization, UI readback, and optional graph-context jobs. | Entity-scoped history, lifecycle surfaces, async update ergonomics, and OpenMemory inspection UX. |
+| memsearch | Markdown-first canonical store with rebuildable local index and practical hybrid retrieval. | `live_baseline_only`. | `wrong_result`: `ELF_BASELINE_PROJECTS=memsearch cargo make baseline-live-docker`, `tmp/live-baseline/live-baseline-report.json`. | `incomplete`: source-of-truth and real-world reindex behavior are not cleanly scored. | Fix Docker same-corpus retrieval and reindex/update/delete reload evidence, then score source-of-truth and retrieval-debug jobs. | Canonical markdown store, local reindex clarity, and user-inspectable source files. |
+| OpenViking | Filesystem-like context trajectory, hierarchical retrieval, and staged context loading. | `live_baseline_only`; supporting `research_gate`. | `wrong_result`: `ELF_BASELINE_PROJECTS=OpenViking cargo make baseline-live-docker`, `tmp/live-baseline/live-baseline-report.json`. | `not_encoded`: hierarchical context trajectory is not encoded; same-corpus output still misses expected evidence. | Make evidence-bearing same-corpus output pass, then score staged trajectory and hierarchy expansion. | `viking://`-style context model, trajectory readback, and staged retrieval planning. |
+| claude-mem | Progressive disclosure, automatic capture loop, repository-local lifecycle, and local viewer workflow. | `live_baseline_only`. | `wrong_result`: `ELF_BASELINE_PROJECTS=claude-mem cargo make baseline-live-docker`, `tmp/live-baseline/live-baseline-report.json`. | `not_encoded`: progressive-disclosure real-world jobs are not encoded. | Durable repository-backed work_resume, operator_debugging_ux, capture/write-policy, and progressive-disclosure jobs. | Progressive disclosure, automatic capture review loops, and local viewer/operator comfort. |
+| RAGFlow | Full RAG application workflow with document, chunk, and reference evidence handles. | `research_gate`. | `blocked`: `ELF_RAGFLOW_SMOKE_START=1 ELF_RAGFLOW_SMOKE_ACCEPT_RESOURCE_ENVELOPE=1 cargo make ragflow-docker-smoke`, `tmp/real-world-memory/ragflow-smoke/ragflow-smoke.json`. | `blocked`: Docker resource envelope and adapter output mapping still need proof. | XY-885 tiny Docker evidence-smoke adapter mapping `reference.chunks` to scored evidence. | Document/chunk references, resource-envelope reporting, and RAG app evidence handles. |
+| LightRAG | Lightweight graph/RAG context export with source file-path citation shape. | `research_gate`. | `blocked`: `ELF_LIGHTRAG_CONTEXT_START=1 cargo make lightrag-docker-context-smoke`, `tmp/real-world-memory/lightrag-context/summary.json`. | `blocked`: Docker service setup and context export are not proven. | XY-886 Docker context-export adapter with explicit provider config and source citation mapping. | Context-only query modes, graph-aware retrieval layout, and file-path citation readback. |
+| GraphRAG | GraphRAG indexing, graph summaries, and document/text-unit evidence tables. | `research_gate`. | `blocked`: `ELF_GRAPHRAG_SMOKE_RUN=1 cargo make graphrag-docker-smoke`, `tmp/real-world-memory/graphrag-smoke/summary.json`. | `blocked`: indexing resource envelope and source citation mapping are not proven. | XY-887 cost-bounded Docker adapter over a tiny corpus and scored output tables. | Graph summary artifacts, local/global search separation, and source table evidence mapping. |
+| Graphiti/Zep | Temporal graph memory with current, historical, and future fact validity windows. | `research_gate`. | `blocked`: `ELF_GRAPHITI_ZEP_SMOKE_START=1 ELF_GRAPHITI_ZEP_SMOKE_RUN=1 cargo make graphiti-zep-docker-temporal-smoke`, `tmp/real-world-memory/graphiti-zep-smoke/summary.json`. | `blocked`: Docker graph-store and temporal adapter are not proven. | XY-888 Docker-local temporal graph adapter scoring current/historical fact validity. | Temporal fact windows, invalidation/supersession semantics, and graph fact provenance. |
+| Letta | Core memory blocks versus archival memory with explicit operating-context surfaces. | `research_gate`. | `not_encoded`: `docs/research/2026-06-10-xy-882-rag-graph-adapter-feasibility.json`. | `blocked`: contained evidence export path is not selected. | Select contained export contract, then encode core-vs-archival, personalization, and project-decision jobs. | Core memory block ergonomics, archival separation, and shared operating context readback. |
+| LangGraph | Checkpoint/replay regression workflow and durable state replay for agent runs. | `research_gate`. | `not_encoded`: `docs/research/2026-06-10-xy-882-rag-graph-adapter-feasibility.json`. | `unsupported`: not a standalone memory backend adapter. | Non-goal for direct win/loss until a standalone memory output contract exists; use replay jobs as benchmark infrastructure reference. | Checkpoint replay, deterministic regression, and state-diff evaluation patterns. |
+| nanograph | Typed graph schema and query ergonomics for graph-lite developer experience. | `research_gate`. | `not_encoded`: `docs/research/2026-06-10-xy-882-rag-graph-adapter-feasibility.json`. | `unsupported`: not a memory backend comparison target. | Non-goal for direct win/loss unless a contained memory-backed target emerges; measure ELF graph-lite DX instead. | Typed relation schema, query ergonomics, and small graph developer experience. |
+| llm-wiki | LLM-maintained wiki or knowledge-page workflow with query-save and lint loops. | `research_gate`. | `not_encoded`: `docs/research/2026-06-10-xy-882-rag-graph-adapter-feasibility.json`. | `unsupported`: no live service runtime for adapter proof. | Select contained plugin or instruction harness, then score knowledge pages for citations, unsupported claims, rebuild, and stale-source lint. | Maintained wiki workflows, page lint, query-save loops, and topic-scoped navigation. |
+| gbrain | Operational knowledge brain with compiled_truth pages, timelines, enrichment, and maintenance loops. | `research_gate`. | `not_encoded`: `docs/research/2026-06-10-xy-882-rag-graph-adapter-feasibility.json`. | `blocked`: Docker-local brain repo and database path are missing. | Prove Docker-local repository/database setup, then encode compiled_truth/timeline and operator-continuity jobs. | Compiled truth pages, timeline maintenance, and human-operable knowledge-brain navigation. |
+| graphify | Graph-compressed navigation with `graph.json` and `GRAPH_REPORT` evidence outputs. | `research_gate`. | `blocked`: `cargo make graphify-docker-graph-report-smoke`, `tmp/real-world-memory/graphify-smoke/graphify-smoke.json`. | `blocked`: Docker CLI graph/report generation is not proven; host-global assistant hooks are out of scope. | XY-889 Docker-only graph/report adapter over `graph.json` and `GRAPH_REPORT.md`. | Graph compression, source-location graph reports, and navigation hints for large code or document spaces. |
+
+## Scenario Matrix
+
+| Scenario | Current ELF evidence | Strongest competitor/reference | Current competitor evidence | Next measurement before claim |
+| --- | --- | --- | --- | --- |
+| Retrieval/debug | Fixture retrieval passes; live retrieval passes. | qmd. | qmd live retrieval passes and live baseline passes, but full-suite live status is `wrong_result`. | Run qmd deep profile and ELF/qmd trace-level replay with expansion, fusion, rerank, and candidate-drop diagnostics. |
+| Work resume | Fixture and live work_resume pass. | agentmemory, claude-mem, OpenViking. | agentmemory `lifecycle_fail`, claude-mem `wrong_result`, OpenViking work_resume `not_encoded`. | Encode durable work_resume adapters or keep each blocked with lifecycle/setup evidence. |
+| Project decisions | Fixture and live project_decisions pass. | qmd, Letta. | qmd live project_decisions pass; Letta is `research_gate` `not_encoded`. | Add Letta core/archival decision jobs only after a contained export path exists. |
+| Source-of-truth | Fixture and live trust_source_of_truth pass. | memsearch. | memsearch canonical-store evidence exists, but source-of-truth is `incomplete` and retrieval is `wrong_result`. | Fix memsearch reindex/retrieval evidence and score source-of-truth rebuild/reload jobs. |
+| Temporal/current-vs-historical memory | Fixture memory_evolution passes; live memory_evolution is `wrong_result`. | Graphiti/Zep, mem0/OpenMemory. | Graphiti/Zep is `research_gate` `blocked`; mem0/OpenMemory is `wrong_result`. | Fix ELF/qmd live memory_evolution evidence links and run XY-888. |
+| Consolidation | Fixture consolidation passes; live consolidation is `not_encoded`. | agentmemory, managed-memory references, llm-wiki. | No manifest project has live consolidation scoring. | Run reviewable consolidation proposal generation with source refs, unsupported-claim flags, and audit transitions. |
+| Knowledge pages | Fixture knowledge_compilation passes; live knowledge_compilation is `not_encoded`. | llm-wiki, gbrain, GraphRAG, graphify. | llm-wiki and gbrain are `research_gate` `not_encoded` or `blocked`; GraphRAG and graphify are `blocked`. | Encode live derived-page rebuild/lint scoring and run contained knowledge/RAG adapters only after setup proof. |
+| Operator debugging | Fixture operator_debugging_ux passes; live operator_debugging_ux is `not_encoded`. | qmd, claude-mem, OpenMemory. | qmd has debug strengths but operator_debugging_ux is `not_encoded`; claude-mem and OpenMemory UX are `not_encoded`. | Score trace hydration, stage attribution, raw-SQL avoidance, and repair-action clarity through live artifacts. |
+| Capture/write policy | Fixture capture_integration passes; live capture_integration is `not_encoded`. | agentmemory, claude-mem. | agentmemory capture is `blocked`; claude-mem capture is `not_encoded`. | Run live capture/write-policy jobs proving redaction, exclusion, evidence binding, and no secret leakage. |
+| Production ops | Fixture production_ops has 4 pass and 2 blocked; live production_ops is `incomplete`; production adoption has provider/backfill/restore evidence. | ELF production gate, qmd, RAG/RAGFlow resource gates. | qmd live production_ops is `incomplete`; RAG/resource gates are `research_gate` `blocked`. | Rerun private-corpus and credentialed gates only when operator-owned manifest and credentials exist. |
+| Personalization | Fixture and live personalization pass. | mem0/OpenMemory, Letta. | mem0/OpenMemory and Letta personalization are `not_encoded`. | Encode scoped preference readback for mem0/OpenMemory and Letta before personalization superiority claims. |
+| Context trajectory | ELF has trace direction but no comparable staged trajectory scenario. | OpenViking. | OpenViking setup is pinned, same-corpus retrieval is `wrong_result`, and hierarchy trajectory is `not_encoded`. | Make OpenViking evidence-bearing retrieval pass, then score staged context trajectory outputs. |
+| Core-vs-archival memory | ELF core-block semantics exist in the service contract, but comparative benchmark coverage is not encoded here. | Letta. | Letta is `research_gate` `not_encoded` until contained export proof exists. | Add ELF core-block versus archival-search jobs; compare Letta only after contained export proof. |
+| Graph/RAG navigation | ELF relation context is not enough to claim graph/RAG navigation parity. | RAGFlow, LightRAG, GraphRAG, Graphiti/Zep, graphify. | All named RAG/graph projects are `research_gate` `blocked` or `not_encoded`. | Run XY-885 through XY-889 Docker-contained adapters with evidence-linked outputs. |
+
+## Parallelizable Benchmark Follow-Ups
+
+These workstreams can proceed after this matrix lands because the claim boundaries are
+now explicit:
+
+| Workstream | Issue or candidate | Parallelizable | Blocked by | Measurement |
+| --- | --- | --- | --- | --- |
+| qmd deep retrieval/debug profile | New benchmark issue | yes | None after this matrix lands. | Stress profile plus trace-level retrieval-debug artifacts for qmd and ELF. |
+| agentmemory durable lifecycle adapter | `[ELF benchmark P0] Make external adapters lifecycle-durable and fail-typed` | yes | Durable local adapter path selection. | Update, delete, cold-start reload, work_resume, and capture/write-policy jobs. |
+| mem0/OpenMemory local and UI coverage | New adapter repair issue | yes | Comparable local OSS path for UI/readback evidence. | Same-corpus fix plus memory_evolution, personalization, and OpenMemory inspection jobs. |
+| memsearch source-of-truth and reindex coverage | New adapter repair issue | yes | Docker same-corpus retrieval and reindex correctness. | Canonical markdown store, rebuild/reindex, retrieval, update/delete/reload jobs. |
+| OpenViking context trajectory | New benchmark issue after evidence output fix | yes | Evidence-bearing same-corpus retrieval output. | Hierarchical expansion, staged trajectory, and resume/retrieval evidence jobs. |
+| claude-mem progressive disclosure | New adapter issue | yes | Durable repository path and progressive-disclosure output contract. | Work resume, operator debugging, capture/write-policy, and progressive disclosure jobs. |
+| RAGFlow evidence smoke | XY-885 | yes | Resource envelope accepted for tiny Docker smoke. | `reference.chunks` to benchmark evidence mapping. |
+| LightRAG context export | XY-886 | yes | Docker service setup and explicit provider config. | Retrieved context export and source file-path citations. |
+| GraphRAG cost-bounded adapter | XY-887 | yes | Tiny corpus cost/resource envelope. | Document, text-unit, graph-summary, and citation output tables. |
+| Graphiti/Zep temporal graph adapter | XY-888 | yes | Docker-local graph store setup. | Current/historical/future fact validity and evidence ids. |
+| graphify graph report adapter | XY-889 | yes | Docker CLI graph/report generation proof. | `graph.json` and `GRAPH_REPORT` evidence for graph navigation and knowledge synthesis. |
+| Private corpus and credentialed production ops | Operator-owned benchmark gates | no | Sanitized private manifest and routed provider credentials. | Private-corpus retrieval quality and credentialed production-ops evidence. |
+| Letta, LangGraph, nanograph, llm-wiki direct adapters | Research-only until output contract | no | Contained evidence export or non-memory-backend comparability contract. | Run only after each has a comparable output contract; otherwise keep as product-reference evidence. |
+
+## Validation Contract
+
+Consistency checks for this report should verify:
+
+- The Markdown project matrix includes every project currently present in
+  `memory_projects_manifest.json`: ELF, qmd, agentmemory, mem0/OpenMemory, memsearch,
+  OpenViking, claude-mem, RAGFlow, LightRAG, GraphRAG, Graphiti/Zep, Letta, LangGraph,
+  nanograph, llm-wiki, gbrain, and graphify.
+- The machine-readable matrix has the same project set and includes every required
+  scenario id: `retrieval_debug`, `work_resume`, `project_decisions`,
+  `source_of_truth`, `temporal_current_historical`, `consolidation`,
+  `knowledge_pages`, `operator_debugging`, `capture_write_policy`, `production_ops`,
+  `personalization`, `context_trajectory`, `core_vs_archival_memory`, and
+  `graph_rag_navigation`.
+- Evidence states remain typed. Do not collapse `research_gate`, `blocked`,
+  `unsupported`, `wrong_result`, `lifecycle_fail`, `incomplete`, or `not_encoded`
+  into pass/fail aggregates.
+
+## Claim Rules
+
+- A project can be called stronger only for a named scenario with comparable measured
+  evidence.
+- `research_gate` plus setup metadata can justify a follow-up adapter issue, not a
+  product win.
+- A blocked measurement is not a hidden loss. Keep the typed reason and rerun only when
+  the missing operator or setup input exists.
+- If a project remains stronger on user-facing workflow but lacks comparable measured
+  evidence, record what ELF should borrow and add a benchmark gate before changing any
+  README-level claim.
diff --git a/docs/guide/benchmarking/index.md b/docs/guide/benchmarking/index.md
index 18824179..37798553 100644
--- a/docs/guide/benchmarking/index.md
+++ b/docs/guide/benchmarking/index.md
@@ -47,6 +47,10 @@ cleanup, use `docs/guide/single_user_production.md`.
   adoption refresh that keeps the decision at adopt with bounded caveats and separates
   fixture, live adapter, private corpus, credentialed, blocked, and research-gate
   evidence.
+- `2026-06-11-competitor-strength-evidence-matrix.md`: XY-897 competitor-strength
+  matrix contract that maps every tracked memory/RAG/graph project to its strongest
+  scenario, current evidence class, typed blockers, next measurement gate, and ELF
+  borrow-if-stronger direction.
 - `real_world_agent_memory_benchmark.md`: operator overview for the v1 real-world
   agent memory benchmark contract, including suite taxonomy, typed report states,
   knowledge-compilation fixture tasks, and the production-ops fixture target.
diff --git a/docs/research/2026-06-11-xy-897-competitor-strength-matrix.json b/docs/research/2026-06-11-xy-897-competitor-strength-matrix.json
new file mode 100644
index 00000000..b847ecc7
--- /dev/null
+++ b/docs/research/2026-06-11-xy-897-competitor-strength-matrix.json
@@ -0,0 +1,648 @@
+{
+  "schema": "elf.competitor_strength_evidence_matrix/v1",
+  "matrix_id": "xy-897-competitor-strength-evidence-matrix-2026-06-11",
+  "date": "2026-06-11",
+  "authority": "XY-897",
+  "purpose": "Keep competitor-strength claims tied to measured evidence classes, typed blockers, and next benchmark gates.",
+  "source_inputs": [
+    "docs/guide/benchmarking/2026-06-10-production-adoption-refresh.md",
+    "docs/guide/benchmarking/2026-06-10-real-world-comparison-report.md",
+    "docs/guide/benchmarking/2026-06-10-live-real-world-sweep-report.md",
+    "docs/guide/research/external_memory_improvement_plan.md",
+    "docs/guide/research/research_projects_inventory.md",
+    "apps/elf-eval/fixtures/real_world_external_adapters/memory_projects_manifest.json",
+    "Makefile.toml"
+  ],
+  "claim_boundary": {
+    "summary": "Do not claim ELF beats, ties, or loses to a project unless the named scenario is encoded and run at a comparable evidence class.",
+    "current_live_real_world_boundary": "ELF and qmd have full-suite live_real_world sweeps, but both are typed non-pass sweeps, not full-suite live passes.",
+    "research_gate_boundary": "Research-gate records are routing evidence for future adapters and must not be counted as fixture-backed, live-baseline, or live-real-world pass evidence.",
+    "operator_boundary": "Private corpus and credentialed production-ops checks remain blocked until operator-owned inputs are supplied."
+  },
+  "manifest_summary": {
+    "adapter_records": 21,
+    "project_count": 17,
+    "evidence_class_counts": {
+      "fixture_backed": 1,
+      "live_baseline_only": 6,
+      "live_real_world": 2,
+      "research_gate": 12
+    },
+    "overall_status_counts": {
+      "pass": 1,
+      "wrong_result": 6,
+      "lifecycle_fail": 1,
+      "blocked": 6,
+      "not_encoded": 7
+    }
+  },
+  "state_taxonomy": [
+    {
+      "state": "fixture_backed",
+      "meaning": "A checked-in fixture or generated fixture response is scored by the real-world job runner. This is evidence for the benchmark contract, not live runtime behavior."
+    },
+    {
+      "state": "live_baseline_only",
+      "meaning": "A Docker live-baseline adapter ran same-corpus or lifecycle checks, but no real-world job suite was scored through that project."
+    },
+    {
+      "state": "live_real_world",
+      "meaning": "A project adapter materialized and scored real-world job records through a runtime or CLI path."
+    },
+    {
+      "state": "research_gate",
+      "meaning": "Source, setup, resource, retry, and output-contract metadata exists, but the project has not produced live adapter pass evidence."
+    },
+    {
+      "state": "blocked",
+      "meaning": "A safe measurement cannot run without operator-owned credentials, private data, setup proof, or a dependency outside the lane."
+    },
+    {
+      "state": "unsupported",
+      "meaning": "The capability is out of scope for the project shape or would require a non-comparable path such as host-global state."
+    },
+    {
+      "state": "wrong_result",
+      "meaning": "The system ran but missed expected memory, evidence, or answer terms."
+    },
+    {
+      "state": "lifecycle_fail",
+      "meaning": "Basic retrieval may work, but update, delete, reload, persistence, or cold-start behavior is wrong or incomplete."
+    },
+    {
+      "state": "incomplete",
+      "meaning": "The run did not reach the behavioral check because setup, install, dependency, or runtime execution failed."
+    },
+    {
+      "state": "not_encoded",
+      "meaning": "The scenario is not currently encoded for that project or evidence class, so no pass or fail claim is allowed."
+    }
+  ],
+  "project_matrix": [
+    {
+      "project": "ELF",
+      "strongest_user_facing_scenario": "Evidence-linked source-of-truth memory service with real-world fixtures and live service retrieval sweeps.",
+      "current_evidence_class": "live_real_world",
+      "supporting_evidence_classes": [
+        "fixture_backed",
+        "live_real_world"
+      ],
+      "measured_status": "wrong_result",
+      "proof": {
+        "command": "cargo make real-world-memory-live-adapters",
+        "artifact": "tmp/real-world-memory/live-adapters/elf-report.md"
+      },
+      "unsupported_or_blocked_status": {
+        "state": "blocked",
+        "typed_reason": "private_manifest_and_provider_credentials",
+        "details": "Fixture production-ops keeps private corpus and provider credential gates blocked; live sweep keeps broader non-retrieval suites typed non-pass."
+      },
+      "benchmark_before_claim": "A full-suite live_real_world pass plus separate private-corpus and credentialed production-ops evidence is required before broad live parity or production proof claims.",
+      "borrow_if_stronger": "Keep borrowing qmd debug knobs, OpenViking staged trajectory, mem0 history, Letta core memory, and graph/RAG navigation patterns where they remain stronger."
+    },
+    {
+      "project": "qmd",
+      "strongest_user_facing_scenario": "Local retrieval-debug workflow with transparent CLI indexing, querying, expansion, fusion, and rerank ergonomics.",
+      "current_evidence_class": "live_real_world",
+      "supporting_evidence_classes": [
+        "live_baseline_only",
+        "live_real_world",
+        "research_gate"
+      ],
+      "measured_status": "wrong_result",
+      "proof": {
+        "command": "cargo make real-world-memory-live-adapters",
+        "artifact": "tmp/real-world-memory/live-adapters/qmd-report.md"
+      },
+      "unsupported_or_blocked_status": {
+        "state": "not_encoded",
+        "typed_reason": "deep_profile_and_non_retrieval_suites_not_encoded",
+        "details": "The full live sweep passes targeted retrieval suites but keeps memory_evolution wrong_result and several broader suites not_encoded or incomplete."
+      },
+      "benchmark_before_claim": "Run qmd deep retrieval/debug profile and full-suite live real-world replay with trace-level diagnostics before claiming ELF wins, ties, or loses on retrieval debugging.",
+      "borrow_if_stronger": "Borrow transparent local knobs for query rewriting, weighted fusion, rerank explanation, and command-line replay."
+    },
+    {
+      "project": "agentmemory",
+      "strongest_user_facing_scenario": "Coding-agent continuity, MCP/REST packaging, viewer workflow, and durable cross-agent memory lifecycle.",
+      "current_evidence_class": "live_baseline_only",
+      "supporting_evidence_classes": [
+        "live_baseline_only"
+      ],
+      "measured_status": "lifecycle_fail",
+      "proof": {
+        "command": "ELF_BASELINE_PROJECTS=agentmemory cargo make baseline-live-docker",
+        "artifact": "tmp/live-baseline/live-baseline-report.json"
+      },
+      "unsupported_or_blocked_status": {
+        "state": "blocked",
+        "typed_reason": "durable_lifecycle_adapter_missing",
+        "details": "Same-corpus retrieval can run, but durable cold-start and real-world job adapter coverage are blocked by the current adapter path."
+      },
+      "benchmark_before_claim": "Add a durable local adapter that covers update, delete, cold-start reload, work resume, capture/write policy, and lifecycle-staleness jobs.",
+      "borrow_if_stronger": "Borrow cross-agent hooks, packaging, continuity scenarios, and operator-visible viewer affordances."
+    },
+    {
+      "project": "mem0/OpenMemory",
+      "strongest_user_facing_scenario": "Memory lifecycle, personalization, hosted/OpenMemory UI ergonomics, and optional graph memory.",
+      "current_evidence_class": "live_baseline_only",
+      "supporting_evidence_classes": [
+        "live_baseline_only"
+      ],
+      "measured_status": "wrong_result",
+      "proof": {
+        "command": "ELF_BASELINE_PROJECTS=mem0 cargo make baseline-live-docker",
+        "artifact": "tmp/live-baseline/live-baseline-report.json"
+      },
+      "unsupported_or_blocked_status": {
+        "state": "not_encoded",
+        "typed_reason": "openmemory_ui_and_hosted_claims_not_encoded",
+        "details": "Local OSS setup is represented, but hosted/OpenMemory UI parity and real-world personalization coverage are not encoded."
+      },
+      "benchmark_before_claim": "Fix the local adapter's same-corpus result, then encode memory_evolution, personalization, OpenMemory UI readback, and optional graph-context jobs.",
+      "borrow_if_stronger": "Borrow entity-scoped memory history, lifecycle surfaces, async update ergonomics, and OpenMemory-style inspection UX."
+    },
+    {
+      "project": "memsearch",
+      "strongest_user_facing_scenario": "Markdown-first canonical store with rebuildable local index and practical hybrid retrieval.",
+      "current_evidence_class": "live_baseline_only",
+      "supporting_evidence_classes": [
+        "live_baseline_only"
+      ],
+      "measured_status": "wrong_result",
+      "proof": {
+        "command": "ELF_BASELINE_PROJECTS=memsearch cargo make baseline-live-docker",
+        "artifact": "tmp/live-baseline/live-baseline-report.json"
+      },
+      "unsupported_or_blocked_status": {
+        "state": "incomplete",
+        "typed_reason": "source_of_truth_and_reindex_real_world_jobs_incomplete",
+        "details": "Same-corpus retrieval is wrong_result and source-of-truth plus real-world reindex behavior is not yet cleanly scored."
+      },
+      "benchmark_before_claim": "Fix Docker same-corpus retrieval and reindex/update/delete reload evidence, then score source-of-truth and retrieval-debug real-world jobs.",
+      "borrow_if_stronger": "Borrow the canonical markdown-store ergonomics, local reindex clarity, and user-inspectable source files."
+    },
+    {
+      "project": "OpenViking",
+      "strongest_user_facing_scenario": "Filesystem-like context trajectory, hierarchical retrieval, and staged context loading.",
+      "current_evidence_class": "live_baseline_only",
+      "supporting_evidence_classes": [
+        "live_baseline_only",
+        "research_gate"
+      ],
+      "measured_status": "wrong_result",
+      "proof": {
+        "command": "ELF_BASELINE_PROJECTS=OpenViking cargo make baseline-live-docker",
+        "artifact": "tmp/live-baseline/live-baseline-report.json"
+      },
+      "unsupported_or_blocked_status": {
+        "state": "not_encoded",
+        "typed_reason": "hierarchical_context_trajectory_not_encoded",
+        "details": "Pinned Docker local embedding setup reaches add_resource/find, but same-corpus output misses expected evidence and trajectory jobs are not encoded."
+      },
+      "benchmark_before_claim": "First make evidence-bearing same-corpus output pass, then run a context-trajectory suite that scores staged retrieval paths and hierarchy expansion.",
+      "borrow_if_stronger": "Borrow the viking-style filesystem context model, trajectory readback, and staged retrieval planning."
+    },
+    {
+      "project": "claude-mem",
+      "strongest_user_facing_scenario": "Progressive disclosure, automatic capture loop, repository-local lifecycle, and practical local viewer workflow.",
+      "current_evidence_class": "live_baseline_only",
+      "supporting_evidence_classes": [
+        "live_baseline_only"
+      ],
+      "measured_status": "wrong_result",
+      "proof": {
+        "command": "ELF_BASELINE_PROJECTS=claude-mem cargo make baseline-live-docker",
+        "artifact": "tmp/live-baseline/live-baseline-report.json"
+      },
+      "unsupported_or_blocked_status": {
+        "state": "not_encoded",
+        "typed_reason": "progressive_disclosure_real_world_jobs_not_encoded",
+        "details": "Current Docker evidence is not a clean retrieval pass and progressive-disclosure jobs are not encoded."
+      },
+      "benchmark_before_claim": "Add durable repository-backed work_resume, operator_debugging_ux, capture/write-policy, and progressive-disclosure jobs.",
+      "borrow_if_stronger": "Borrow progressive disclosure, automatic capture review loops, and local viewer/operator comfort."
+    },
+    {
+      "project": "RAGFlow",
+      "strongest_user_facing_scenario": "Full RAG application workflow with document, chunk, and reference evidence handles.",
+      "current_evidence_class": "research_gate",
+      "supporting_evidence_classes": [
+        "research_gate"
+      ],
+      "measured_status": "blocked",
+      "proof": {
+        "command": "ELF_RAGFLOW_SMOKE_START=1 ELF_RAGFLOW_SMOKE_ACCEPT_RESOURCE_ENVELOPE=1 cargo make ragflow-docker-smoke",
+        "artifact": "tmp/real-world-memory/ragflow-smoke/ragflow-smoke.json"
+      },
+      "unsupported_or_blocked_status": {
+        "state": "blocked",
+        "typed_reason": "docker_service_resource_envelope_and_adapter_output_mapping",
+        "details": "Research says adapter candidate, but Docker runtime proof and reference.chunks to benchmark evidence mapping must still run."
+      },
+      "benchmark_before_claim": "Run XY-885 tiny Docker evidence-smoke adapter and map RAGFlow reference chunks to scored retrieval/debug evidence.",
+      "borrow_if_stronger": "Borrow document/chunk reference surfaces, resource-envelope reporting, and RAG app evidence handles."
+    },
+    {
+      "project": "LightRAG",
+      "strongest_user_facing_scenario": "Lightweight graph/RAG context export with source file-path citation shape.",
+      "current_evidence_class": "research_gate",
+      "supporting_evidence_classes": [
+        "research_gate"
+      ],
+      "measured_status": "blocked",
+      "proof": {
+        "command": "ELF_LIGHTRAG_CONTEXT_START=1 cargo make lightrag-docker-context-smoke",
+        "artifact": "tmp/real-world-memory/lightrag-context/summary.json"
+      },
+      "unsupported_or_blocked_status": {
+        "state": "blocked",
+        "typed_reason": "docker_service_setup_and_context_export_not_proven",
+        "details": "The project is an adapter candidate, but retrieved-context export and real-world adapter scoring remain blocked."
+      },
+      "benchmark_before_claim": "Run XY-886 Docker context-export adapter with explicit LLM and embedding config plus source citation mapping.",
+      "borrow_if_stronger": "Borrow context-only query modes, graph-aware retrieval layout, and file-path citation readback."
+    },
+    {
+      "project": "GraphRAG",
+      "strongest_user_facing_scenario": "GraphRAG indexing, graph summaries, and document/text-unit evidence tables.",
+      "current_evidence_class": "research_gate",
+      "supporting_evidence_classes": [
+        "research_gate"
+      ],
+      "measured_status": "blocked",
+      "proof": {
+        "command": "ELF_GRAPHRAG_SMOKE_RUN=1 cargo make graphrag-docker-smoke",
+        "artifact": "tmp/real-world-memory/graphrag-smoke/summary.json"
+      },
+      "unsupported_or_blocked_status": {
+        "state": "blocked",
+        "typed_reason": "indexing_resource_envelope_and_source_citation_mapping",
+        "details": "Cost-bounded Docker CLI/API and parquet outputs are identified, but indexing and evidence mapping have not passed."
+      },
+      "benchmark_before_claim": "Run XY-887 cost-bounded Docker adapter over a tiny corpus and score output tables against retrieval and knowledge-synthesis evidence.",
+      "borrow_if_stronger": "Borrow graph summary artifacts, local/global search separation, and source table evidence mapping."
+    },
+    {
+      "project": "Graphiti/Zep",
+      "strongest_user_facing_scenario": "Temporal graph memory with current, historical, and future fact validity windows.",
+      "current_evidence_class": "research_gate",
+      "supporting_evidence_classes": [
+        "research_gate"
+      ],
+      "measured_status": "blocked",
+      "proof": {
+        "command": "ELF_GRAPHITI_ZEP_SMOKE_START=1 ELF_GRAPHITI_ZEP_SMOKE_RUN=1 cargo make graphiti-zep-docker-temporal-smoke",
+        "artifact": "tmp/real-world-memory/graphiti-zep-smoke/summary.json"
+      },
+      "unsupported_or_blocked_status": {
+        "state": "blocked",
+        "typed_reason": "docker_graph_store_and_temporal_adapter_not_proven",
+        "details": "Temporal graph memory is an adapter candidate, but Docker graph-store setup and real-world job scoring are blocked."
+      },
+      "benchmark_before_claim": "Run XY-888 Docker-local temporal graph adapter and score current versus historical fact validity with evidence ids.",
+      "borrow_if_stronger": "Borrow temporal fact windows, invalidation/supersession semantics, and graph fact provenance."
+    },
+    {
+      "project": "Letta",
+      "strongest_user_facing_scenario": "Core memory blocks versus archival memory with explicit operating-context surfaces.",
+      "current_evidence_class": "research_gate",
+      "supporting_evidence_classes": [
+        "research_gate"
+      ],
+      "measured_status": "not_encoded",
+      "proof": {
+        "command": null,
+        "artifact": "docs/research/2026-06-10-xy-882-rag-graph-adapter-feasibility.json"
+      },
+      "unsupported_or_blocked_status": {
+        "state": "blocked",
+        "typed_reason": "contained_evidence_export_path_not_selected",
+        "details": "Research-only until a supported contained server path can export core/archival evidence without relying on unsupported setup."
+      },
+      "benchmark_before_claim": "Select a contained evidence export contract, then encode core-vs-archival memory, personalization, and project-decision jobs.",
+      "borrow_if_stronger": "Borrow explicit core memory block ergonomics, archival separation, and shared operating context readback."
+    },
+    {
+      "project": "LangGraph",
+      "strongest_user_facing_scenario": "Checkpoint/replay regression workflow and durable state replay for agent runs.",
+      "current_evidence_class": "research_gate",
+      "supporting_evidence_classes": [
+        "research_gate"
+      ],
+      "measured_status": "not_encoded",
+      "proof": {
+        "command": null,
+        "artifact": "docs/research/2026-06-10-xy-882-rag-graph-adapter-feasibility.json"
+      },
+      "unsupported_or_blocked_status": {
+        "state": "unsupported",
+        "typed_reason": "not_a_standalone_memory_backend_adapter",
+        "details": "Keep as a checkpoint/replay reference, not as a direct memory backend competitor until a comparable memory output contract exists."
+      },
+      "benchmark_before_claim": "Non-goal for direct win/loss until a standalone memory adapter contract exists; use replay regression jobs as a benchmark infrastructure reference.",
+      "borrow_if_stronger": "Borrow checkpoint replay, deterministic regression, and state-diff evaluation patterns."
+    },
+    {
+      "project": "nanograph",
+      "strongest_user_facing_scenario": "Typed graph schema and query ergonomics for graph-lite developer experience.",
+      "current_evidence_class": "research_gate",
+      "supporting_evidence_classes": [
+        "research_gate"
+      ],
+      "measured_status": "not_encoded",
+      "proof": {
+        "command": null,
+        "artifact": "docs/research/2026-06-10-xy-882-rag-graph-adapter-feasibility.json"
+      },
+      "unsupported_or_blocked_status": {
+        "state": "unsupported",
+        "typed_reason": "not_a_memory_backend_comparison_target",
+        "details": "Official shape is no server and no Docker path; use as graph-lite DX reference rather than adapter proof."
+      },
+      "benchmark_before_claim": "Non-goal for direct win/loss unless a contained memory-backed comparison target emerges; measure ELF graph-lite DX against typed schema/query acceptance instead.",
+      "borrow_if_stronger": "Borrow typed relation schema, query ergonomics, and small graph developer experience."
+    },
+    {
+      "project": "llm-wiki",
+      "strongest_user_facing_scenario": "LLM-maintained wiki or knowledge-page workflow with query-save and lint loops.",
+      "current_evidence_class": "research_gate",
+      "supporting_evidence_classes": [
+        "research_gate"
+      ],
+      "measured_status": "not_encoded",
+      "proof": {
+        "command": null,
+        "artifact": "docs/research/2026-06-10-xy-882-rag-graph-adapter-feasibility.json"
+      },
+      "unsupported_or_blocked_status": {
+        "state": "unsupported",
+        "typed_reason": "live_service_runtime_not_available_for_adapter_proof",
+        "details": "Research-only until a contained plugin or instruction harness can emit scored knowledge-page evidence."
+      },
+      "benchmark_before_claim": "Select a contained plugin or instruction harness, then score knowledge pages for citation coverage, unsupported claims, rebuild, and stale-source lint.",
+      "borrow_if_stronger": "Borrow maintained wiki workflows, page lint, query-save loops, and topic-scoped knowledge navigation."
+    },
+    {
+      "project": "gbrain",
+      "strongest_user_facing_scenario": "Operational knowledge brain with compiled_truth pages, timelines, enrichment, and maintenance loops.",
+      "current_evidence_class": "research_gate",
+      "supporting_evidence_classes": [
+        "research_gate"
+      ],
+      "measured_status": "not_encoded",
+      "proof": {
+        "command": null,
+        "artifact": "docs/research/2026-06-10-xy-882-rag-graph-adapter-feasibility.json"
+      },
+      "unsupported_or_blocked_status": {
+        "state": "blocked",
+        "typed_reason": "docker_local_brain_repo_and_database_path_missing",
+        "details": "Research remains blocked until a Docker-local brain repo and database path can be proven without operator-owned state."
+      },
+      "benchmark_before_claim": "First prove Docker-local repository and database setup, then encode compiled_truth/timeline page scoring and operator-continuity jobs.",
+      "borrow_if_stronger": "Borrow compiled truth pages, timeline maintenance, and human-operable knowledge-brain navigation."
+    },
+    {
+      "project": "graphify",
+      "strongest_user_facing_scenario": "Graph-compressed navigation with graph.json and GRAPH_REPORT evidence outputs.",
+      "current_evidence_class": "research_gate",
+      "supporting_evidence_classes": [
+        "research_gate"
+      ],
+      "measured_status": "blocked",
+      "proof": {
+        "command": "cargo make graphify-docker-graph-report-smoke",
+        "artifact": "tmp/real-world-memory/graphify-smoke/graphify-smoke.json"
+      },
+      "unsupported_or_blocked_status": {
+        "state": "blocked",
+        "typed_reason": "docker_cli_graph_report_generation_not_proven",
+        "details": "Adapter candidate, but graph report generation and real-world scoring are still blocked; host-global assistant hooks are out of scope."
+      },
+      "benchmark_before_claim": "Run XY-889 Docker-only graph/report adapter over graph.json and GRAPH_REPORT.md, then score graph navigation and knowledge-synthesis evidence.",
+      "borrow_if_stronger": "Borrow graph compression, source-location graph reports, and navigation hints for large code or document spaces."
+    }
+  ],
+  "scenario_matrix": [
+    {
+      "scenario_id": "retrieval_debug",
+      "scenario": "retrieval/debug",
+      "current_elf_evidence": "ELF fixture-backed retrieval passes and ELF live_real_world retrieval passes in the full sweep.",
+      "strongest_competitor_or_reference": "qmd",
+      "current_competitor_evidence": "qmd live_real_world retrieval passes and qmd live_baseline_only checks pass, but qmd full-suite live status is wrong_result.",
+      "current_state": "Measured tie on encoded retrieval answers; qmd remains stronger on local debug ergonomics not fully scored.",
+      "next_measurement": "Run qmd deep retrieval/debug profile and ELF/qmd trace-level wrong-result replay with expansion, fusion, rerank, and candidate-drop diagnostics."
+    },
+    {
+      "scenario_id": "work_resume",
+      "scenario": "work resume",
+      "current_elf_evidence": "ELF fixture-backed work_resume passes and ELF live_real_world work_resume passes.",
+      "strongest_competitor_or_reference": "agentmemory, claude-mem, OpenViking",
+      "current_competitor_evidence": "agentmemory is live_baseline_only with lifecycle_fail; claude-mem is wrong_result; OpenViking work_resume is not_encoded.",
+      "current_state": "ELF and qmd have current encoded live pass evidence, but continuity-oriented competitors remain undermeasured.",
+      "next_measurement": "Encode durable agentmemory, claude-mem, and OpenViking work_resume adapters or declare each blocked with lifecycle/setup evidence."
+    },
+    {
+      "scenario_id": "project_decisions",
+      "scenario": "project decisions",
+      "current_elf_evidence": "ELF fixture-backed and live_real_world project_decisions suites pass.",
+      "strongest_competitor_or_reference": "qmd, Letta",
+      "current_competitor_evidence": "qmd live_real_world project_decisions passes; Letta project_decisions is research_gate not_encoded.",
+      "current_state": "ELF and qmd are the only measured live competitors for this scenario.",
+      "next_measurement": "Add core/archival decision-memory jobs for Letta only after a contained export path exists; otherwise keep Letta as design reference."
+    },
+    {
+      "scenario_id": "source_of_truth",
+      "scenario": "source-of-truth",
+      "current_elf_evidence": "ELF fixture-backed trust_source_of_truth passes and ELF live_real_world trust_source_of_truth passes.",
+      "strongest_competitor_or_reference": "memsearch",
+      "current_competitor_evidence": "memsearch has live_baseline_only canonical store evidence but trust_source_of_truth is incomplete and retrieval is wrong_result.",
+      "current_state": "ELF has stronger measured source-of-truth evidence; memsearch remains a local-store ergonomics reference.",
+      "next_measurement": "Fix memsearch same-corpus retrieval/reindex evidence, then run source-of-truth rebuild and reload jobs before any win/loss claim."
+    },
+    {
+      "scenario_id": "temporal_current_historical",
+      "scenario": "temporal/current-vs-historical memory",
+      "current_elf_evidence": "ELF fixture-backed memory_evolution passes, but ELF live_real_world memory_evolution is wrong_result.",
+      "strongest_competitor_or_reference": "Graphiti/Zep, mem0/OpenMemory",
+      "current_competitor_evidence": "Graphiti/Zep is research_gate blocked; mem0/OpenMemory is live_baseline_only wrong_result.",
+      "current_state": "No project has a comparable live pass for current-vs-historical evidence; ELF cannot claim live superiority yet.",
+      "next_measurement": "Fix ELF/qmd live memory_evolution evidence links and run XY-888 Graphiti/Zep temporal graph adapter."
+    },
+    {
+      "scenario_id": "consolidation",
+      "scenario": "consolidation",
+      "current_elf_evidence": "ELF fixture-backed consolidation passes, but live_real_world consolidation is not_encoded.",
+      "strongest_competitor_or_reference": "agentmemory, managed dreaming references, llm-wiki",
+      "current_competitor_evidence": "Manifest projects do not yet have live consolidation scoring; llm-wiki knowledge workflow is research_gate not_encoded.",
+      "current_state": "Fixture-only ELF evidence is useful, but no live proposal-generation parity claim is allowed.",
+      "next_measurement": "Run a reviewable consolidation-worker benchmark that emits proposals, source refs, unsupported-claim flags, and apply/discard/defer audit events."
+    },
+    {
+      "scenario_id": "knowledge_pages",
+      "scenario": "knowledge pages",
+      "current_elf_evidence": "ELF fixture-backed knowledge_compilation passes, but live_real_world knowledge_compilation is not_encoded.",
+      "strongest_competitor_or_reference": "llm-wiki, gbrain, GraphRAG, graphify",
+      "current_competitor_evidence": "llm-wiki and gbrain are research_gate not_encoded or blocked; GraphRAG and graphify are research_gate blocked.",
+      "current_state": "No live knowledge-page competitor result exists; ELF has only fixture-backed derived-page evidence.",
+      "next_measurement": "Encode live knowledge-page rebuild/lint scoring for ELF and run contained llm-wiki, gbrain, GraphRAG, or graphify adapters only after setup proof exists."
+    },
+    {
+      "scenario_id": "operator_debugging",
+      "scenario": "operator debugging",
+      "current_elf_evidence": "ELF fixture-backed operator_debugging_ux passes, but ELF live_real_world operator_debugging_ux is not_encoded.",
+      "strongest_competitor_or_reference": "qmd, claude-mem, OpenMemory",
+      "current_competitor_evidence": "qmd has local debug strengths but operator_debugging_ux is not_encoded in live sweeps; claude-mem and OpenMemory UX are not_encoded.",
+      "current_state": "Operator debugging remains mostly product/UX evidence, not comparable live benchmark evidence.",
+      "next_measurement": "Score trace hydration, candidate-stage attribution, raw-SQL avoidance, and repair-action clarity through live viewer or CLI artifacts."
+    },
+    {
+      "scenario_id": "capture_write_policy",
+      "scenario": "capture/write policy",
+      "current_elf_evidence": "ELF fixture-backed capture_integration passes, but ELF live_real_world capture_integration is not_encoded.",
+      "strongest_competitor_or_reference": "agentmemory, claude-mem",
+      "current_competitor_evidence": "agentmemory capture_integration is blocked and claude-mem capture_integration is not_encoded.",
+      "current_state": "ELF fixture evidence is strongest, but live capture and write-policy behavior still needs runtime scoring.",
+      "next_measurement": "Run capture/write-policy jobs that prove redaction, exclusion, evidence binding, and no secret leakage through live ingestion paths."
+    },
+    {
+      "scenario_id": "production_ops",
+      "scenario": "production ops",
+      "current_elf_evidence": "ELF production runbooks and fixture production_ops cover restore, Qdrant rebuild, backfill resume, resource envelope, and typed private/credential blockers; live_real_world production_ops is incomplete.",
+      "strongest_competitor_or_reference": "ELF production gate, qmd, RAG/RAGFlow resource gates",
+      "current_competitor_evidence": "qmd live production_ops is incomplete; RAGFlow/GraphRAG/LightRAG resource gates are research_gate blocked.",
+      "current_state": "ELF has the strongest checked-in production evidence, but private corpus and credentialed gates remain blocked.",
+      "next_measurement": "Rerun private-corpus and credentialed production-ops gates only when operator-owned manifest and credentials are supplied."
+    },
+    {
+      "scenario_id": "personalization",
+      "scenario": "personalization",
+      "current_elf_evidence": "ELF fixture-backed personalization passes and ELF live_real_world personalization passes.",
+      "strongest_competitor_or_reference": "mem0/OpenMemory, Letta",
+      "current_competitor_evidence": "mem0/OpenMemory personalization is not_encoded and Letta personalization is research_gate not_encoded.",
+      "current_state": "ELF and qmd have live encoded evidence; personalization-specialized competitors are not yet comparable.",
+      "next_measurement": "Encode mem0/OpenMemory and Letta scoped-preference readback jobs before making personalization superiority claims."
+    },
+    {
+      "scenario_id": "context_trajectory",
+      "scenario": "context trajectory",
+      "current_elf_evidence": "ELF has trace and trajectory directions, but staged context trajectory is not yet a comparable live scenario.",
+      "strongest_competitor_or_reference": "OpenViking",
+      "current_competitor_evidence": "OpenViking Docker setup is pinned, same-corpus retrieval is wrong_result, and hierarchical trajectory is research_gate not_encoded.",
+      "current_state": "OpenViking remains the strongest design reference, but not a measured live winner.",
+      "next_measurement": "Make OpenViking same-corpus evidence-bearing retrieval pass, then score hierarchical expansion and staged context trajectory outputs."
+    },
+    {
+      "scenario_id": "core_vs_archival_memory",
+      "scenario": "core-vs-archival memory",
+      "current_elf_evidence": "ELF spec and admin surfaces define core blocks, but comparative benchmark coverage is not yet encoded here.",
+      "strongest_competitor_or_reference": "Letta",
+      "current_competitor_evidence": "Letta is research_gate not_encoded until a contained evidence export path is selected.",
+      "current_state": "Scenario is a product gap measurement target, not a current win/loss surface.",
+      "next_measurement": "Add core-block versus archival-search jobs for ELF and only compare Letta after contained export proof exists."
+    },
+    {
+      "scenario_id": "graph_rag_navigation",
+      "scenario": "graph/RAG navigation",
+      "current_elf_evidence": "ELF relation context and graph-lite work are not enough to claim graph/RAG navigation parity.",
+      "strongest_competitor_or_reference": "RAGFlow, LightRAG, GraphRAG, Graphiti/Zep, graphify",
+      "current_competitor_evidence": "All named RAG/graph projects are research_gate blocked or not_encoded, with adapter-candidate follow-ups for RAGFlow, LightRAG, GraphRAG, Graphiti/Zep, and graphify.",
+      "current_state": "No RAG/graph project has live_real_world pass evidence; research gates define follow-up adapter work only.",
+      "next_measurement": "Run XY-885 through XY-889 Docker-contained adapters and require evidence-linked outputs before any graph/RAG navigation claim."
+    }
+  ],
+  "parallelizable_followups": [
+    {
+      "workstream": "qmd deep retrieval/debug profile",
+      "issue_or_candidate": "new benchmark issue",
+      "parallelizable": true,
+      "blocked_by": "None after this matrix lands.",
+      "measurement": "Stress profile plus trace-level retrieval-debug artifacts for qmd and ELF."
+    },
+    {
+      "workstream": "agentmemory durable lifecycle adapter",
+      "issue_or_candidate": "[ELF benchmark P0] Make external adapters lifecycle-durable and fail-typed",
+      "parallelizable": true,
+      "blocked_by": "Durable local adapter path selection.",
+      "measurement": "Update, delete, cold-start reload, work_resume, and capture/write-policy jobs."
+    },
+    {
+      "workstream": "mem0/OpenMemory local and UI coverage",
+      "issue_or_candidate": "new adapter repair issue",
+      "parallelizable": true,
+      "blocked_by": "Comparable local OSS path for UI/readback evidence.",
+      "measurement": "Same-corpus fix plus memory_evolution, personalization, and OpenMemory inspection jobs."
+    },
+    {
+      "workstream": "memsearch source-of-truth and reindex coverage",
+      "issue_or_candidate": "new adapter repair issue",
+      "parallelizable": true,
+      "blocked_by": "Docker same-corpus retrieval and reindex correctness.",
+      "measurement": "Canonical markdown store, rebuild/reindex, retrieval, update/delete/reload jobs."
+    },
+    {
+      "workstream": "OpenViking context trajectory",
+      "issue_or_candidate": "new benchmark issue after evidence output fix",
+      "parallelizable": true,
+      "blocked_by": "Evidence-bearing same-corpus retrieval output.",
+      "measurement": "Hierarchical expansion, staged trajectory, and resume/retrieval evidence jobs."
+    },
+    {
+      "workstream": "claude-mem progressive disclosure",
+      "issue_or_candidate": "new adapter issue",
+      "parallelizable": true,
+      "blocked_by": "Durable repository path and progressive-disclosure output contract.",
+      "measurement": "Work resume, operator debugging, capture/write-policy, and progressive disclosure jobs."
+    },
+    {
+      "workstream": "RAGFlow evidence smoke",
+      "issue_or_candidate": "XY-885",
+      "parallelizable": true,
+      "blocked_by": "Resource envelope accepted for tiny Docker smoke.",
+      "measurement": "reference.chunks to benchmark evidence mapping."
+    },
+    {
+      "workstream": "LightRAG context export",
+      "issue_or_candidate": "XY-886",
+      "parallelizable": true,
+      "blocked_by": "Docker service setup and explicit provider config.",
+      "measurement": "Retrieved context export and source file-path citations."
+    },
+    {
+      "workstream": "GraphRAG cost-bounded adapter",
+      "issue_or_candidate": "XY-887",
+      "parallelizable": true,
+      "blocked_by": "Tiny corpus cost/resource envelope.",
+      "measurement": "Document, text-unit, graph-summary, and citation output tables."
+    },
+    {
+      "workstream": "Graphiti/Zep temporal graph adapter",
+      "issue_or_candidate": "XY-888",
+      "parallelizable": true,
+      "blocked_by": "Docker-local graph store setup.",
+      "measurement": "Current/historical/future fact validity and evidence ids."
+    },
+    {
+      "workstream": "graphify graph report adapter",
+      "issue_or_candidate": "XY-889",
+      "parallelizable": true,
+      "blocked_by": "Docker CLI graph/report generation proof.",
+      "measurement": "graph.json and GRAPH_REPORT evidence for graph navigation and knowledge synthesis."
+    },
+    {
+      "workstream": "Private corpus and credentialed production ops",
+      "issue_or_candidate": "operator-owned benchmark gates",
+      "parallelizable": false,
+      "blocked_by": "Sanitized private manifest and routed provider credentials.",
+      "measurement": "Private-corpus retrieval quality and credentialed production-ops pass/fail evidence."
+    },
+    {
+      "workstream": "Letta, LangGraph, nanograph, llm-wiki direct adapters",
+      "issue_or_candidate": "research-only until output contract",
+      "parallelizable": false,
+      "blocked_by": "Contained evidence export or non-memory-backend comparability contract.",
+      "measurement": "Only run after each has a comparable output contract; otherwise treat as product-reference evidence."
+    }
+  ]
+}