Skip to content

fix(app-server): reconcile loaded thread history on resume#30866

Draft
steipete-oai wants to merge 3 commits into
mainfrom
steipete/reconcile-running-thread-history
Draft

fix(app-server): reconcile loaded thread history on resume#30866
steipete-oai wants to merge 3 commits into
mainfrom
steipete/reconcile-running-thread-history

Conversation

@steipete-oai

@steipete-oai steipete-oai commented Jul 1, 2026

Copy link
Copy Markdown

Summary

  • Reconcile an already loaded, idle thread with its persisted rollout when thread/resume is called.
  • Serialize running-thread resume with rollback and history injection, while preserving the live overlay when a turn is active.
  • Keep reconstructed runtime state coherent across external appends, rollbacks, compaction, interrupted tails, token usage, metadata observation, and auto-compaction windows.

Motivation

Fixes #21743. Related focus-refresh report: #21177.

When two app-server clients share a thread, one can append work after another client has loaded it. The loaded app-server previously answered thread/resume from stale in-memory history, so Desktop could not discover the external turns until its app-server restarted. This also preserves the ordering requirement identified in #11756 (comment): history hydration and listener attachment remain serialized.

Approach

The app-server remains the source of truth rather than introducing a local file or SQLite watcher, which would not generalize to remote app-servers.

For a loaded idle thread, resume now:

  1. Takes a per-thread reconciliation lock and an optimistic snapshot of model history.
  2. Flushes local rollout writes and reloads persisted history.
  3. Validates a logical non-metadata cursor, imports the external suffix without re-truncating the existing prefix, or performs canonical reconstruction for compaction and rollback.
  4. Atomically installs history and related runtime state only if the thread is still idle and the snapshot still matches.
  5. Builds the resume response from that same persisted read before the next request is allowed through.

The storage and fresh metadata reads run off the listener behind an explicit event-delivery cut. The Finish phase projects the final overlay for the response, drains pre-cut events to prior subscribers only, then resumes normal track-one/dispatch-one delivery. High-volume legacy exec deltas continue draining to existing subscribers while storage is read, and Finish consumes only a finite queue snapshot. This prevents joiner duplicates, listener stalls, and unbounded output buffering.

Busy, incomplete, and conflicting histories fail closed or retry without overwriting live state. The cursor ignores metadata-only appends. After an ambiguous append failure, recovery accepts only identical history or a safe strict extension of the complete in-memory snapshot; older and divergent reads conflict without mutation.

Rejected resumes leave client compatibility settings untouched. Append-only reconciliation also preserves the current server-measured auto-compaction prefill; canonical rewrites and window changes reset it. Imported trailing compaction queues its next-turn lifecycle source exactly once, including cursor-mismatch reconstruction. Point operations such as rollback lease the listener without implicitly subscribing, and lease acquisition is serialized with idle unload.

The one-shot token-budget reminder now carries a distinct contextual marker. Cold resume and live reconciliation derive its delivered latch from the surviving current-window history, so an external reminder is not emitted twice and rollback or compaction rearms it correctly. Older unmarked reminder text remains intentionally indistinguishable from arbitrary configured developer text.

The diff is test-heavy: most added lines cover core reconstruction invariants and cross-process app-server ordering.
The commits retain a review seam between core/storage reconciliation and app-server delivery ordering; the draft keeps them together because the externally observable no-loss/no-duplicate guarantee spans both layers.

Verification

  • Consolidated the new regression coverage by 600 net lines while preserving all 34 core reconciliation, 22 lifecycle, and 42 V2 resume tests; independent scenario-parity audits were clean
  • Full app-server thread-resume module (42 passed)
  • Core history reconciliation unit suite (34 passed)
  • Event-cut lifecycle/replay suite (22 passed) and token replay suite (6 passed)
  • Clean-target full Rust workspace: 11,801 tests executed, 11,791 passed; the sole additional load-sensitive timeout passed alone, while the remaining nine failures reproduce on untouched origin/main under this managed host's forced policy and skill roots
  • Live-vs-durable collaboration projection regression
  • Two independent app-server processes handing off one rollout
  • Resume/injection and rollback/resume ordering regressions
  • Busy-turn overlay, incomplete/crash-tail, compaction, rollback, metadata, and cursor regressions
  • Paired source Desktop + CLI E2E, independently for window refocus and route re-entry: A loaded in Desktop, external CLI appends B, automatic refresh shows B once with no manual fallback, then C receives A/B/C in order exactly once
  • Final just fix -p codex-core, just fix -p codex-app-server, and just fmt

Follow-up

Experimental raw-event opt-in remains thread-scoped. Per-connection raw-event capabilities are a separate architectural change; this PR applies response coverage and redaction only to buffered handoff events.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Codex Desktop open thread view does not refresh after another app-server client appends a turn

1 participant