Skip to content

[codex] Preserve command identity across repeated approvals#30969

Open
bookholt-oai wants to merge 11 commits into
bookholt/psec-4922-approval-integrityfrom
bookholt/psec-4922-repeated-approval-client-integrity
Open

[codex] Preserve command identity across repeated approvals#30969
bookholt-oai wants to merge 11 commits into
bookholt/psec-4922-approval-integrityfrom
bookholt/psec-4922-repeated-approval-client-integrity

Conversation

@bookholt-oai

@bookholt-oai bookholt-oai commented Jul 3, 2026

Copy link
Copy Markdown
Contributor

Why

A command item can receive more than one approval callback. Clients already accept an optional callback ID, but several app-server and TUI paths still treated the command item and callback as the same identity. A follow-up callback could therefore replace parent command metadata, complete the item twice, or offer persistence choices that a one-shot callback cannot honor.

The TUI also correlated pending app-server prompts with thread-local semantic IDs instead of the exact outer JSON-RPC request. Equal IDs across watched threads could collide, while a delayed response to a remotely resolved request could consume a newer same-key request.

This is a client-integrity prerequisite in the PSEC-4922 stack. It does not make core emit fresh retry callback IDs and does not close PSEC-4922 by itself.

What

  • Keep itemId as the stable command identity and treat approvalId, when present, as the per-callback UI/state and internal callback identity.
  • Restrict compatibility callbacks with an ID and no explicit decision list to one-shot Accept or Cancel choices.
  • Preserve the first command item's command, working directory, and actions across later callbacks.
  • Consume command metadata once so cancellation and a late process-end event cannot emit duplicate terminal updates.
  • Scope TUI pending requests by typed thread identity and carry the exact typed outer request ID through visible, deferred, replayed, and remotely dismissed prompts.
  • Reject duplicate scoped prompt keys without replacing the original request, and make stale-generation responses inert instead of falling back to an ordinary thread operation.
  • Keep callback replay and analytics behavior compatible while documenting when approvalId may be omitted.

How

The app-server stores parent command metadata independently from each approval callback, records whether a callback started a new item, and consumes the parent record on its single terminal path. Protocol, Session, and active and inactive TUI fallback logic use the callback ID to select conservative one-shot choices when the server did not provide an explicit decision list. Existing explicit decision lists remain authoritative.

The TUI validates request thread IDs before registration, namespaces semantic lookup keys by thread, and carries the outer RequestId in a dedicated exact-resolution event. Registry removal requires the same typed thread, semantic key or turn, and outer request ID. Replay/status state is authoritative by exact request generation, and UI dismissal uses the same identity across approvals, user input, and MCP elicitation. Any stale or mismatched response is a no-op.

This boundary is larger than a mechanical change because the identity rule must agree across protocol fallback, active and replayed TUI state, app-server lifecycle handling, public documentation, and the corresponding race matrix.

Testing

  • just test -p codex-protocol callback_without_explicit_decisions_is_one_shot
  • just test -p codex-core callback_without_explicit_decisions_is_one_shot
  • Focused app-server metadata and lifecycle tests, including public v2 zsh-fork cancellation and late completion
  • just test -p codex-app-server-protocol (251 passed)
  • just test -p codex-tui app_server (129 passed)
  • Focused TUI matrices: approval overlay (39), request user input (83), app link (24), pending replay (17), MCP elicitation (20), and deferred origin-thread routing (1) all passed
  • just test -p codex-tui: 2,971 passed, 2 failed, 4 skipped; both failures were unchanged Guardian feature-flag tests reproduced twice on the exact pre-correlation branch baseline
  • just fix -p codex-tui and just fmt
  • Local Bazel argument-comment lint could not start because the repository's pinned rustc is older than the resolved sqlx requirement; required cross-platform CI, including native Windows and argument-comment lint, is pending before this draft is marked ready

Related: PSEC-4922

@bookholt-oai bookholt-oai marked this pull request as ready for review July 3, 2026 19:08
@bookholt-oai bookholt-oai requested a review from a team as a code owner July 3, 2026 19:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant