Conversation
38d7926 to
141110a
Compare
bolinfest
added a commit
that referenced
this pull request
Jul 2, 2026
## Why App-server deployments can consume structured JSON logs for operational measurements without requiring an OTEL exporter. Existing tool-result telemetry reports the handler outcome, but it does not separate time spent waiting to dispatch from time spent executing the handler. A compact completion event for the outer, direct tool call lets consumers measure those phases and correlate them with a conversation and turn. Code-mode calls are intentionally excluded so nested runtime calls do not create overlapping events that are easy to double-count. ## What changed - Added a [`ToolCallTimingGuard`](https://github.com/openai/codex/blob/141110a73cff041c42fd51f47e363e83df1dde51/codex-rs/core/src/tools/parallel.rs#L32) around direct tool calls. Event-only strings and timing state are captured only when the `codex_core::tools::parallel` `INFO` target is enabled. - Added a [`codex.tool_call`](https://github.com/openai/codex/blob/141110a73cff041c42fd51f47e363e83df1dde51/codex-rs/core/src/tools/parallel.rs#L313) completion event with conversation, turn, tool, call, trace, dispatch, handler, and total timing fields. - Recorded the execution-start marker after the dispatch lock is acquired. Event emission snapshots that marker once so a concurrently starting dispatch cannot produce internally inconsistent fields. - Limited the event to `ToolCallSource::Direct`; [unit coverage](https://github.com/openai/codex/blob/141110a73cff041c42fd51f47e363e83df1dde51/codex-rs/core/src/tools/parallel.rs#L365) verifies code-mode calls are ignored. - Added [cancellation coverage](https://github.com/openai/codex/blob/141110a73cff041c42fd51f47e363e83df1dde51/codex-rs/core/src/tools/parallel.rs#L408) that holds the execution gate and verifies a call cancelled before admission emits exactly one dispatch-only timing event. - Added reusable [`JsonLogCapture`](https://github.com/openai/codex/blob/141110a73cff041c42fd51f47e363e83df1dde51/codex-rs/app-server/tests/common/json_logging.rs#L15) support, a [JSON-logging-specific `TestAppServer` constructor](https://github.com/openai/codex/blob/141110a73cff041c42fd51f47e363e83df1dde51/codex-rs/app-server/tests/common/test_app_server.rs#L172), and an [end-to-end app-server test](https://github.com/openai/codex/blob/141110a73cff041c42fd51f47e363e83df1dde51/codex-rs/app-server/tests/suite/logging.rs#L52) that drives a direct `exec_command` through the public v2 JSON-RPC API and validates the emitted JSON event. Exec-server-specific request and process timing remains in the stacked PR #30901. ## Suggested logging filter ```bash LOG_FORMAT=json \ RUST_LOG='warn,codex_core::tools::parallel=info' \ codex app-server ``` ## Event example Identifier and timing values are illustrative. ### `codex.tool_call` ```json { "timestamp": "2026-06-27T03:45:20.443Z", "level": "INFO", "fields": { "message": "tool call completed", "event.name": "codex.tool_call", "trace_id": "4bf92f3577b34da6a3ce929d0e0e4736", "conversation.id": "67e55044-10b1-426f-9247-bb680e5fe0c8", "turn_id": "019f04f8-6ac2-78f1-8625-f04a6d35af18", "tool_name": "exec_command", "call_id": "call_7b8483", "tool_source": "direct", "execution_started": true, "dispatch_duration_ms": 12, "handler_duration_ms": 431, "total_duration_ms": 443 }, "target": "codex_core::tools::parallel" } ``` If execution never starts, `execution_started` is `false`, `handler_duration_ms` is `0`, and `dispatch_duration_ms` covers the full observed lifetime. If a duration cannot be represented as an unsigned 64-bit millisecond value, all three duration fields are omitted rather than populated with a sentinel that could corrupt downstream calculations. ## Test plan - `just test -p codex-core tool_call_timing_guard_ignores_code_mode_source` - `just test -p codex-core cancellation_before_dispatch_admission_logs_dispatch_only_timing` - `just test -p codex-app-server app_server_emits_structured_tool_call_timing_event` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/30334). * __->__ #30334
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
An exec-server request and the process it starts have different lifetimes. A request-level measurement shows JSON-RPC handling latency, while a process-level measurement captures the actual execution lifetime and may continue after the initiating request has completed. Structured
INFOlogs let deployments consume both measurements through their existing logging pipeline.This PR is stacked on #30334, which adds the corresponding core tool and inference events.
What changed
codex.exec_server_requestevent when request handling completes.codex.exec_server_processevent when the underlying process reaches a terminal state.codex_exec_server::telemetryINFOtarget is enabled.Suggested logging filter
RUST_LOG='warn,codex_exec_server::telemetry=info' codex-exec-serverEvent examples
Identifier and timing values below are illustrative.
codex.exec_server_request{ "timestamp": "2026-06-27T03:45:20.447Z", "level": "INFO", "fields": { "message": "exec-server request completed", "event.name": "codex.exec_server_request", "request_id": "request-42", "method": "process/start", "result": "success", "duration_ms": 18, "traceparent": "00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01" }, "target": "codex_exec_server::telemetry" }codex.exec_server_process{ "timestamp": "2026-06-27T03:45:20.861Z", "level": "INFO", "fields": { "message": "exec-server process completed", "event.name": "codex.exec_server_process", "process_id": "process-018f27", "trace_id": "4bf92f3577b34da6a3ce929d0e0e4736", "result": "success", "duration_ms": 414 }, "target": "codex_exec_server::telemetry" }Test plan
just test -p codex-exec-server exec_server_timing_events_are_structured_info_logsjust test -p codex-exec-server disabled_info_events_do_not_capture_process_log_valuesStack created with Sapling. Best reviewed with ReviewStack.