fix: merge duplicate-index entries in streaming tool_call deltas#3232
fix: merge duplicate-index entries in streaming tool_call deltas#3232sharziki wants to merge 1 commit into
Conversation
When the first streamed chunk contains multiple tool_calls entries with the same logical index (e.g. a tool name and its first argument fragment in one SSE event), the two early-return branches in accumulate_delta store the list as-is without merging by index. Later chunks merge into physical position 0 only, leaving the duplicate entry orphaned and producing invalid final JSON. Add _merge_indexed_list() that collapses entries sharing the same index value before the list is stored for the first time. Applied in both _deltas.py and _assistants.py which have identical accumulate_delta implementations. Fixes openai#3201
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c2bcb65bb3
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if not is_dict(item): | ||
| continue | ||
| idx = item.get("index") # type: ignore[union-attr] | ||
| if not isinstance(idx, int): | ||
| continue |
There was a problem hiding this comment.
Do not silently drop malformed indexed list entries
When _merge_indexed_list() is triggered (first item is a dict with index), entries that are not dicts or whose index is missing/non-int are skipped via continue, which silently removes data from the stream snapshot. This differs from the existing list-merge path in accumulate_delta() that explicitly raises on invalid indexed entries, so malformed-but-present payloads now become silent corruption instead of a detectable error. In streams where providers emit mixed or partially malformed tool_calls arrays, this can produce incomplete final messages without surfacing any failure.
Useful? React with 👍 / 👎.
Summary
Fixes #3201
When the first streamed chunk contains multiple
tool_callsentries with the same logicalindex(e.g. a tool-call name and its first argument fragment in one SSE event), the two early-return branches inaccumulate_deltastore the list as-is without merging by index. Later chunks merge into physical position 0 only, leaving the duplicate entry orphaned and producing invalid final JSON.Root cause
Both early-return branches at the top of the per-key loop in
_deltas.py(and the identical copy in_assistants.py) assign the delta list wholesale and skip the index-based merge logic that handles duplicate indexes in steady state:Fix
Add
_merge_indexed_list()that collapses list entries sharing the sameindexvalue before the list is stored for the first time. Applied in both_deltas.pyand_assistants.py.indexfield in the first entry pass through unchangedindexvalues are merged using the existingaccumulate_deltarecursionTest plan
index: 0in first chunk,acc_value is None)key not in accvariant