You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fix: use low reasoning effort instead of omitting it for NoThinking
OpenAI reasoning models (o-series, gpt-5) always reason internally;
omitting the reasoning parameter does not disable reasoning -- it uses
the model's default effort. Those hidden reasoning tokens count against
max_output_tokens, so with a small budget (e.g. title generation) the
model can exhaust all tokens on reasoning and return empty visible text.
When NoThinking is set, explicitly send reasoning_effort: low so the
model spends as few output tokens as possible on reasoning. We use
"low" rather than "minimal" because older models (o3-mini, o1) only
accept low/medium/high.
Also normalize the reasoning field in the VCR request matcher so
cassettes recorded with or without reasoning config still match.
Fixes#2318
0 commit comments