Skip to content

fix: avoid inferred caps for explicit-only providers#402

Open
Fengzdadi wants to merge 1 commit into
MoonshotAI:mainfrom
Fengzdadi:codex/explicit-only-completion-budget
Open

fix: avoid inferred caps for explicit-only providers#402
Fengzdadi wants to merge 1 commit into
MoonshotAI:mainfrom
Fengzdadi:codex/explicit-only-completion-budget

Conversation

@Fengzdadi
Copy link
Copy Markdown
Contributor

Related Issue

Resolve #349

Related to #318

Problem

OpenAI-compatible providers can reject requests when Kimi Code derives a completion-token cap from a catalog model's context window and sends that value as max_tokens / max_output_tokens. This surfaced in #349 with MiMo-V2.5-Pro: the catalog context window is much larger than the provider's accepted output-token cap, so the request returned 400 Param Incorrect.

#318 addresses the common configured-output-limit path by using max_output_size as the completion cap. This PR adds the companion guard for providers that should only receive configured hard caps and should not receive fallback caps inferred from context windows.

What changed

  • Added an optional completionBudgetStrategy field to ChatProvider.
  • Marked OpenAI Chat Completions and OpenAI Responses providers as explicit-only for completion budgeting.
  • Updated completion-budget application so explicit-only providers still receive configured hard caps, but skip fallback/context-window inferred caps.
  • Added focused tests for the explicit-only strategy and provider declarations.

Checklist

  • I have read the CONTRIBUTING document.
  • I have linked a related issue, or explained the problem above.
  • I have added tests that prove my feature works.
  • Ran gen-changesets skill, or this PR needs no changeset.
  • Ran gen-docs skill, or this PR needs no doc update.

Verification

  • node node_modules/vitest/vitest.mjs run packages/agent-core/test/utils/completion-budget.test.ts packages/agent-core/test/agent/kosong-llm.test.ts
  • node node_modules/vitest/vitest.mjs run packages/kosong/test/openai-legacy.test.ts packages/kosong/test/openai-responses.test.ts
  • node node_modules/typescript/bin/tsc -p packages/agent-core/tsconfig.json --noEmit
  • node node_modules/typescript/bin/tsc -p packages/kosong/tsconfig.json --noEmit
  • git diff --check
  • Live MiMo smoke: xiaomi-token-plan-sgp/mimo-v2.5-pro with thinking enabled returned successfully after avoiding the inferred context-sized cap.

@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented Jun 4, 2026

🦋 Changeset detected

Latest commit: 1731864

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 3 packages
Name Type
@moonshot-ai/kimi-code Patch
@moonshot-ai/agent-core Patch
@moonshot-ai/kosong Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@Fengzdadi Fengzdadi marked this pull request as ready for review June 4, 2026 04:28
@pkg-pr-new
Copy link
Copy Markdown

pkg-pr-new Bot commented Jun 4, 2026

pnpm dlx https://pkg.pr.new/@moonshot-ai/kimi-code@7423570
npx https://pkg.pr.new/@moonshot-ai/kimi-code@7423570

commit: 7423570

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7423570e4a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +81 to +85
if (
args.budget.hardCap === undefined &&
args.provider.completionBudgetStrategy === 'explicit-only'
) {
return args.provider;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Honor max_output_size for explicit-only providers

When an openai/openai_responses model alias sets max_output_size, resolveCompletionBudget still produces only a fallback unless an env var is set, and toKosongProviderConfig only forwards maxOutputSize to Anthropic (packages/agent-core/src/session/provider-manager.ts:220-260). Because this early return fires before withMaxCompletionTokens, those explicit-only aliases drop the configured per-alias cap and send no max_tokens/max_output_tokens, leaving users unable to cap OpenAI-compatible providers that reject oversized output budgets.

Useful? React with 👍 / 👎.

@Fengzdadi Fengzdadi force-pushed the codex/explicit-only-completion-budget branch from 7423570 to 1731864 Compare June 4, 2026 05:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

配置provider 报错

1 participant