Skip to content

fix(apicompat): Anthropic 转 Responses 时按 OpenAI 语义汇总 input_tokens#1

Open
stabey wants to merge 1 commit into
mainfrom
fix/anthropic-to-responses-cache-tokens
Open

fix(apicompat): Anthropic 转 Responses 时按 OpenAI 语义汇总 input_tokens#1
stabey wants to merge 1 commit into
mainfrom
fix/anthropic-to-responses-cache-tokens

Conversation

@stabey
Copy link
Copy Markdown
Owner

@stabey stabey commented May 27, 2026

概述

  • Anthropic Messages 的 input_tokens 不包含 cache_read_input_tokenscache_creation_input_tokens;而 OpenAI Responses 的 input_tokens包含缓存命中的总输入。
  • 原来的反向转换器(backend/internal/pkg/apicompat/anthropic_to_responses_response.go 里的 AnthropicToResponsesResponse 和流式 state machine)把 Anthropic 的 InputTokens 直接透传给了 OpenAI 侧,导致客户端看到的 prompt_tokens / input_tokens 少计了缓存命中部分,cache_creation 则被完全丢弃。
  • 本 PR 修复非流式路径和流式 makeResponsesCompletedEvent,在写入 InputTokens 时把 cache_read + cache_creation 加回去;同时在流式 state 中新增 CacheCreationInputTokens 字段,并在 message_startmessage_delta 中正确捕获。
  • 一处修复同时受益的 6 条下游链路:
    • Anthropic 上游 → Responses 客户端(同步 + 流式) —— gateway_forward_as_responses.go
    • Anthropic 上游 → ChatCompletions 客户端(同步 + 流式) —— gateway_forward_as_chat_completions.go
    • Gemini 上游 → ChatCompletions 客户端(同步 + 流式) —— gemini_chat_completions_compat_service.go(链路:Gemini → Anthropic → Responses → CC)

计费路径不受影响:计费读的是 ForwardResult.Usage(Anthropic 语义,从原始事件构建),不依赖被翻坏的响应体。

测试计划

  • go test ./internal/pkg/apicompat/... —— 新增测试覆盖非流式 + 流式,cache 字段分别在 message_startmessage_delta 中下发两种情况
  • go test -tags unit -run "TestHandleResponses|TestHandleCC|Gemini|Antigravity" ./internal/service/
  • go vet ./internal/pkg/apicompat/... ./internal/service/...

新增测试位于 backend/internal/pkg/apicompat/anthropic_responses_test.go:

  • TestAnthropicToResponsesResponse_CacheTokensUseOpenAIInputSemantics
  • TestAnthropicToResponsesResponse_NoCacheTokens
  • TestAnthropicEventToResponses_CacheTokensRoundTripFromMessageStart
  • TestAnthropicEventToResponses_CacheTokensFromMessageDelta

🤖 Generated with Claude Code

…hropic to Responses

Anthropic Messages reports input_tokens excluding cache_read/cache_creation, but
OpenAI Responses input_tokens is the total including cached tokens. The reverse
converter passed Anthropic's input_tokens straight through, so client-facing
prompt_tokens/input_tokens were short by the cached count and cache_creation
was dropped entirely.

Fix the non-stream path and the streaming state machine to add cache_read +
cache_creation back into input_tokens, and track CacheCreationInputTokens on
the streaming state. Six downstream paths benefit (Anthropic->Responses,
Anthropic->ChatCompletions, Gemini->ChatCompletions, each sync + stream).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@stabey stabey changed the title fix(apicompat): emit OpenAI-semantic input_tokens when converting Anthropic to Responses fix(apicompat): Anthropic 转 Responses 时按 OpenAI 语义汇总 input_tokens May 27, 2026
@stabey stabey force-pushed the fix/anthropic-to-responses-cache-tokens branch from 5132367 to 89dffdd Compare May 27, 2026 14:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant