[codex] Add generic model context window hints#23
Conversation
|
Warning Review limit reachedYou’ve reached a temporary PR review limit under our Fair Usage Limits Policy. Next review available in: 50 minutes Enable usage-based reviews in Billing to review now. Otherwise, wait until the next included review is available. How can I continue?After more reviews become available, a review can be triggered using the To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based reviews. How do review limits work?CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability. For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window. Please refer docs for additional details. Review details⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughThis PR adds a ChangesContext Window Inference
Estimated code review effort: 2 (Simple) | ~10 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/harness/model/mod.rs (1)
47-68: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winDocument the ordering invariant of
MODEL_CONTEXT_PATTERNS.Correctness relies on more-specific substrings (e.g.
gpt-4.1,gpt-4o,gpt-4-turbo) preceding shorter, more general ones (gpt-4) sincefind_mapreturns the first match. This is implicit and easy to break silently when new entries are added later.♻️ Suggested doc comment
const MODEL_CONTEXT_PATTERNS: &[(&str, ContextPatternMatch, u64)] = &[ + // NOTE: order matters — `find_map` returns the first match, so more + // specific substrings (e.g. "gpt-4.1") must precede shorter, more + // general ones (e.g. "gpt-4") that would otherwise shadow them. ("claude-haiku-4.5", ContextPatternMatch::Substring, 200_000),🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/harness/model/mod.rs` around lines 47 - 68, Document the ordering rule for MODEL_CONTEXT_PATTERNS in model/mod.rs: because ModelContext::context_pattern_for uses find_map, the first matching entry wins, so more specific substrings like gpt-4.1, gpt-4o, and gpt-4-turbo must stay before broader matches like gpt-4. Add a doc comment on MODEL_CONTEXT_PATTERNS that states this invariant and warns future additions must preserve specificity-first ordering to avoid silent misclassification.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@src/harness/model/mod.rs`:
- Around line 47-68: Document the ordering rule for MODEL_CONTEXT_PATTERNS in
model/mod.rs: because ModelContext::context_pattern_for uses find_map, the first
matching entry wins, so more specific substrings like gpt-4.1, gpt-4o, and
gpt-4-turbo must stay before broader matches like gpt-4. Add a doc comment on
MODEL_CONTEXT_PATTERNS that states this invariant and warns future additions
must preserve specificity-first ordering to avoid silent misclassification.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 606b6a99-9637-4766-99b8-511f8c42f891
📒 Files selected for processing (3)
src/harness/mod.rssrc/harness/model/mod.rssrc/harness/model/test.rs
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 45702373ae
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| ("gpt-4.1", ContextPatternMatch::Substring, 1_047_576), | ||
| ("gpt-4o", ContextPatternMatch::Substring, 128_000), | ||
| ("gpt-4-turbo", ContextPatternMatch::Substring, 128_000), | ||
| ("gpt-4", ContextPatternMatch::Substring, 128_000), |
There was a problem hiding this comment.
OpenAI lists the plain gpt-4 model's context window as 8,192 tokens (https://developers.openai.com/api/docs/models/gpt-4), so context_window_for_model_id("gpt-4") now returns a 128k budget for a model that rejects prompts above 8k. Because this helper is intended for pre-dispatch budgeting, callers that use the plain gpt-4 id can skip summarization/compaction and hit provider context-limit errors; keep the 128k value scoped to gpt-4-turbo/gpt-4o variants.
Useful? React with 👍 / 👎.
| ("llama-3", ContextPatternMatch::Substring, 128_000), | ||
| ("llama3", ContextPatternMatch::Substring, 128_000), |
There was a problem hiding this comment.
Don't assign Llama 3 the Llama 3.1 window
Meta's Llama 3 model card lists an 8k context length for the Llama 3 8B/70B models (https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct), while the 128k window belongs to later Llama 3.1 models. With these generic patterns, context_window_for_model_id("llama3:8b") returns 128k, so local/hosted Llama 3 callers can over-budget by 16x and fail at runtime; add separate llama-3.1/llama3.1 patterns or lower the plain Llama 3 fallback.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 45702373ae
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| ("gpt-4.1", ContextPatternMatch::Substring, 1_047_576), | ||
| ("gpt-4o", ContextPatternMatch::Substring, 128_000), | ||
| ("gpt-4-turbo", ContextPatternMatch::Substring, 128_000), | ||
| ("gpt-4", ContextPatternMatch::Substring, 128_000), |
There was a problem hiding this comment.
OpenAI lists the plain gpt-4 model's context window as 8,192 tokens (https://developers.openai.com/api/docs/models/gpt-4), so context_window_for_model_id("gpt-4") now returns a 128k budget for a model that rejects prompts above 8k. Because this helper is intended for pre-dispatch budgeting, callers that use the plain gpt-4 id can skip summarization/compaction and hit provider context-limit errors; keep the 128k value scoped to gpt-4-turbo/gpt-4o variants.
Useful? React with 👍 / 👎.
| ("llama-3", ContextPatternMatch::Substring, 128_000), | ||
| ("llama3", ContextPatternMatch::Substring, 128_000), |
There was a problem hiding this comment.
Don't assign Llama 3 the Llama 3.1 window
Meta's Llama 3 model card lists an 8k context length for the Llama 3 8B/70B models (https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct), while the 128k window belongs to later Llama 3.1 models. With these generic patterns, context_window_for_model_id("llama3:8b") returns 128k, so local/hosted Llama 3 callers can over-budget by 16x and fail at runtime; add separate llama-3.1/llama3.1 patterns or lower the plain Llama 3 fallback.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 8092871096
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| ("o1", ContextPatternMatch::Segment, 200_000), | ||
| ("o3", ContextPatternMatch::Segment, 200_000), | ||
| ("deepseek", ContextPatternMatch::Substring, 128_000), | ||
| ("gemma3", ContextPatternMatch::Substring, 8_192), |
There was a problem hiding this comment.
Return Gemma 3's input context window
For Gemma 3 ids such as gemma3:4b (the added test covers this exact Ollama-style id), this returns 8,192 even though Google's Gemma 3 model card lists 128K input context for the 4B/12B/27B sizes and 32K for 1B/270M: https://ai.google.dev/gemma/docs/core/model_card_3. Since this helper is meant to feed pre-dispatch and capability budgeting, using the 8K value will unnecessarily reject/compact long prompts that those Gemma 3 models can accept; split Gemma 3 from the older Gemma fallback or make the pattern size-aware.
Useful? React with 👍 / 👎.
Summary
Validation
Summary by CodeRabbit