You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* feat: add TinkerNativeBackend for native training
Separate native Tinker training/inference from LocalBackend to keep the API
clear while enabling explicit loss/checkpoint behavior and config.
* fix: address pre-commit type and format issues
Align tinker native types with OpenAI tooling and update tests to avoid
invalid type expressions under pyright.
* feat: add safer state merge and policy tracking
Use merge_state for backend persistence to avoid clobbering model state, and
fail fast on trajectories without Choice objects to prevent no-op training.
Expose policy version fields on trajectories for off-policy tracking.
* feat(pipeline): add PipelineTrainer for async 3-stage training
Add a new PipelineTrainer module that implements an asynchronous
3-stage pipeline (rollout, training, eval) for efficient RL training:
- PipelineTrainer: Main trainer class with configurable workers,
batch sizes, and off-policy limits
- StatusReporter: Live progress reporting with tqdm and periodic
logging
- PipelineState: Shared state dataclass for stage coordination
- Type definitions for RolloutFn, SingleRolloutFn, EvalFn
Key features:
- Async rollout workers with policy version tracking
- Stale sample detection and automatic discard
- Zero-variance group handling with collapse detection
- Graceful signal handling (SIGINT/SIGTERM)
- State persistence for training resumption
- Eval scheduling with configurable intervals
Also includes:
- yes_no_maybe_pipeline.py: Simple example showing basic usage
- binary_prefix_tool_pipeline.py: Complex example with tool calls
Updates to tinker_native backend:
- Add debug logging via ART_TINKER_TRAIN_LOG/ART_TINKER_SAMPLE_LOG
- Add fallback for create_conversation_prefix_with_tools
- Fix tool_call id handling in OpenAI server responses
* fix: resolve type errors after rebasing on main with ty
- Fix import path for get_free_port (moved from service to server)
- Add cast for merge_state return type
- Fix test to use async function for TrajectoryGroup creation
- Move tinker deps to separate dependency group
- Add tinker to allowed-unresolved-imports for ty
---------
Co-authored-by: Cursor Bot <bot@cursor.com>
0 commit comments