Commit 5207db0
Remove batch-level padding from tokenize_sft_batch (#582)
* Remove batch-level padding from tokenize_sft_batch
tokenize_sft_batch was padding all trajectories to the longest sequence
in the batch, but every consumer (unsloth, megatron) processes them
individually. This wasted CPU memory and GPU compute on padding tokens.
Now each trajectory tensor keeps its natural length. The unsloth
training loop strips any residual padding before .to(device) for
robustness.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Use microbatch pattern in unsloth training loop
Match the serverless-training microbatch approach: process trajectories
in configurable microbatch groups with padding trimmed to the longest
in each group. Changing microbatch_size is a one-line change.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Fix ruff formatting
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Simplify training loop back to plain per-trajectory iteration
Padding is now removed at the source in tokenize_sft_batch, so the
training loop doesn't need microbatch trimming logic.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>1 parent f79fa5e commit 5207db0
1 file changed
Lines changed: 8 additions & 43 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
387 | 387 | | |
388 | 388 | | |
389 | 389 | | |
390 | | - | |
391 | | - | |
392 | | - | |
393 | | - | |
394 | | - | |
395 | | - | |
396 | | - | |
| 390 | + | |
| 391 | + | |
| 392 | + | |
397 | 393 | | |
398 | 394 | | |
399 | 395 | | |
| |||
409 | 405 | | |
410 | 406 | | |
411 | 407 | | |
412 | | - | |
413 | 408 | | |
414 | 409 | | |
415 | 410 | | |
416 | 411 | | |
417 | | - | |
| 412 | + | |
418 | 413 | | |
419 | | - | |
420 | | - | |
421 | | - | |
| 414 | + | |
| 415 | + | |
| 416 | + | |
422 | 417 | | |
423 | 418 | | |
424 | | - | |
425 | | - | |
426 | | - | |
427 | | - | |
428 | | - | |
429 | | - | |
430 | | - | |
431 | | - | |
432 | | - | |
433 | | - | |
434 | | - | |
435 | | - | |
436 | | - | |
437 | | - | |
438 | | - | |
439 | | - | |
440 | | - | |
441 | | - | |
442 | | - | |
443 | | - | |
444 | | - | |
445 | | - | |
446 | | - | |
447 | | - | |
448 | | - | |
449 | | - | |
450 | | - | |
451 | | - | |
452 | | - | |
453 | | - | |
454 | | - | |
| 419 | + | |
455 | 420 | | |
456 | 421 | | |
457 | 422 | | |
| |||
0 commit comments