Commit d69345e
Remove beta KL divergence from training loss (#607)
Remove the Schulman KL estimator (beta * KL) that was added directly
to the training loss. The kl_penalty_coef mechanism (advantage
adjustment) remains as the preferred approach for KL regularization.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>1 parent ec1c174 commit d69345e
7 files changed
Lines changed: 6 additions & 25 deletions
File tree
- src/art
- local
- megatron
- preprocessing
- serverless
- unsloth
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
427 | 427 | | |
428 | 428 | | |
429 | 429 | | |
430 | | - | |
431 | 430 | | |
432 | 431 | | |
433 | 432 | | |
| |||
470 | 469 | | |
471 | 470 | | |
472 | 471 | | |
473 | | - | |
474 | 472 | | |
475 | 473 | | |
476 | 474 | | |
| |||
527 | 525 | | |
528 | 526 | | |
529 | 527 | | |
530 | | - | |
| 528 | + | |
531 | 529 | | |
532 | 530 | | |
533 | 531 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
14 | 14 | | |
15 | 15 | | |
16 | 16 | | |
17 | | - | |
18 | 17 | | |
19 | 18 | | |
20 | 19 | | |
| |||
124 | 123 | | |
125 | 124 | | |
126 | 125 | | |
127 | | - | |
128 | | - | |
129 | | - | |
130 | | - | |
131 | | - | |
132 | | - | |
133 | 126 | | |
134 | | - | |
135 | 127 | | |
136 | | - | |
137 | 128 | | |
138 | 129 | | |
139 | 130 | | |
| |||
144 | 135 | | |
145 | 136 | | |
146 | 137 | | |
147 | | - | |
148 | 138 | | |
149 | 139 | | |
150 | 140 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
250 | 250 | | |
251 | 251 | | |
252 | 252 | | |
253 | | - | |
| 253 | + | |
254 | 254 | | |
255 | 255 | | |
256 | 256 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
41 | 41 | | |
42 | 42 | | |
43 | 43 | | |
44 | | - | |
45 | | - | |
46 | | - | |
| 44 | + | |
47 | 45 | | |
48 | 46 | | |
49 | 47 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
149 | 149 | | |
150 | 150 | | |
151 | 151 | | |
152 | | - | |
153 | 152 | | |
154 | 153 | | |
155 | 154 | | |
| |||
179 | 178 | | |
180 | 179 | | |
181 | 180 | | |
182 | | - | |
183 | 181 | | |
184 | 182 | | |
185 | 183 | | |
| |||
212 | 210 | | |
213 | 211 | | |
214 | 212 | | |
215 | | - | |
| 213 | + | |
216 | 214 | | |
217 | 215 | | |
218 | 216 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
16 | 16 | | |
17 | 17 | | |
18 | 18 | | |
19 | | - | |
20 | 19 | | |
21 | 20 | | |
22 | 21 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
138 | 138 | | |
139 | 139 | | |
140 | 140 | | |
141 | | - | |
| 141 | + | |
142 | 142 | | |
143 | 143 | | |
144 | 144 | | |
| |||
173 | 173 | | |
174 | 174 | | |
175 | 175 | | |
176 | | - | |
177 | | - | |
178 | 176 | | |
179 | 177 | | |
180 | | - | |
| 178 | + | |
181 | 179 | | |
182 | 180 | | |
183 | 181 | | |
| |||
0 commit comments