Commit f5d72b5
committed
Enhance Megatron model configuration and training index handling
- Added internal configuration for model registration in yes-no-maybe-megatron.py to optimize GPU memory utilization and tensor parallel size.
- Refactored index calculation in train.py to improve efficiency and handle cases where indices may be empty, ensuring robust data parallelism during training.1 parent 0ad9a89 commit f5d72b5
2 files changed
Lines changed: 15 additions & 12 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
4 | 4 | | |
5 | 5 | | |
6 | 6 | | |
| 7 | + | |
7 | 8 | | |
8 | 9 | | |
9 | 10 | | |
| |||
43 | 44 | | |
44 | 45 | | |
45 | 46 | | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
46 | 53 | | |
47 | 54 | | |
48 | 55 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
8 | 8 | | |
9 | 9 | | |
10 | 10 | | |
| 11 | + | |
11 | 12 | | |
12 | 13 | | |
13 | 14 | | |
| |||
213 | 214 | | |
214 | 215 | | |
215 | 216 | | |
216 | | - | |
217 | | - | |
218 | | - | |
219 | | - | |
220 | | - | |
221 | | - | |
222 | | - | |
223 | | - | |
224 | | - | |
225 | | - | |
226 | | - | |
227 | | - | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
228 | 224 | | |
229 | 225 | | |
230 | 226 | | |
| |||
0 commit comments