Commit ab411da
authored
feat(archon): add ZBVZeroBubble pipeline schedule support (#916)
Add V-style (zero bubble) pipeline scheduling to ArchonEngine.
ZBVZeroBubble splits backward into input-grad and weight-grad steps,
enabling near-zero pipeline bubbles with 2 stages per rank.
Key changes:
- V-style stage assignment in _get_stage_indices() (rank 0 gets first
and last stages)
- Schedule-aware _pp_last_stage_rank determination
- Auto-disable torch.compile and op-level selective AC for V-style
schedules (incompatible with split backward)
- Generalize V-style guards to also cover ScheduleDualPipeV for
forward compatibility
- Add ZBV forward/backward distributed tests1 parent e03f32f commit ab411da
File tree
7 files changed
+752
-399
lines changed- areal
- api
- experimental
- engine
- models/archon
- tests/experimental/archon
- torchrun
- docs
7 files changed
+752
-399
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
453 | 453 | | |
454 | 454 | | |
455 | 455 | | |
456 | | - | |
| 456 | + | |
457 | 457 | | |
458 | 458 | | |
459 | 459 | | |
| |||
466 | 466 | | |
467 | 467 | | |
468 | 468 | | |
469 | | - | |
| 469 | + | |
470 | 470 | | |
471 | 471 | | |
472 | 472 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
12 | 12 | | |
13 | 13 | | |
14 | 14 | | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
15 | 20 | | |
16 | 21 | | |
17 | 22 | | |
| |||
239 | 244 | | |
240 | 245 | | |
241 | 246 | | |
242 | | - | |
243 | 247 | | |
244 | | - | |
| 248 | + | |
| 249 | + | |
245 | 250 | | |
246 | 251 | | |
247 | 252 | | |
| |||
297 | 302 | | |
298 | 303 | | |
299 | 304 | | |
| 305 | + | |
| 306 | + | |
| 307 | + | |
| 308 | + | |
| 309 | + | |
| 310 | + | |
| 311 | + | |
| 312 | + | |
| 313 | + | |
| 314 | + | |
| 315 | + | |
| 316 | + | |
| 317 | + | |
| 318 | + | |
| 319 | + | |
| 320 | + | |
| 321 | + | |
| 322 | + | |
| 323 | + | |
| 324 | + | |
| 325 | + | |
| 326 | + | |
| 327 | + | |
| 328 | + | |
| 329 | + | |
| 330 | + | |
| 331 | + | |
| 332 | + | |
| 333 | + | |
| 334 | + | |
| 335 | + | |
| 336 | + | |
| 337 | + | |
300 | 338 | | |
301 | 339 | | |
302 | 340 | | |
| |||
742 | 780 | | |
743 | 781 | | |
744 | 782 | | |
| 783 | + | |
745 | 784 | | |
746 | 785 | | |
747 | 786 | | |
| |||
832 | 871 | | |
833 | 872 | | |
834 | 873 | | |
| 874 | + | |
| 875 | + | |
| 876 | + | |
| 877 | + | |
| 878 | + | |
| 879 | + | |
| 880 | + | |
| 881 | + | |
| 882 | + | |
| 883 | + | |
| 884 | + | |
| 885 | + | |
835 | 886 | | |
836 | 887 | | |
837 | 888 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
14 | 14 | | |
15 | 15 | | |
16 | 16 | | |
| 17 | + | |
| 18 | + | |
17 | 19 | | |
18 | 20 | | |
19 | 21 | | |
| |||
210 | 212 | | |
211 | 213 | | |
212 | 214 | | |
213 | | - | |
| 215 | + | |
214 | 216 | | |
215 | 217 | | |
216 | 218 | | |
| |||
297 | 299 | | |
298 | 300 | | |
299 | 301 | | |
| 302 | + | |
300 | 303 | | |
301 | 304 | | |
302 | 305 | | |
303 | | - | |
| 306 | + | |
| 307 | + | |
304 | 308 | | |
305 | 309 | | |
306 | 310 | | |
307 | | - | |
308 | | - | |
309 | | - | |
310 | | - | |
311 | | - | |
312 | | - | |
313 | | - | |
314 | | - | |
315 | | - | |
| 311 | + | |
| 312 | + | |
| 313 | + | |
| 314 | + | |
| 315 | + | |
| 316 | + | |
316 | 317 | | |
317 | | - | |
318 | | - | |
| 318 | + | |
| 319 | + | |
319 | 320 | | |
320 | | - | |
| 321 | + | |
| 322 | + | |
| 323 | + | |
| 324 | + | |
321 | 325 | | |
322 | | - | |
| 326 | + | |
323 | 327 | | |
324 | 328 | | |
325 | 329 | | |
| |||
351 | 355 | | |
352 | 356 | | |
353 | 357 | | |
354 | | - | |
| 358 | + | |
355 | 359 | | |
356 | 360 | | |
357 | 361 | | |
| |||
364 | 368 | | |
365 | 369 | | |
366 | 370 | | |
367 | | - | |
| 371 | + | |
368 | 372 | | |
369 | 373 | | |
370 | 374 | | |
| |||
429 | 433 | | |
430 | 434 | | |
431 | 435 | | |
| 436 | + | |
| 437 | + | |
| 438 | + | |
| 439 | + | |
| 440 | + | |
| 441 | + | |
| 442 | + | |
| 443 | + | |
| 444 | + | |
432 | 445 | | |
433 | 446 | | |
434 | 447 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
6 | 6 | | |
7 | 7 | | |
8 | 8 | | |
9 | | - | |
10 | | - | |
11 | | - | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
12 | 12 | | |
13 | 13 | | |
14 | | - | |
15 | | - | |
16 | | - | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
17 | 21 | | |
18 | 22 | | |
19 | 23 | | |
| |||
107 | 111 | | |
108 | 112 | | |
109 | 113 | | |
110 | | - | |
| 114 | + | |
111 | 115 | | |
112 | | - | |
| 116 | + | |
| 117 | + | |
113 | 118 | | |
114 | 119 | | |
115 | 120 | | |
116 | 121 | | |
117 | 122 | | |
118 | 123 | | |
119 | 124 | | |
120 | | - | |
| 125 | + | |
121 | 126 | | |
122 | 127 | | |
123 | 128 | | |
124 | 129 | | |
125 | 130 | | |
126 | 131 | | |
127 | | - | |
| 132 | + | |
128 | 133 | | |
129 | | - | |
| 134 | + | |
| 135 | + | |
130 | 136 | | |
131 | 137 | | |
132 | 138 | | |
133 | 139 | | |
134 | 140 | | |
135 | 141 | | |
136 | 142 | | |
137 | | - | |
| 143 | + | |
138 | 144 | | |
139 | 145 | | |
140 | 146 | | |
| |||
161 | 167 | | |
162 | 168 | | |
163 | 169 | | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
164 | 219 | | |
165 | 220 | | |
166 | 221 | | |
| |||
169 | 224 | | |
170 | 225 | | |
171 | 226 | | |
172 | | - | |
| 227 | + | |
173 | 228 | | |
174 | | - | |
| 229 | + | |
| 230 | + | |
175 | 231 | | |
176 | 232 | | |
177 | 233 | | |
178 | 234 | | |
179 | 235 | | |
180 | 236 | | |
181 | 237 | | |
182 | | - | |
| 238 | + | |
183 | 239 | | |
184 | 240 | | |
185 | 241 | | |
186 | 242 | | |
187 | 243 | | |
188 | 244 | | |
189 | | - | |
| 245 | + | |
190 | 246 | | |
191 | | - | |
| 247 | + | |
| 248 | + | |
192 | 249 | | |
193 | 250 | | |
194 | 251 | | |
195 | 252 | | |
196 | 253 | | |
197 | 254 | | |
198 | 255 | | |
199 | | - | |
| 256 | + | |
200 | 257 | | |
201 | 258 | | |
202 | 259 | | |
| |||
0 commit comments