Commit 3756774
stac: retry transient child failures once; exit on root failure (#73)
Two related changes that sharpen the startup-resilience story:
1. Retry pass for transient child failures. After the main pool drains,
children that timed out get one more attempt in a fresh pool using a
longer STAC_CHILD_RETRY_TIMEOUT (default 8s, env-overridable). Rescues
the tail-latency case (~2 of 81 collections observed on 2026-04-18 dev
rollout) without needing to kill the pod. A retry that also fails
leaves the original error in STAC_LOAD_ERRORS.
2. Exit on root-catalog failure. When the root fetch itself fails at
startup, sys.exit(1) rather than starting uvicorn with an empty
catalog. Lets Kubernetes restart the pod so the next attempt can
catch better S3 conditions, instead of presenting a useless server.
Partial catalogs (some children failed) still serve — that's the
point of the resilience design; only total failure triggers exit.
3. Default STAC_FETCH_CONCURRENCY bumped from 8 to 16 so the main pool
plus retry pass both fit within the readiness-probe budget.
Tests: 4 new (retry rescue, retry failure is final, retry uses configured
timeout, new concurrency default). Full suite: 95 passed.
Co-authored-by: Carl Boettiger <cboettig@berkeley.edu>1 parent b8398c1 commit 3756774
3 files changed
Lines changed: 209 additions & 14 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
11 | 11 | | |
12 | 12 | | |
13 | 13 | | |
14 | | - | |
| 14 | + | |
15 | 15 | | |
16 | 16 | | |
17 | 17 | | |
| |||
276 | 276 | | |
277 | 277 | | |
278 | 278 | | |
| 279 | + | |
| 280 | + | |
| 281 | + | |
| 282 | + | |
| 283 | + | |
| 284 | + | |
| 285 | + | |
| 286 | + | |
| 287 | + | |
| 288 | + | |
| 289 | + | |
| 290 | + | |
| 291 | + | |
279 | 292 | | |
280 | 293 | | |
281 | 294 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
30 | 30 | | |
31 | 31 | | |
32 | 32 | | |
33 | | - | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
34 | 40 | | |
35 | 41 | | |
36 | 42 | | |
| |||
307 | 313 | | |
308 | 314 | | |
309 | 315 | | |
310 | | - | |
| 316 | + | |
311 | 317 | | |
312 | 318 | | |
313 | 319 | | |
| |||
322 | 328 | | |
323 | 329 | | |
324 | 330 | | |
| 331 | + | |
| 332 | + | |
325 | 333 | | |
326 | | - | |
| 334 | + | |
| 335 | + | |
| 336 | + | |
| 337 | + | |
327 | 338 | | |
328 | 339 | | |
329 | 340 | | |
| |||
334 | 345 | | |
335 | 346 | | |
336 | 347 | | |
337 | | - | |
| 348 | + | |
338 | 349 | | |
339 | 350 | | |
340 | 351 | | |
341 | 352 | | |
342 | 353 | | |
343 | 354 | | |
344 | 355 | | |
| 356 | + | |
| 357 | + | |
345 | 358 | | |
346 | | - | |
| 359 | + | |
| 360 | + | |
| 361 | + | |
| 362 | + | |
347 | 363 | | |
348 | 364 | | |
349 | 365 | | |
| |||
402 | 418 | | |
403 | 419 | | |
404 | 420 | | |
| 421 | + | |
| 422 | + | |
| 423 | + | |
| 424 | + | |
| 425 | + | |
| 426 | + | |
405 | 427 | | |
406 | | - | |
| 428 | + | |
407 | 429 | | |
408 | 430 | | |
409 | 431 | | |
410 | 432 | | |
411 | | - | |
| 433 | + | |
412 | 434 | | |
413 | 435 | | |
414 | 436 | | |
415 | 437 | | |
416 | | - | |
| 438 | + | |
| 439 | + | |
417 | 440 | | |
| 441 | + | |
418 | 442 | | |
419 | 443 | | |
420 | 444 | | |
| |||
424 | 448 | | |
425 | 449 | | |
426 | 450 | | |
427 | | - | |
| 451 | + | |
| 452 | + | |
| 453 | + | |
| 454 | + | |
| 455 | + | |
| 456 | + | |
| 457 | + | |
| 458 | + | |
| 459 | + | |
| 460 | + | |
| 461 | + | |
| 462 | + | |
| 463 | + | |
| 464 | + | |
| 465 | + | |
| 466 | + | |
| 467 | + | |
| 468 | + | |
| 469 | + | |
| 470 | + | |
| 471 | + | |
| 472 | + | |
| 473 | + | |
| 474 | + | |
| 475 | + | |
| 476 | + | |
| 477 | + | |
| 478 | + | |
| 479 | + | |
| 480 | + | |
| 481 | + | |
| 482 | + | |
| 483 | + | |
| 484 | + | |
| 485 | + | |
| 486 | + | |
| 487 | + | |
| 488 | + | |
| 489 | + | |
| 490 | + | |
| 491 | + | |
| 492 | + | |
| 493 | + | |
| 494 | + | |
| 495 | + | |
| 496 | + | |
| 497 | + | |
| 498 | + | |
| 499 | + | |
428 | 500 | | |
429 | 501 | | |
430 | 502 | | |
431 | | - | |
| 503 | + | |
432 | 504 | | |
433 | 505 | | |
434 | 506 | | |
| 507 | + | |
| 508 | + | |
| 509 | + | |
| 510 | + | |
| 511 | + | |
435 | 512 | | |
436 | 513 | | |
437 | 514 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
541 | 541 | | |
542 | 542 | | |
543 | 543 | | |
544 | | - | |
545 | | - | |
| 544 | + | |
| 545 | + | |
| 546 | + | |
546 | 547 | | |
547 | 548 | | |
548 | 549 | | |
549 | | - | |
| 550 | + | |
550 | 551 | | |
551 | 552 | | |
552 | 553 | | |
| |||
992 | 993 | | |
993 | 994 | | |
994 | 995 | | |
| 996 | + | |
| 997 | + | |
| 998 | + | |
| 999 | + | |
| 1000 | + | |
| 1001 | + | |
| 1002 | + | |
| 1003 | + | |
| 1004 | + | |
| 1005 | + | |
| 1006 | + | |
| 1007 | + | |
| 1008 | + | |
| 1009 | + | |
| 1010 | + | |
| 1011 | + | |
| 1012 | + | |
| 1013 | + | |
| 1014 | + | |
| 1015 | + | |
| 1016 | + | |
| 1017 | + | |
| 1018 | + | |
| 1019 | + | |
| 1020 | + | |
| 1021 | + | |
| 1022 | + | |
| 1023 | + | |
| 1024 | + | |
| 1025 | + | |
| 1026 | + | |
| 1027 | + | |
| 1028 | + | |
| 1029 | + | |
| 1030 | + | |
| 1031 | + | |
| 1032 | + | |
| 1033 | + | |
| 1034 | + | |
| 1035 | + | |
| 1036 | + | |
| 1037 | + | |
| 1038 | + | |
| 1039 | + | |
| 1040 | + | |
| 1041 | + | |
| 1042 | + | |
| 1043 | + | |
| 1044 | + | |
| 1045 | + | |
| 1046 | + | |
| 1047 | + | |
| 1048 | + | |
| 1049 | + | |
| 1050 | + | |
| 1051 | + | |
| 1052 | + | |
| 1053 | + | |
| 1054 | + | |
| 1055 | + | |
| 1056 | + | |
| 1057 | + | |
| 1058 | + | |
| 1059 | + | |
| 1060 | + | |
| 1061 | + | |
| 1062 | + | |
| 1063 | + | |
| 1064 | + | |
| 1065 | + | |
| 1066 | + | |
| 1067 | + | |
| 1068 | + | |
| 1069 | + | |
| 1070 | + | |
| 1071 | + | |
| 1072 | + | |
| 1073 | + | |
| 1074 | + | |
| 1075 | + | |
| 1076 | + | |
| 1077 | + | |
| 1078 | + | |
| 1079 | + | |
| 1080 | + | |
| 1081 | + | |
| 1082 | + | |
| 1083 | + | |
| 1084 | + | |
| 1085 | + | |
| 1086 | + | |
| 1087 | + | |
| 1088 | + | |
| 1089 | + | |
| 1090 | + | |
| 1091 | + | |
| 1092 | + | |
| 1093 | + | |
| 1094 | + | |
| 1095 | + | |
| 1096 | + | |
| 1097 | + | |
| 1098 | + | |
| 1099 | + | |
0 commit comments