|
2 | 2 |
|
3 | 3 | ## Latest |
4 | 4 |
|
| 5 | +## [1.3.0] |
| 6 | + |
| 7 | +### Released |
| 8 | + |
| 9 | +- 2026-04-27 |
| 10 | + |
| 11 | +### Added |
| 12 | + |
| 13 | +- Image curation pipeline with semantic filtering |
| 14 | +- Image embedding stages (Cosmos-Embed1, InternVideo2-MM, OpenAI-compatible) and image annotate pipeline |
| 15 | +- OpenAI- and Gemini-compatible endpoints for image captioning, filtering, and classification |
| 16 | +- Artificial-text detection stage for the video filtering pipeline (PaddleOCR-based) |
| 17 | +- Sensor library (camera-only) with `SensorGroup`, mcap-based ingestion, and timestamp validation |
| 18 | +- SeedVR-based upscaling stage |
| 19 | +- Pipeline config files with NVCF-compatible JSON and YAML loading (`--config` for split/shard/dedup) |
| 20 | +- Centralized pipeline argument validation via `common_pipeline_settings` and `shard_pipeline_settings` |
| 21 | +- vLLM async captioning stage for higher captioning throughput (experimental — correctness |
| 22 | + issues are still being worked through; not recommended for production use) |
| 23 | +- OpenTelemetry instrumentation for vLLM captioning |
| 24 | +- Token-counting instrumentation to measure captioning throughput |
| 25 | +- Caption status fields normalized across caption backends, with status-gated metadata writing |
| 26 | +- Stage-replay validation that compares re-run output against the original recording |
| 27 | +- S3 support for `stage-save` and `stage-replay` |
| 28 | +- Ray Data hello-world pipeline and splitting pipeline MVP as an alternative engine alongside Xenna |
| 29 | +- `--*-cpus-per-worker` knobs documented for CPU-constrained hosts |
| 30 | +- Run local-launched container as the host user (including AD/SSSD/NIS UIDs) to avoid root-owned outputs |
| 31 | +- Slim Docker image built alongside the full image, with auto-warmup honoring `--envs` |
| 32 | +- Local Xenna build path in CI and per-pipeline Xenna overrides |
| 33 | +- Fixed-stride coverage in the NVCF split benchmark matrix |
| 34 | +- Real-inference smoke test for vLLM captioning health |
| 35 | +- Upgrade to CUDA 13.0 |
| 36 | +- Upgrade vLLM to 0.19.0 |
| 37 | +- Upgrade Ray to 2.55.0 (with the `serve` extra) |
| 38 | +- Upgrade cosmos-xenna to 0.2.3 |
| 39 | +- Bump `av` to `>=17,<18` and add the `mcap` dependency for the sensor library |
| 40 | + |
| 41 | +### Fixed |
| 42 | + |
| 43 | +- `SamplingGrid` produced incorrect windows for irregular grids |
| 44 | +- `--execution-mode` CLI flag is now honored end-to-end |
| 45 | +- Cosmos-Embed1 writes per-variant embedding directories |
| 46 | +- Symlink the host pixi path so shebangs resolve inside the local-launched container |
| 47 | +- Sensor library uses read-only views to avoid accidental buffer mutation |
| 48 | +- Add Qwen3 preprocessing logic for filtering stages |
| 49 | +- Use pre-built images for benchmark runs to avoid redundant builds |
| 50 | +- Remove external storage dependency from `ImageSensor` |
| 51 | +- Semantic filter updates and dedup pipeline input path cleanup |
| 52 | +- Loosen Cosmos-Reason1 caption similarity threshold to reduce flakiness |
| 53 | + |
| 54 | +### Changed |
| 55 | + |
| 56 | +- Replace `CurationPhase` / `PipelineBuilder` with factory functions (`*_builders.py`); the |
| 57 | + `phase_interface` module and per-pipeline `phases.py` files are removed |
| 58 | +- Add `config: VllmConfig` parameter to `VllmPlugin.make_llm_input` for image vs video |
| 59 | + modality selection; subclasses must update their signature |
| 60 | +- Switch CI Slurm and k8s GPU jobs to the slim image with in-container `pixi install` and |
| 61 | + `pixi run --as-is` |
| 62 | +- Change CI NVCF backend |
| 63 | +- Normalize the `SamplingGrid` API and make sampling windows explicit (no sentinel boundaries) |
| 64 | +- Update semantic filter stages to use `VllmCaptioning` |
| 65 | +- Add a CPU-only Paddle option for the `unified` env |
| 66 | +- Pixi lockfile refreshed for CVE coverage |
| 67 | +- Add notice and disclaimer to README and Docker image |
| 68 | + |
| 69 | +### Documentation |
| 70 | + |
| 71 | +- Speed-of-light design doc for captioning throughput, with refined SOL baseline methodology |
| 72 | + using `vllm bench` as the reference |
| 73 | +- Refined Ray Data runner design with the first implementation slice |
| 74 | +- Document `--*-cpus-per-worker` tuning knobs |
| 75 | +- Add `--squash-before-merge` to MR guidelines |
| 76 | + |
5 | 77 | ## [1.2.2] |
6 | 78 |
|
7 | 79 | ### Released |
|
0 commit comments