Skip to content

Releases: bghira/SimpleTuner

v4.2.3 - anima fixes

26 Apr 15:07
dbb2309

Choose a tag to compare

What's Changed

  • anima: support preview-2 and preview-3 flavours with diffusers style paths by @bghira in #2693
  • anima: do not require text encoder to be loaded by @bghira in #2694
  • anima: fix validation pipeline with cached text embeds by @bghira in #2695
  • anima: loss scaling fix by @bghira in #2696
  • anima: fix PEFT LoRA resume when switching from Anima key layout to Diffusers by @bghira in #2697

Full Changelog: v4.2.2...v4.2.3

v4.2.2

24 Apr 17:09
485cb40

Choose a tag to compare

Features

  • ERNIE training w/ TREAD, LayerSync, CREPA, and assistant LoRA support (among other features)
  • BFL Self Flow for most architectures (not ERNIE)

Bugfixes

  • filename caption_strategy is available in the webUI dataset configuration page
  • training configuration wizard step count is corrected
  • dataset creation wizard no longer prevents progressing beyond first step
  • fixed broken Flux2 text embed caching
  • external validation script crash resolved, improved benchmark handling

What's Changed

  • ERNIE Image model w/ assistant lora support by @bghira in #2683
  • (#2680) resolve error when using external validation script with benchmark by @bghira in #2684
  • Fix broken text embed cache on Flux2 by @Copilot in #2686
  • (#2627) disable validation_using_datasets img2img pipeline for flux kontext dev by @bghira in #2687
  • black forest labs self-flow (wip) by @bghira in #2639
  • add filename caption_strategy option to UI by @bghira in #2688
  • Update most tutorial images and multi-GPU configuration text by @bghira in #2689
  • training configuration wizard step count should be auto determined by @bghira in #2690
  • dataset wizard: fix next button disabled unnecessarily by @bghira in #2691

Full Changelog: v4.2.1...v4.2.2

v4.2.1

14 Apr 20:32
aa5ca85

Choose a tag to compare

What's Changed

Full Changelog: v4.2.0...v4.2.1

v4.2.0 - dataset viewer, ACEStep 1.5, GLIGEN, huber flow matching loss

10 Apr 14:09
3a53745

Choose a tag to compare

Features

  • Garaw (from Discord) introduced Qwen 2512 assistant LoRA support, targeting the adapter he made for protecting this models' direct policy optimisation (DPO) or whatever reinforcement learning approach they took that prevents long-term training of the model
    • If you've trained Qwen Image 2512 and seen it "frying" or struggling to learn likeness, use this assistant LoRA to improve the results.
  • ACE-Step v1.5 is now supported fully. Not much changed for the XL models either. it's just new paths.
  • GLIGEN is integrated for all models, including video, with bbox generator and editor (webUI)
    • The webUI bounding box editor for video files is available in the new dataset viewer as well
    • There is a validation prompt library editor improvement which brings this to the validation pipeline config, allowing GLIGEN motion tracking to be configured for video models
  • A new dataset viewer is available for image, editing, controlnet, and video models (audio files not yet)
    • The conditioning data that links to a given sample is shown alongside it when the image is opened in the viewer
    • The image's complete caption is available for checking
    • The resize and crop transforms that the image or video will go through are shown for the user side by side
    • The crop coordinates that the metadata scan created can be manually edited using the webUI to drag the crop box into a better position
  • User-accessible buttons to trigger dataset scan / caching operations without starting training yet
    • On-demand aspect bucketing can now be triggered from the dataset viewer, avoiding the requirement to spin up the entire trainer just to scan and validate your dataset configuration
    • VAE outputs can be cleared and aspect bucket / crop metadata can be force-rebuilt using convenient buttons in the dataset viewer card
    • Text embeds can be re-generated via this card, but not deleted (not yet anyway, the text embeds are shared among datasets for deduplication, so I didn't want to introduce odd behaviour there)

What's Changed

  • ACE-Step v1.5 training (full-rank, LoRA, LyCORIS) by @bghira in #2669
  • GLIGEN: open-set grounded text-to-image training by @bghira in #2625
  • add dataset viewer with ability to verify and modify crop coordinates visually by @bghira in #2670
  • Upgrade transformers to v5.x (CVE PVE-2026-85102) by @redevined in #2674
  • Add Huber loss support for flow matching by @mhirki in #2673
  • Allow negative --validation_adapter_strength by @mhirki in #2672
  • Enable assistant LoRA support for Qwen-Image-2512 by @mhirki in #2671
  • dataset viewer: caching vae and text latents on demand by @bghira in #2675
  • merge by @bghira in #2676

Full Changelog: v4.1.3...v4.2.0

v4.1.3 - LTX Video 2.3

02 Apr 19:57
fbddde0

Choose a tag to compare

What's Changed

  • (#2645) reduce discord spam by limiting structured messages by @bghira in #2652
  • qwen image: tested new fix for batched training by @bghira in #2655
  • LTX-Video 2.3 by @bghira in #2654
  • LTX Video 2.3 diffusers mistakes corrected, bump torchcodec (and mark sizes as optional) by @bghira in #2662
  • feat: log prodigy_d and prodigy_effective_lr in training metrics by @rafstahelin in #2658
  • add info about custom flux2 text encoder path parameter by @bghira in #2664
  • SDXL single file loader should pull text enc from ckpt by @bghira in #2665
  • (#2536) add more variable expansions for other modelspec comment and path fields by @bghira in #2666
  • sdxl single file loader follow-up improvements for memory use by @bghira in #2667

Full Changelog: v4.1.2...v4.1.3

v4.1.2 - qwen image batched training fix

25 Mar 15:07
3d16d3b

Choose a tag to compare

What's Changed

  • qwen image: fix collate padding for batch_size > 1 by @rafstahelin in #2647
  • qwen image: per-sample split attention for batched training by @rafstahelin in #2648
  • qwen_image backport from diffusers git for bs>1 by @bghira in #2650

Full Changelog: v4.1.1...v4.1.2

v4.1.1 - anima model, z-image/qwen comfyui compat fix

14 Mar 16:37
dfd1627

Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v4.1.0...v4.1.1

v4.1.0

14 Feb 00:52
f43a6a6

Choose a tag to compare

Breaking change

The minor version is bumped to 4.1.0 for a breaking change.

  • LTX-2 audio configuration is now required to be set manually in order to have audio dataset auto-creation enabled.
    • The LTXVIDEO2 quickstart is updated with this information.
  • Qwen Image no longer uses attention mask when the sequence is full (small speed-up for very very long captions)

Features

  • CREPA can now be enabled and used with image model training, even on video models
  • U-REPA is implemented with documentation for SDXL, SD1x, and Kolors
  • T-LoRA via LyCORIS implementation (note: missing ortho init, to be included in follow-up)
  • Wan now uses attention dispatcher, can use flash-attn, cudnn, other options

Bugfixes

  • Musubi block swap validation speed fixed for Flux2
  • LoKr init norm works with torchao quant
  • Huggingface model card epoch + step count alignment
  • Flux2 + Ramtorch validation error resolved

What's Changed

  • U-REPA: SDXL, SD1x, Kolors by @bghira in #2563
  • update docs for installing cuda13 variant of torch by @bghira in #2596
  • wan: switch to attn dispatcher so backends can be changed; improve context-parallel performance by @bghira in #2599
  • flux2: fix ramtorch validation by checking device location correctly by @bghira in #2611
  • fix flux2 block swap validation performance by @bghira in #2614
  • (#2574) add eval dataset type for lookup in vaecache by @bghira in #2590
  • expand CREPA coverage to image models by @bghira in #2562
  • qwen image and qwen edit do not need attn masking when no padding by @bghira in #2598
  • LyCORIS T-LoRA by @bghira in #2609
  • (#2602) hooks for two-stage model pipeline capability by @bghira in #2608
  • test coverage for multi-stage hook by @bghira in #2616
  • add documentation for using CREPA for images by @bghira in #2617
  • (#2573) step number consistency in async upload by @bghira in #2618
  • (#2572) add another just in case setting for batch size by @bghira in #2619
  • (#2612) fix --init_lokr_norm with torchao quant by @bghira in #2620
  • s2v/ltx-2 audio auto split should be enabled more intelligently by @bghira in #2621

Full Changelog: v4.0.6...v4.1.0

v4.0.6 - torchao+diffusers compat, sliders, and other bugfixes

11 Feb 12:41
0609d8f

Choose a tag to compare

What's Changed

  • tests: fix test suite remaining stuck open at the end on CUDA devices by @bghira in #2584
  • Fix TypeError when simpletuner.file is None during training launch by @Copilot in #2585
  • flux2: add hint about Flux2 single stream block by @bghira in #2589
  • ramtorch: update docs to mention required suffix by @bghira in #2588
  • (#2578) ramtorch should match on wildcard like PEFT targets do by @bghira in #2587
  • (#2583) remove low_cpu_mem_usage from qwen init, it does not do anything anymore; use dtype instead of torch_dtype by @bghira in #2586
  • slider training by @kaibioinfo in #2591
  • Sliders by @bghira in #2592
  • limit torchao to less than 0.16.0 for diffusers bug by @bghira in #2594
  • merge by @bghira in #2595

New Contributors

  • @Copilot made their first contribution in #2585

Full Changelog: v4.0.5...v4.0.6

v4.0.5 - caption shuffling, torch 2.10, bugfixes

07 Feb 20:52
426dc96

Choose a tag to compare

What's Changed

  • LTX-2 audio fps should come from --framerate, not the dataset by @bghira in #2558
  • configurable caption_shuffle for cached text embeddings by @bghira in #2560
  • (#2567) fix htmx renderer for checkpoint selection by @bghira in #2568
  • update torch to 2.10, add psutil by @bghira in #2571
  • fix: DeepSpeed device_placement ValueError by @hjinnkim in #2570
  • log grad_absmax separately for regularisation data by @bghira in #2576
  • (#2575) fix dataset wizard for video type by @bghira in #2577

New Contributors

Full Changelog: v4.0.4...v4.0.5