Takenoko

An opinionated, perpetual WIP project aimed at hacking WanVideo 2.1(2)-T2V-(A)14B LoRA training.

It is intended as a playground for experimenting with new ideas and testing various training features, including some that might ultimately turn out to be useless. The configuration file structure may change at any time, and some non-functioning options may still be present. It only supports Wan2.1-T2V-14B and Wan2.2-T2V-A14B training.

☄️ Disclaimer

This project would not have been possible without musubi-tuner. Although extensively refactored and reworked (to the point where upstream merge is no longer possible), the original project provided the foundation on which Takenoko was built. By reusing an existing and proven codebase, I was able to focus more on experimentation and learning instead of reinventing the wheel. Thanks to kohya-ss for the awesome work.

☄️ Docs

Since this project is mostly aimed at personal use and is in a state of constant improvement (without guaranteeing backwards compatibility), it probably won't have comprehensive documentation in the near future (unless it somehow becomes popular, which I hope it does not). I've tried to provide detailed comments in the config template, but they can't cover everything. As a workaround, I recommend using repomix to compress the entire repository into a single XML AI-readable file (will take around 1M tokens), then feeding it into the free Grok 4 Fast with 2M context window and asking questions about various aspects of the project.

☄️ Quick Start (Windows)

Clone the repository.
Run install.bat.
Create configuration file (you can copy sample config from configs/examples folder).
Place it into the configs directory.
Launch run_trainer.bat and follow the instructions.

☄️ License

This project borrows code from various sources, which use different types of licenses, mostly Apache 2.0, MIT, and AGPLv3. Since AGPLv3 is a strong copyleft license, including any AGPLv3 code likely means the entire project must be released under AGPLv3. This understanding is based on publicly available licensing information.

☄️ Acknowledgments

Takenoko draws inspiration from and incorporates code, ideas, and techniques from various open-source projects and publications. I thank the authors and maintainers for their contributions. Below is a list of all sources and papers (in no particular order). I have tried to reference all sources, but if I happen to miss any (or if more specific credits are warranted), please let me know.

Keep in mind that work on some features is not yet complete due to time and hardware constraints. If a feature is not working or is not implemented exactly as in the original work, all responsibility lies with my implementation, not with the authors of the original code or paper.

Source	Type	What was borrowed	Author(s)	License	Comment
musubi-tuner	repo	- Original codebase	kohya-ss	Apache 2.0
blissful-tuner	repo	- Several optimization techniques	Sarania	Apache 2.0
diffusion-pipe	repo	- Pre-computed timestep distribution algorithm - AdamW8bitKahan optimizer - Automagic optimizer modifications	tdrussell	MIT
WanTraining	repo	- Control LoRA training - DWT loss	spacepxl	Apache 2.0
ai-toolkit	repo	- Differential output preservation - Adafactor optimizer - Prodigy 8-bit optimizer - Automagic optimizer - EMA implementation - Concept slider training - Stepped loss	ostris	MIT
musubi-tuner (pr)	repo	- Initial implementation of validation datasets	NSFW-API	Apache 2.0
Timestep-Attention-and-other-shenanigans	repo	- Clustered MSE Loss - EW loss	Anzhc	AGPL-3.0
Diffuse and Disperse: Image Generation with Representation Regularization	paper	- Dispersive loss	Runqian Wang, Kaiming He	CC BY 4.0
DispLoss	repo	- Dispersive loss PyTorch implementation	raywang4	MIT
sd-scripts	repo	- Regularization datasets - LoRA-GGPO - Validation loss	kohya-ss	Apache 2.0
wan2.1-dilated-controlnet	repo	- ControlNET training	TheDenk	Apache 2.0
T-LoRA	repo	- T-LoRA training	ControlGenAI	MIT	see also paper
sd-scripts (fork)	repo	- Fourier loss - HinaAdaptive optimizer	hinablue	Apache 2.0
Muon	repo	- Muon optimizer	KellerJordan	MIT
dion	repo	- DION2-inspired reduced orthonormal optimizer integration	microsoft	MIT
Sana	repo	- CAME 8-bit optimizer	NVlabs	Apache 2.0	see also paper
SimpleTuner	repo	- Routed TREAD - SOAP optimizer - Masked training (spatial-first loss, area interpolation, proper normalization, auto mask generation) - Advanced EMA features - CREPA/LayerSync improvements - Scheduled rollout probability ramping	bghira	AGPL-3.0
diffusion-pipe (pr)	repo	- Frame-based TREAD	Ada123-a	MIT
Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think	paper	- Representational alignment loss, 3-layer MLP projection head, forward hook-based feature capture	Sihyun Yu, Sangkyung Kwak, Huiwon Jang, Jongheon Jeong, Jonathan Huang, Jinwoo Shin, Saining Xie	CC BY 4.0
REPA	repo	- Representation Alignment implementation	sihyun-yu	MIT
dino	repo	- VisionTransformer implementation	facebookresearch	MIT
Sophia	repo	- Sophia optimizer	Liuhong99	MIT	see also paper
Adaptive Non-Uniform Timestep Sampling for Diffusion Model Training	repo	- Adaptive timestep sampling	KU-DMLab	MIT	see also paper
Temporal Regularization Makes Your Video Generator Stronger	paper	- Temporal regularization via perturbation	Harold Haodong Chen, Haojian Huang, Xianfeng Wu, Yexin Liu, Yajing Bai, Wen-Jie Shu, Harry Yang, Ser-Nam Lim	arXiv 1.0
AR-Diffusion: Asynchronous Video Generation with Auto-Regressive Diffusion	paper	- Frame-oriented Probability Propagation (FoPP) scheduler	Mingzhen Sun, Weining Wang, Gen Li, Jiawei Liu, Jiahui Sun, Wanquan Feng, Shanshan Lao, SiYu Zhou, Qian He, Jing Liu	arXiv 1.0
Redefining Temporal Modeling in Video Diffusion: The Vectorized Timestep Approach	paper	- Vectorized timestep scheduling	Yaofang Liu, Yumeng Ren, Xiaodong Cun, Aitor Artola, Yang Liu, Tieyong Zeng, Raymond H. Chan, Jean-michel Morel	arXiv 1.0
Self Forcing: Bridging the Train-Test Gap in Autoregressive Video Diffusion	paper	- Post-training autoregressive self-rollout method	Xun Huang, Zhengqi Li, Guande He, Mingyuan Zhou, Eli Shechtman	CC BY-NC-SA 4.0
Wan2.1-NABLA	repo	- Dynamic sparse attention	gen-ai-team	Apache 2.0	see also paper
VideoX-Fun	repo	- Reward LoRA training	aigc-apps	Apache 2.0
Fira	repo	- Fira optimizer	xichen-fy	Apache 2.0	see also paper
google-research	repo	- Frechet Video Distance (FVD) implementation	google-research	Apache 2.0
Mixture of Contexts for Long Video Generation	paper	- Mixture of Contexts (MoC) sparse attention routing	Shengqu Cai, Ceyuan Yang, Lvmin Zhang, Yuwei Guo, Junfei Xiao, Ziyan Yang, Yinghao Xu, Zhenheng Yang, Alan Yuille, Leonidas Guibas, Maneesh Agrawala, Lu Jiang, Gordon Wetzstein	CC BY-SA 4.0
SPHL-for-stable-diffusion	code	- Pseudo-Huber loss implementation	kabachuha		see also paper
Context as Memory: Scene-Consistent Interactive Long Video Generation with Memory Retrieval	paper	- Context-as-Memory integration	Jiwen Yu, Jianhong Bai, Yiran Qin, Quande Liu, Xintao Wang, Pengfei Wan, Di Zhang, Xihui Liu	CC BY 4.0
SingLoRA	repo	- SingLoRA implementation	kyegomez	MIT	see also paper
PEFT-SingLoRA	repo	- Enhanced non-square matrix handling	bghira	BSD 2-clause
sd-scripts(pr)	repo	- Latent quality analysis	araleza	Apache 2.0
Contrastive Flow Matching	paper	- Contrastive loss	George Stoica, Vivek Ramanujan, Xiang Fan, Ali Farhadi, Ranjay Krishna, Judy Hoffman	CC BY 4.0
DeltaFM	repo	- Contrastive Flow Matching implementation (class-conditioned sampling, unconditional handling)	gstoica27	MIT
OneTrainer	repo	- Masked training (prior preservation, unmasked weight, random mask removal) - OFTv2 orthogonal finetuning integration reference	Nerogar	AGPL-3.0
Ouroboros-Diffusion: Exploring Consistent Content Generation in Tuning-free Long Video Diffusion	paper	- Frequency-domain temporal consistency	Jingyuan Chen, Fuchen Long, Jie An, Zhaofan Qiu, Ting Yao, Jiebo Luo, Tao Mei	CC BY-SA 4.0
mmgp	repo	- Memory-mapped safetensors loading	deepbeepmeep	GNU GPL
attention-map-diffusers	repo	- Cross-attention map visualization	wooyeolbaek	MIT
musubi-tuner (fork)	repo	- Full model fine-tuning - Row-based TREAD	betterftr	Apache 2.0
stochastic_round_cuda	repo	- Stochastic rounding CUDA implementation	ethansmith2000	MIT
simplevae	repo	- VAE training enhancements	AiArtLab
RamTorch	repo	- RamTorch CPU-bouncing linear layers	lodestone-rock	Apache 2.0
Directly Aligning the Full Diffusion Trajectory with Fine-Grained Human Preference	repo	- SRPO preference optimization	Tencent-Hunyuan	SRPO Non-Commercial License	see also paper
SARA: Structural and Adversarial Representation Alignment for Training-efficient Diffusion Models	paper	- Autocorrelation matrix alignment - Adversarial distribution alignment - Multi-level hierarchical representation loss	Hesen Chen, Junyan Wang, Zhiyu Tan, Hao Li	CC BY 4.0
Scion	repo	- Scion optimizer	LIONS-EPFL	MIT	see also paper
EqM	repo	- Equilibrium matching adaptation	raywang4	MIT	see also paper
NorMuon	repo	- Neuron-wise Normalized Muon implementation	CoffeeVampir3	MIT
TiM	repo	- Transition training objective (paired timesteps, transports, weighting, EMA)	WZDTHU	Apache 2.0	see also paper
rcm	repo	- rCM distillation algorithm reference	NVlabs	Apache 2.0	see also paper
Aozora_SDXL_Training	repo	- Raven optimizer	Hysocs
Sprint: Sparse-Dense Residual Fusion for Efficient Diffusion Transformers	paper	- Sparse-dense residual fusion with token dropping - Path-drop learning with token regularization - Two-stage training scheduler	Dogyun Park, Moayed Haji-Ali, Yanyu Li, Willi Menapace, Sergey Tulyakov, Hyunwoo J. Kim, Aliaksandr Siarohin, Anil Kag	CC BY 4.0
AdaMuon	repo	- Adaptive Muon optimizer implementation	Chongjie-Si	Apache 2.0	see also paper
Cross-Frame Representation Alignment for Fine-Tuning Video Diffusion Models	paper	- Cross-frame representation alignment	Sungwon Hwang, Hyojin Jang, Kinam Kim, Minho Park, Jaegul Choo	CC BY 4.0
LayerSync: Self-aligning Intermediate Layers	repo	- Inter-layer alignment loss	vita-epfl	MIT	see also paper
HyperLoRA	repo	- HyperLoRA concept	bytedance	GPL-3.0	see also paper
Qwen-Image-i2L	repo	- Trainable single-pass LoRA weight prediction hypernetwork concept - Residual-conditioned branch with cached auxiliary embeddings - Optional multi-encoder auxiliary embedding fusion	DiffSynth-Studio	Apache 2.0	see also article
iREPA	repo	- Convolutional projector for spatial preservation - Spatial z-score normalization for sharper alignment	End2End-Diffusion	MIT	see also paper
SpeedrunDiT	repo	- Dim-aware timestep shift - Cross-batch CFM regularizer - Sprint uncond-only path drop for sampling	SwayStar123	MIT
Improved Variational Online Newton (IVON)	repo	- IVON implementation	team-approx-bayes	GPL-3.0	with code from PR by rockerBOO
MemFlow	repo	- Memory bank - Sparse memory activation guidance	KlingTeam	Apache 2.0	see also paper
HASTE	repo	- Holistic alignment loss - Semantic anchor feature projections - Attention alignment with teacher offset - Stage‑wise termination	NUS-HPC-AI-Lab	Apache 2.0	see also paper
sd-scripts (pr)	repo	- CDC-FM flow matching	rockerBOO	Apache 2.0	see also paper
GaLore	repo	- GaLore optimizer	jiaweizzhao	Apache 2.0	see also paper
REG	repo	- Class‑token entanglement - Class‑token denoising loss - Alignment loss to encoder features	Martinser	MIT	see also paper
Q-GaLore	repo	- Q-GaLore optimizer	VITA-Group	Apache 2.0	see also paper
SemanticGen: Video Generation in Semantic Space	paper	- Semantic token conditioning - Feature‑representation cross‑alignment loss	Jianhong Bai, Xiaoshi Wu, Xintao Wang, Xiao Fu, Yuanxing Zhang, Qinghe Wang, Xiaoyu Shi, Menghan Xia, Zuozhu Liu, Haoji Hu, Pengfei Wan, Kun Gai
transformers (pr)	repo	- Implementation of Q-GaLore optimizer	SunMarc	Apache 2.0
Glance	repo	- Fixed-timestep distillation mode	CSU-JPG	Apache 2.0	see also paper
Stable-Video-Infinity	repo	- Error‑recycling fine‑tuning - Timestep‑grid replay buffers - Buffer replacement strategies - Warmup distributed buffer fill - Probabilistic error injection and modulation - Anchor‑conditioned motion replay - Sequence‑aware batching for replay continuity	vita-epfl	Apache 2.0	see also paper
EquiVDM: Equivariant Video Diffusion Models with Temporally Consistent Noise	paper	- Temporally consistent noise with flow caching	Chao Liu, Arash Vahdat	CC BY 4.0
catlvdm	repo	- BCNI/SACN corruption for T5 conditioning - Structured corruption robustness boost - Mask‑aware embedding noise injection	chikap421	MIT	see also paper
TPDiff: Temporal Pyramid Video Diffusion Model	paper	- Temporal pyramid bounded sampling - Stage‑wise temporal resampling - Stage‑specific scheduler‑aware gamma/sigma	Lingmin Ran, Mike Zheng Shou	CC BY 4.0
relora	repo	- ReLoRA pipeline	Guitaricet	Apache 2.0	see also paper
DenseDPO: Fine-Grained Temporal Preference Optimization for Video Diffusion Models	paper	- DenseDPO training method	Ziyi Wu, Anil Kag, Ivan Skorokhodov, Willi Menapace, Ashkan Mirzaei, Igor Gilitschenski, Sergey Tulyakov, Aliaksandr Siarohin	CC BY 4.0
Blockwise-Flow-Matching	repo	- Blockwise timestep segment objective - SemFeat alignment conditioning - SemFeat time-embedding injection - FRN loss	mlvlab		see also paper
MuonClip	repo	- MuonClip	kyegomez	Apache 2.0	see also paper
mHC: Manifold-Constrained Hyper-Connections	paper	- Multi-path residual stream with learnable residual mixing matrix - Doubly-stochastic manifold constraint - Identity-mapping preservation across depth - Sinkhorn-Knopp normalization enforcing constraint - Norm-preserving cross-stream residual propagation	Zhenda Xie, Yixuan Wei, Huanqi Cao, Chenggang Zhao, Chengqi Deng, Jiashi Li, Damai Dai, Huazuo Gao, Jiang Chang, Liang Zhao, Shangyan Zhou, Zhean Xu, Zhengyan Zhang, Wangding Zeng, Shengding Hu, Yuqing Wang, Jingyang Yuan, Lean Wang, Wenfeng Liang	arXiv 1.0
manifolds	repo	- Manifold Muon integration	thinking-machines-lab	MIT	see also blogpost
LoRA meets Riemannion: Muon Optimizer for Parametrization-independent Low-Rank Adapters	paper	- Riemannion fixed‑rank optimizer - Manifold momentum/transport - Manifold‑aware LoRA tangent projection and retraction - One‑step gradient locally optimal initialization	Vladimir Bogachev, Vladimir Aletov, Alexander Molozhavenko, Denis Bobkov, Vera Soboleva, Aibek Alanov, Maxim Rakhuba	arXiv 1.0
pico-relora	repo	- Optimizer reset via random pruning - Jagged cosine scheduler	Yu-val-weiss	Apache 2.0	see also paper
Physics-Guided Motion Loss for Video Generation Model	paper	- Physics-guided motion loss	Bowen Xue, Giuseppe Claudio Guarnera, Shuang Zhao, Zahra Montazeri	arXiv 1.0
optimizers	repo	- Original implementation of Kron, Conda, VSGD, RangerVA and NvNovoGrad optimizers	NoteDance	Apache 2.0
clora	repo	- Cross-attention capture - Token-focused attention - Spatial attention masking - Contrastive attention separation	gemlab-vt	MIT	see also paper
splus	repo	- SPlus optimizer	kvfrans		see also paper
Internal-Guidance	repo	- Auxiliary supervision on intermediate layers - Internal dynamics guidance - Target shifting mechanism	CVL-UESTC	MIT	see also paper
Beyond External Guidance: Unleashing the Semantic Richness Inside Diffusion Transformers for Improved Training	paper	- Two‑stage self‑guidance - Feature‑space CFG semantic enrichment - Frozen internal teacher stabilization - Lightweight projection alignment	Lingchen Sun, Rongyuan Wu, Zhengqiang Zhang, Ruibin Li, Yujing Sun, Shuaizheng Liu, Lei Zhang	CC BY 4.0
FreeFuse	repo	- Subject-mask training with auxiliary consistency losses	yaoliliu	Apache 2.0	see also paper
Immiscible-Diffusion	repo	- KNN candidate noise selection implementation - Linear assignment noise matching reference	yhli123	MIT	see also paper
MixFlow	repo	- Slowed interpolation mixture objective - Beta-style timestep remapping	fudan-generative-vision		see also paper
Video Consistency Distance: Enhancing Temporal Consistency for Image-to-Video Generation via Reward-Based Fine-Tuning	paper	- VCD temporal-consistency objective - Frequency-domain amplitude/phase consistency distance	Takehiro Aoshima, Yusuke Shinohara, Byeongseon Park	arXiv 1.0
MoAlign: Motion-Centric Representation Alignment for Video Diffusion Models	paper	- Two-stage motion-centric alignment - Spatial/temporal relational alignment loss with temporal weighting	Aritra Bhowmik, Denis Korzhenkov, Cees G. M. Snoek, Amirhossein Habibian, Mohsen Ghafoorian	CC BY 4.0
End-to-End Training for Autoregressive Video Diffusion via Self-Resampling	paper	- Self-resampling history corruption - History token routing - Autoregressive rollout with KV-cache acceleration	Yuwei Guo, Ceyuan Yang, Hao He, Yang Zhao, Meng Wei, Zhenheng Yang, Weilin Huang, Dahua Lin	CC BY 4.0
VideoREPA	repo	- Video teacher integration patterns - TRD objective implementation	aHapBean	Apache 2.0	see also paper
Structure From Tracking: Distilling Structure-Preserving Motion for Video Generation	paper	- Bidirectional teacher-feature fusion for structure-preserving motion distillation - Local Gram Flow (LGF) alignment objective - SFT pipeline with optional SAM2 tracker-memory backend	Yang Fei, George Stoica, Jingyuan Liu, Qifeng Chen, Ranjay Krishna, Xiaojuan Wang, Benlin Liu	CC BY-NC-ND 4.0
CAMEO	repo	- Attention distillation techniques	cvlab-kaist		see also paper
VAE-REPA: Variational Autoencoder Representation Alignment for Efficient Diffusion Training	paper	- VAE-latent representation alignment objective - Configurable projector depth	Mengmeng Wang, Dengyang Jiang, Liuzhuozheng Li, Yucheng Lin, Guojiang Shen, Xiangjie Kong, Yong Liu, Guang Dai, Jingdong Wang	CC BY 4.0
DisMo	repo	- Conditional LoRA modulation reference - Stochastic delta-time sampling reference - Motion/appearance disentanglement diagnostics direction	CompVis	MIT	see also paper
ReflexFlow: Rethinking Learning Objective for Exposure Bias Alleviation in Flow Matching	paper	- Anti-Drift Rectification (ADR) objective - Frequency Compensation (FC) loss reweighting - Scheduled sampling strategy for biased-input training	Guanbo Huang, Jingjia Mao, Fanding Huang, Fengkai Liu, Xiangyang Luo, Yaoyuan Liang, Jiasheng Lu, Xiaoe Wang, Pei Liu, Ruiliu Fu, Shao-Lun Huang	arXiv 1.0
StableVelocity	repo	- VA-REPA timestep-aware weighting schedules - StableVM memory-bank target construction - Class-aware bank sampling	linYDTHU	MIT	see also paper
LTX-2	repo	- IC-LoRA trainer/pipeline structure references - Reference-target conditioning flow design - IC-LoRA network module conventions	Lightricks	LTX-2 Community License
In-Context LoRA for Diffusion Transformers	paper	- In-context concatenation objective for condition/target layouts	Lianghua Huang, Wei Wang, Zhi-Fan Wu, Yupeng Shi, Huanzhang Dou, Chen Liang, Yutong Feng, Yu Liu, Jingren Zhou	CC BY 4.0
In-Context Sync-LoRA for Portrait Video Editing	paper	- Sync-aware paired-video curation concept - Motion-preserving in-context edit objective	Sagi Polaczek, Or Patashnik, Ali Mahdavi-Amiri, Daniel Cohen-Or	arXiv 1.0
Generative Modeling via Drifting	paper	- Drifting auxiliary objective - Mean-shift drifting field with kernel normalization	Mingyang Deng, He Li, Tianhong Li, Yilun Du, Kaiming He	CC BY 4.0
DeT	repo	- Motion-transfer enhancement integration - Local temporal-kernel and dense-trajectory supervision objectives	Shi-qingyu		see also paper
Mano-Restriking-Manifold-Optimization-for-LLM-Training	repo	- Mano optimizer implementation - Tangent-space manifold update - Matrix/aux-Adam parameter split	xie-lab-ml	Apache 2.0	see also paper
Euphonium	repo	- SRPO process-reward guidance - Dual outcome/process reward modes - KL-auto scaling - Optional latent SPSA gradients	zerzerzerz	Apache 2.0	see also paper
ShortFT: Diffusion Model Alignment via Shortcut-based Fine-Tuning	paper	- Progressive shortcut backprop schedule for reward LoRA training - Segment/anchor-based denoising-chain backprop control	Xiefan Guo, Miaomiao Cui, Liefeng Bo, Di Huang	arXiv 1.0
FlexAM	repo	- FlexAM conditioning - Density-guided timestep conditioning concept	IGL-HKUST	Apache 2.0	see also paper
UFO	repo	- Static-clip training - Frame-correlated autoregressive noise sharing - Motion-sub frame-delta loss - Temporal-attention LoRA targeting	Delong-liu-bupt	MIT	see also paper
PiSSA	repo	- Principal/residual decomposition	GraphPKU	Apache 2.0	see also paper
sd-scripts (pr)	repo	- PiSSA initialization and integration patterns	rockerBOO	Apache 2.0
AdaLoRA	repo	- Adaptive rank-budget allocation workflow - Rank-importance scoring and masking flow	QingruZhang	MIT	see also paper
MoRA	repo	- High-rank square adapter update - Type-based projection/expansion mapping	kongds		see also paper
VeRA: Vector-based Random Matrix Adaptation	paper	- Shared frozen random projection matrices across adapted layers - Trainable per-layer scaling vectors with minimal parameter overhead	Dawid J. Kopiczko, Tijmen Blankevoort, Yuki M. Asano	CC BY 4.0
S2D: Selective Spectral Decay for Quantization-Friendly Conditioning of Neural Activations	paper	- Selective dominant-spectrum regularization - Amortized spectral updates with thresholded top-component targeting	Arnav Chavan, Nahush Lele, Udbhav Bamba, Sankalp Dayal, Aditi Raghunathan, Deepak Gupta	arXiv 1.0
LoRWeB	repo	- Dynamic LoRA basis with query-conditioned weight mixing - Query-mode/runtime wiring patterns for visual analogy triplets	NVlabs	NVIDIA License	see also paper
Growing with the Generator: Self-paced GRPO for Video Generation	paper	- Self-paced reward progression - Sparsity-aware reward mixing	Rui Li, Yuanzhi Liang, Ziqi Ni, Haibing Huang, Chi Zhang, Xuelong Li	arXiv 1.0
CDKA	repo	- Reference implementation for CDKA	rainstonee		see also paper
QLoRA	repo	- 4-bit NF4/FP4 quantized base-model loading - Double/nested quantization flow - Paged bitsandbytes optimizer integration - k-bit preparation patterns	artidoro	MIT	see also paper
Mode Seeking meets Mean Seeking for Fast Long Video Generation	paper	- Decoupled global/local dual-head auxiliary objective - Sliding-window local teacher-alignment approximation - Reverse-KL local behavior-matching term	Shengqu Cai, Weili Nie, Chao Liu, Julius Berner, Lvmin Zhang, Nanye Ma, Hansheng Chen, Maneesh Agrawala, Leonidas Guibas, Gordon Wetzstein, Arash Vahdat	CC BY-SA 4.0
VB-LoRA	repo	- Vector-bank LoRA composition - Top-k sparse logits composition and compact checkpoint strategy	leo-yangli		see also paper
A Rank Stabilization Scaling Factor for Fine-Tuning with LoRA	paper	- Rank-stabilized adapter scaling	Damjan Kalajdzievski	arXiv 1.0
Self-Supervised Flow Matching for Scalable Multi-Modal Synthesis	paper	- Dual-timestep masked noising - EMA teacher feature alignment with cosine objective - Combined training objective	Hila Chefer, Patrick Esser, Dominik Lorenz, Dustin Podell, Vikash Raja, Vinh Tong, Antonio Torralba, Robin Rombach	arXiv 1.0
Video2LoRA	repo	- LightLoRA auxiliary-factor decomposition - Reference-video-conditioned hypernetwork for runtime LoRA prediction - Iterative latent-token decoder structure and paired-reference training flow - End-to-end diffusion training without pre-trained semantic LoRA supervision	BerserkerVV		see also paper
StelLA	repo	- Three-factor LoRA decomposition - Stiefel-manifold constrained adapter updates - Euclidean-to-Riemannian gradient conversion with retraction	SonyResearch	Apache 2.0	see also paper
Disentangling Task Conflicts in Multi-Task LoRA via Orthogonal Gradient Projection	paper	- Oorthogonal gradient projection for shared multi-task LoRA - Separate conflict projection for LoRA low-rank factors	Ziyu Yang, Guibin Chen, Yuxin Yang, Aoxiong Zeng, Xiangquan Yang	CC BY 4.0
Helios: Real Real-Time Long Video Generation Model	paper	- Frame-aware historical-context corruption - First-frame history anchoring for anti-drift training	Shenghai Yuan, Yuanyang Yin, Zongjian Li, Xinwei Huang, Xiao Yang, Li Yuan	CC BY 4.0
OmniTransfer: All-in-one Framework for Spatio-temporal Video Transfer	paper	- Reference positional-bias direction for IC-LoRA - Task-type conditioning token for SemanticGen-style routing	Pengze Zhang, Yanze Wu, Mengtian Li, Xu Bai, Songtao Zhao, Fulong Ye, Chong Mou, Xinghui Li, Zhuowei Chen, Qian He, Mingyuan Gao	arXiv 1.0
Demystifing Video Reasoning	paper	- Early-step multi-view consensus - Mid-layer merge-and-continue training - High-noise timestep transfer	Ruisi Wang, Zhongang Cai, Fanyi Pu, Junxiang Xu, Wanqi Yin, Maijunxian Wang, Ran Ji, Chenyang Gu, Bo Li, Ziqi Huang, Hokin Deng, Dahua Lin, Ziwei Liu, Lei Yang	CC BY 4.0
ViBe: Ultra-High-Resolution Video Synthesis Born from Pure Images	paper	- High-frequency-aware training objective (HFATO) - Downsample-upsample latent degradation	Yunfeng Wu, Hongying Cheng, Zihao He, Songhua Liu	arXiv 1.0
FLeX: Fourier-based Low-rank EXpansion for multilingual transfer	paper	- Fourier-domain regularization - High-frequency-weighted spectral penalty with optional FFN-focused targeting	Gaurav Narasimhan	CC BY 4.0
Isokinetic Flow Matching for Pathwise Straightening of Generative Flows	paper	- Jacobian-free lookahead velocity-consistency regularizer for flow matching - Time-weighted, speed-normalized pathwise acceleration penalty	Tauhid Khan	arXiv 1.0	Train-time only Iso-FM auxiliary loss; inference unchanged
URSA	repo	- Split anchor-vs-continuation video loss reduction - Separate anchor and temporal loss telemetry - Spatiotemporal guidance weighting via anchor reconstruction and frame-delta consistency	baaivision	Apache 2.0	see also paper
DeCo	repo	- DCT-based low/high-frequency energy diagnostics - Band-balanced DCT reconstruction auxiliary loss with separate low/high-frequency weights	Zehong-Ma		see also paper
LoRA-drop: Efficient LoRA Parameter Pruning based on Output Evaluation	paper	- Output-magnitude per-layer EMA tracking - Automatic low-importance same-shape adapter sharing	Hongyun Zhou, Xiangyu Lu, Wang Xu, Conghui Zhu, Tiejun Zhao, Muyun Yang	arXiv 1.0
DyPE	repo	- Dynamic RoPE index scaling for oversized spatial and temporal token grids	guyyariv	MIT	see also paper
TwinFlow	repo	- Parallel distillation pipeline - Signed-timestep conditioning for negative-time self-adversarial passes - TwinFlow-controlled beta sigma sampling and enhancement-window gating - Recursive consistency target with optional target enhancement, adversarial, and rectification losses	inclusionAI	Apache 2.0	see also paper
rectified-flow-pytorch	repo	- Self-Flow RMSNorm + GELU projector design	lucidrains	MIT
HY-SOAR	repo	- HY-SOAR auxiliary trajectory-correction objective - Same-noise off-trajectory supervision with detached CFG rollout	Tencent-Hunyuan	Apache 2.0	see also paper
FlowC2S	repo	- Current-succeeding transport supervision - Chunk-pairing training layout - Target-inversion scoping reference	marghovo		see also paper

Name		Name	Last commit message	Last commit date
Latest commit History 290 Commits
assets		assets
configs/examples		configs/examples
docs		docs
extensions/stochastic_rounding		extensions/stochastic_rounding
runpod		runpod
src		src
tools		tools
.gitattributes		.gitattributes
.gitignore		.gitignore
.pylintrc		.pylintrc
.python-version		.python-version
AGENTS.md		AGENTS.md
README.md		README.md
install.bat		install.bat
pyproject.toml		pyproject.toml
run_trainer.bat		run_trainer.bat

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Takenoko

☄️ Disclaimer

☄️ Docs

☄️ Quick Start (Windows)

☄️ License

☄️ Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Takenoko

☄️ Disclaimer

☄️ Docs

☄️ Quick Start (Windows)

☄️ License

☄️ Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages