Dynamics Model

cd RISE/dynamics/dynamics_model 
# Navigate to the dynamics model directory before running the following commands

Generated Samples

Data Format

The framework expects data in the LeRobot format. For optimal training performance, we strongly recommend pre-resizing videos to [256, 192] resolution.

Directory Structure

All tasks should be organized in the dataset directory with the following structure:

# copy your dataset under the dataset directory
cp -r path/to/your/dataset dataset/

Each dataset is organized as follows:

task_A/
├── data/
│   └── chunk-000/
│       ├── episode_000000.parquet
│       ├── episode_000001.parquet
│       ├── episode_000002.parquet
│       └── ...
├── meta/
│   ├── info.json              
│   ├── episodes.jsonl        
│   ├── episodes_stats.jsonl   
│   └── tasks.jsonl        
└── videos/
    └── chunk-000/
        └── [video files]

Video Preprocessing

The preprocess.sh script resizes all videos in the dataset to 256x192 resolution using ffmpeg, preserving aspect ratio with center padding. Processed videos are saved in videos_small while maintaining the original directory structure.

Usage:

# Process specific datasets
./preprocess.sh dataset1 [dataset2](optional)

The output would be as follows with videos_small:

task_A/
├── data/
├── meta/
└── videos/
└── videos_small/
│    └── chunk-000/
│        └── [video files]

Model Checkpoints

Base LTX Backbone

Download the LTX-Video backbone components (Text Encoder, Tokenizer, and VAE) using the provided script:

./download.sh

This script automatically downloads all required components from the LTX-Video HuggingFace repository to the checkpoints directory.

Alternatively, you can manually download the following components:

Text Encoder: text_encoder
Tokenizer: tokenizer
VAE: vae
Pre-trained dynamics model: dynamics_model, pretrained on Galaxea Open World and AgiBot World Alpha jointly.

Place all downloaded weights in the same directory and update the pretrained_model_name_or_path field in your configuration file.

Training

Pre-training

Pre-training is performed on large-scale robotic datasets to learn general dynamics priors. We utilize the following datasets:

Galaxea Open World Dataset: Galaxea-Open-World-Dataset
AgiBot World Alpha: AgiBotWorld-Alpha

Steps

Prepare Data: Convert your datasets to the LeRobot format as described above.
Configure Training: Edit configs/ltx_model/pretrain.yaml according to the comments:
- Set pretrained_model_name_or_path to your LTX backbone checkpoint directory
- Set diffusion_model.model_path to your pre-trained diffusion checkpoint
- Configure data.train.data_roots and data.val.data_roots to point to your dataset directories
Launch Training:
```
bash train_task_centric.sh
```

Fine-tuning

Fine-tuning adapts the pre-trained model to specific task domains using domain-specific datasets.

Steps

Prepare Task-Specific Data: Organize your fine-tuning dataset in the LeRobot format.
Compute Action Normalization Statistics: Use norm.py to compute and save normalization statistics:
```
python norm.py --datasets <your_finetune_dataset> --save-config data/utils/action_norm.json
```
This automatically computes min and max values for each dataset and saves them to a JSON configuration file.
Configure Fine-tuning: Edit configs/ltx_model/finetune.yaml:
- Set pretrained_model_name_or_path to your LTX backbone checkpoint directory
- Set diffusion_model.model_path to your diffusion checkpoint
- Configure data.train.data_roots and data.val.data_roots for your fine-tuning dataset
- Add norm_config_path: data/utils/action_norm.json to both data.train and data.val sections
- The data loader will automatically use the normalization values from the config file based on dataset names
Launch Fine-tuning:
```
bash task_finetune.sh
```

Inference

The inference pipeline generates future video sequences conditioned on initial observations and action sequences.

Steps

Configure Inference: Edit configs/ltx_model/infer.yaml:
- Set pretrained_model_name_or_path your LTX backbone checkpoint directory
- Set diffusion_model.model_path to your diffusion checkpoint
Update Inference Script: Edit infer.sh with appropriate paths
Run Inference:
```
bash infer.sh
```

Inference Parameters

--config_file: Path to inference configuration file
--image_root: Directory containing input observation images
--output_path: Directory to save generated videos
--act_tokens_path: Path to action token file (.pt format)
--norm_constant: Normalization constant for action tokens (e.g., FINETUNE_TASK)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dynamics Model

Generated Samples

Data Format

Directory Structure

Video Preprocessing

Model Checkpoints

Base LTX Backbone

Training

Pre-training

Steps

Fine-tuning

Steps

Inference

Steps

Inference Parameters

FilesExpand file tree

dynamics_model.md

Latest commit

History

dynamics_model.md

File metadata and controls

Dynamics Model

Generated Samples

Data Format

Directory Structure

Video Preprocessing

Model Checkpoints

Base LTX Backbone

Training

Pre-training

Steps

Fine-tuning

Steps

Inference

Steps

Inference Parameters