cargo make testIf you need GPU training support with the tch backend on Windows, you'll need to install libtorch manually:
- Download libtorch from https://pytorch.org/get-started/locally/
- Follow the setup instructions at https://github.com/LaurentMazare/tch-rs?tab=readme-ov-file#libtorch-manual-install
- Extract to a directory (e.g.,
C:\libtorch) - Set the environment variable:
$Env:LIBTORCH = "C:\libtorch"The ML pipeline exposes toggles that change both the training inputs and labels without modifying source code. Loader features and the deterministic guess flag can be combined, while target encodings are mutually exclusive so that model layouts stay predictable.
The project supports multiple training precision levels. Choose exactly one:
| Feature | Description | Notes |
|---|---|---|
ml_train_precision_fp32 |
Full 32-bit floating point training | Default; required for NdArray backend and HPT |
ml_train_precision_fp16 |
Half precision (16-bit) training with dynamic loss scaling | Requires compatible backend (e.g., tch with CUDA) |
ml_train_precision_bf16 |
BFloat16 training with dynamic loss scaling | Requires compatible backend (e.g., tch with CUDA) |
Note: NdArray-based training and hyperparameter tuning require ml_train_precision_fp32. Inference always runs on the NdArray backend and automatically converts stored values to f32.
Choose exactly one:
| Feature | Description |
|---|---|
ml_store_precision_full |
Store models as full precision |
ml_store_precision_half |
Store models as half precision (smaller files) |
Choose exactly one:
| Feature | Description | Input width (before deterministic guess) |
|---|---|---|
ml_loader_note_binned_convolution |
Uses the existing note-binned harmonic convolution (128 bins) | 128 |
ml_loader_mel |
Applies mel filter banks to the full spectrum (512 bands) | 512 |
ml_loader_frequency |
Feeds the raw 8,192-bin frequency spectrum | 8192 |
ml_loader_frequency_pooled |
Averages the raw spectrum into 2,048 pooled bins (factor ×4) | 2048 |
Optional add-on:
| Feature | Description |
|---|---|
ml_loader_include_deterministic_guess |
Prepends the deterministic 128-note guess vector to whichever loader you selected above (doubling 128-bin inputs, adding 128 to the others) |
Choose exactly one:
| Feature | Description | Output width contribution |
|---|---|---|
ml_target_full |
Emits the full 128-note mask (per MIDI note across octaves) | +128 |
ml_target_folded |
Emits a folded 12-class pitch-class mask (one octave) | +12 |
ml_target_folded_bass |
Emits two 12-class masks: a categorical bass pitch class (trained with softmax / cross-entropy) and a multi-hot mask of every pitch class present across all octaves | +24 |
When using ml_target_folded_bass, the bass pitch uses softmax + cross-entropy loss while other notes use binary cross-entropy. Inference decodes bass via argmax to emit a single pitch class.
# Default (note-binned + deterministic guess, 128-note target)
cargo check
# Mel features with deterministic guess and folded targets
cargo check --no-default-features \
--features "cli ml_infer ml_loader_mel ml_loader_include_deterministic_guess ml_target_folded"
# Raw frequency spectrum without deterministic guess, folded targets only
cargo check --no-default-features \
--features "cli ml_infer ml_loader_frequency ml_target_folded"
# Pooled raw spectrum with deterministic guess, folded targets only
cargo check --no-default-features \
--features "cli ml_infer ml_loader_frequency_pooled ml_loader_include_deterministic_guess ml_target_folded"
# Pooled spectrum with deterministic guess and folded+bass targets
cargo check --no-default-features \
--features "cli ml_infer ml_loader_frequency_pooled ml_loader_include_deterministic_guess ml_target_folded_bass"Make sure exactly one loader feature is enabled at a time, and exactly one target feature is enabled overall. The deterministic guess flag can be toggled independently to suit experiments.
- Uses cosine-annealed learning rate schedule starting from
TrainConfig.adam_learning_rate - Reduced-precision training uses dynamic loss scaling with skip-on-overflow for gradient stability
- Scale growth/backoff happens automatically per training step
Before running the release process, ensure you have:
- ✅ crates.io authentication:
cargo loginwith your API token - ✅ npm authentication:
npm loginornpm adduser - ✅ GitHub Container Registry authentication:
docker login ghcr.io -u USERNAME - ✅ wasm32-wasip2 target:
rustup target add wasm32-wasip2 - ✅ Required tools (prefer cargo-binstall for speed):
# Install cargo-binstall first for faster subsequent installations curl -L --proto '=https' --tlsv1.2 -sSf https://raw.githubusercontent.com/cargo-bins/cargo-binstall/main/install-from-binstall-release.sh | bash # Then install tools via binstall cargo binstall --no-confirm cargo-release # For version bumping cargo binstall --no-confirm cargo-make # For task orchestration cargo binstall --no-confirm wkg # For OCI publishing cargo binstall --no-confirm wasm-pack # For npm WASM builds
Follow these steps to cut a new release:
# Bump versions and create git tags using cargo-release
cargo make releaseThis will:
- Update version numbers in all
Cargo.tomlfiles - Create git commits for version bumps
- Create git tags (e.g.,
v0.8.0) - Does NOT publish (controlled by
--no-publishflag)
# Publish to crates.io, npm, and GitHub Container Registry
cargo make publish-allThis orchestrates:
- ✅ Format check and tests (
check-all) - ✅ Build CLI binary (
build-cli) - ✅ Build WASM package for npm (
build-npm) - ✅ Build Leptos web app (
build-web) - ✅ Build WASM binary for OCI (
build-oci) - ✅ Publish
kordcrate to crates.io (publish-crates) - ✅ Rename and publish to npm as
kordweb(publish-npm) - ✅ Publish to GitHub Container Registry at
ghcr.io/twitchax/kord:latest(publish-oci)
# Push the version tags created by cargo-release
git push --follow-tags🎯 Manual step: Go to GitHub Releases and:
- Click "Draft a new release"
- Select the tag you just pushed (e.g.,
v0.8.0) - Generate release notes or write your own
- Attach platform binaries from CI artifacts if desired
- Publish the release
Note: CI automatically builds platform binaries (Linux, Windows, macOS) and the WASM binary on every push to main, but does not automatically publish them. All publishing is manual via the steps above.
If you've already bumped versions manually or want to republish:
cargo make publish-all# Build components individually
cargo make build-cli
cargo make build-npm
cargo make build-web
# Publish individually
cargo make publish-crates
cargo make publish-npm# Build the WASI binary for wasip2
cargo make build-oci
# Publish to GitHub Container Registry
cargo make publish-ociPrerequisites:
- Install the
wasm32-wasip2target:rustup target add wasm32-wasip2- Install
wkgtool:cargo install wkg- Authenticate with GitHub Container Registry:
docker login ghcr.ioThe package will be available at
ghcr.io/twitchax/kord:latestand can be run with any WASI-compatible runtime like Wasmtime or wkg.
Build:
cargo make docker-buildRun:
cargo make docker-runDeploy:
cargo make fly-deploycargo run --bin kord --no-default-features \
--features "cli ml_train ml_tch ml_train_precision_fp32 ml_store_precision_full ml_loader_mel ml_target_folded" \
--release -- -q ml train \
--backend tch \
--training-sources samples/captured \
--training-sources samples/slakh \
--training-sources sim \
--noise-asset-root samples/noise \
--destination model \
--model-epochs 16cargo check --bin kord --no-default-features \
--features "cli ml_train ml_train_precision_fp32 ml_store_precision_full ml_tch ml_loader_mel ml_target_folded"- Add APIs (and likely docs like
rtz) that allow people to explore chords (may be useful for LLM). + utoipa - Add synthesizer to frontend for more pleasant audio feedback.
- Add synthesizer to the website on the "play" buttons.
- Add a button to allow for playing the scales in the describe page (so you can play the stacked chord, or any on of the suggested scales).
- "Cm7" yields a suggested first scale of Gbadd13! during inference since the note ordering is not known. Investigate how to improve this...