LatentForge

Interactive CLI for building high-quality image datasets for Flux LoRA fine-tuning, powered by the Claude Agent SDK.

Tell the agent what you want a dataset for — it searches, downloads, organizes, curates, deduplicates, resizes, and captions images through natural language conversation.

Quick Start

Run directly (no install)

# With Nix
nix run github:utensils/latentforge

# With uv
uvx latentforge

Install

# With uv
uv tool install latentforge

# Run
latentforge                              # interactive — no config
latentforge --config configs/ghibli.yaml  # with a dataset config

Development

nix develop     # enters devshell with latentforge, ruff, pyright, gallery-dl, uv
latentforge     # run the agent
nix fmt         # format nix + python files

How It Works

latentforge → launches an interactive Claude agent
  → 22 custom MCP tools for image operations
  → Built-in vision to examine images
  → You chat to guide: "build me a dataset for X", "curate the logos", etc.

Example Session

> I want a dataset for Studio Ghibli art styles

  [tool: create_config]
  Created config: configs/ghibli.yaml with 5 categories

> Search for movie poster art and download them

  [tool: search_bing]
  Found 18 image URLs for 'Studio Ghibli movie poster art'
  [tool: download_images]
  Download complete: 15 saved, 2 skipped (dup), 1 failed

> Find duplicates and show me quality stats

  [tool: find_duplicates]
  Found 2 duplicate pairs (threshold=8)
  [tool: analyze_quality]
  Total: 15 images, avg 1340x1020, 12 at 1024+

Tools

22 custom MCP tools across the full dataset workflow:

Category	Tools
Config	`create_config`, `read_config`, `update_config`, `list_configs`
Search	`search_bing`, `search_wikimedia`
Download	`download_images` (MD5 dedup), `download_gallery` (gallery-dl, 80+ sites)
Browse	`list_images`, `get_image_info`
Organize	`move_images`, `organize_images`
Quality	`analyze_quality`, `find_duplicates`, `detect_screenshots`
Cropping	`crop_center`, `crop_smart`, `crop_faces`
Faces	`detect_faces`
Training	`resize_images`, `write_caption`
Export	`export_dataset` (ai-toolkit format)

The agent also has built-in Read (with vision for viewing images), Write, and Bash tools.

Slash Commands

Type these during a session:

Command	Description
`/help`	Show available commands
`/config`	Show active dataset config
`/tools`	List all agent tools
`/cost`	Show session cost
`/status`	Session status and context usage
`/model <name>`	Switch Claude model (forks session)
`/export [path]`	Export dataset to ai-toolkit format
`/compact`	Compact context (summarize + fresh session)
`/quit`	Exit

Dataset Config

Each dataset is a YAML file in configs/. The agent can create these for you, or you can write them by hand:

name: ghibli
subject: "Studio Ghibli"
trigger_word: "ghibli_style"
output_dir: ./datasets/ghibli
search_queries:
  posters:
    - "Studio Ghibli movie poster art"
    - "Spirited Away poster"
  backgrounds:
    - "Studio Ghibli background art landscape"
categories:
  posters: "Movie poster art"
  backgrounds: "Background paintings and landscapes"
curation:
  target_count: "50-150"
  min_resolution: 512
  training_resolution: 1024

Dataset Structure

Datasets are stored under datasets/<name>/ with category subdirectories:

datasets/
└── ghibli/
    ├── posters/
    │   ├── studio_ghibli_movie_poster_a1b2c3d4e5f6.jpg
    │   ├── studio_ghibli_movie_poster_a1b2c3d4e5f6.txt
    │   └── ...
    └── backgrounds/
        ├── ghibli_background_art_7g8h9i0j1k2l.png
        └── ...

Images follow the naming pattern {query_prefix}_{md5_hash}.{ext} — the MD5 hash ensures deduplication across runs.

Workflow

Configure — Create a YAML config (or ask the agent to make one)
Collect — Search Bing/Wikimedia and download with MD5 dedup
Organize — Auto-sort by category using filename prefixes
Curate — Agent views images and helps reject low-quality ones
Deduplicate — Perceptual hash detection finds near-duplicates
Resize — Batch resize to training resolution (default 1024x1024)
Caption — Write .txt captions with trigger word alongside each image
Export — Export to ai-toolkit format with auto-generated training config
Train — Use ai-toolkit, kohya-ss/sd-scripts, or similar

Authentication

Set one of:

ANTHROPIC_API_KEY — Anthropic API key
CLAUDE_CODE_OAUTH_TOKEN — OAuth token (used when no API key is present)

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.claude		.claude
configs		configs
datasets		datasets
src/latentforge		src/latentforge
.envrc		.envrc
.gitignore		.gitignore
.mcp.json		.mcp.json
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
flake.lock		flake.lock
flake.nix		flake.nix
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LatentForge

Quick Start

Run directly (no install)

Install

Development

How It Works

Example Session

Tools

Slash Commands

Dataset Config

Dataset Structure

Workflow

Authentication

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

LatentForge

Quick Start

Run directly (no install)

Install

Development

How It Works

Example Session

Tools

Slash Commands

Dataset Config

Dataset Structure

Workflow

Authentication

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages