voicepeak-cli

日本語 | English

A command-line interface wrapper for VOICEPEAK text-to-speech software with preset management and automatic audio playback.

What's Different from the Original VOICEPEAK Command?

This wrapper enhances the original VOICEPEAK CLI with several powerful features:

🎵 Auto-play with mpv - Automatically plays generated audio when no output file is specified
📝 Voice presets - Save and reuse combinations of narrator, emotions, and pitch settings
📜 Long text support - Automatically splits texts longer than 140 characters and merges audio chunks
🔧 Advanced playback modes - Choose between batch (generate all → merge → play) or sequential (generate → play one by one)
🔄 Pipe input support - Accept text from stdin: echo "text" | vp
🚀 Background execution - Run with --bg to return shell control immediately while audio generates and plays
🔇 Clean output - Suppresses technical output by default (use --verbose to see debug info)
⚙️ Configuration file - Store your preferred settings in ~/.config/vp/config.toml

Key Benefits

Enhanced Workflow: No need to manually save and play audio files - just run and listen
Batch Processing: Handle long documents without worrying about character limits
Flexible Input: Works with direct text, files, or piped input from other commands
Personalization: Save your favorite voice configurations for consistent results
Professional Output: Clean interface with optional verbose mode for debugging

Requirements

macOS
VOICEPEAK installed at /Applications/voicepeak.app/
mpv for audio playback (install via Homebrew: brew install mpv)
ffmpeg for batch mode and multi-chunk file output (install via Homebrew: brew install ffmpeg)

Installation

From crates.io (Recommended)

cargo install voicepeak-cli

From source

Clone this repository
Build and install:
```
cargo install --path .
```

Usage

Basic Usage

# Simple text-to-speech (requires preset or --narrator)
vp "こんにちは、世界！"

# With explicit narrator
vp "こんにちは、世界！" --narrator "夏色花梨"

# Save to file instead of auto-play
vp "こんにちは、世界！" --narrator "夏色花梨" -o output.wav

# Read from file
vp -t input.txt --narrator "夏色花梨"

# Pipe input
echo "こんにちは、世界！" | vp --narrator "夏色花梨"
cat document.txt | vp -p karin-happy

Using Presets

# List available presets
vp --list-presets

# Use a preset
vp "こんにちは、世界！" -p karin-happy

# Override preset settings
vp "こんにちは、世界！" -p karin-normal --emotion "happy=50"

Voice Controls

# Control speech parameters
vp "こんにちは、世界！" --narrator "夏色花梨" --speed 120 --pitch 50

# List available narrators
vp --list-narrator

# List emotions for a specific narrator
vp --list-emotion "夏色花梨"

Text Length Handling

# Allow automatic text splitting (default)
vp "very long text..."

# Strict mode: reject texts longer than 140 characters
vp "text" --strict-length

Background Execution

# Run in background (returns shell control immediately)
vp --bg "こんにちは、世界！"

# Combine with other options
vp --bg -p karin-happy "こんにちは、世界！"
vp --bg -o output.wav "こんにちは、世界！"
echo "こんにちは" | vp --bg

Playback Modes

# Batch mode: generate all chunks first, merge, then play (default)
vp "long text" --playback-mode batch

# Sequential mode: generate and play chunks one by one
vp "long text" --playback-mode sequential

# Long text file output (uses ffmpeg to merge chunks)
vp "very long text" -o output.wav

# For sequential playback without ffmpeg
vp "long text" --playback-mode sequential

Configuration

Configuration is stored in ~/.config/vp/config.toml. The file is automatically created on first run.

Example Configuration

default_preset = "karin-custom"

[[presets]]
name = "karin-custom"
narrator = "夏色花梨"
emotions = [
    { name = "hightension", value = 10 },
    { name = "sasayaki", value = 20 },
]
pitch = 30
speed = 120

[[presets]]
name = "karin-normal"
narrator = "夏色花梨"
emotions = []

[[presets]]
name = "karin-happy"
narrator = "夏色花梨"
emotions = [{ name = "hightension", value = 50 }]

Configuration Fields

default_preset: Optional. Preset to use when no -p option is specified
presets: Array of voice presets

Preset Fields

name: Unique preset identifier
narrator: Voice narrator name
emotions: Array of emotion parameters with name and value
pitch: Optional pitch adjustment (-300 to 300)
speed: Optional speed adjustment (50 to 200)

Command-Line Options

Usage: vp [OPTIONS] [TEXT]

Arguments:
  [TEXT]  Text to say (or pipe from stdin)

Options:
  -t, --text <FILE>              Text file to say
  -o, --out <FILE>               Path of output file (optional - will play with mpv if not specified)
  -n, --narrator <NAME>          Name of voice
  -e, --emotion <EXPR>           Emotion expression (e.g., happy=50,sad=50)
  -p, --preset <NAME>            Use voice preset
      --list-narrator            Print voice list
      --list-emotion <NARRATOR>  Print emotion list for given voice
      --list-presets             Print available presets
      --speed <VALUE>            Speed (50 - 200)
      --pitch <VALUE>            Pitch (-300 - 300)
      --strict-length            Reject input longer than 140 characters (default: false, allows splitting)
      --playback-mode <MODE>     Playback mode: sequential or batch (default: batch)
      --bg                       Run in background (return immediately)
  -v, --verbose                  Enable verbose output (show VOICEPEAK debug messages)
  -h, --help                     Print help
  -V, --version                  Print version

Parameter Priority

When multiple sources specify the same parameter, the priority order is:

Command-line options (highest priority)
Preset values
Default values / none (lowest priority)

For example:

vp "text" -p my-preset --pitch 100 uses pitch=100 (CLI override)
vp "text" -p my-preset uses preset's pitch value
vp "text" --narrator "voice" uses no pitch adjustment

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contributing

Contributions are welcome! Please see CONTRIBUTING.md for detailed guidelines on how to contribute to this project.

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
.github/workflows		.github/workflows
src		src
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md
README_ja.md		README_ja.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

voicepeak-cli

What's Different from the Original VOICEPEAK Command?

Key Benefits

Requirements

Installation

From crates.io (Recommended)

From source

Usage

Basic Usage

Using Presets

Voice Controls

Text Length Handling

Background Execution

Playback Modes

Configuration

Example Configuration

Configuration Fields

Preset Fields

Command-Line Options

Parameter Priority

License

Contributing

About

Uh oh!

Releases 6

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

voicepeak-cli

What's Different from the Original VOICEPEAK Command?

Key Benefits

Requirements

Installation

From crates.io (Recommended)

From source

Usage

Basic Usage

Using Presets

Voice Controls

Text Length Handling

Background Execution

Playback Modes

Configuration

Example Configuration

Configuration Fields

Preset Fields

Command-Line Options

Parameter Priority

License

Contributing

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 6

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages