Skip to content

petamorikei/voicepeak-cli

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

41 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

voicepeak-cli

ζ—₯本θͺž | English

A command-line interface wrapper for VOICEPEAK text-to-speech software with preset management and automatic audio playback.

What's Different from the Original VOICEPEAK Command?

This wrapper enhances the original VOICEPEAK CLI with several powerful features:

  • 🎡 Auto-play with mpv - Automatically plays generated audio when no output file is specified
  • πŸ“ Voice presets - Save and reuse combinations of narrator, emotions, and pitch settings
  • πŸ“œ Long text support - Automatically splits texts longer than 140 characters and merges audio chunks
  • πŸ”§ Advanced playback modes - Choose between batch (generate all β†’ merge β†’ play) or sequential (generate β†’ play one by one)
  • πŸ”„ Pipe input support - Accept text from stdin: echo "text" | vp
  • πŸš€ Background execution - Run with --bg to return shell control immediately while audio generates and plays
  • πŸ”‡ Clean output - Suppresses technical output by default (use --verbose to see debug info)
  • βš™οΈ Configuration file - Store your preferred settings in ~/.config/vp/config.toml

Key Benefits

  • Enhanced Workflow: No need to manually save and play audio files - just run and listen
  • Batch Processing: Handle long documents without worrying about character limits
  • Flexible Input: Works with direct text, files, or piped input from other commands
  • Personalization: Save your favorite voice configurations for consistent results
  • Professional Output: Clean interface with optional verbose mode for debugging

Requirements

  • macOS
  • VOICEPEAK installed at /Applications/voicepeak.app/
  • mpv for audio playback (install via Homebrew: brew install mpv)
  • ffmpeg for batch mode and multi-chunk file output (install via Homebrew: brew install ffmpeg)

Installation

From crates.io (Recommended)

cargo install voicepeak-cli

From source

  1. Clone this repository
  2. Build and install:
    cargo install --path .

Usage

Basic Usage

# Simple text-to-speech (requires preset or --narrator)
vp "γ“γ‚“γ«γ‘γ―γ€δΈ–η•ŒοΌ"

# With explicit narrator
vp "γ“γ‚“γ«γ‘γ―γ€δΈ–η•ŒοΌ" --narrator "ε€θ‰²θŠ±ζ’¨"

# Save to file instead of auto-play
vp "γ“γ‚“γ«γ‘γ―γ€δΈ–η•ŒοΌ" --narrator "ε€θ‰²θŠ±ζ’¨" -o output.wav

# Read from file
vp -t input.txt --narrator "ε€θ‰²θŠ±ζ’¨"

# Pipe input
echo "γ“γ‚“γ«γ‘γ―γ€δΈ–η•ŒοΌ" | vp --narrator "ε€θ‰²θŠ±ζ’¨"
cat document.txt | vp -p karin-happy

Using Presets

# List available presets
vp --list-presets

# Use a preset
vp "γ“γ‚“γ«γ‘γ―γ€δΈ–η•ŒοΌ" -p karin-happy

# Override preset settings
vp "γ“γ‚“γ«γ‘γ―γ€δΈ–η•ŒοΌ" -p karin-normal --emotion "happy=50"

Voice Controls

# Control speech parameters
vp "γ“γ‚“γ«γ‘γ―γ€δΈ–η•ŒοΌ" --narrator "ε€θ‰²θŠ±ζ’¨" --speed 120 --pitch 50

# List available narrators
vp --list-narrator

# List emotions for a specific narrator
vp --list-emotion "ε€θ‰²θŠ±ζ’¨"

Text Length Handling

# Allow automatic text splitting (default)
vp "very long text..."

# Strict mode: reject texts longer than 140 characters
vp "text" --strict-length

Background Execution

# Run in background (returns shell control immediately)
vp --bg "γ“γ‚“γ«γ‘γ―γ€δΈ–η•ŒοΌ"

# Combine with other options
vp --bg -p karin-happy "γ“γ‚“γ«γ‘γ―γ€δΈ–η•ŒοΌ"
vp --bg -o output.wav "γ“γ‚“γ«γ‘γ―γ€δΈ–η•ŒοΌ"
echo "こんにけは" | vp --bg

Playback Modes

# Batch mode: generate all chunks first, merge, then play (default)
vp "long text" --playback-mode batch

# Sequential mode: generate and play chunks one by one
vp "long text" --playback-mode sequential

# Long text file output (uses ffmpeg to merge chunks)
vp "very long text" -o output.wav

# For sequential playback without ffmpeg
vp "long text" --playback-mode sequential

Configuration

Configuration is stored in ~/.config/vp/config.toml. The file is automatically created on first run.

Example Configuration

default_preset = "karin-custom"

[[presets]]
name = "karin-custom"
narrator = "ε€θ‰²θŠ±ζ’¨"
emotions = [
    { name = "hightension", value = 10 },
    { name = "sasayaki", value = 20 },
]
pitch = 30
speed = 120

[[presets]]
name = "karin-normal"
narrator = "ε€θ‰²θŠ±ζ’¨"
emotions = []

[[presets]]
name = "karin-happy"
narrator = "ε€θ‰²θŠ±ζ’¨"
emotions = [{ name = "hightension", value = 50 }]

Configuration Fields

  • default_preset: Optional. Preset to use when no -p option is specified
  • presets: Array of voice presets

Preset Fields

  • name: Unique preset identifier
  • narrator: Voice narrator name
  • emotions: Array of emotion parameters with name and value
  • pitch: Optional pitch adjustment (-300 to 300)
  • speed: Optional speed adjustment (50 to 200)

Command-Line Options

Usage: vp [OPTIONS] [TEXT]

Arguments:
  [TEXT]  Text to say (or pipe from stdin)

Options:
  -t, --text <FILE>              Text file to say
  -o, --out <FILE>               Path of output file (optional - will play with mpv if not specified)
  -n, --narrator <NAME>          Name of voice
  -e, --emotion <EXPR>           Emotion expression (e.g., happy=50,sad=50)
  -p, --preset <NAME>            Use voice preset
      --list-narrator            Print voice list
      --list-emotion <NARRATOR>  Print emotion list for given voice
      --list-presets             Print available presets
      --speed <VALUE>            Speed (50 - 200)
      --pitch <VALUE>            Pitch (-300 - 300)
      --strict-length            Reject input longer than 140 characters (default: false, allows splitting)
      --playback-mode <MODE>     Playback mode: sequential or batch (default: batch)
      --bg                       Run in background (return immediately)
  -v, --verbose                  Enable verbose output (show VOICEPEAK debug messages)
  -h, --help                     Print help
  -V, --version                  Print version

Parameter Priority

When multiple sources specify the same parameter, the priority order is:

  1. Command-line options (highest priority)
  2. Preset values
  3. Default values / none (lowest priority)

For example:

  • vp "text" -p my-preset --pitch 100 uses pitch=100 (CLI override)
  • vp "text" -p my-preset uses preset's pitch value
  • vp "text" --narrator "voice" uses no pitch adjustment

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contributing

Contributions are welcome! Please see CONTRIBUTING.md for detailed guidelines on how to contribute to this project.

About

No description, website, or topics provided.

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages