Skip to content

paiml/sovereign-ai-cookbook

Repository files navigation

Sovereign AI Cookbook

Forjar deployment configs for the complete PAIML sovereign AI stack.
17 Rust components -- 10 deployment stacks -- 14 recipes -- zero third-party runtime dependencies.


Table of Contents


Status Dashboard

All CI and nightly build status across the sovereign stack.

Component Binary CI Nightly
realizar realizar CI Nightly
aprender apr CI Nightly
trueno trueno-monitor CI Nightly
trueno-rag trueno-rag CI Nightly
entrenar entrenar CI Nightly
alimentar alimentar CI Nightly
batuta batuta CI Nightly
forjar forjar CI Nightly
paiml-mcp-agent-toolkit pmat CI Nightly
copia copia CI Nightly
pzsh pzsh CI Nightly
renacer renacer CI Nightly
repartir repartir CI Nightly
whisper.apr whisper-apr CI Nightly
pepita pepita CI Nightly
simular simular CI Nightly
pacha pacha CI Nightly

Installation

# Clone the cookbook
git clone https://github.com/paiml/sovereign-ai-cookbook
cd sovereign-ai-cookbook

# Install forjar (the deployment engine)
cargo install forjar

# Or download a nightly binary
curl -L -o forjar https://github.com/paiml/forjar/releases/download/nightly/forjar-x86_64-unknown-linux-musl
chmod +x forjar && mv forjar ~/.cargo/bin/

Quick Start

git clone https://github.com/paiml/sovereign-ai-cookbook
cd sovereign-ai-cookbook

# Validate a stack config
forjar validate -f stacks/01-inference/forjar.yaml

# Plan (dry-run — shows resource DAG)
forjar plan -f stacks/01-inference/forjar.yaml

# Apply (deploys to ephemeral docker containers)
forjar apply -f stacks/01-inference/forjar.yaml

# Check for drift (BLAKE3 verification)
forjar drift -f stacks/01-inference/forjar.yaml

Stacks

Each stack is a complete, deployable forjar.yaml targeting docker containers. Swap transport: container to ssh for production.

Stack What it deploys Components Resources
01-inference Single-machine model serving realizar 11
02-training GPU training pipeline entrenar 9
03-rag Retrieval-augmented generation trueno-db, trueno-rag, realizar 35
04-speech Speech recognition whisper-apr 10
05-distributed-inference Multi-node inference repartir, realizar 22
06-full-stack Complete sovereign AI lab all components 86
07-data-pipeline Ingest, train, serve alimentar, entrenar, realizar 29
08-observability Monitoring and tracing renacer, Jaeger, Grafana 10
09-edge-inference Jetson Orin Nano edge inference realizar, trueno 18
09-qwen-coder Local coding assistant aprender (apr-cli) 16

Clean-Room Test Matrix

Stack Matrix — 0/1 pass, 1 fail (updated: 2026-04-23 13:26 UTC)

Stack Status Apply Idempotent Resources Duration
01-inference FAIL FAIL 1 28s

Sovereign Stack Components

Every component is a standalone Rust binary with zero third-party runtime dependencies. All are published to crates.io and built nightly from main.

Component Binary Version Layer Description crates.io
realizar realizar v0.8 Application Model serving (GGUF, SafeTensors, CUDA) realizar
aprender apr v0.27 ML Core Model inspection, inference, training CLI aprender
trueno trueno-monitor v0.16 Compute SIMD/GPU engine + TUI monitor trueno
trueno-rag trueno-rag v0.2 Application RAG pipeline (embed, index, query) trueno-rag
entrenar entrenar v0.7 ML Core Training engine (LoRA, QLoRA, classification) entrenar
alimentar alimentar v0.2 Data Ingestion, preprocessing, dedup alimentar
batuta batuta v0.6 Infra Orchestration, mutation testing, oracle batuta
forjar forjar v0.5 Infra Infrastructure-as-Code provisioning forjar
paiml-mcp-agent-toolkit pmat v0.4 Infra Code quality, work tracking, coverage pmat
copia copia v0.1 Infra Sovereign file sync copia
pzsh pzsh v0.1 Infra Sub-10ms shell framework pzsh
renacer renacer v0.10 Infra Syscall tracing, Jaeger, Grafana renacer
repartir repartir v2.0 Compute Distributed execution workers repartir
whisper.apr whisper-apr v0.2 Application Speech recognition (Whisper) whisper-apr
pepita pepita v0.1 Infra Kernel namespace isolation, seccomp pepita
simular simular v0.3 Infra Simulation engine simular
pacha pacha v0.2 Data Model/data registry, BLAKE3 checksums pacha

Nightly Binary Releases

Every component ships cross-platform nightly binaries built from main via GitHub Actions. Binaries are statically linked (musl on Linux) and require no runtime dependencies.

Binary Repo Layer Description Platforms
realizar realizar Application Model serving (GGUF, SafeTensors, CUDA) Linux, macOS, Windows
apr aprender ML Core Model inspection, inference, training CLI Linux, macOS, Windows
trueno-monitor trueno Compute SIMD/GPU engine + TUI monitor Linux
trueno-rag trueno-rag Application RAG pipeline (embed, index, query) Linux, macOS, Windows
entrenar entrenar ML Core Training engine (LoRA, QLoRA, classification) Linux, macOS, Windows
alimentar alimentar Data Ingestion, preprocessing, dedup Linux, macOS, Windows
batuta batuta Infra Orchestration, mutation testing, oracle Linux, macOS, Windows
forjar forjar Infra Infrastructure-as-Code provisioning Linux, macOS, Windows
pmat paiml-mcp-agent-toolkit Infra Code quality, work tracking, coverage Linux, macOS, Windows
copia copia Infra Sovereign file sync Linux, macOS, Windows
pzsh pzsh Infra Sub-10ms shell framework Linux, macOS, Windows
renacer renacer Infra Syscall tracing, Jaeger, Grafana Linux, macOS, Windows
repartir repartir Compute Distributed execution workers Linux, macOS, Windows
whisper-apr whisper.apr Application Speech recognition (Whisper) Linux, macOS, Windows
pepita pepita Infra Kernel namespace isolation, seccomp Linux, macOS, Windows
simular simular Infra Simulation engine Linux, macOS, Windows
pacha pacha Data Model/data registry, BLAKE3 checksums Linux, macOS, Windows

Install any binary:

# Download from nightly release (example: forjar on Linux x86_64)
curl -L -o forjar https://github.com/paiml/forjar/releases/download/nightly/forjar-x86_64-unknown-linux-musl
chmod +x forjar && mv forjar ~/.cargo/bin/

# Or provision automatically via forjar (type: github_release)
forjar apply -f stacks/06-full-stack/forjar.yaml

Architecture

                    ┌──────────────────┐
                    │   monitor-box    │
                    │  renacer tracing │
                    │  Grafana + Jaeger│
                    │  pacha registry  │
                    └────────┬─────────┘
                             │
            ┌────────────────┼────────────────┐
            │                │                │
     ┌──────▼──────┐  ┌─────▼──────┐  ┌─────▼──────┐
     │   gpu-box   │  │  rag-box   │  │ worker-box │
     │  realizar   │  │ trueno-db  │  │  repartir  │
     │  entrenar   │  │ trueno-rag │  │  worker    │
     │             │  │ whisper-apr│  │            │
     └─────────────┘  └────────────┘  └────────────┘

All stacks use transport: container with ephemeral docker containers. Forjar creates the container, applies all resources (packages, files, services, firewall rules, cron jobs), verifies convergence with BLAKE3 hashing, and tears down after testing.

See docs/architecture.md for data flow diagrams and port assignments.

Recipes

Reusable building blocks in recipes/. Each recipe is machine-agnostic -- stacks bind them to specific machines.

Recipe Component What it configures
realizar-serve realizar GPU model serving (GGUF, safetensors), systemd unit, firewall, health check
entrenar-train entrenar Training config (learning rate, epochs, LoRA rank), GPU setup, checkpoints
trueno-rag-pipeline trueno-rag Embedding + retrieval pipeline, backed by trueno-db
trueno-db-analytics trueno-db Analytics/vector database, WAL, compaction
alimentar-ingest alimentar Data ingestion, preprocessing, dedup, scheduled cron
whisper-apr-asr whisper-apr ASR service, model download, VAD, beam search
pacha-registry pacha Model/data registry, BLAKE3 checksums, GC
pepita-sandbox pepita Kernel namespace isolation, overlay filesystem, seccomp
repartir-worker repartir Distributed execution worker, TLS, systemd
renacer-observability renacer Syscall tracing, Jaeger, Grafana, OTLP
batuta-agent batuta Autonomous agent runtime, mutation testing daemon
jetson-edge-base (platform) Jetson Orin Nano base: JetPack CUDA, Rust toolchain, sovereign tools
sovereign-ai-stack (meta) Fleet coordination, health dashboard, inventory
apr-inference-server aprender GPU inference with model download, BLAKE3 verification

Testing

All stacks deploy to ephemeral docker containers — no SSH, no root, no real hardware required.

# Validate all stacks
make validate

# Plan all stacks (shows resource DAGs)
make plan

# Validate a single stack
make validate-one STACK=03-rag

# Apply a single stack
make apply-one STACK=01-inference

# Check drift after manual changes
make drift-one STACK=01-inference

Production Deployment

Replace container transport with SSH for real machines:

machines:
  gpu-box:
    hostname: gpu-prod-01.internal
    addr: 10.0.1.10
    user: deploy
    arch: x86_64
    ssh_key: ~/.ssh/deploy_key
    roles: [gpu-compute, inference]

Use policy.parallel_machines: true for concurrent multi-machine deployment. Use policy.serial: 1 for rolling deploys.

How to Update

README.md is auto-generated. Never edit it directly.

What to change Where to edit How it deploys
README content, layout, badges scripts/generate-readme.sh Push to main → workflow regenerates
Stack test matrix (automatic) Clean-room CI injects results between STACK_MATRIX markers
Component versions, layers, descriptions COMPONENTS array in scripts/generate-readme.sh Push to main → workflow regenerates
CI + nightly badges (automatic) Generated from COMPONENTS array
Component table, nightly links (automatic) Generated from COMPONENTS array
# Preview locally
./scripts/generate-readme.sh

# CI check mode (fails if README.md is stale)
./scripts/generate-readme.sh --check

Contributing

Contributions are welcome. To get started:

  1. Fork the repository
  2. Create a stack or recipe in the appropriate directory
  3. Validate with make validate
  4. Submit a pull request

All stack configs must pass forjar validate before merge. See How to Update for README changes.

Related Repositories

Project Purpose Link
forjar Infrastructure as Code (deploys this cookbook) Book
aprender ML library (models, inference, training) crates.io
trueno SIMD/GPU compute engine (pure Rust PTX) crates.io
realizar Model serving (GGUF, SafeTensors, CUDA) crates.io
entrenar Training engine (LoRA, QLoRA) crates.io
trueno-rag RAG pipeline (embed, index, query) crates.io
batuta Orchestration, mutation testing, oracle crates.io
paiml-mcp-agent-toolkit Code quality, work tracking, coverage crates.io
apr-cookbook 202 Rust ML code examples Examples

License

MIT

About

Forjar deployment configs for the PAIML sovereign AI stack

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors