Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
ae5768d
extend: add fs and vfio attacher interfaces and specs
CMGS Apr 27, 2026
8fa8a2b
vm: add shared-memory config for CH virtio-fs readiness
CMGS Apr 27, 2026
a63968e
cloudhypervisor: add vm.add-fs / vm.add-device API helpers
CMGS Apr 27, 2026
1da43b9
cloudhypervisor: implement fs.Attacher and vfio.Attacher
CMGS Apr 27, 2026
a7aad99
cmd/vm: add fs and device attach/detach subcommands
CMGS Apr 27, 2026
6d9df66
cmd/vm: extend inspect with runtime attached devices
CMGS Apr 27, 2026
401c2be
docs: document runtime device attach and constraints
CMGS Apr 27, 2026
112a7b0
extend: senior-review cleanup pass
CMGS Apr 27, 2026
0b591ab
cloudhypervisor: fix runningVMClient and DeviceAttach check order
CMGS Apr 27, 2026
63e3531
cloudhypervisor: factor attachWith / detachWith / listWith helpers
CMGS Apr 28, 2026
8b0c726
cloudhypervisor: extract inspectRunning helper for the three with-hel…
CMGS Apr 28, 2026
2a0ce2c
extend, vm/debug: address two P2 review findings
CMGS Apr 28, 2026
98a5dfe
docs: tighten --pci spec wording, document virtiofsd single-shot life…
CMGS Apr 28, 2026
8a1095c
extend, vm: /code review fixes (vmAPIOnce, omitempty, debug args, TODO)
CMGS Apr 28, 2026
36a23ea
extend, vm: review-followup — flatten attachGroup, Normalize, pidfile…
CMGS Apr 28, 2026
4be6943
extend, vm: reorder methods/standalone funcs per SKILL.md
CMGS Apr 28, 2026
25d9454
extend, vm: simplify utils/http helpers and route snapshot/restore to…
CMGS Apr 28, 2026
17813b0
cloudhypervisor: route removeDeviceVM through vmAPIOnce
CMGS Apr 28, 2026
3830b73
cloudhypervisor, firecracker: route non-idempotent endpoints through …
CMGS Apr 28, 2026
22918e8
docs: trim verbose comments
CMGS Apr 28, 2026
ce1599e
refactor(hypervisor): reorder backend.go layout per SKILL.md
CMGS Apr 28, 2026
ee75245
refactor: final /code + /simplify pass — close audit loop
CMGS Apr 28, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 16 additions & 0 deletions KNOWN_ISSUES.md
Original file line number Diff line number Diff line change
Expand Up @@ -213,3 +213,19 @@ On CNI plugins with strict per-veth MAC enforcement (Cilium eBPF, Calico eBPF),
**Upstream status**: FC's `NetworkOverride` struct only has `iface_id` and `host_dev_name` — no `guest_mac` field. Adding it would follow the existing `VsockOverride` pattern. No issue or PR exists yet.

**Workaround**: If MAC matching is required, run `ip link set dev ethX address <new-mac>` inside the guest after clone (the post-clone hints print the expected MAC values).

## Vhost-user-fs requires VM-level shared memory

`cocoon vm fs attach` only works on CH VMs that were created with `--shared-memory`. CH's `memory shared=on` is fixed at VM creation: backend processes (e.g. virtiofsd) need to mmap guest memory via the negotiated memfd, and the memory model cannot be flipped on a running VM. If `--shared-memory` was omitted at create time, the only path is to recreate the VM. Cocoon's preflight reads `vm.info` and surfaces a clear error rather than letting CH return a vague rejection.

## Snapshotting a VM with attached vhost-user-fs / VFIO is rejected by CH

Cloud Hypervisor refuses to snapshot a VM that holds a vhost-user-fs share or a VFIO PCI passthrough device. Cocoon does not block the call client-side (the rejection comes from CH itself); the surfaced error explains the cause. `cocoon vm fs detach` / `cocoon vm device detach` first to clear runtime devices, then snapshot.

## Runtime attached devices do not survive VM stop / clone / restore

Attaches via `cocoon vm fs attach` and `cocoon vm device attach` are runtime-only — they live in the CH process state and are never written into the VM record, sidecar, or snapshot. After `vm stop`, `vm clone`, or `vm restore`, the user must re-run the attach commands. `vm inspect` reflects the live CH `vm.info` for running VMs and omits `attached_devices` for stopped VMs. This is by design: cocoon does not own the backend (virtiofsd / vfio-pci binding) and cannot guarantee the resource still exists across host events.

## virtiofsd is a single-shot daemon

Upstream virtiofsd serves exactly one vhost-user client and exits when that client disconnects. Consequence: after `cocoon vm fs detach`, the daemon is gone — a follow-up `cocoon vm fs attach` against the same socket path will hang or time out until a fresh `virtiofsd` instance is launched. The same applies after `cocoon vm stop` (CH closes the socket on shutdown). Scripts that cycle attach/detach should respawn virtiofsd between calls. This is a virtiofsd behavior, not a cocoon limitation.
65 changes: 65 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -140,6 +140,12 @@ cocoon
│ ├── rm [flags] VM [VM...] Delete VM(s) (--force to stop first)
│ ├── restore [flags] VM SNAP Restore a running VM to a snapshot
│ ├── status [VM...] Watch VM status in real time
│ ├── fs
│ │ ├── attach [flags] VM Attach a vhost-user-fs share (CH only)
│ │ └── detach [flags] VM Detach a vhost-user-fs share by --tag
│ ├── device
│ │ ├── attach [flags] VM Attach a VFIO PCI device (CH only)
│ │ └── detach [flags] VM Detach a VFIO PCI device by --id
│ └── debug [flags] IMAGE Generate hypervisor launch command (dry run)
├── snapshot
│ ├── save [flags] VM Create a snapshot from a running VM
Expand Down Expand Up @@ -187,6 +193,7 @@ Applies to `cocoon vm create`, `cocoon vm run`, and `cocoon vm debug`:
| `--no-direct-io` | `false` | Disable O_DIRECT on writable disks (use page cache; CH only, useful for dev/test with few VMs) |
| `--data-disk` | empty (repeatable) | Attach an extra data disk: `size=20G[,name=...][,fstype=ext4|none][,mount=/mnt/x][,directio=on|off|auto]`. See [Data Disks](#data-disks) |
| `--windows` | `false` | Windows guest (UEFI boot, kvm_hyperv=on, no cidata) |
| `--shared-memory` | `false` | Enable CH `memory shared=on`; required for later `vm fs attach` (CH only, fixed for VM lifetime) |

### Clone Flags

Expand Down Expand Up @@ -374,6 +381,64 @@ Phase 1 inherits data disks 1:1: snapshot reflinks each `data-<name>.raw` into t

Restore preflight verifies sidecar integrity, file presence (vmstate, memory, COW, every `data-*.raw`), and per-index Role/Path/RO match between sidecar and CH config.json **before** killing the running VM, so a malformed or imported snapshot fails fast and leaves the live VM untouched.

## Runtime Device Attach (Cloud Hypervisor only)

Cocoon can hot-plug two classes of external resources onto a running VM:

- **Vhost-user-fs** — a file share served by an external `virtiofsd` (or compatible backend) over a Unix socket. Attach surfaces a virtio-fs device in the guest, accessible via `mount -t virtiofs <tag> /mnt/...`.
- **VFIO PCI passthrough** — a host PCI device bound to `vfio-pci` (GPU, NIC, NVMe). Attach hands the device to the guest with IOMMU isolation.

Both attaches are **runtime-only**: the device lives only for the current VM process and is gone after stop/restart. Cocoon does not own the backend lifecycle (the user runs `virtiofsd`, binds the PCI device, etc.). Attached devices are not part of the VM record and are not preserved by snapshot / clone / restore. Cloud Hypervisor itself rejects snapshotting a VM that has vhost-user or VFIO devices attached, so plan accordingly.

### Vhost-user-fs

Prerequisite: the VM must have been created with `--shared-memory`. CH's `memory shared=on` cannot be flipped on a running VM, and vhost-user-fs requires it to share guest memory with the backend process.

```bash
# 1) Boot VM with shared-memory enabled.
cocoon vm run --shared-memory --name share-host ghcr.io/cocoonstack/cocoon/ubuntu:24.04

# 2) On host, run virtiofsd against a directory.
virtiofsd --socket-path=/tmp/virtiofsd.sock --shared-dir=/srv/data --cache=never &

# 3) Attach to the VM (tag is the guest mount tag and detach key).
cocoon vm fs attach share-host --socket /tmp/virtiofsd.sock --tag data

# 4) Inside the guest:
mkdir -p /mnt/data && mount -t virtiofs data /mnt/data

# 5) Detach later:
cocoon vm fs detach share-host --tag data
```

Flags:

| Flag | Default | Description |
|------|---------|-------------|
| `--socket` | required | Absolute path to the virtiofsd unix socket |
| `--tag` | required | Guest mount tag (also detach key) |
| `--num-queues` | `1` | Request queues |
| `--queue-size` | `1024` | Queue depth |

### VFIO PCI passthrough

Prerequisite: host has `intel_iommu=on` (or `amd_iommu=on`) on the kernel command line and the target PCI device is bound to `vfio-pci`.

```bash
# Bind the device on the host (one-time per device).
echo 0000:01:00.0 > /sys/bus/pci/drivers/vfio-pci/bind # see https://wiki.archlinux.org/title/PCI_passthrough_via_OVMF

# --pci accepts: short BDF (01:00.0), full BDF (0000:01:00.0), or
# sysfs path under /sys/bus/pci/devices/. Other absolute paths are
# rejected so cocoon does not forward a non-PCI directory to CH.
cocoon vm device attach my-vm --pci 01:00.0 --id mygpu

# Detach.
cocoon vm device detach my-vm --id mygpu
```

`cocoon vm inspect VM` includes an `attached_devices` field for running VMs that surfaces every attached vhost-user-fs share and VFIO device, read live from CH `vm.info`. The field is omitted for stopped VMs.

## Windows Support

Cocoon supports Windows guests via the `--windows` flag:
Expand Down
62 changes: 33 additions & 29 deletions cmd/core/helpers.go
Original file line number Diff line number Diff line change
Expand Up @@ -270,27 +270,41 @@ func EnsureImage(ctx context.Context, backends []imagebackend.Images, vmCfg *typ
}

func ResolveImageOwner(ctx context.Context, backends []imagebackend.Images, ref string) (imagebackend.Images, error) {
var matches []imagebackend.Images
for _, b := range backends {
return resolveOwner(backends, ref, func(b imagebackend.Images) (bool, error) {
img, err := b.Inspect(ctx, ref)
return img != nil, err
},
fmt.Errorf("image %q: not found in any backend", ref),
fmt.Errorf("image %s: %w", ref, imagebackend.ErrAmbiguous),
)
}

// resolveOwner returns the unique backend among backends for which found
// reports true. notFound is returned when zero match; ambiguous wraps the
// caller-supplied error and lists the matched backend types.
func resolveOwner[T interface{ Type() string }](backends []T, ref string, found func(T) (bool, error), notFound, ambiguous error) (T, error) {
var matches []T
var zero T
for _, b := range backends {
ok, err := found(b)
if err != nil {
return nil, fmt.Errorf("inspect %s in %s: %w", ref, b.Type(), err)
return zero, fmt.Errorf("inspect %s in %s: %w", ref, b.Type(), err)
}
if img != nil {
if ok {
matches = append(matches, b)
}
}
switch len(matches) {
case 0:
return nil, fmt.Errorf("image %q: not found in any backend", ref)
return zero, notFound
case 1:
return matches[0], nil
default:
names := make([]string, len(matches))
for i, b := range matches {
names[i] = b.Type()
}
return nil, fmt.Errorf("image %s: %w (backends: %s)", ref, imagebackend.ErrAmbiguous, strings.Join(names, ", "))
return zero, fmt.Errorf("%w (backends: %s)", ambiguous, strings.Join(names, ", "))
}
}

Expand All @@ -306,6 +320,7 @@ func VMConfigFromFlags(cmd *cobra.Command, image string) (*types.VMConfig, error
password, _ := cmd.Flags().GetString("password")
noDirectIO, _ := cmd.Flags().GetBool("no-direct-io")
windows, _ := cmd.Flags().GetBool("windows")
sharedMemory, _ := cmd.Flags().GetBool("shared-memory")
dataDiskRaw, _ := cmd.Flags().GetStringArray("data-disk")

if vmName == "" {
Expand Down Expand Up @@ -338,6 +353,7 @@ func VMConfigFromFlags(cmd *cobra.Command, image string) (*types.VMConfig, error
Network: network,
NoDirectIO: noDirectIO,
Windows: windows,
SharedMemory: sharedMemory,
},
User: user,
Password: password,
Expand Down Expand Up @@ -384,6 +400,7 @@ func CloneVMConfigFromFlags(cmd *cobra.Command, snapCfg *types.SnapshotConfig) (
Network: network,
NoDirectIO: noDirectIO,
Windows: snapCfg.Windows,
SharedMemory: snapCfg.SharedMemory,
},
OnDemand: onDemand,
}
Expand Down Expand Up @@ -500,30 +517,17 @@ func findHypervisorFactory(typ config.HypervisorType) func(*config.Config) (hype
}

func resolveVMOwner(ctx context.Context, hypers []hypervisor.Hypervisor, ref string) (hypervisor.Hypervisor, error) {
var matches []hypervisor.Hypervisor
for _, h := range hypers {
_, resolveErr := h.Inspect(ctx, ref)
if resolveErr == nil {
matches = append(matches, h)
continue
return resolveOwner(hypers, ref, func(h hypervisor.Hypervisor) (bool, error) {
if _, err := h.Inspect(ctx, ref); err == nil {
return true, nil
} else if !errors.Is(err, hypervisor.ErrNotFound) {
return false, err
}
if errors.Is(resolveErr, hypervisor.ErrNotFound) {
continue
}
return nil, fmt.Errorf("inspect %s in %s: %w", ref, h.Type(), resolveErr)
}
switch len(matches) {
case 0:
return nil, fmt.Errorf("vm %s: %w", ref, hypervisor.ErrNotFound)
case 1:
return matches[0], nil
default:
names := make([]string, len(matches))
for i, h := range matches {
names[i] = h.Type()
}
return nil, fmt.Errorf("vm %s: %w (backends: %s)", ref, hypervisor.ErrAmbiguous, strings.Join(names, ", "))
}
return false, nil
},
fmt.Errorf("vm %s: %w", ref, hypervisor.ErrNotFound),
fmt.Errorf("vm %s: %w", ref, hypervisor.ErrAmbiguous),
)
}

// sanitizeVMName derives a safe VM name from an image reference.
Expand Down
114 changes: 114 additions & 0 deletions cmd/vm/attach.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
package vm

import (
"context"
"errors"
"fmt"

"github.com/projecteru2/core/log"
"github.com/spf13/cobra"

cmdcore "github.com/cocoonstack/cocoon/cmd/core"
"github.com/cocoonstack/cocoon/extend/fs"
"github.com/cocoonstack/cocoon/extend/vfio"
"github.com/cocoonstack/cocoon/hypervisor"
)

func (h Handler) FsAttach(cmd *cobra.Command, args []string) error {
ctx, a, err := resolveAttacher[fs.Attacher](h, cmd, args, "fs attach", fs.ErrUnsupportedBackend)
if err != nil {
return err
}
socket, _ := cmd.Flags().GetString("socket")
tag, _ := cmd.Flags().GetString("tag")
numQ, _ := cmd.Flags().GetInt("num-queues")
qSize, _ := cmd.Flags().GetInt("queue-size")
id, err := a.FsAttach(ctx, args[0], fs.Spec{Socket: socket, Tag: tag, NumQueues: numQ, QueueSize: qSize})
if err != nil {
return classifyAttachErr(err)
}
if done, jsonErr := cmdcore.MaybeOutputJSON(cmd, map[string]string{"vm": args[0], "tag": tag, "id": id}); done {
return jsonErr
}
log.WithFunc("cmd.vm.fs.attach").Infof(ctx, "attached fs tag=%s id=%s vm=%s", tag, id, args[0])
return nil
}

func (h Handler) FsDetach(cmd *cobra.Command, args []string) error {
ctx, a, err := resolveAttacher[fs.Attacher](h, cmd, args, "fs detach", fs.ErrUnsupportedBackend)
if err != nil {
return err
}
tag, _ := cmd.Flags().GetString("tag")
if err := a.FsDetach(ctx, args[0], tag); err != nil {
return classifyAttachErr(err)
}
if done, jsonErr := cmdcore.MaybeOutputJSON(cmd, map[string]string{"vm": args[0], "tag": tag}); done {
return jsonErr
}
log.WithFunc("cmd.vm.fs.detach").Infof(ctx, "detached fs tag=%s vm=%s", tag, args[0])
return nil
}

func (h Handler) DeviceAttach(cmd *cobra.Command, args []string) error {
ctx, a, err := resolveAttacher[vfio.Attacher](h, cmd, args, "device attach", vfio.ErrUnsupportedBackend)
if err != nil {
return err
}
pci, _ := cmd.Flags().GetString("pci")
id, _ := cmd.Flags().GetString("id")
deviceID, err := a.DeviceAttach(ctx, args[0], vfio.Spec{PCI: pci, ID: id})
if err != nil {
return classifyAttachErr(err)
}
if done, jsonErr := cmdcore.MaybeOutputJSON(cmd, map[string]string{"vm": args[0], "pci": pci, "id": deviceID}); done {
return jsonErr
}
log.WithFunc("cmd.vm.device.attach").Infof(ctx, "attached device pci=%s id=%s vm=%s", pci, deviceID, args[0])
return nil
}

func (h Handler) DeviceDetach(cmd *cobra.Command, args []string) error {
ctx, a, err := resolveAttacher[vfio.Attacher](h, cmd, args, "device detach", vfio.ErrUnsupportedBackend)
if err != nil {
return err
}
id, _ := cmd.Flags().GetString("id")
if err := a.DeviceDetach(ctx, args[0], id); err != nil {
return classifyAttachErr(err)
}
if done, jsonErr := cmdcore.MaybeOutputJSON(cmd, map[string]string{"vm": args[0], "id": id}); done {
return jsonErr
}
log.WithFunc("cmd.vm.device.detach").Infof(ctx, "detached device id=%s vm=%s", id, args[0])
return nil
}

// resolveAttacher resolves args[0] to a hypervisor and asserts it implements
// A (fs.Attacher / vfio.Attacher). The op string ("fs attach", "device detach")
// prefixes both error wraps so the four handlers no longer repeat the
// Init→FindHypervisor→type-assert boilerplate.
func resolveAttacher[A any](h Handler, cmd *cobra.Command, args []string, op string, errUnsupported error) (context.Context, A, error) {
var zero A
ctx, conf, err := h.Init(cmd)
if err != nil {
return ctx, zero, err
}
hyper, err := cmdcore.FindHypervisor(ctx, conf, args[0])
if err != nil {
return ctx, zero, fmt.Errorf("%s: %w", op, err)
}
a, ok := hyper.(A)
if !ok {
return ctx, zero, fmt.Errorf("%s: backend %s: %w", op, hyper.Type(), errUnsupported)
}
return ctx, a, nil
}

// classifyAttachErr surfaces ErrNotRunning more clearly than the generic wrap.
func classifyAttachErr(err error) error {
if errors.Is(err, hypervisor.ErrNotRunning) {
return fmt.Errorf("vm is not running: %w", err)
}
return err
}
Loading
Loading