|
| 1 | +# VM Lifecycle |
| 2 | + |
| 3 | +Two things shape how you integrate with Slicer: how VMs are named, and how long they live. Both follow from a single primitive, the **host group**, defined in the daemon's config file. |
| 4 | + |
| 5 | +## Host groups |
| 6 | + |
| 7 | +A host group is a named pool of VMs that share the same hardware profile (vCPU, RAM, storage), network, and image. The daemon reads its host groups from its YAML config (typically produced by `slicer new`) at startup. You can define multiple groups in a single config. The common pattern is one group for a long-lived control plane plus one for ephemeral tenant or workload VMs: |
| 8 | + |
| 9 | +```yaml |
| 10 | +config: |
| 11 | + host_groups: |
| 12 | + # One persistent, always-on VM created at daemon start. |
| 13 | + - name: ctrl |
| 14 | + storage: image |
| 15 | + storage_size: 25G |
| 16 | + count: 1 |
| 17 | + vcpu: 4 |
| 18 | + ram_gb: 8 |
| 19 | + userdata: | |
| 20 | + apt-get update -qy |
| 21 | + apt-get install -qy nginx postgresql |
| 22 | + network: |
| 23 | + bridge: brctrl0 |
| 24 | + tap_prefix: ctrl |
| 25 | + gateway: 192.168.137.1/24 |
| 26 | + |
| 27 | + # No pre-allocated VMs, everything is API-launched. |
| 28 | + # Isolated networking, firewalled subnet with egress controlled by allow/drop. |
| 29 | + - name: sbox |
| 30 | + storage: image |
| 31 | + storage_size: 25G |
| 32 | + count: 0 |
| 33 | + vcpu: 2 |
| 34 | + ram_gb: 4 |
| 35 | + network: |
| 36 | + mode: "isolated" |
| 37 | + drop: [] |
| 38 | + allow: ["0.0.0.0/0"] |
| 39 | + |
| 40 | + image: "ghcr.io/openfaasltd/slicer-systemd-min:6.1.90-x86_64-latest" |
| 41 | + hypervisor: firecracker |
| 42 | + api: |
| 43 | + port: 8080 |
| 44 | + bind_address: "127.0.0.1" |
| 45 | + auth: |
| 46 | + enabled: true |
| 47 | +``` |
| 48 | +
|
| 49 | +Run `slicer new --help` for the full set of flags and `slicer new NAME > slicer.yaml` to generate a starter config. |
| 50 | + |
| 51 | +Host groups must be defined in the YAML file before starting the daemon. They cannot be added dynamically at runtime. If you have that need, consider the [Slicer per tenant](/platform/instance-per-tenant/) model, where each tenant gets its own daemon and its own host group config. |
| 52 | + |
| 53 | +### What `count:` does |
| 54 | + |
| 55 | +- **`count: N`**: Slicer creates and **protects** N VMs in that group at startup. They're persistent by construction: the daemon restores them after its own restart, and they come back automatically if the host reboots. Use this for the control plane, a shared database, or anything that must be present whenever the daemon is running. |
| 56 | +- **`count: 0`**: no pre-allocated VMs. Callers create and delete VMs on demand through the API (`POST /hostgroup/NAME/nodes`). This is the right shape for sandboxes, per-job workers, and tenant workloads. |
| 57 | + |
| 58 | +Both shapes can coexist in the same daemon. The split above, one persistent control-plane host group plus one on-demand sandbox host group, is how most multi-tenant deployments are structured. |
| 59 | + |
| 60 | +### VM size at launch |
| 61 | + |
| 62 | +The host group's `vcpu` / `ram_gb` are the **default and maximum** for VMs in that group. When you launch a VM through the API you can: |
| 63 | + |
| 64 | +- Omit `cpus` / `ram_bytes` entirely - the VM gets the host group's defaults. |
| 65 | +- Request **the same or less** than the defaults - honoured as-is. |
| 66 | +- Request **more** than the defaults - rejected with `400`. |
| 67 | + |
| 68 | +So if `sbox` is defined at 2 vCPU / 4 GiB, a client can legitimately launch a 1 vCPU / 1 GiB worker inside it but cannot launch an 8 vCPU / 16 GiB worker. To offer bigger VMs, define a separate host group with a bigger profile. |
| 69 | + |
| 70 | +## Naming: pets vs. cattle |
| 71 | + |
| 72 | +You don't pick the hostname. When Slicer creates a VM in a host group, it assigns the name: `<hostgroup>-1`, `<hostgroup>-2`, `<hostgroup>-3`, and so on. Numbers increment per host group. |
| 73 | + |
| 74 | +This is deliberate. Host groups are pools of interchangeable VMs, not a hand-managed machine register. No name collisions, no "that name is already taken" errors. Slicer tracks the real hostname internally; your application should track meaning through **tags**. |
| 75 | + |
| 76 | +## Tags for stable identity |
| 77 | + |
| 78 | +Tags are a free-form array of strings attached to each VM. Pass them at creation time: |
| 79 | + |
| 80 | +```bash |
| 81 | +curl -X POST http://127.0.0.1:8080/hostgroup/sbox/nodes \ |
| 82 | + -H "Content-Type: application/json" \ |
| 83 | + -d '{ |
| 84 | + "tags": ["user=alice", "job=build-4821", "display=Alice dev environment"], |
| 85 | + "cpus": 2, |
| 86 | + "ram_bytes": 4294967296 |
| 87 | + }' |
| 88 | +``` |
| 89 | + |
| 90 | +Any string is valid. The convention that works well in practice is `key=value`: easy to filter, easy to render in a UI. Use it to carry whatever your application needs to reason about later, for example a sandbox expiry deadline: `expires_at=2026-04-14 08:28:00`. |
| 91 | + |
| 92 | +### Looking up a VM by tag |
| 93 | + |
| 94 | +The list endpoint filters on either exact match or prefix: |
| 95 | + |
| 96 | +```bash |
| 97 | +# exact |
| 98 | +GET /nodes?tag=user=alice |
| 99 | +
|
| 100 | +# prefix (matches any tag starting with "user=") |
| 101 | +GET /nodes?tag_prefix=user= |
| 102 | +``` |
| 103 | + |
| 104 | +Also available on a specific host group: |
| 105 | + |
| 106 | +```bash |
| 107 | +GET /hostgroup/sbox/nodes?tag_prefix=user= |
| 108 | +``` |
| 109 | + |
| 110 | +### How to represent Slicer VMs in your product |
| 111 | + |
| 112 | +Whether your product surfaces VMs to humans (a dashboard, a CLI, a support tool) or to other systems (a scheduler, a billing pipeline, an API), the shape is the same: |
| 113 | + |
| 114 | +1. **Create** with tags carrying the display name, owner, and any internal IDs from your product like tenant, namespace, billing ID, environment, and so on. |
| 115 | +2. **List / look up** VMs. On a shared daemon (Slicer per host) scope results with `tag_prefix=owner=` or similar. On a per-tenant daemon the unfiltered list already belongs to one tenant, so a plain `GET /nodes` is enough. |
| 116 | +3. **Render** the tag value where an end user sees a VM; keep the auto-assigned hostname as the internal handle your product uses to address it. |
| 117 | +4. **Manage** (start, stop, delete) via the real hostname, carried alongside the friendly tag in whatever record your product already stores. |
| 118 | + |
| 119 | +This keeps Slicer's naming model out of your product's domain language while still giving you precise control over each VM. |
| 120 | + |
| 121 | +## Lifecycle |
| 122 | + |
| 123 | +### Ephemeral is the default |
| 124 | + |
| 125 | +VMs launched through the API (`POST /hostgroup/NAME/nodes`) are **ephemeral** by default. They run until one of three things happens, and in every case the disk is removed and there is no automatic restart: |
| 126 | + |
| 127 | +- **DELETE via the API**: the VM stops and the disk is removed. |
| 128 | +- **Guest exits on its own** (`sudo reboot`, kernel panic, and similar): the daemon's reaper notices and cleans up the record and the disk. |
| 129 | +- **Daemon restart**: ephemeral VM records are not carried across, so the VMs are gone. |
| 130 | + |
| 131 | +This is the right shape for code execution, CI jobs, batch processing, and anything where the VM is disposable. |
| 132 | + |
| 133 | +### Persistent API-launched VMs |
| 134 | + |
| 135 | +For VMs that should survive daemon restarts, such as long-running dev environments, tenant workspaces, and user-facing sandboxes, set `persistent: true` at creation: |
| 136 | + |
| 137 | +```bash |
| 138 | +curl -X POST http://127.0.0.1:8080/hostgroup/sbox/nodes \ |
| 139 | + -H "Content-Type: application/json" \ |
| 140 | + -d '{ |
| 141 | + "persistent": true, |
| 142 | + "tags": ["user=alice", "purpose=dev"], |
| 143 | + "cpus": 2, |
| 144 | + "ram_bytes": 4294967296 |
| 145 | + }' |
| 146 | +``` |
| 147 | + |
| 148 | +Or with the CLI: |
| 149 | + |
| 150 | +```bash |
| 151 | +slicer vm launch sbox --persistent --tag user=alice |
| 152 | +``` |
| 153 | + |
| 154 | +Persistent VMs: |
| 155 | + |
| 156 | +- Survive daemon restarts. The daemon re-attaches to them on startup. |
| 157 | +- Are **not** deleted when the VM stops. Their disk is retained. |
| 158 | +- Can be stopped deliberately without losing state, via `POST /vm/HOSTNAME/shutdown` or `sudo reboot` inside the guest. |
| 159 | +- Stay around until you explicitly `DELETE` them through the API or CLI. Delete removes the disk; there is no undelete. |
| 160 | + |
| 161 | +Bring a stopped persistent VM back up with: |
| 162 | + |
| 163 | +```bash |
| 164 | +slicer vm relaunch HOSTNAME |
| 165 | +``` |
| 166 | + |
| 167 | +or the equivalent REST call: |
| 168 | + |
| 169 | +```bash |
| 170 | +POST /vm/HOSTNAME/relaunch |
| 171 | +``` |
| 172 | + |
| 173 | +Relaunch is the intended recovery path whenever a persistent VM has been shut down cleanly, whether by the API, the guest, or a host reboot. Config-declared VMs from `count: N` are a special case: the daemon re-launches them automatically whenever it starts, so no manual `relaunch` is needed for those. |
| 174 | + |
| 175 | +## Where to go next |
| 176 | + |
| 177 | +Most integrations land in one of the shapes below. Pick the row that's closest to what you're building and follow the deployment link. |
| 178 | + |
| 179 | +| Use case | Lifecycle | Deployment | Networking | |
| 180 | +| --- | --- | --- | --- | |
| 181 | +| Code execution / agent sandbox | Ephemeral | [Slicer per tenant](/platform/instance-per-tenant/) | [Isolated](/reference/networking/) + allowlist | |
| 182 | +| CI/CD job runners | Ephemeral | [Slicer per host](/platform/single-instance/) | [Isolated](/reference/networking/) | |
| 183 | +| Batch processing | Ephemeral | [Slicer per host](/platform/single-instance/) | Bridge | |
| 184 | +| Dev environments | Persistent | [Slicer per tenant](/platform/instance-per-tenant/) | Bridge | |
| 185 | +| Tenant workspaces | Persistent | [Slicer per tenant](/platform/instance-per-tenant/) | [Isolated](/reference/networking/) | |
| 186 | +| Named resources in your product | Persistent | [Slicer per tenant](/platform/instance-per-tenant/) | Bridge | |
| 187 | +| Control plane / shared services | Persistent | [Slicer per host](/platform/single-instance/) | Bridge | |
| 188 | + |
| 189 | +Networking choices are rules of thumb. If your tenants execute untrusted code, default to [isolated mode](/reference/networking/) with an explicit egress allowlist rather than bridge. |
| 190 | + |
| 191 | +## See also |
| 192 | + |
| 193 | +* [Slicer per host](/platform/single-instance/): one daemon, tenants share it, ownership tracked via tags. |
| 194 | +* [Slicer per tenant](/platform/instance-per-tenant/): one daemon per tenant, isolated networking, Unix sockets. |
| 195 | +* [Go SDK](/platform/go-sdk/) |
| 196 | +* [TypeScript SDK](/platform/typescript-sdk/) |
| 197 | +* [REST API reference](/reference/api/): exact request/response shapes. |
0 commit comments