Architecture
vMetal manages bare metal servers through standard Kubernetes resources. Physical servers are represented as BareMetalHost objects on a control plane cluster. When a tenant cluster requests a node, vMetal selects an available server, provisions it via PXE boot, and attaches it as a Kubernetes node.
vMetal builds on Metal3 for bare metal lifecycle management and Ironic for out-of-band server provisioning. No hypervisor or virtualization layer is involved.
What vMetal adds
Metal3 and Ironic handle BMC registration, hardware inspection, PXE boot, OS installation, and server cleaning. vMetal builds on that foundation to expose server capacity to tenant clusters.
- NodeProvider deploys and manages the Metal3/Ironic/DHCP stack on a control plane cluster. It groups servers into named pools called node types.
- NodeTypes select
BareMetalHostresources by label. They define provisioning defaults such as OS image, SSH keys, and network configuration. - Machines are requests for one server of a given node type. When the platform creates a Machine, it selects an available
BareMetalHostand merges properties from the provider down to the machine level. It then allocates an IP, generatesuser-dataandnetwork-dataSecrets, and triggers provisioning. - IPAM tracks IP allocations per Machine. When a Machine is deleted, the platform releases the IP automatically.
- vCluster join automation runs inside the generated cloud-init configuration. When a Machine targets a tenant cluster as a private node, the cloud-init joins the server to that cluster automatically.
- UI workflows cover server registration, provisioning, deprovisioning, power on/off, and reboot through the Platform UI.
Provisioning flow
When a Machine is created, vMetal provisions the server through these steps:
- The platform matches the Machine's node type selector against available
BareMetalHostresources and claims one. Bare metal Machines map to existing physical servers. The platform does not create new instances. - The platform merges provisioning properties from the NodeProvider, node type, environment, and Machine. It allocates an IP from the configured range and generates
user-dataandnetwork-dataSecrets on the control plane cluster. - The platform sets the
BareMetalHost's image,userData, andnetworkDatareferences to trigger provisioning. - Ironic powers on the server via BMC (Redfish or IPMI) and initiates a PXE boot. An in-memory installer writes the OS to disk. Ironic then sets the boot device to disk and reboots the server.
- The server boots into the provisioned OS and runs the cloud-init scripts.
- When a Machine targets a tenant cluster as a private node, the cloud-init joins the server to that cluster as a worker node.
When the Machine is deleted, vMetal deprovisions the BareMetalHost, releases the allocated IP, and returns the server to the available pool.
Server lifecycle
BareMetalHost resources go through the following states:
| State | Description |
|---|---|
registering | The server is being registered in the Ironic database and BMC credentials are verified. |
inspecting | Hardware inventory is actively being collected: CPU, RAM, NICs, disks, firmware, and PCIe devices (such as GPUs). |
available | Server is ready to be provisioned. |
provisioning | OS image is being written and cloud-init is being configured. |
provisioned | Server is running with the configured OS. |
deprovisioning | Server is being cleaned and returned to available state. |
error | An error occurred. Check the BareMetalHost status for details. |
When a Machine claims a server, it moves from available through provisioning to provisioned. When the claim is removed, the server is deprovisioned and returned to available.
Components
vMetal consists of the following components, all deployed on the control plane cluster:
Metal3 Bare Metal Operator
The Bare Metal Operator manages BareMetalHost custom resources. It drives server registration, hardware inspection, and state transitions by communicating with Ironic to execute provisioning operations.
Ironic
Ironic handles the low-level provisioning: BMC communication (power on/off, boot device selection), PXE boot orchestration, and OS image installation. It supports multiple BMC protocols including Redfish and IPMI, with broad hardware vendor compatibility out of the box.
DHCP server
A proxy DHCP server that handles PXE boot by forwarding requests between bare metal servers and Ironic. When the bare metal servers and Ironic are on different networks, the DHCP server bridges the communication. It is automatically configured based on BareMetalHost resources.
Multus CNI
Multus is a CNI plugin that enables attaching the DHCP server to a separate provisioning network. It allows the DHCP server pod to have a network interface on the bare metal provisioning network in addition to the cluster network.
Stack integration
vMetal operates as part of the vCluster Platform stack:
- vMetal provisions and manages physical servers as Kubernetes nodes.
- vCluster Platform orchestrates the control plane, node providers, and tenant management.
- vCluster provides tenant clusters for tenant isolation on shared infrastructure.
- vNode adds secure runtime isolation, allowing privileged workloads (Docker-in-Docker, hostPID) to run safely on shared hardware.