Skip to main content

Architecture

vMetal manages bare metal servers through standard Kubernetes resources. Physical servers are represented as BareMetalHost objects on a control plane cluster. When a tenant cluster requests a node, vMetal selects an available server, provisions it via PXE boot, and attaches it as a Kubernetes node.

vMetal builds on Metal3 for bare metal lifecycle management and Ironic for out-of-band server provisioning. No hypervisor or virtualization layer is involved.

What vMetal adds

Metal3 and Ironic handle BMC registration, hardware inspection, PXE boot, OS installation, and server cleaning. vMetal builds on that foundation to expose server capacity to tenant clusters.

  • NodeProvider deploys and manages the Metal3/Ironic/DHCP stack on a control plane cluster. It groups servers into named pools called node types.
  • NodeTypes select BareMetalHost resources by label. They define provisioning defaults such as OS image, SSH keys, and network configuration.
  • Machines are requests for one server of a given node type. When the platform creates a Machine, it selects an available BareMetalHost and merges properties from the provider down to the machine level. It then allocates an IP, generates user-data and network-data Secrets, and triggers provisioning.
  • IPAM tracks IP allocations per Machine. When a Machine is deleted, the platform releases the IP automatically.
  • vCluster join automation runs inside the generated cloud-init configuration. When a Machine targets a tenant cluster as a private node, the cloud-init joins the server to that cluster automatically.
  • UI workflows cover server registration, provisioning, deprovisioning, power on/off, and reboot through the Platform UI.

Provisioning flow

When a Machine is created, vMetal provisions the server through these steps:

  1. The platform matches the Machine's node type selector against available BareMetalHost resources and claims one. Bare metal Machines map to existing physical servers. The platform does not create new instances.
  2. The platform merges provisioning properties from the NodeProvider, node type, environment, and Machine. It allocates an IP from the configured range and generates user-data and network-data Secrets on the control plane cluster.
  3. The platform sets the BareMetalHost's image, userData, and networkData references to trigger provisioning.
  4. Ironic powers on the server via BMC (Redfish or IPMI) and initiates a PXE boot. An in-memory installer writes the OS to disk. Ironic then sets the boot device to disk and reboots the server.
  5. The server boots into the provisioned OS and runs the cloud-init scripts.
  6. When a Machine targets a tenant cluster as a private node, the cloud-init joins the server to that cluster as a worker node.

When the Machine is deleted, vMetal deprovisions the BareMetalHost, releases the allocated IP, and returns the server to the available pool.

Server lifecycle

BareMetalHost resources go through the following states:

StateDescription
registeringThe server is being registered in the Ironic database and BMC credentials are verified.
inspectingHardware inventory is actively being collected: CPU, RAM, NICs, disks, firmware, and PCIe devices (such as GPUs).
availableServer is ready to be provisioned.
provisioningOS image is being written and cloud-init is being configured.
provisionedServer is running with the configured OS.
deprovisioningServer is being cleaned and returned to available state.
errorAn error occurred. Check the BareMetalHost status for details.

When a Machine claims a server, it moves from available through provisioning to provisioned. When the claim is removed, the server is deprovisioned and returned to available.

Components

vMetal consists of the following components, all deployed on the control plane cluster:

Metal3 Bare Metal Operator

The Bare Metal Operator manages BareMetalHost custom resources. It drives server registration, hardware inspection, and state transitions by communicating with Ironic to execute provisioning operations.

Ironic

Ironic handles the low-level provisioning: BMC communication (power on/off, boot device selection), PXE boot orchestration, and OS image installation. It supports multiple BMC protocols including Redfish and IPMI, with broad hardware vendor compatibility out of the box.

DHCP server

A proxy DHCP server that handles PXE boot by forwarding requests between bare metal servers and Ironic. When the bare metal servers and Ironic are on different networks, the DHCP server bridges the communication. It is automatically configured based on BareMetalHost resources.

Multus CNI

Multus is a CNI plugin that enables attaching the DHCP server to a separate provisioning network. It allows the DHCP server pod to have a network interface on the bare metal provisioning network in addition to the cluster network.

Stack integration

vMetal operates as part of the vCluster Platform stack:

  • vMetal provisions and manages physical servers as Kubernetes nodes.
  • vCluster Platform orchestrates the control plane, node providers, and tenant management.
  • vCluster provides tenant clusters for tenant isolation on shared infrastructure.
  • vNode adds secure runtime isolation, allowing privileged workloads (Docker-in-Docker, hostPID) to run safely on shared hardware.