Skip to main content

vMetal

vMetal adds bare metal provisioning to the vCluster Platform. It uses Metal3 and Ironic to manage physical servers through BMC, and provisions them as nodes for tenant clusters via PXE boot and cloud-init.

info

vMetal does not require a hypervisor. Physical servers are enrolled as BareMetalHost resources. Metal3 inspects each server and provisions it with an OS image through Ironic.

The Complete AI Infrastructure StackPhysical HardwareDGX Nodes / HGX Systems / GPU Servers / NVLink / InfiniBandvMetalBare Metal Provisioning & Lifecycle ManagementPXE Boot | Metal3 | Ironic | BMC (Redfish/IPMI) | Hardware DiscoveryvClusterTenant & Cluster OrchestrationTenant Clusters | Tenant Isolation | Self-Service | IsolationvNodeRuntime Isolation for Shared HardwareSecurity Boundaries | Privileged Workloads | Node-Level IsolationCertified StacksPre-Validated AI Platform BlueprintsRun:ai | Slinky | SkyPilot | Ray | Terraform BlueprintsUSER EXPERIENCEPLATFORMINFRASTRUCTURENO HYPERVISORDirect hardware access for all workloadsUnified stack from physical metal to production AI platform
Click to enlarge

What vMetal provides

  • Pooled bare metal capacity — NodeTypes group servers by hardware profile (GPU model, CPU count, rack location). The platform claims and returns servers automatically as tenant cluster demand changes.
  • GPU and accelerator workloads — Direct hardware access with no virtualization. Workloads see GPUs, FPGAs, and other accelerators natively.
  • Automated provisioning pipeline — Register a server as a BareMetalHost. Metal3 handles BMC verification, hardware inspection, PXE boot, OS installation, and cloud-init automatically.
  • Automatic node registration — Provisioned servers join tenant clusters on their own. No manual kubeadm join or agent installation required.
  • Server lifecycle management — When a server is no longer needed, vMetal cleans it, releases its IP, and returns it to the available pool for reuse.
  • Tenant isolation — Combine with vCluster and vNode for tenant isolation and secure runtime isolation on shared hardware.

Where to start

If you want to...Go to
Set up your first bare metal serverInstall
Understand the provisioning architectureArchitecture
Understand GPU passthrough, vGPU, and MIG boundariesGPU presentation modes
Provision GPU servers for AI workloadsGPU Quickstart
Configure a NodeProvider and node typesConfiguration