Networking Guide
A vMetal deployment uses three networks, each carrying distinct traffic. For a conceptual overview of the provisioning network and PXE boot flow, see Networking.
Network zones
BMC network — out-of-band management traffic. Ironic connects to BMC endpoints over Redfish or IPMI to power servers on and off, set boot order, and monitor hardware status. This network often uses a dedicated management VLAN or subnet.
Provisioning network — PXE boot traffic. Bare metal servers boot the Ironic Python Agent installer from this network. The installer runs in memory and fetches the OS image, which may come from Ironic or an external HTTP source. The DHCP proxy server listens here for PXE requests.
Tenant network — the network provisioned servers use after PXE boot. For servers in a tenant cluster, Kubernetes node and workload traffic runs here.
Network separation
The BMC network is a dedicated out-of-band management network, separate from both the provisioning and tenant networks. It may be routable from the provisioning network or completely isolated, depending on security requirements. Sharing the BMC network with other segments is technically possible but not a recommended configuration.
The provisioning network and tenant network can share a segment. When they do, apply appropriate security measures such as a DHCP relay to guard against unauthorized DHCP responses during PXE boot.
Connectivity requirements
| Traffic | Source | Destination | Protocol |
|---|---|---|---|
| BMC power / boot control | Ironic | BMC endpoint | Redfish: TCP 443 or 80 |
| BMC power / boot control | Ironic | BMC endpoint | IPMI: UDP 623 |
| PXE DHCP (broadcast) | Bare metal server | DHCP proxy | UDP 67 |
| TFTP boot | Bare metal server | DHCP proxy | UDP 69 |
| HTTP (IPA image, iPXE, API proxy) | Bare metal server | DHCP proxy | TCP 8080 |
TFTP is not required for servers that use HTTP Boot instead of TFTP-based iPXE.
DHCP proxy
Deploy the DHCP server component in all configurations, including flat networks where servers and Ironic share a segment.
The component handles two separate functions. The DHCP server side responds to PXE DHCP requests from bare metal servers. The HTTP proxy side bridges API traffic between the Ironic Python Agent running on the server and Ironic, in both directions.
When the NodeProvider manages Metal3, the component picks up Ironic's service endpoints automatically. If you manage Metal3 independently, provide the Ironic endpoint URLs in the DHCP Helm values.
Multus CNI attaches a second interface to the DHCP server pod. This interface connects to the provisioning network and receives PXE DHCP requests. Configure the interface type using the networkAttachmentDefinition.config Helm value.
Bridge
Use a Linux bridge when the control plane cluster nodes have a bridge interface on the provisioning network. The bridge must exist on the node before the DHCP pod starts.
deploy:
dhcp:
enabled: true
helmValues: |
networkAttachmentDefinition:
vip: 192.168.100.2/24
config: |
{
"cniVersion": "0.3.1",
"type": "bridge",
"bridge": "br0",
"isDefaultGateway": false
}
Macvlan
Use macvlan when the control plane cluster nodes connect to the provisioning network on an existing physical interface. Multus creates a virtual sub-interface on that physical interface without requiring a pre-configured bridge.
deploy:
dhcp:
enabled: true
helmValues: |
networkAttachmentDefinition:
vip: 10.0.0.2/24
config: |
{
"cniVersion": "0.3.1",
"type": "macvlan",
"master": "eth0",
"mode": "bridge"
}
Host network
As an alternative to Multus CNI, set hostNetwork: true in the DHCP Helm values. The DHCP server pod then shares the host node's network namespace and can reach the provisioning network directly without a NetworkAttachmentDefinition. This removes the Multus CNI requirement but means the pod shares the host's network interfaces, which may have unintended side effects.
VIP
networkAttachmentDefinition.vip is the IP address the DHCP proxy listens on for PXE requests. Write it in CIDR notation. Multus uses the prefix to configure the interface.
vip: 192.168.100.2/24
Choose a static IP within the provisioning subnet that no other device uses. The platform IPAM does not manage this IP. Configure it independently from the network-cidr or network-ip-range allocation range.
IPAM
The platform assigns one static IP to each provisioned server. Configure the address pool on the node type using one of two properties.
CIDR allocation
metal3.vcluster.com/network-cidr allocates from a subnet. The format is <gateway>/<prefix>.
192.168.100.1/24
The platform derives the network address from the prefix length and skips the network address, gateway, and broadcast address. It allocates IPs sequentially from the remaining pool.
nodeTypes:
- name: compute-node
properties:
metal3.vcluster.com/network-cidr: "192.168.100.1/24"
metal3.vcluster.com/dns-servers: "8.8.8.8,8.8.4.4"
IP range allocation
metal3.vcluster.com/network-ip-range allocates from one or more explicit IP ranges. Use this to restrict allocation to a subset of a subnet or to exclude specific addresses.
The format uses comma-separated ranges in <start>-<end> form.
10.0.0.20-10.0.0.30,10.0.0.40-10.0.0.50
Range mode allocates every IP in the range exactly. The platform does not skip network or broadcast addresses.
nodeTypes:
- name: compute-node
properties:
metal3.vcluster.com/network-ip-range: "10.0.0.20-10.0.0.30,10.0.0.40-10.0.0.50"
IP tracking and release
The platform stores each allocation in a Kubernetes ConfigMap on the control plane cluster. When you delete a Machine, the platform releases its IP immediately. The next allocation can reuse it.
Do not configure both network-cidr and network-ip-range on the same node type. Set only one.
Conflict avoidance
The platform uses Kubernetes optimistic locking to serialize concurrent allocations. Two simultaneous provisioning requests do not receive the same IP.
The platform does not validate the configured range against addresses already in use on the network. Ensure the range does not overlap with static addresses, the DHCP proxy VIP, or other DHCP servers on the provisioning network.
The platform detects exhaustion at provisioning time. Provisioning fails with an allocation error when all IPs are in use. Size your range to accommodate the maximum number of servers you expect to run simultaneously.
Troubleshooting
Server stuck in provisioning
A server stuck in provisioning usually means PXE boot is not completing.
DHCP proxy unreachable. Verify that the DHCP proxy VIP is reachable from the server's PXE NIC. Check that UDP port 67 is open between the provisioning network and the DHCP server pod.
TFTP or HTTP blocked. After the DHCP handshake, the server downloads boot assets over TFTP (port 69) and the OS image over HTTP (port 6180) from Ironic. Verify both ports are open from the provisioning network to the control plane cluster.
Wrong MAC address. The bootMACAddress in the BareMetalHost must match the NIC the server uses for PXE boot. A mismatch causes the server to boot from a different device and never contact Ironic.
Check the DHCP server pod logs for diagnostic information. The logs show whether DHCP requests are received from the server's MAC address, and whether subsequent TFTP and HTTP requests arrive.
Ironic cannot reach the BMC
No route to BMC. Ironic needs a route to each BMC IP. If the BMC uses a dedicated management network, the control plane cluster nodes must have a route to that subnet.
Firewall blocking management ports. Redfish uses TCP port 443 or 80. IPMI uses UDP port 623. Verify these are open between the control plane cluster and every BMC.
Wrong credentials. The Secret referenced by credentialsName must be in the same namespace as the BareMetalHost. Verify the username and password are correct by logging into the BMC directly.
BMC connectivity and credential errors are visible in the BareMetalHost resource status conditions.
IP not allocated
Range exhausted. All IPs in the configured range are in use. Deprovision unused servers or expand the range on the node type.
Format error. The network-cidr format is <gateway>/<prefix>, for example 192.168.100.1/24. The network-ip-range format is <start>-<end>, for example 10.0.0.20-10.0.0.30. Check the Machine status conditions for error details.
Gateway outside subnet. The gateway in network-cidr must be a valid host address within the specified subnet. It cannot be the network or broadcast address.