Infrastructure

Bare Metal

Definition

Bare metal refers to GPU servers accessed without virtualisation or hypervisor overhead — the customer receives direct hardware access via SSH or IPMI/BMC. In GPU cloud contexts, bare-metal instances provide full control over the operating system, drivers, CUDA runtime, and network configuration. This eliminates the performance overhead of virtualisation (typically 5-15% for GPU workloads) and provides deterministic performance characteristics critical for large-scale training. Most neoclouds default to bare-metal delivery, while hyperscalers primarily offer virtualised instances.

Technical Context

Bare-metal GPU servers expose the full hardware topology to the user: NVLink mesh, InfiniBand adapters, local NVMe storage, and NUMA domains. This enables workload-specific optimisations impossible in virtualised environments — custom NCCL configurations, GPU affinity settings, and direct RDMA access. The trade-off is reduced multi-tenancy efficiency: a bare-metal server serves one customer at a time, and fractional GPU access requires container-level isolation (MIG, MPS) rather than hypervisor-level partitioning.

Advisory Relevance

The bare-metal vs virtualised distinction affects pricing, utilisation, and competitive positioning. We evaluate whether operators are positioned correctly for their target market — enterprise buyers often need managed Kubernetes, while AI labs want bare-metal with RDMA.

This glossary is maintained by Disintermediate as a reference for GPU infrastructure professionals, investors, and operators. Each entry reflects terminology as used in active advisory engagements and market intelligence work.

View all terms Discuss this topic