Operations

Cluster Utilisation

Definition

Cluster utilisation measures the percentage of available GPU capacity that is actively generating revenue at any given time. It is the single most important variable in GPU infrastructure economics — the difference between 75% and 90% utilisation on a 1,000-GPU cluster can represent millions of dollars in annual revenue. Even well-capitalised neoclouds with multi-billion-dollar backlogs typically operate at 85-95% utilisation, not 100%. Factors that reduce utilisation include maintenance windows, customer workload variability, cluster reconfiguration between customers, hardware failures, and cooling capacity constraints.

Technical Context

Utilisation can be measured at multiple levels: GPU core utilisation (percentage of compute cycles in use), memory utilisation (percentage of VRAM allocated), time-based utilisation (percentage of hours billed), and revenue utilisation (actual revenue as a percentage of theoretical maximum at list price). These metrics can diverge significantly — a GPU running at 30% compute utilisation but billed for 100% of hours shows different pictures depending on which metric you use. Revenue utilisation, accounting for discounts and void periods, is the metric that matters for financial modelling.

Advisory Relevance

Utilisation assumptions are the most common source of overstatement in GPU infrastructure business plans. Our due diligence work consistently finds management teams projecting 95-100% utilisation when operational reality is 75-90%. We benchmark against observed performance across the market.

This glossary is maintained by Disintermediate as a reference for GPU infrastructure professionals, investors, and operators. Each entry reflects terminology as used in active advisory engagements and market intelligence work.

View all terms Discuss this topic