Multi-cloud is one of the most frequently misunderstood concepts in cloud infrastructure. The sales pitch is compelling: spread your workloads across AWS, Azure, and GCP and you reduce dependency on any single provider, can arbitrage costs, and can pick the best service from each platform. The practice is considerably messier.
The operational overhead of running multi-cloud is roughly proportional to the number of platforms involved, and the overhead is not linear — the complexity of cross-cloud networking, identity federation, and observability tooling grows faster than the count of platforms would suggest. Organisations that adopt multi-cloud without explicit plans for managing that overhead typically end up with fragmented visibility and higher operational costs than a single-cloud deployment would produce.
The resilience argument is real but narrower than the pitch suggests. Multi-cloud provides resilience against the class of failures that affect a single cloud provider's availability — regional outages, platform-specific incidents. It does not provide resilience against operational failures, misconfigurations, or the class of supply-chain issues that affect cloud platforms' underlying dependencies.
The cost-arbitrage argument is the weakest. Cloud pricing is complex, and the cost difference between comparable services on different platforms is often smaller than the cost of the additional engineering overhead required to run on both. The exceptions are workloads with very specific unit-economics profiles — large-scale data egress, for instance, where egress costs differ materially between providers.
Where multi-cloud genuinely helps: data sovereignty requirements that map different data categories to different provider footprints; acquisition integration where the acquired entity runs a different platform; licensing arrangements that make a specific platform materially cheaper for specific workloads. These are real-world cases with clear-enough economics to justify the overhead.
Our advice: default to a primary-cloud model with explicit exceptions for specific use cases, not the reverse. Define the use-case-driven exceptions clearly before adding a second provider. Build the operational model to handle both before the second provider carries production traffic.
Multi-cloud done well is a mature engineering practice that requires investment proportional to its complexity. Multi-cloud done carelessly is a significantly more expensive version of single-cloud with worse visibility.