NVIDIA's customer concentration — five buyers, half the revenue, and what happens if one of them in-houses

NVIDIA's 10-K reports customer concentration via a generic line item: "one customer represented approximately 13% of total revenue." That single sentence is the most consequential disclosure in the document. The other 87% is also concentrated — just spread across four other hyperscalers in the same range. When you triangulate the disclosures against what each hyperscaler reports as AI capex, the picture clarifies: Microsoft, Meta, Alphabet, Amazon, and Oracle together represent ~45-55% of $NVDA's data-center revenue depending on the quarter.

That concentration is the bear case the CUDA moat cannot defend against. CUDA defends against substrate switching by new buyers. It does very little to defend against an existing buyer who has the engineering capacity to absorb the migration cost in pursuit of capex savings at scale. The hyperscalers do. Three of them are actively executing.

This article is what each concentration risk actually looks like, why Google is the canary, and how to read the disclosure language as it shifts.

The TL;DR. Customer concentration is the second-largest structural risk on NVDA (after HBM supply). Google's TPU is the proof that a hyperscaler can fully in-house — Google now runs most of its internal AI workloads on TPU, not NVDA. AWS Trainium and Meta MTIA are 18-36 months behind that path. If a second hyperscaler reaches "majority internal workload on custom silicon" in the 2027-2028 window, NVDA's data-center revenue base contracts visibly. Microsoft and Oracle are not on that path and stay structurally long NVDA.

What the disclosures actually say

NVIDIA's quarterly filings disclose customer concentration in two ways:

1. Direct customers (the named "Customer A, B, C" line). These are the entities NVIDIA invoices directly. The 10-K typically discloses two or three customers crossing the 10% threshold. The named entities are usually OEMs (Dell, Hewlett Packard Enterprise, SuperMicro) or distributors who assemble systems for the hyperscalers, not the hyperscalers themselves.

2. Indirect end-customers (the "one indirect customer represented approximately 19% of total revenue" line). This is the one that matters. The indirect-customer disclosure captures the hyperscaler who ultimately owns the silicon — the entity that ordered HGX servers from SuperMicro, which means SuperMicro shows up as the direct customer but Microsoft (or whoever) is the actual end-purchaser. NVIDIA discloses these because the auditor requires it under ASC 280; the entity is identified by reference to the concentration risk not by name.

Cross-reference NVIDIA's indirect-customer disclosures against the named hyperscalers' AI capex line items in their own filings and the picture sharpens:

Microsoft — discloses ~$80-90B annual capex through FY2026 of which roughly half is AI-data-center. Direct NVIDIA buyer at large scale; ~13-15% of NVDA revenue is the rough triangulation.
Meta — disclosed AI capex of $60-65B for 2025, rising. Direct NVIDIA buyer at large scale; ~10-13% of NVDA revenue.
Alphabet (Google) — large buyer of NVDA for Google Cloud customer workloads, but Google's internal workloads run on TPU. The split inside Google is ~70/30 TPU/NVDA for internal compute; the Google Cloud NVDA spend is for external Cloud customers who specifically want NVIDIA silicon.
Amazon (AWS) — large NVDA buyer for AWS EC2 GPU instances + internal AI workloads, but ramping Trainium for internal training and Inferentia for internal inference. Direct NVDA share roughly 10-12%.
Oracle — emerged in 2024-2025 as a major NVDA buyer for OCI's AI capacity and the Stargate project. Smaller than the top four but growing fastest.

The top five together represent ~45-55% of NVDA's data-center revenue depending on the quarter. That's tight concentration even by tech-sector standards.

Why concentration is the right framing, not customer count

NVIDIA's defenders point out that the named "Customer A" at 13% in the 10-K is an OEM, not a hyperscaler, and that NVIDIA serves "thousands of enterprise customers." Both statements are true and irrelevant.

The relevant question is not how many entities sign NVDA invoices. It's how many independent decision-makers control the demand. The hyperscaler architectures are centrally planned at the CTO/CFO level — Satya Nadella, Sundar Pichai, Mark Zuckerberg, Andy Jassy, Larry Ellison are the actual buyers. When Meta decides to allocate $20B to MTIA over Blackwell for the 2027 cycle, that's one decision compressing $20B of NVDA TAM at one company. The "thousands of enterprise customers" sum to a small minority of the data-center revenue base.

Concentration risk on NVDA is behavior-correlated, not just count-concentrated. Five hyperscalers facing the same gross-margin pressure on AI inference (cloud customers expect price/perf parity with their on-prem alternatives at 25-30% lower TCO) all have the same incentive to develop custom silicon. They have done so:

Google: TPU v1 (2015) through TPU v5e (2024) through Ironwood TPU v7 (2025-2026). The most mature custom-silicon program.
Amazon: Trainium (training, 2020), Inferentia (inference, 2019), Trainium2 (2024). The second-most mature.
Meta: MTIA v1 (2023, inference), MTIA v2 (2024-2025). Catching up.
Microsoft: Maia 100 (announced 2023, shipping 2024-2025). The latest entrant.
Oracle: no public custom-silicon program. Structurally long NVDA.

Why Google's TPU is the canary

Google announced TPU v1 in 2016 retrospectively (it had been running internally since 2015). The market response was muted — "Google has special needs, this won't generalize." Ten years later, TPU has done the following:

1. Captured the majority of Google's internal AI workload. Search ranking, ads click prediction, YouTube recommendation, Gmail spam classification, Google Photos object detection, the LaMDA/Bard/Gemini training runs — these all run on TPU, not NVIDIA. Google has not disclosed the split precisely but third-party teardowns and engineer-blog disclosures triangulate to ~70-80% of internal AI compute on TPU.

2. Reached external commercial scale. TPU is available on Google Cloud at competitive pricing vs NVDA H100/H200 instances on the same workloads. Anthropic announced in 2024 it would train Claude models on Google's TPU as well as on NVIDIA — the first time a frontier-model lab publicly committed to a non-NVIDIA training substrate at scale.

3. Established that the migration cost is payable. This is the part the bear case hangs on. Google has spent ~10 years and billions of dollars building the TPU software stack (XLA compiler, Pathways orchestration, JAX integration) to the point that internal teams choose TPU over NVDA on Google's own infra. If Meta, AWS, and Microsoft each spend 5-7 years and tens of billions on their custom-silicon programs, the migration cost can be paid down — and once it has been, the recurring NVDA spend at that hyperscaler steps down.

The trade-relevant timeline:

2025-2026: AWS Trainium2 ramping; meaningful share of AWS internal inference workloads.
2026-2027: Meta MTIA v3 expected. First publicly disclosed MTIA training of a frontier model.
2027-2028: Microsoft Maia v2 or v3 expected to reach meaningful internal share.
2028+: If two or more hyperscalers cross the "majority internal AI on custom silicon" threshold, NVDA's concentrated revenue base contracts visibly.

NVIDIA's defense — see the CUDA moat — is that the migration cost is in the mid-nine-figures per hyperscaler and the chip premium has to compress meaningfully before the math flips. That defense holds for new buyers. It does not hold for existing buyers who have already absorbed years of CapEx into their internal silicon programs and now want to amortize it.

What NVIDIA does to fight back

NVIDIA is not standing still on concentration. Three counter-moves are visible:

1. The platform sell. NVIDIA has progressively repositioned from "we sell GPUs" to "we sell the integrated platform" — DGX systems, MGX reference designs, AI Enterprise software stack, NIM microservices, the full-stack approach. The pitch to a hyperscaler is "you can build your own ASIC, but you can't build the platform we ship — buy ours and reallocate your engineering cycles to your actual product." This works for some workloads (rapid frontier-model iteration, customer-cloud NVDA-demanded instances) and not for others (mature inference workloads at known scale, which is exactly what custom-ASIC programs target first).

2. The customer-cloud play. NVIDIA is investing in independent neoclouds (CoreWeave, Lambda, Crusoe, Together) and even building its own DGX Cloud offering. The strategy is to bypass the hyperscalers entirely — sell capacity directly to end-developers, taking the cloud layer in-house. This is structurally hostile to Microsoft/Google/AWS but it diversifies the demand base off the top-5.

3. The supply allocation lever. When HBM is constrained (which it is — see the HBM bottleneck) NVIDIA allocates the scarce supply to the customers who will keep buying long-term. Reports through 2024-2025 indicated NVIDIA prioritized neoclouds and Oracle (high-growth, no custom-silicon program) ahead of Google (large but in-housing) on initial Blackwell allocations. This is rational allocation policy from NVIDIA's perspective and a signal of which hyperscalers it trusts as forward customers.

How to read the disclosure language

Three things to watch in each NVDA 10-K and quarterly filing:

1. Indirect-customer concentration line. The "one indirect customer represented X% of total revenue" disclosure. If the X drops 2-3 percentage points QoQ that's a leading indicator that the largest hyperscaler is reducing share — either through in-housing or through allocation away from NVIDIA.

2. Hyperscaler 10-K language on AI compute substrate. Microsoft, Meta, Google, Amazon describe their AI compute architectures in 10-K risk factors and CapEx discussions. The phrase "diversifying our AI compute substrate" or "investing in our own custom silicon for AI workloads" started appearing in 2023-2024 filings. When that language intensifies — naming specific programs, disclosing specific CapEx allocations to custom-silicon, or guiding to substrate-mix percentages — the in-housing path is firming. Most informative are Meta and AWS, both of which have started disclosing more specifically over the past two annual cycles.

3. NVIDIA earnings-call commentary on cloud customer mix. NVIDIA categorizes data-center revenue into "compute" (training-focused) and "networking" (Mellanox/InfiniBand/Spectrum-X) and references customer composition without naming names. The phrase "sovereign AI" started appearing heavily in 2024 — that's NVIDIA's pitch to non-hyperscaler buyers (national governments, large enterprises) and a tell that they want to diversify the customer base. If sovereign-AI commentary grows while hyperscaler commentary stagnates, the concentration is structurally trending the wrong way.

The actionable read. If you're long $NVDA, you're long concentrated demand. The largest tail risk isn't AMD (the CUDA moat handles that) and isn't HBM supply caps (those affect ramp but not long-term TAM). It's a hyperscaler reaching critical mass on their custom-silicon program and stepping their NVDA spend down 30-50% within 18 months. Google has shown this is possible. AWS is the next most likely. Meta and Microsoft are behind. Oracle is structurally safe. Allocate accordingly — and consider the supply-side trade (HBM oligopoly) as a cleaner expression of the AI buildout without the customer-concentration risk.

Three signals that would make the concentration risk concrete

1. AWS Trainium becomes the default for Bedrock workloads. Currently Bedrock (AWS's managed LLM inference service) runs a mix of NVIDIA and Trainium. If AWS announces Trainium-default with NVIDIA as the opt-in tier, that's a step function down in AWS's NVDA spend.

2. Meta publishes an MTIA training run paper. Meta has trained Llama models on NVIDIA so far. An MTIA-trained frontier-class model — even a small one — would establish that MTIA's software stack is mature enough to displace NVIDIA on Meta's largest workloads. None has been published as of mid-2026.

3. Microsoft Maia revenue contribution disclosure. Microsoft's silicon spend gets disclosed at quarterly capex granularity but doesn't break out Maia vs NVIDIA. If Microsoft starts referencing "Maia capacity" in earnings calls as a meaningful share of AI compute, the substrate is firming. Nadella has hinted at this trajectory; the numbers haven't shown it yet.

Bottom line

NVIDIA's data-center business is concentrated in five hyperscalers, three of which are actively executing multi-year programs to reduce their NVIDIA dependence. The CUDA moat defends against new-buyer migration; it does not defend against existing-buyer in-housing. Google has demonstrated the playbook works. AWS and Meta are 18-36 months behind. Microsoft is the slowest and Oracle has no custom-silicon program.

The bull case on NVDA must price the timeline before two hyperscalers cross the "majority internal on custom silicon" threshold. The bear case on NVDA — the right bear case, not the wrong "AMD will catch up" one — prices it sooner. Either way, customer concentration is the variable that matters, and the HBM supply ceiling is the floor under both cases.

NVDA dashboard on QuantAbundancia — thesis panel with current marks.

The CUDA moat — why the software defends against AMD but not against hyperscaler in-housing.

NVIDIA's HBM bottleneck — the supply-side ceiling that gates the revenue ramp regardless of customer composition.

The 12 AI bubbles ranked — why compute, memory, and custom-silicon belong in separate blocks even though the narrative groups them.

This article is what each concentration risk actually looks like, why Google is the canary, and how to read the disclosure language as it shifts.

What the disclosures actually say

NVIDIA's quarterly filings disclose customer concentration in two ways:

Cross-reference NVIDIA's indirect-customer disclosures against the named hyperscalers' AI capex line items in their own filings and the picture sharpens:

Microsoft — discloses ~$80-90B annual capex through FY2026 of which roughly half is AI-data-center. Direct NVIDIA buyer at large scale; ~13-15% of NVDA revenue is the rough triangulation.
Meta — disclosed AI capex of $60-65B for 2025, rising. Direct NVIDIA buyer at large scale; ~10-13% of NVDA revenue.
Alphabet (Google) — large buyer of NVDA for Google Cloud customer workloads, but Google's internal workloads run on TPU. The split inside Google is ~70/30 TPU/NVDA for internal compute; the Google Cloud NVDA spend is for external Cloud customers who specifically want NVIDIA silicon.
Amazon (AWS) — large NVDA buyer for AWS EC2 GPU instances + internal AI workloads, but ramping Trainium for internal training and Inferentia for internal inference. Direct NVDA share roughly 10-12%.
Oracle — emerged in 2024-2025 as a major NVDA buyer for OCI's AI capacity and the Stargate project. Smaller than the top four but growing fastest.

The top five together represent ~45-55% of NVDA's data-center revenue depending on the quarter. That's tight concentration even by tech-sector standards.

Why concentration is the right framing, not customer count

Google: TPU v1 (2015) through TPU v5e (2024) through Ironwood TPU v7 (2025-2026). The most mature custom-silicon program.
Amazon: Trainium (training, 2020), Inferentia (inference, 2019), Trainium2 (2024). The second-most mature.
Meta: MTIA v1 (2023, inference), MTIA v2 (2024-2025). Catching up.
Microsoft: Maia 100 (announced 2023, shipping 2024-2025). The latest entrant.
Oracle: no public custom-silicon program. Structurally long NVDA.

Why Google's TPU is the canary

The trade-relevant timeline:

2025-2026: AWS Trainium2 ramping; meaningful share of AWS internal inference workloads.
2026-2027: Meta MTIA v3 expected. First publicly disclosed MTIA training of a frontier model.
2027-2028: Microsoft Maia v2 or v3 expected to reach meaningful internal share.
2028+: If two or more hyperscalers cross the "majority internal AI on custom silicon" threshold, NVDA's concentrated revenue base contracts visibly.

What NVIDIA does to fight back

NVIDIA is not standing still on concentration. Three counter-moves are visible:

How to read the disclosure language

Three things to watch in each NVDA 10-K and quarterly filing:

Three signals that would make the concentration risk concrete

Bottom line

NVDA dashboard on QuantAbundancia — thesis panel with current marks.

The CUDA moat — why the software defends against AMD but not against hyperscaler in-housing.

NVIDIA's HBM bottleneck — the supply-side ceiling that gates the revenue ramp regardless of customer composition.

The 12 AI bubbles ranked — why compute, memory, and custom-silicon belong in separate blocks even though the narrative groups them.

NVIDIA's customer concentration — five buyers, half the revenue, and what happens if one of them in-houses

What the disclosures actually say

Why concentration is the right framing, not customer count

Why Google's TPU is the canary

What NVIDIA does to fight back

How to read the disclosure language

Three signals that would make the concentration risk concrete

Bottom line

Related bubbles

Get the daily digest.

NVIDIA's customer concentration — five buyers, half the revenue, and what happens if one of them in-houses

What the disclosures actually say

Why concentration is the right framing, not customer count

Why Google's TPU is the canary

What NVIDIA does to fight back

How to read the disclosure language

Three signals that would make the concentration risk concrete

Bottom line

Related bubbles

Get the daily digest.