[@DwarkeshPatel] Jensen Huang – TPU competition, why we should sell chips to China, & Nvidia’s supply chain moat

April 15, 2026 · 10 min read

Video Bot

Duration: 103 min

Short Summary

Nvidia CEO Jensen Huang discusses the company's five-layer AI ecosystem strategy, $100B+ supply chain commitments, and 70%+ margins from its CUDA moat. He argues custom chips (TPUs, ASICs) offer minimal cost savings over Nvidia's 70% margins, that export controls may have accelerated China's chip industry, and that DeepSeek first running on Huawei hardware would be "horrible" for US tech leadership. Huang outlines $30B invested in OpenAI, $10B in Anthropic, and explains his "do as much as needed" philosophy of ecosystem-building.

Key Quotes

"We've seen the valuations of a bunch of software companies crash because people are expecting AI to commoditize software." (00:00:00)
"The input is electrons, the output is tokens. In the middle is Nvidia. Our job is to do as much as necessary and as little as possible to enable that transformation to be done at incredible capabilities." (00:00:01)
"AI is a five-layer cake, if you will. We have ecosystems across the entire five layers. We try to do as little as possible, but the part that we have to do, as it turns out, is insanely hard. I don't think that gets commoditized." (00:00:26)
"Some of the doomers were telling people, 'Whatever you do, don't be a radiologist.' You might hear some of those videos still on the web saying radiology is going to be the first career to go and the world is not going to need any more radiologists. Guess what we're short of? Radiologists." (00:00:13)
"Nvidia is fundamentally making software that other people are manufacturing, and if software gets commoditized, does Nvidia get commoditized? In the end, something has to transform electrons to tokens." (00:00:02)

Detailed Summary

Nvidia's Five-Layer AI Ecosystem

Nvidia operates as a vertically integrated AI company spanning five distinct layers—chips, CUDA software, foundation models, applications, and services—transforming raw electrons into valuable tokens through an ecosystem that competitors struggle to replicate. The company's GPU business maintains approximately 70% gross margins, supported by decades of investment in proprietary technology that creates a defensible moat around its core offerings.

Jensen Huang describes the ecosystem as transforming "electrons into tokens" through chips, CUDA, models, applications, and services
Nvidia's CUDA moat is supported by NVLink, CUDA-X libraries, and cuLitho for computational lithography
Purchase commitments with foundries and suppliers approach $100 billion, with SemiAnalysis reporting potential $250 billion in commitments
Nvidia is TSMC's largest customer on N3 and N2 nodes, with AI representing 60% of N3 capacity this year and 86% next year

AI Cloud Investment Philosophy

Nvidia has deployed significant capital into AI infrastructure companies, treating investments as ecosystem-building rather than financial speculation. The company's core philosophy is "do as much as needed, as little as possible"—focusing investments on creating infrastructure that wouldn't exist without Nvidia's involvement.

Nvidia invested $30 billion in OpenAI, $10 billion in Anthropic, and backstopped CoreWeave up to $6.3 billion
Direct investment in CoreWeave totals $2 billion as a neocloud provider
Huang explicitly rejected picking winners among foundation model companies, stating "picking winners would be arrogant"
The company created neoclouds like CoreWeave, Nscale, and Nebius that wouldn't exist without Nvidia's capital and commitment

CoWoS Packaging Resolution and Supply Chain Strategy

Advanced chip packaging became a critical bottleneck for AI compute availability, with CoWoS capacity limiting GPU shipments for approximately two years. Nvidia addressed the constraint through aggressive capital deployment, fundamentally changing the scaling dynamics of advanced packaging.

CoWoS packaging was a two-year bottleneck that Nvidia resolved by "swarming it with investment"
TSMC now scales packaging at the same rate as logic, having doubled multiple times to meet demand
Huang argues no bottleneck (chip capacity, CoWoS, EUV machines) lasts longer than 2-3 years once demand signal is established
The partnership with TSMC spans approximately 30 years without a formal legal contract, based entirely on mutual trust
ASML can scale EUV production relatively quickly once demand signals are clear

Custom Silicon Economics and Competition

Custom AI chips have emerged as a potential alternative to Nvidia's GPUs, but Huang argues the economics don't justify the engineering investment required. The margin differential between custom ASICs and Nvidia's offerings is narrower than commonly perceived, while the complexity of building competitive silicon continues to increase.

ASIC margins (~65%) are only marginally lower than Nvidia's (~70%), making custom chips economically marginal
Anthropic is cited as the "sole driver" of TPU and Trainium growth—"without them there would be zero growth"
Many custom ASIC projects have been canceled, validating Nvidia's position that "building an ASIC better than Nvidia is not easy and not sensible"
60% of Nvidia's revenue comes from hyperscalers: Google, Amazon, Azure, and OCI
Google runs TPUs as the majority of their compute while OpenAI uses custom Triton kernels instead of standard CUDA libraries

Blackwell Architecture Efficiency Gains

The transition from Hopper to Blackwell represents a generational leap in AI compute efficiency that challenges conventional assumptions about hardware scaling. Despite only modest transistor improvements, architectural innovation delivered transformative performance gains that benefit both training and inference workloads.

Hopper to Blackwell delivers 30x-50x energy efficiency improvement (Huang initially announced 35x but corrected to 50x)
The generational improvement came three years apart with only 75% transistor improvement
Blackwell achieved 50x overall performance through architecture improvements alone, demonstrating that architecture matters more than lithography scaling
Nvidia's CUDA ecosystem supports every framework including Triton, vLLM, and SGLang
Nvidia serves as the primary backend contributor to Triton's open-source kernel library

Algorithmic Progress and Efficiency Multipliers

Hardware advances alone cannot explain AI's rapid capabilities improvement—algorithmic innovations contribute substantially to overall performance gains. Jensen Huang emphasizes that "great computer science" through model architectures, attention mechanisms, and training methodologies can deliver 10x improvements that complement hardware scaling.

Moore's Law advances approximately 25% per year, but algorithmic improvements can yield 10x performance gains
Jensen argues most advances in AI came from algorithm advances, not just raw hardware
Mixture of Experts (MoEs), attention mechanisms, and other innovations reduce compute requirements dramatically
Post-training and reinforcement learning frameworks like verl and NeMo RL are described as "exploding" as important areas for AI development

Export Controls and China's Chip Industry

US export restrictions on advanced semiconductor technology have fundamentally altered the trajectory of China's AI chip development, potentially accelerating domestic capability building rather than slowing it. Huang argues that restricting chip sales may be counterproductive, noting that China has responded by developing internal ecosystems while facing severe compute limitations.

China manufactures 60% of the world's mainstream chips
China has 50% of AI researchers (half the world's AI developers) and represents approximately 40% of the global technology industry
Huang claims export controls "enabled and accelerated China's chip industry" by forcing their ecosystem to focus on internal architectures
China has "one tenth the amount of flops the US has" at 7nm without EUVs due to chip-making export controls
China has "enormous energy and datacenters sitting completely empty" that cannot be utilized due to compute restrictions
Jensen estimates the threshold China needs for advanced AI capabilities has already been reached, making export controls insufficient without dialogue and research engagement

DeepSeek, Huawei, and Non-American AI Stacks

Chinese AI technology has progressed significantly, with Huawei reporting its "largest single year in the history of their company" through millions of chip shipments. Jensen expresses concern that DeepSeek models running first on Huawei hardware would represent a "horrible outcome" for US technological leadership, as models optimized for non-American architecture would disadvantage the US tech stack.

Huawei just had the "largest single year in the history of their company," shipping millions of chips with logic and HBM2 memory
SMIC has "plenty of logic capacity and plenty of HBM2" to meet China's AI needs
The H200 outperforms Huawei 910C by roughly 2-3x, with Huawei compensating by using twice as many chips
DeepSeek "first releasing on Huawei hardware would be a horrible outcome" for the US
China's limited compute has forced researchers to develop "extremely smart algorithms"—DeepSeek represents "not an inconsequential advance"

US Competitive Advantages and Limitations

The United States maintains substantial advantages in AI development through superior chip technology, but Huang emphasizes that compute alone doesn't determine outcomes. Energy availability and the application layer represent critical dependencies for sustained US leadership that go beyond pure hardware supremacy.

Jensen argues the US has a "100x compute advantage more than anywhere else in the world"
Nvidia ensures US labs get first access to advanced technologies through allocation prioritization
The US is "scarce on energy," requiring continued architecture advances to maximize throughput per watt with fewer chips
AI is described as a "five-layer cake" where abundance of energy makes up for chips and vice versa
Every layer of the AI stack must succeed for US leadership, including the application layer where "AI diffuses into society" and benefits from the industrial revolution
Huang explicitly rejected comparing AI chips to enriched uranium: "We're not enriched uranium. It's a chip, and it's a chip that they can make themselves."

Radiology AI Predictions and Industry Outlook

Early predictions about AI replacing radiologists have proven unfounded, with the field now facing shortages despite a decade of algorithmic improvement. The episode illustrates a broader pattern where AI excels at discrete tasks but faces challenges integrating into complex professional workflows.

Radiology AI doomer predictions from approximately ten years ago warned careers would disappear
Radiologists are now "in short supply" despite those predictions
Jensen argues AI misunderstands the distinction between "tasks" (reading scans) and "jobs" (patient care), causing unnecessary fear
CUDA enables flexibility for creating MoE, diffusion, and disaggregated systems, making AI dependent on the stack above as much as architecture below

Future Roadmap and Computational Scope

Nvidia commits to continuing its annual GPU release cadence with Vera Rubin, Vera Rubin Ultra, and Feynman architectures in the pipeline. Beyond AI, Huang emphasizes that "every important computation is not AI-related"—traditional HPC applications in molecular dynamics, seismic processing, and scientific computing remain critical workloads where CUDA acceleration delivers substantial value.

Annual GPU releases committed: Vera Rubin, Vera Rubin Ultra, then Feynman
Token costs decreasing by 10x each year through architecture improvements
Nvidia can fulfill orders from "single rack or graphics card to $100 billion AI factory"—claiming to be the only company that can say that today
"Every important computation is not AI-related" includes molecular dynamics, seismic processing for energy discovery, and image processing
General purpose computing remains too inefficient for these workloads, requiring CUDA acceleration

Transcript: Download plain text

Short Summary​

Key Quotes​

Detailed Summary​

Nvidia's Five-Layer AI Ecosystem​

AI Cloud Investment Philosophy​

CoWoS Packaging Resolution and Supply Chain Strategy​

Custom Silicon Economics and Competition​

Blackwell Architecture Efficiency Gains​

Algorithmic Progress and Efficiency Multipliers​

Export Controls and China's Chip Industry​

DeepSeek, Huawei, and Non-American AI Stacks​

US Competitive Advantages and Limitations​

Radiology AI Predictions and Industry Outlook​

Future Roadmap and Computational Scope​