Google’s Gemini 3 landed last week with impressive reviews: frontier-class performance that beats OpenAI and Anthropic except for its agentic code. Conventional wisdom said Google was lagging behind OpenAI, which remains true in adoption. But on capability, they have a real chance to catch up. The tech press is focused on benchmarks and capabilities.

They’re missing the real headline: Google trained these models entirely on TPUs. Zero NVIDIA dependency.

While competitors debate model quality, Google just decoupled its cost structure from the entire market. OpenAI and Microsoft pay massive margins to Jensen Huang. Google is paying cost-plus to its own hardware division.

It’s the margin war

In a commodity market, the low-cost provider eventually wins. AI inference is racing toward commoditization. Google’s vertical integration is the ultimate cheat code.

Consider the math. NVIDIA-dependent companies must price APIs to cover energy, cloud operations, and NVIDIA’s 75% gross margins. Google prices APIs to cover energy and silicon fabrication costs.

This allows Google to trigger a race to the bottom on pricing that no NVIDIA-dependent competitor can survive. They can effectively subsidize the model indefinitely to protect Search.

Competitors, meanwhile, burn capital paying premium rents for compute. This is “compute sovereignty”: scaling capacity without third-party permission.

Amazon proves it’s not just Google

Google isn’t alone. Amazon is running the same playbook with AWS Trainium chips.

Anthropic is training Claude 4.x on 500,000 AWS Trainium2 chips, with plans to scale to one million by year-end. AWS claims 30-40% better price-performance than NVIDIA equivalents. That’s structural cost advantage, not incremental improvement.

Amazon’s approach mirrors Google’s vertical integration but with a different business model. Where Google optimizes for its own models, AWS sells compute sovereignty as a service.

The software moat is deeper than the hardware

Cheap chips don’t matter if nobody knows how to program them. This is the “Island Problem”: great infrastructure developers won’t adopt. It remains the single biggest risk to Google’s strategy.

NVIDIA is just not a hardware company. They are a software platform disguised as a chip manufacturer. The CUDA ecosystem is the deepest moat in tech.

Every researcher coming out of Stanford or MIT learns PyTorch on CUDA. Every major open-source library is optimized for CUDA first. Moving a production workload from NVIDIA to TPU isn’t just a “recompile.” It often requires rewriting code in JAX or dealing with XLA compiler friction.

If Google wins on efficiency but loses on developer mindshare, they end up with the best internal infrastructure that nobody else wants to use.

The model is the only leverage left

On the flip side, if Google cannot break the CUDA stranglehold, its TPUs remain a private island. In this scenario, their only path to victory is abstracting the hardware away entirely.

They must force the market to consume Inference APIs, not raw compute.

If Gemini 3 is sufficiently powerful (and sufficiently cheap) developers won’t care what silicon it runs on. They will never touch the metal. They will just hit the API endpoint. Google’s strategy relies on turning the TPU from a developer hurdle into a hidden margin engine.

The industry is splitting into two distinct camps.

OpenAI, Meta (training Llama 4 on H100s), and most startups rent NVIDIA capacity. They optimize for speed-to-market and developer familiarity but remain locked into NVIDIA’s margin structure and the CUDA ecosystem.

Google with TPUs and Amazon with Trainium optimize for unit economics. They’re betting that price-performance eventually trumps developer familiarity, and that vertical integration becomes the only sustainable path in a commoditizing market.

The question isn’t which group is right. It’s which advantage compounds faster: ecosystem lock-in or cost structure.