GPU Or ASIC For LLM Scale-Up?

LLMs are just getting started.

popularity

The CEOs of OpenAI, Anthropic, and xAI share a strikingly similar vision — AI’s progress is exponential, it will change humanity, and its impact will be greater than most people expect.

This is more than just speculation. The market for AI, and its value, are real today:

  • A human developer with GitHub CoPilot codes 55% faster with AI.
  • GPT-4 scores 88th percentile on the LSAT vs. 50th percentile for an average human.
  • And I personally am using ChatGPT for conversational Spanish practice and grammar drills

LLM revenues in 2025 will be ~$10B at OpenAI, and $2B to $4B at Anthropic.

Four years ago GPT-2 provided pre-schooler intelligence. GPT-4 is like a smart high-schooler.

By ~2028, LLMs will offer smart PhD-level intelligence. In the 2030s LLM IQ will be superhuman.

The economics for AI are improving, as well. The cost of a given model is dropping 4X/year (Anthropic) to 10X/year (OpenAI). This is an equal combination of compute improvements and algorithm improvements. This means in 2030 today’s models will cost 1/1,000th to 1/100,000th to operate.

AI will be everywhere, and human productivity can soar using it. A view with more detail: https://situational-awareness.ai/from-gpt-4-to-agi/

There are 5+ companies with the capability and capital to do this, including giants like Amazon, Google, Microsoft. And startups like OpenAI and Anthropic, whose market caps are in the $100B range now, will be $1 trillion if they can deliver. The LLM winner be the first $10 trillion company.

Their success will place huge strains on growth and capacity for semiconductors, packaging, data centers, cooling, and power. Semiconductor revenues will be mostly AI/HPC by 2030.

GPU vs ASIC? YES: Hyperscalers want options

Today, data center AI accelerators are >90% NVIDIA GPUs, with some AMD GPUs, and the rest custom ASIC (primarily Amazon).

NVIDIA is the only player with a total solution — GPU, NVlink networking, racks, systems, software. It will be hard to match or beat NVIDIA at its game. The company’s revenue is $160B/year.

NVIDIA has 3 or 4 customers buying >10% of its output, or almost $20B/year each.

But AMD’s GPU roadmap is catching up to NVIDIA. Its M350 will match Blackwell 2H/2025. And its M400 will match NVIDIA’s expected Rubin (successor to Blackwell). AMD is catching up on software and interconnect/systems, too, hoping to get to $10B/year in revenue in 2026.

Even if AMD is not as good as NVIDIA, expect the major hyperscalers to give them business. They want a strong alternative to NVIDIA, and having that alternative gives the hyperscalers some pricing leverage, along with the ability to ramp their data centers faster if NVIDIA is supply-constrained.

What about ASICs for AI Accelerators? Just a few years ago ASIC was a bad word among investors — low margin, low growth. Now it’s hot because the hyperscalers want options.

Amazon, Google, Meta, OpenAI are all developing their own AI accelerators. There are others, as well. For example, Broadcom’s AI revenues have soared about 10X in 3 years to about half of total sales. Likewise, Marvell’s AI revenues have soared in the same time frame, and AI is now by far its biggest business unit.

At the Morgan Stanley Technology Conference in early March, Open AI CEO Sam Altman said that an ASIC for a given model can be really efficient if you give up some of the GPU’s flexibility. Remember, networks used to use x86 processors. Now it’s all switch chips because of the kinds of packets to be processed change but slowly.

The market is moving from mostly training workloads to mostly inference. An inference-only ASIC can be much simpler. It’s all about cost and power. An inference-only ASIC that is constrained to, say, just transformer models can be simpler and cheaper still. Alchip’s CEO says an ASIC is 40% better price/performance than a GPU, and it can be optimized for the customer’s specific software.

An AI accelerator today has a 3nm or 2nm compute engine, and maybe a separate 5nm SRAM and PHY chips on “older” and cheaper nodes. Alchip’s CEO said it costs $50M in NRE for an AI accelerator. Broadcom/Marvell are probably doing more complex accelerators with more chiplets and 3D packaging with $100M+ development costs. And the hyperscalers will have 100-plus-person architecture teams; 100+ people doing network connectivity, and many more doing software. That puts the total cost in the 1/3 to 1/2 billion-dollar range. Can they afford this?

If a hyperscaler buying $20B/year can get a 10% discount from NVIDIA because it has alternatives, then it can afford to do their own ASICs. And if a hyperscaler succeeds in building an ASIC that is half the cost of an NVIDIA GPU, using less power, then it has hit a home run. Alchip’s CEO said it could be about 40% cheaper than a GPU.

Hyperscalers likely will deploy GPUs from NVIDIA and AMD for workloads the most complex and fastest-changing workloads, and for external customers, but they will use their own ASICs on their internal, slower-changing workloads. The ultimate mix of GPU and ASIC will depend on relative performance, power, and availability. It could be 90% GPU and 10% ASIC. Or, as McKinsey predicts, it could be 10% GPU and 90% ASIC. Smaller customers who “just” buy $1B/year will have to use GPUs.



Leave a Reply


(Note: This name will be displayed publicly)