Today’s most important event is NVIDIA GTC Conference, which is basically an AI version of A Brief History of Humankind.

robot
Abstract generation in progress

The most important thing today is the NVIDIA GTC conference—basically a human history book for the AI age.

Even before Huang Renxun takes the stage, the leaked information in advance is already enough to fill a whole book.

Wanwan has pulled together three big takeaways. Come on, fat friends—follow me.

1)AI compute costs get cut straight down to a tenth

The previous Blackwell already packs a punch, right?
The new-generation Vera Rubin chips will be announced for mass production soon.

What makes Vera Rubin so strong? Put simply: it’s cheap.

Running the same AI model,
the number of chips is reduced to one quarter, and inference compute costs drop by 90%.
A 90% drop, friends.
AWS, Microsoft, and Google—the three major cloud providers—are the first batch to get on board.

2)Groq, bought for $20 billion last year—turns in its homework today

Previously, Huang Renxun said during an earnings call that Groq would connect to NVIDIA’s ecosystem as an extension architecture—like how, back then, NVIDIA acquired Mellanox to complete its network capabilities.

Groq’s LPU and NVIDIA’s GPU sit in the same data center. The GPU understands the problem, and the LPU rapidly spits out the answers.

With the two types of chips splitting responsibilities and coordinating, agent scenarios’ latency gets knocked down directly.

With AI agents doing work for people, a single task can go back and forth and adjust the model dozens of times. Every round burns inference compute, and the user is just sitting there—if it’s even a bit slower, the experience falls apart.

Inference happens in two steps: first, understand your question; then, output the answer one word at a time.

GPU is strong at the first step, but for the second step’s speed and stability of “writing out the words,” Groq’s LPU is better.

Is $20 billion expensive?

Just think about it—after this, every company runs hundreds of agents, and each agent tweaks the model thousands of times every day.

3)NVIDIA’s OpenClaw launches—now called NemoClaw

It’s an open-source platform package. Once enterprises install it, they can deploy AI employees to run processes for real people, handle data, and manage projects.
They say it’s already in talks with Salesforce and Adobe.

What’s interesting is that NemoClaw doesn’t require you to use NVIDIA chips.
Take a good look at this logic.
Selling chips only makes money from the hardware layer—only by setting the rules can you make money across the entire chain. Huang Renxun’s accounting here is crystal clear.

4)Huang Renxun says he’ll show “a chip the world has never seen before”

Most likely, the next-next generation architecture—Feynman—will make its first appearance, with mass production in 2028, using TSMC’s most advanced 1.6nm process.

And there’s another niche tidbit I think is pretty interesting.

NVIDIA is rolling out laptop processors—two of them—aimed mainly at gaming.
The sellers of graphics cards are coming to take a bite out of the CPU business, huh.

Wanwan, I feel like Huang Renxun is going to become a great figure of an era in the future.

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pin