Jensen Huang Maps the Era of AI Factories at GTC 2026

NextFin News -

Jensen Huang delivered the opening keynote at NVIDIA’s GTC developer conference on March 16, 2026, at the SAP Center in San Jose, California. The address — headlining a multi-day GTC program of technical sessions and partner announcements — set out NVIDIA’s view of the next platform shift in computing: the rise of AI factories that convert data into tokens and tokens into economic value. NVIDIA announced the keynote schedule and the GTC program lists Huang’s presentation for March 16, 2026 at 11:00 a.m. PT. NVIDIA’s GTC keynote page provides the event details and agenda.

Opening framing: platforms, ecosystems and the token economy

Huang opened by reframing NVIDIA as a platform company with three interlocking platforms — CUDA X libraries, systems and a new category he named "AI factories." He described the industry’s five-layer architecture and urged attendees to focus on applications. As he put it, tokens have opened a new frontier, turning data into knowledge and are already at work across science, industry and healthcare. He repeated the central metaphor: data centers are becoming production factories where tokens are the manufactured commodity.

CUDA, the installed base, and why longevity matters

Reflecting on CUDA’s 20th anniversary, Huang argued the long installed base of NVIDIA GPUs is the foundation for accelerated computing's flywheel. He said CUDA’s ecosystem of compilers, tools and libraries makes optimization and continuous software updates possible for millions of deployed GPUs. In his words, the install base is what attracts developers who then create new algorithms, and that continuous software work drives down computing cost while extending useful life.

Fusing graphics and generative AI: neuro rendering and DLSS 5

Huang described a new graphics era — "neuro rendering" — that fuses structured 3D data and generative AI to produce controllable, photoreal visuals. He demonstrated DLSS 5 during the keynote and said the fusion of structured virtual-world data with probabilistic generative models creates content that is both beautiful and controllable. We combine 3D graphics, structured data with generative AI, probabilistic computing, he said, and noted this pattern will reappear industry after industry.

Structured and unstructured data: QDF and QVS as foundational libraries

Huang distinguished two essential data platforms for the AI era: structured data frames and unstructured semantic/vector stores. He announced NVIDIA libraries targeting both domains — QDF for data frames and QVS for vector stores — arguing that AI must be accelerated where the data lives. He emphasized the business imperative: AI needs rapid access to massive datasets and today's CPU-based data processing systems cannot keep up.

Cloud, partners and confidential computing

Huang reviewed deep integrations with cloud providers and system OEMs, naming Google Cloud (Vertex AI, BigQuery), AWS (SageMaker, Bedrock) and Microsoft Azure among others. He framed NVIDIA’s role as enabling customers to land workloads onto clouds with accelerated libraries, and he underlined confidential computing as a critical capability: even the operator cannot see your data. Even the operator cannot touch or see your models, he said, positioning NVIDIA GPUs as supporting protected model deployment across clouds and regions.

AI factories, tokens per watt, and the economics of throughput

A central section of Huang’s keynote introduced the token-factory model for data centers. He explained that AI factories are power‑constrained and that maximizing tokens-per-watt determines economic throughput. He used charts to map token throughput (intelligence / context length) against interactivity (token rate) and argued that new architectures push every tier upward. He repeatedly used the phrase this is your token factory and urged CEOs and operators to measure their data centers by throughput at fixed power.

Systems and accelerators: Grace Blackwell, Vera Rubin, Grock and Dynamo

Huang summarized the system roadmap that he said addresses pretraining, post‑training and inference at scale. He described several generation steps: Grace Blackwell (NVLink72, MVFP4) as a leap in tokens-per-watt and inference efficiency; Vera Rubin as a vertically integrated, liquid‑cooled rack system for agentic AI with large NVLink all‑to‑all bandwidth; and Grock (LP30/LPX) as a deterministic, dataflow inference chip with massive on‑chip SRAM designed for ultra‑low‑latency token generation.

To combine very different processor strengths, Huang introduced Dynamo — a software orchestration approach that disaggregates inference so Vera Rubin handles the heavy math while Grock handles low‑latency token decode and generation. He said the combined stack and code optimizations produce large multi‑dozen‑times throughput gains and substantively lower token cost.

Extreme code design and backward compatibility

Huang emphasized NVIDIA’s software investment and extreme kernel optimization as the reason behind many performance gains. He reiterated that new systems remain architecturally compatible and backward compatible with prior generations: NVIDIA will continue to support existing CUDA assets while enabling new tiers of performance and new price points.

Omniverse, DSX and the digital twin for AI factory design & operations

To address complexity in building gigawatt AI factories, Huang introduced NVIDIA DSX — an Omniverse‑based digital twin and operational platform. DSX includes SIM for physical/electrical/thermal simulation, DSX Exchange for operational data and Max Q for dynamic power orchestration. Huang portrayed DSX as the blueprint for designing and operating AI factories to maximize token throughput, resilience and energy efficiency.

OpenClaw, Nemo / enterprise-safe agents, and the open-agent moment

Huang highlighted OpenClaw — an open-source agent operating system — as a turning point for agentic systems. He described how OpenClaw can spawn agents, schedule tasks, access tools and handle IO across modalities. Because agents can access sensitive data and execute code, Huang said NVIDIA worked with the OpenClaw project to build enterprise‑ready, policy‑driven references (Nemo / NemoClaw) with guardrails, privacy routers and policy engines so companies can safely deploy agents inside corporate networks. He framed this moment as comparable to HTML, Linux and Kubernetes in enabling a new category of applications.

Open models and the Neotron coalition

Huang announced a family of open frontier models and an initiative to support domain‑specific foundation models: Neotron (language/reasoning), Cosmos (physical/world models), Alpamo (autonomous vehicle reasoning), Groot (robotics), Bioneo (biology/chemistry) and Earth 2 (weather/climate). He described an open‑model coalition that includes research and tooling partners, and argued these open families will let regions and industries fine‑tune sovereign, domain‑specialized models.

Physical AI and robotics: simulation, world models and partners

Concluding the technical tour, Huang focused on physical AI. He reiterated NVIDIA’s three robotics computers — training, synthetic-data/simulation and on‑robot inference — and emphasized simulation (Isaac Lab, Newton, Cosmos) and foundation models (Groot) for closing the data gap in robotics. He announced automotive and robotics partnerships and named OEMs and mobility partners joining NVIDIA’s "robo taxi ready" platform. He closed with demonstrations showing simulated and physically embodied agents and characters trained in Omniverse.

Selected direct quotations from the keynote

This is how intelligence is made.

Tokens are harnessing a new wave of clean energy and unlocking the secrets of the stars.

The install base is what attracts developers who then create new algorithms.

Even the operator cannot see your data. Even the operator cannot touch or see your models.

We combine 3D graphics, structured data with generative AI, probabilistic computing.

References and further viewing

Full keynote and related event pages:

For readers who wish to review the full remarks, NVIDIA and major news outlets have published video and transcript resources linked above.

Explore more exclusive insights at nextfin.ai.

Jensen Huang Maps the Era of AI Factories at GTC 2026

Opening framing: platforms, ecosystems and the token economy

CUDA, the installed base, and why longevity matters

Fusing graphics and generative AI: neuro rendering and DLSS 5

Structured and unstructured data: QDF and QVS as foundational libraries

Cloud, partners and confidential computing

AI factories, tokens per watt, and the economics of throughput

Systems and accelerators: Grace Blackwell, Vera Rubin, Grock and Dynamo

Extreme code design and backward compatibility

Omniverse, DSX and the digital twin for AI factory design & operations

OpenClaw, Nemo / enterprise-safe agents, and the open-agent moment

Open models and the Neotron coalition

Physical AI and robotics: simulation, world models and partners

Selected direct quotations from the keynote

References and further viewing

Insights

What is vision behind AI factories in the computing landscape?

How does the token economy influence data utilization across industries?

What role does CUDA play in NVIDIA's ecosystem and its longevity?

What are the key technologies driving growth in the AI market as discussed in GTC 2026?

What recent collaborations has NVIDIA established with cloud providers?

How does NVIDIA's architecture support backward compatibility?

What challenges do AI factories face concerning power consumption?

How does the new graphics technology 'neuro rendering' differ from traditional methods?

What are the implications of using OpenClaw for enterprise applications?

What historical context led to the development of NVIDIA's AI factory concept?

How do NVIDIA's new libraries QDF and QVS address data processing challenges?

What industry trends are shaping the future of AI manufacturing?

What are the potential long-term impacts of token economics in data centers?

How does NVIDIA's approach to confidential computing enhance data security?

What comparisons can be made between NVIDIA's AI factory and traditional data centers?

What specific advancements does the Grace Blackwell system bring to AI efficiency?

How does NVIDIA envision the integration of robotics in future AI applications?

What are the key features of NVIDIA DSX for AI factory design and operations?

What are the core difficulties facing developers in the AI factory ecosystem?

How does NVIDIA's Neotron coalition aim to support domain-specific AI models?