NVIDIA has announced the release of Nemotron 3 Super, an open-source hybrid AI model developed by Mamba-Attention and MoE, delivering 5x higher throughput for agentic AI. The model is designed to power advanced AI systems.

[NVDA](/stocks/NVDA) released Nemotron 3 Super on March 11, a 120-billion-parameter open model with 12 billion active parameters designed to run complex agentic AI systems at scale. The model uses a hybrid Mamba-Attention mixture-of-experts (MoE) architecture that delivers up to 5x higher throughput and up to 2x higher accuracy than its predecessor, with a 1-million-token context window enabling agents to retain full workflow state in memory.

Nemotron 3 Super achieves up to 2.2x and 7.5x higher inference throughput than GPT-OSS-120B and Qwen3.5-122B respectively, and currently holds the No. 1 position on the DeepResearch Bench — a benchmark measuring multi-step research across large document sets. The open-source release underscores NVIDIA's strategy of building developer ecosystems around its hardware and software stack.

The model is available across multiple platforms including build.nvidia.com, Perplexity, OpenRouter, and Hugging Face, with enterprise access via Google Cloud's Vertex AI and Oracle Cloud Infrastructure. Availability on Amazon Bedrock and Microsoft Azure is expected soon, broadening access for enterprise AI deployments.

NVIDIA Releases Nemotron 3 Super AI Model for Advanced Agentic AI