NVIDIA Unveils Nemotron 3 Open-Source AI Models to Revolutionize Agent Development

    NVIDIA Unveils Nemotron 3 Open-Source AI Models to Revolutionize Agent Development

    NVIDIA has unveiled the Nemotron 3 series of open-source AI models, along with supporting data and tools, aimed at simplifying the creation of reliable and efficient AI agents for a range of industries. Available in Nano, Super, and Ultra variants, these models feature a novel hybrid latent mixture-of-experts design that boosts performance for multi-agent setups, where multiple AI components work together on intricate tasks.

    As companies move beyond basic chatbots toward interconnected AI networks, developers grapple with issues like excessive data exchange between agents, loss of contextual focus over time, and steep computational expenses. The Nemotron 3 lineup tackles these head-on by prioritizing openness, speed, and adaptability, allowing teams to craft tailored AI solutions they can fully understand and trust.

    Open-source approaches drive the evolution of artificial intelligence, according to Jensen Huang, NVIDIA’s founder and CEO. He emphasized that Nemotron turns cutting-edge AI into an accessible foundation, equipping developers with the clarity and optimization required for large-scale agent-based systems.

    This release aligns with NVIDIA’s push for sovereign AI, where nations customize models to fit local data standards and cultural priorities. Partners in regions including Europe and South Korea are already using these transparent models to develop compliant AI infrastructure.

    A roster of early users, from Accenture and Cadence to CrowdStrike, Deloitte, and Zoom, plan to embed Nemotron models into workflows spanning manufacturing, security, coding, and beyond. ServiceNow’s chairman and CEO, Bill McDermott, highlighted the partnership’s role in advancing AI automation, noting that combining their platform with Nemotron 3 sets new benchmarks in productivity and precision.

    In practice, these open models complement closed systems by handling routine tasks affordably while reserving premium models for demanding reasoning. This hybrid strategy sharpens overall efficiency, particularly in managing token usage, the core metric for AI processing costs.

    Perplexity’s CEO, Aravind Srinivas, praised the flexibility, explaining how their routing system assigns jobs to Nemotron 3 Ultra for optimized open-model performance or to proprietary alternatives for specialized needs, resulting in faster and more scalable AI tools.

    For emerging companies, Nemotron 3 accelerates prototyping and scaling of AI collaborators that enhance human efforts. Venture firms like General Catalyst and Mayfield see it as a boon for their startups, with Mayfield’s managing partner Navin Chaddha pointing out how NVIDIA’s resources provide affordable paths to innovation and broad deployment.

    The Nemotron 3 models, built on mixture-of-experts technology, offer tiered options: the compact Nano version packs 30 billion parameters but engages only up to 3 billion per operation for quick, focused jobs; Super scales to around 100 billion total with 10 billion active, ideal for coordinated agent teams; and Ultra reaches approximately 500 billion parameters with 50 billion active, suited for in-depth analysis and planning.

    Right now, the Nano model stands out for its low-cost operation in areas like code troubleshooting, text condensing, and search tasks. Its architecture yields four times the processing speed of its predecessor and cuts reasoning time by up to 60 percent, while supporting a million-token memory span for sustained accuracy in extended processes.

    Independent evaluators at Artificial Analysis have crowned Nemotron 3 Nano as a top performer in efficiency and precision among similar-sized open models. The larger Super and Ultra variants employ NVIDIA’s streamlined 4-bit training on Blackwell hardware, slashing memory use and enabling faster development on standard setups without sacrificing quality.

    Complementing the models, NVIDIA has opened up three trillion tokens worth of datasets for pretraining, refinement, and reinforcement learning, including a safety-focused collection for agent evaluation. New libraries like NeMo Gym and NeMo RL offer ready environments and validation tools, all freely accessible on GitHub and Hugging Face to speed up custom AI agent building.

    Compatibility extends to tools such as LM Studio and vLLM, with integrations from Prime Intellect and Unsloth streamlining reinforcement training. Nemotron 3 Nano launches immediately on Hugging Face and via providers like Baseten, Fireworks, and Together AI. It also integrates with enterprise platforms from Couchbase to UiPath, and will soon hit clouds including AWS Bedrock, Google Cloud, and others. As an NIM microservice, it ensures secure runs on NVIDIA hardware. The Super and Ultra models arrive in early 2026.


    You might also like this video

    Leave a Reply