Nemotron Nano 9B V2

Name: Nemotron Nano 9B V2
Brand: NVIDIA
Price: 0.04 USD

by NVIDIA

NVIDIA-Nemotron-Nano-9B-v2 is a large language model (LLM) trained from scratch by NVIDIA, and designed as a unified model for both reasoning and non-reasoning tasks. It responds to user queries and tasks by first generating a reasoning trace and then concluding with a final response. The model's reasoning capabilities can be controlled via a system prompt. If the user prefers the model to provide its final answer without intermediate reasoning traces, it can be configured to do so.

Chat with Nemotron Nano 9B V2

Input Price$0.04/1M tokens

Output Price$0.16/1M tokens

Intelligence14.8

Coding8.3

Specifications

Technical details and pricing.

ProviderNVIDIA

Context Window131,072 tokens

Release DateAug 18, 2025

ModalitiesText

Benchmarks

10 benchmark scores from Artificial Analysis.

GPQA57.0%

MMLU Pro74.2%

HLE4.6%

LiveCodeBench72.4%

AIME 202569.7%

SciCode22.0%

LCR21.0%

IFBench27.6%

Tau221.9%

TerminalBench Hard1.5%

Composite Indices

Intelligence, Coding, Math

Standard Benchmarks

Academic and industry benchmarks

Frequently Asked Questions

What is Nemotron Nano 9B V2 good for?

Use Nemotron Nano 9B V2 for everyday tasks like writing, summarizing, brainstorming, and getting clear explanations.

How much does Nemotron Nano 9B V2 cost?

Pricing is based on usage. Current rates are $0.04/1M tokens for input and $0.16/1M tokens for output.

Can I try Nemotron Nano 9B V2 for free?

Yes. You can start a chat instantly and test the model before deciding on a plan.

Does Nemotron Nano 9B V2 support images or audio?

Nemotron Nano 9B V2 focuses on text-based tasks.

Similar Models

Other models you might want to explore.

Nemotron Nano 12B 2 VL (free)

NVIDIA

NVIDIA Nemotron Nano 2 VL is a 12-billion-parameter open multimodal reasoning model designed for video understanding and document intelligence.

Details →

Nemotron 3 Nano 30B A3B (free)

NVIDIA

NVIDIA Nemotron 3 Nano 30B A3B is a small language MoE model with highest compute efficiency and accuracy for developers to build specialized agentic AI systems.

Details →

Llama 3.1 Nemotron 70B Instruct

NVIDIA

NVIDIA's Llama 3.1 Nemotron 70B is a language model designed for generating precise and useful responses.

Details →

Benchmarks and pricing are sourced from Artificial Analysis where available. OpenRouter specs are used as a fallback.