Mistral Models
Mistral logo

Devstral Small 1.1

by Mistral

Devstral Small 1.1 is a 24B parameter open-weight language model for software engineering agents, developed by Mistral AI in collaboration with All Hands AI. Finetuned from Mistral Small 3.1 and released under the Apache 2.0 license, it features a 128k token context window and supports both Mistral-style function calling and XML output formats. Designed for agentic coding workflows, Devstral Small 1.1 is optimized for tasks such as codebase exploration, multi-file edits, and integration into autonomous development agents like OpenHands and Cline. It achieves 53.6% on SWE-Bench Verified, surpassing all other open models on this benchmark, while remaining lightweight enough to run on a single 4090 GPU or Apple silicon machine. The model uses a Tekken tokenizer with a 131k vocabulary and is deployable via vLLM, Transformers, Ollama, LM Studio, and other OpenAI-compatible runtimes.

Chat with Devstral Small 1.1
Input Price$0.10/1M tokens
Output Price$0.30/1M tokens
Intelligence15.2
Coding12.1

Specifications

Technical details and pricing.

ProviderMistral
Context Window131,072 tokens
Release DateJul 10, 2025
ModalitiesText

Benchmarks

12 benchmark scores from Artificial Analysis.

GPQA41.4%
MMLU Pro62.2%
HLE3.7%
LiveCodeBench25.4%
MATH 50063.5%
AIME 202529.3%
AIME0.3%
SciCode24.3%
LCR17.0%
IFBench34.6%
Tau228.4%
TerminalBench Hard6.1%

Composite Indices

Intelligence, Coding, Math

Standard Benchmarks

Academic and industry benchmarks

Frequently Asked Questions

What is Devstral Small 1.1 good for?

Use Devstral Small 1.1 for everyday tasks like writing, summarizing, brainstorming, and getting clear explanations.

How much does Devstral Small 1.1 cost?

Pricing is based on usage. Current rates are $0.10/1M tokens for input and $0.30/1M tokens for output.

Can I try Devstral Small 1.1 for free?

Yes. You can start a chat instantly and test the model before deciding on a plan.

Does Devstral Small 1.1 support images or audio?

Devstral Small 1.1 focuses on text-based tasks.

Benchmarks and pricing are sourced from Artificial Analysis where available. OpenRouter specs are used as a fallback.