Mistral Devstral 2: SWE-bench 72.2%, 7× Cheaper Than Sonnet

Mistral AI unveils two open-source code agent models: Devstral 2 (123B) and Devstral Small 2 (24B), achieving 72.2% on SWE-bench Verified with 7× better cost efficiency than Claude Sonnet. Plus Mistral Vibe CLI for terminal-native coding.

1. Europe’s AI Makes Its Move

On June 16, French AI lab Mistral released Devstral 2—two new code agent models—alongside Mistral Vibe, an open-source CLI coding tool. This marks Mistral’s most significant product launch since pivoting to its “specialization beats scaling” strategy.

Devstral 2 achieves 72.2% on SWE-bench Verified (the industry gold standard for coding agent benchmarks) while delivering 7× better cost efficiency than Claude Sonnet. Crucially, both variants are released under the Apache 2.0 license—free for commercial use, modification, and self-deployment.

2. Model Lineup & Pricing

Category	Devstral 2 (123B)	Devstral Small 2 (24B)
Parameters	123B	24B
Context Window	256K	256K
Input Price	$0.40 / 1M tokens	$0.10 / 1M tokens
Output Price	$2.00 / 1M tokens	$0.30 / 1M tokens
SWE-bench Verified	72.2%	TBA
Deployment	Enterprise GPU	Consumer GPU ready
License	Apache 2.0	Apache 2.0
API Access	Mistral La Plateforme	Mistral La Plateforme

3. What SWE-bench 72.2% Really Means

SWE-bench Verified measures a coding agent’s ability to autonomously locate bugs from real GitHub issues, generate fixes, and pass the project’s test suite—a far cry from simple code completion benchmarks.

Competitive Landscape

Model	SWE-bench Verified	Input Price/M tokens	Architecture
Devstral 2 (123B)	72.2%	$0.40	Mixture-of-Experts
Claude Sonnet 4.7	~75%	$3.00	Closed-source
GPT-5.5-mini	~70%	$2.00	Closed-source
DeepSeek Coder V4	~68%	$0.50	MoE
Gemini 3.5 Flash	~72%	$1.50	Closed-source

Key insight: Devstral 2 trails Claude Sonnet 4.7 by only ~3 percentage points on SWE-bench, while costing 1/7 to 1/8 as much. For teams running coding agents at scale, the economics are overwhelmingly compelling.

4. Mistral Vibe CLI: Terminal-Native Coding Agent

Mistral Vibe is the open-source CLI companion released alongside Devstral 2. It runs entirely in the terminal and supports:

Multi-file code generation: describe what you need, get a complete project structure
Bug fixing: automatically submit fix PRs for GitHub issues
Refactoring & translation: batch code refactoring, language translation
Local/cloud dual mode: use local models (Devstral Small 2) or the cloud API

Quick Start

# Install
pip install mistral-vibe

# Initialize
cd my-project && vibe init

# Start coding
vibe "Add unit tests for the src/auth module, covering all edge cases"

Under the hood, Mistral Vibe uses an iterative explore-fix-verify agent loop, automatically running your project’s test suite after each modification to validate correctness—echoing the design philosophy of tools like Devin and OpenAI Codex CLI.

5. The NixAPI Perspective: API Aggregation

Devstral 2’s pricing is aggressive, but multi-model routing remains the optimal strategy. Through NixAPI’s model aggregation:

Smart routing: complex tasks → Devstral 2, simple tasks → Devstral Small 2, with automatic fallback
Cost optimization: automatically select the most economical model per task difficulty
Multi-vendor resilience: Mistral API + OpenRouter + self-hosted nodes, triple redundancy

6. Conclusion: A New Era for Open-Source Code Agents

The second weekend of June 2026 brought the developer community two gifts:

GLM-5.2 (June 13)—an open-source reasoning model surpassing Fable 5 on BridgeBench, rendering export controls irrelevant within 48 hours
Devstral 2 (June 16)—an Apache 2.0 open-source code agent hitting 72.2% on SWE-bench at 1/7th the cost of proprietary alternatives

This isn’t a coincidence. Open-source AI has achieved real competitiveness in the coding agent domain. For teams prioritizing cost control and technical sovereignty, the time to switch ecosystems is now.

Sources: Mistral AI Blog, SWE-bench Verified

Mistral Devstral 2: Europe's Strongest Open-Source Code Agent Hits SWE-bench 72.2% at 1/7th Sonnet's Cost