Mistral Devstral 2: Europe's Strongest Open-Source Code Agent Hits SWE-bench 72.2% at 1/7th Sonnet's Cost
Mistral AI unveils two open-source code agent models: Devstral 2 (123B) and Devstral Small 2 (24B), achieving 72.2% on SWE-bench Verified with 7× better cost efficiency than Claude Sonnet. Plus Mistral Vibe CLI for terminal-native coding.
1. Europe’s AI Makes Its Move
On June 16, French AI lab Mistral released Devstral 2—two new code agent models—alongside Mistral Vibe, an open-source CLI coding tool. This marks Mistral’s most significant product launch since pivoting to its “specialization beats scaling” strategy.
Devstral 2 achieves 72.2% on SWE-bench Verified (the industry gold standard for coding agent benchmarks) while delivering 7× better cost efficiency than Claude Sonnet. Crucially, both variants are released under the Apache 2.0 license—free for commercial use, modification, and self-deployment.
2. Model Lineup & Pricing
| Category | Devstral 2 (123B) | Devstral Small 2 (24B) |
|---|---|---|
| Parameters | 123B | 24B |
| Context Window | 256K | 256K |
| Input Price | $0.40 / 1M tokens | $0.10 / 1M tokens |
| Output Price | $2.00 / 1M tokens | $0.30 / 1M tokens |
| SWE-bench Verified | 72.2% | TBA |
| Deployment | Enterprise GPU | Consumer GPU ready |
| License | Apache 2.0 | Apache 2.0 |
| API Access | Mistral La Plateforme | Mistral La Plateforme |
3. What SWE-bench 72.2% Really Means
SWE-bench Verified measures a coding agent’s ability to autonomously locate bugs from real GitHub issues, generate fixes, and pass the project’s test suite—a far cry from simple code completion benchmarks.
Competitive Landscape
| Model | SWE-bench Verified | Input Price/M tokens | Architecture |
|---|---|---|---|
| Devstral 2 (123B) | 72.2% | $0.40 | Mixture-of-Experts |
| Claude Sonnet 4.7 | ~75% | $3.00 | Closed-source |
| GPT-5.5-mini | ~70% | $2.00 | Closed-source |
| DeepSeek Coder V4 | ~68% | $0.50 | MoE |
| Gemini 3.5 Flash | ~72% | $1.50 | Closed-source |
Key insight: Devstral 2 trails Claude Sonnet 4.7 by only ~3 percentage points on SWE-bench, while costing 1/7 to 1/8 as much. For teams running coding agents at scale, the economics are overwhelmingly compelling.
4. Mistral Vibe CLI: Terminal-Native Coding Agent
Mistral Vibe is the open-source CLI companion released alongside Devstral 2. It runs entirely in the terminal and supports:
- Multi-file code generation: describe what you need, get a complete project structure
- Bug fixing: automatically submit fix PRs for GitHub issues
- Refactoring & translation: batch code refactoring, language translation
- Local/cloud dual mode: use local models (Devstral Small 2) or the cloud API
Quick Start
# Install
pip install mistral-vibe
# Initialize
cd my-project && vibe init
# Start coding
vibe "Add unit tests for the src/auth module, covering all edge cases"
Under the hood, Mistral Vibe uses an iterative explore-fix-verify agent loop, automatically running your project’s test suite after each modification to validate correctness—echoing the design philosophy of tools like Devin and OpenAI Codex CLI.
5. The NixAPI Perspective: API Aggregation
Devstral 2’s pricing is aggressive, but multi-model routing remains the optimal strategy. Through NixAPI’s model aggregation:
- Smart routing: complex tasks → Devstral 2, simple tasks → Devstral Small 2, with automatic fallback
- Cost optimization: automatically select the most economical model per task difficulty
- Multi-vendor resilience: Mistral API + OpenRouter + self-hosted nodes, triple redundancy
6. Conclusion: A New Era for Open-Source Code Agents
The second weekend of June 2026 brought the developer community two gifts:
- GLM-5.2 (June 13)—an open-source reasoning model surpassing Fable 5 on BridgeBench, rendering export controls irrelevant within 48 hours
- Devstral 2 (June 16)—an Apache 2.0 open-source code agent hitting 72.2% on SWE-bench at 1/7th the cost of proprietary alternatives
This isn’t a coincidence. Open-source AI has achieved real competitiveness in the coding agent domain. For teams prioritizing cost control and technical sovereignty, the time to switch ecosystems is now.
Sources: Mistral AI Blog, SWE-bench Verified
Try NixAPI Now
Reliable LLM API relay for OpenAI, Claude, Gemini, DeepSeek, Qwen, and Grok with ¥1 = $1 top-up
Sign Up Free