Mistral Devstral 2: Europe's Strongest Open-Source Code Agent Hits SWE-bench 72.2% at 1/7th Sonnet's Cost

Mistral AI unveils two open-source code agent models: Devstral 2 (123B) and Devstral Small 2 (24B), achieving 72.2% on SWE-bench Verified with 7× better cost efficiency than Claude Sonnet. Plus Mistral Vibe CLI for terminal-native coding.

NixAPI Team June 16, 2026 ~4 min read
Mistral Devstral 2 — SWE-bench 72.2%, Apache 2.0 open-source, $0.40/M tokens

1. Europe’s AI Makes Its Move

On June 16, French AI lab Mistral released Devstral 2—two new code agent models—alongside Mistral Vibe, an open-source CLI coding tool. This marks Mistral’s most significant product launch since pivoting to its “specialization beats scaling” strategy.

Devstral 2 achieves 72.2% on SWE-bench Verified (the industry gold standard for coding agent benchmarks) while delivering 7× better cost efficiency than Claude Sonnet. Crucially, both variants are released under the Apache 2.0 license—free for commercial use, modification, and self-deployment.

2. Model Lineup & Pricing

CategoryDevstral 2 (123B)Devstral Small 2 (24B)
Parameters123B24B
Context Window256K256K
Input Price$0.40 / 1M tokens$0.10 / 1M tokens
Output Price$2.00 / 1M tokens$0.30 / 1M tokens
SWE-bench Verified72.2%TBA
DeploymentEnterprise GPUConsumer GPU ready
LicenseApache 2.0Apache 2.0
API AccessMistral La PlateformeMistral La Plateforme

3. What SWE-bench 72.2% Really Means

SWE-bench Verified measures a coding agent’s ability to autonomously locate bugs from real GitHub issues, generate fixes, and pass the project’s test suite—a far cry from simple code completion benchmarks.

Competitive Landscape

ModelSWE-bench VerifiedInput Price/M tokensArchitecture
Devstral 2 (123B)72.2%$0.40Mixture-of-Experts
Claude Sonnet 4.7~75%$3.00Closed-source
GPT-5.5-mini~70%$2.00Closed-source
DeepSeek Coder V4~68%$0.50MoE
Gemini 3.5 Flash~72%$1.50Closed-source

Key insight: Devstral 2 trails Claude Sonnet 4.7 by only ~3 percentage points on SWE-bench, while costing 1/7 to 1/8 as much. For teams running coding agents at scale, the economics are overwhelmingly compelling.

4. Mistral Vibe CLI: Terminal-Native Coding Agent

Mistral Vibe is the open-source CLI companion released alongside Devstral 2. It runs entirely in the terminal and supports:

  • Multi-file code generation: describe what you need, get a complete project structure
  • Bug fixing: automatically submit fix PRs for GitHub issues
  • Refactoring & translation: batch code refactoring, language translation
  • Local/cloud dual mode: use local models (Devstral Small 2) or the cloud API

Quick Start

# Install
pip install mistral-vibe

# Initialize
cd my-project && vibe init

# Start coding
vibe "Add unit tests for the src/auth module, covering all edge cases"

Under the hood, Mistral Vibe uses an iterative explore-fix-verify agent loop, automatically running your project’s test suite after each modification to validate correctness—echoing the design philosophy of tools like Devin and OpenAI Codex CLI.

5. The NixAPI Perspective: API Aggregation

Devstral 2’s pricing is aggressive, but multi-model routing remains the optimal strategy. Through NixAPI’s model aggregation:

  • Smart routing: complex tasks → Devstral 2, simple tasks → Devstral Small 2, with automatic fallback
  • Cost optimization: automatically select the most economical model per task difficulty
  • Multi-vendor resilience: Mistral API + OpenRouter + self-hosted nodes, triple redundancy

6. Conclusion: A New Era for Open-Source Code Agents

The second weekend of June 2026 brought the developer community two gifts:

  1. GLM-5.2 (June 13)—an open-source reasoning model surpassing Fable 5 on BridgeBench, rendering export controls irrelevant within 48 hours
  2. Devstral 2 (June 16)—an Apache 2.0 open-source code agent hitting 72.2% on SWE-bench at 1/7th the cost of proprietary alternatives

This isn’t a coincidence. Open-source AI has achieved real competitiveness in the coding agent domain. For teams prioritizing cost control and technical sovereignty, the time to switch ecosystems is now.


Sources: Mistral AI Blog, SWE-bench Verified

Try NixAPI Now

Reliable LLM API relay for OpenAI, Claude, Gemini, DeepSeek, Qwen, and Grok with ¥1 = $1 top-up

Sign Up Free