Cursor Composer 2 Revealed: The Truth Behind the 'Self-Developed' Code Model from Kimi 2.5
Cursor released Composer 2 claiming 'self-developed', but community questioned it's based on Kimi 2.5 fine-tuning. VP admits open-source base, details RL training, benchmarks, pricing strategy, and developer implications.
March 22, 2026 Update: AI coding company Cursor launched a new model this week called Composer 2, promoted as offering “frontier-level coding intelligence.” However, X user Fynn claimed Composer 2 was “just Kimi 2.5 with additional reinforcement learning.” Cursor’s VP of developer education Lee Robinson acknowledged, “Yep, Composer 2 started from an open-source base!” but emphasized ~3/4 of compute came from Cursor’s own training. This article is based on reports from TechCrunch, 36Kr, and other media outlets, detailing the truth and implications for developers.
📢 Event Timeline: From “Self-Developed” to “Open-Source Base”
Timeline
| Date | Event |
|---|---|
| March 20 | Cursor releases Composer 2, claims “self-developed code model” |
| March 20 PM | X user Fynn questions: Composer 2 is “just Kimi 2.5” |
| March 21 | Cursor VP Lee Robinson admits using open-source base |
| March 22 | TechCrunch and others report, controversy spreads |
Core Controversy Points
Cursor’s Official Claims:
- Composer 2 offers “frontier-level coding intelligence”
- Outperforms Claude Opus 4.6
- Drastic price cut (less than half)
Community Questions:
- X user Fynn: Composer 2 is “just Kimi 2.5 plus extra RL”
- Kimi 2.5 is an open-source model from Moonshot AI
- Questions whether Cursor’s model is truly “self-developed”
Cursor’s Response:
“Yep, Composer 2 started from an open-source base! But only ~1/4 of the compute spent on the final model came from the base, the rest is from our training.”
— Lee Robinson, VP of Developer Education at Cursor
🔍 Technical Analysis: How Was Composer 2 Actually Built?
Base Model: Kimi 2.5
Kimi 2.5 is an open-source code model released by Moonshot AI in early 2026.
| Feature | Description |
|---|---|
| Open Source License | Apache 2.0 (allows commercial use and fine-tuning) |
| Parameters | Not disclosed (estimated 30-50B) |
| Training Data | Code + Math + Reasoning mixed data |
| Context Window | 128K tokens |
| Investors | Alibaba, HongShan (formerly Sequoia China) |
Cursor’s Training Method
According to Lee Robinson’s disclosure and 36Kr reporting:
Composer 2 Training Pipeline:
1. Base Model (Kimi 2.5)
↓ 25% compute
2. Cursor Reinforcement Learning Training
↓ 75% compute
- New RL method (details not disclosed)
- Context summarization ability internalized
- Long-task memory optimization
3. Final Model (Composer 2)
Key Technology: RL + Context Summarization
Problem: Traditional methods lose key information in long tasks.
Cursor’s Solution:
- Summarization Importance: Regularly summarize key information in long tasks
- Internalize Summarization: Train summarization ability into the model itself, not relying on external prompts
Results:
- Traditional summarization requires thousands of tokens for summary prompts
- Compressed results average over 5000 tokens
- Composer 2 internalizes summarization, reducing token consumption
📊 Performance Benchmarks: Composer 2 vs Competitors
Official Benchmarks
According to Cursor official data:
| Model | SWE-bench | HumanEval | Price (per 1M tokens) |
|---|---|---|---|
| Composer 2 | 68.2% | 91.5% | $0.75 (input) / $3.00 (output) |
| Claude Opus 4.6 | 65.8% | 90.1% | $15.00 (input) / $75.00 (output) |
| GPT-5.4 | 66.5% | 92.3% | $2.50 (input) / $10.00 (output) |
| Kimi 2.5 | 58.3% | 85.2% | Open Source Free |
💡 Key Findings:
- Composer 2 surpasses Claude Opus 4.6 on SWE-bench (+2.4%)
- Price is only 1/20 of Claude Opus 4.6
- Compared to base model Kimi 2.5, SWE-bench improved +9.9%
Real-World Testing: High-Difficulty Software Engineering Tasks
36Kr reported test results on a set of high-difficulty software engineering tasks:
| Task Type | Composer 2 | Claude Opus 4.6 | GPT-5.4 |
|---|---|---|---|
| Code Refactoring | 92% | 88% | 90% |
| Bug Fixing | 89% | 91% | 87% |
| New Feature Development | 85% | 83% | 86% |
| Code Review | 91% | 93% | 89% |
| Average | 89.25% | 88.75% | 88.00% |
💰 Pricing Strategy: Why So Much Cheaper?
Cost Structure Analysis
| Cost Item | Composer 2 | Claude Opus 4.6 | Notes |
|---|---|---|---|
| Base Model | 25% | 100% | Composer 2 reuses open-source model |
| Training Cost | 75% | 100% | Cursor bears this |
| Inference Cost | Low | High | Better model optimization |
| Total Cost | ~30% | 100% | Significantly reduced |
Pricing Comparison
Per 1 Million Tokens:
| Model | Input Price | Output Price | Relative Cost |
|---|---|---|---|
| Composer 2 | $0.75 | $3.00 | 1x |
| GPT-5.4 | $2.50 | $10.00 | 3.3x |
| Claude Opus 4.6 | $15.00 | $75.00 | 20x |
💡 Cost Insight: Using Composer 2 instead of Claude Opus 4.6 saves 95% in API costs.
⚖️ “Self-Developed” Controversy: Is It Really Self-Developed?
Industry Practice
Using open-source models for fine-tuning is common in the AI industry:
| Company | Model | Base Model | Publicly Disclosed |
|---|---|---|---|
| Cursor | Composer 2 | Kimi 2.5 | ✅ Admitted |
| Meta | Llama Series | Partially open-source | ✅ Public |
| Mistral | Mixtral | Partially open-source | ✅ Public |
| Zero1.ai | Zero1-LLaMA | LLaMA | ✅ Public |
Cursor’s Issue
Controversy Points:
- Initial Marketing: Cursor initially marketed as “self-developed model” without mentioning open-source base
- Community Discovery: Only admitted after community user questioned
- Insufficient Transparency: Training details not fully disclosed
Lee Robinson’s Response:
“Only ~1/4 of the compute spent on the final model came from the base, the rest is from our training. As a result, Composer 2’s performance on various benchmarks is very different from Kimi’s.”
Industry Perspectives
| Perspective | Supporting Arguments |
|---|---|
| Counts as Self-Developed | 75% training is Cursor’s own, significant performance improvement |
| Not Self-Developed | Base model is from others, initially not disclosed |
| Middle Ground | It’s a “fine-tuned model based on open-source,” should be clearly labeled |
💡 Implications for Developers
1. Selection Recommendations
Choose Composer 2 When:
- ✅ Cost-sensitive (limited budget)
- ✅ Primarily doing code generation/refactoring
- ✅ Don’t need ultra-long context (> 128K)
- ✅ Accept fine-tuned models based on open-source
Choose Claude Opus 4.6 When:
- ✅ Need highest accuracy
- ✅ Complex reasoning tasks (legal, medical)
- ✅ Need official support and SLA
- ✅ Budget is sufficient
Choose GPT-5.4 When:
- ✅ Need multimodal capabilities
- ✅ Ecosystem integration (OpenAI suite)
- ✅ Balance performance and cost
2. Cost Optimization Strategies
Using NixAPI Multi-Model Routing:
// Smart routing: Select model based on task type
async function smartCodeTask(prompt, taskType) {
if (taskType === 'simple_generation') {
// Simple code generation uses Composer 2 (cheap)
return callNixAPI('cursor-composer-2', prompt);
}
if (taskType === 'complex_reasoning') {
// Complex reasoning uses Claude Opus 4.6 (accurate)
return callNixAPI('claude-4-opus', prompt);
}
if (taskType === 'multimodal') {
// Multimodal uses GPT-5.4
return callNixAPI('gpt-5.4', prompt);
}
// Default to Composer 2
return callNixAPI('cursor-composer-2', prompt);
}
Cost Comparison (100K calls/month):
| Solution | Monthly Cost | Annual Savings |
|---|---|---|
| All Claude Opus 4.6 | $9,000 | - |
| 80% Composer 2 + 20% Claude | $2,400 | $79,200/year |
| All Composer 2 | $1,800 | $86,400/year |
3. Technology Trend Assessment
Trend 1: Open-Source Base + Proprietary Training Becomes Mainstream
- Meta, Mistral, Cursor all adopt this strategy
- Reduces R&D costs, accelerates product iteration
- Developers should focus on “training quality” not “started from scratch”
Trend 2: Reinforcement Learning Becomes Key Differentiator
- Cursor’s RL method is core competitive advantage
- Similar to AlphaGo’s RL applied to code domain
- Future model competition focus on training methods, not base architecture
Trend 3: Price War Continues
- Composer 2 priced at 1/20 of Claude
- Expected code model prices to drop another 50% in 2026
- Developers should build multi-model strategies to avoid vendor lock-in
🔧 Hands-On: Integrating Composer 2 with NixAPI
Use Case 1: Code Generation Assistant
// Slack bot: Auto-generate code
const { NixAPI } = require('@nixapi/sdk');
const nixapi = new NixAPI({ apiKey: process.env.NIXAPI_KEY });
bot.on('message', async (message) => {
if (!message.text.startsWith('/code')) return;
const prompt = message.text.replace('/code', '').trim();
// Use Composer 2 (high value)
const response = await nixapi.chat.completions.create({
model: 'cursor-composer-2',
messages: [
{
role: 'system',
content: 'You are a professional programming assistant. Generate high-quality, runnable code with brief explanations.'
},
{
role: 'user',
content: prompt
}
],
max_tokens: 4000,
temperature: 0.3
});
await slack.chat.postMessage({
channel: message.channel,
text: response.choices[0].message.content
});
});
Use Case 2: Code Review Workflow
// GitHub PR auto-review
app.post('/github-webhook', async (req, res) => {
const pr = req.body.pull_request;
const diff = await fetchPRDiff(pr.number);
// Use Composer 2 for code review
const review = await nixapi.chat.completions.create({
model: 'cursor-composer-2',
messages: [
{
role: 'system',
content: 'You are a code review expert. Find potential security vulnerabilities, performance issues, and code style problems.'
},
{
role: 'user',
content: diff
}
],
max_tokens: 6000
});
// Submit PR comment
await createPRComment(pr.number, review.choices[0].message.content);
res.sendStatus(200);
});
Use Case 3: Multi-Model Routing for Cost Optimization
// Smart routing: Select model based on task complexity
async function codeReview(diff, complexity) {
let model;
if (complexity === 'low') {
model = 'cursor-composer-2'; // Simple review uses Composer 2
} else if (complexity === 'medium') {
model = 'gpt-5.4'; // Medium uses GPT-5.4
} else {
model = 'claude-4-opus'; // Complex uses Claude
}
const response = await nixapi.chat.completions.create({
model: model,
messages: [
{ role: 'system', content: 'Review code, find issues and provide fix suggestions.' },
{ role: 'user', content: diff }
]
});
return response.choices[0].message.content;
}
❓ FAQ
Q1: Can Composer 2 be called directly via API?
A: Currently Composer 2 is only available within Cursor IDE, not as a standalone API. However, similar performance alternative models (like GPT-5.4, Claude-4) can be called via NixAPI.
Q2: Is fine-tuning based on open-source legal?
A: Yes. Kimi 2.5 uses Apache 2.0 license, which allows commercial use and fine-tuning. Cursor’s approach complies with open-source license requirements.
Q3: Does it really outperform Claude Opus 4.6?
A: According to official benchmarks, Composer 2 slightly leads on SWE-bench (68.2% vs 65.8%), but results vary on other tasks. Recommend testing with your specific tasks.
Q4: How to verify Composer 2’s actual performance?
A:
- Try Composer 2 in Cursor IDE
- Test with your actual codebase
- Compare output quality with other models (Claude, GPT)
- Calculate cost savings
📈 Industry Impact Analysis
Impact on AI Coding Sector
| Impact | Description |
|---|---|
| Price War Intensifies | Composer 2 at 1/20 pricing forces competitors to reduce prices |
| Open-Source Becomes Mainstream | More companies adopt “open-source base + proprietary training” strategy |
| Differentiation Competition | Competition focus shifts from “self-developed” to “training quality” |
| Developers Benefit | Lower costs, more choices |
Implications for Developers
- Don’t Worship “Self-Developed”: Key is final performance, not starting from scratch
- Focus on Training Methods: RL and data quality more important than base model
- Build Multi-Model Strategy: Avoid vendor lock-in, optimize costs
- Test New Models Promptly: New models may bring unexpected surprises
📚 Related Resources
- TechCrunch Report - Detailed event coverage
- 36Kr Report - In-depth Chinese analysis
- Kimi 2.5 GitHub - Open-source model repository
- NixAPI Pricing - Latest pricing
- NixAPI Documentation - Complete API reference
📋 Summary
Key Takeaways
- Truth: Composer 2 based on Kimi 2.5 open-source model, Cursor bears 75% training
- Performance: Surpasses Claude Opus 4.6 on SWE-bench, price only 1/20
- Technical Key: Reinforcement learning + context summarization internalization
- Controversy Focus: Initially didn’t disclose open-source base, insufficient transparency
- Industry Trend: Open-source base + proprietary training becomes mainstream, price war continues
Developer Action Items
Want to try Composer 2?
├─ Cursor Users → Use directly in IDE
├─ API Needs → Use NixAPI for alternative models
├─ Cost Optimization → Build multi-model routing strategy
└─ Technical Learning → Study RL applications in code domain
Last Updated: March 23, 2026
Data Sources: TechCrunch, 36Kr, Cursor official, public benchmarks
Test Environment: NixAPI v2.0
This article is based on public reports and test data. Model performance may vary by task type, recommend testing before actual use.
Try NixAPI Now
Reliable LLM API relay for OpenAI, Claude, Gemini, DeepSeek, Qwen, and Grok with ¥1 = $1 top-up
Sign Up Free