MiniMax M2.7: The Self-Evolving Agent Model (SWE-Pro 56.22%, Terminal-Bench 57.0%)
MiniMax M2.7 was announced on March 18, 2026 and open-sourced on April 12, 2026. At 230B total / 10B active parameters with a 204,800-token context window, it is described as the "first model deeply participating in its own evolution." M2.7 introduces Agent Teams for multi-agent collaboration, achieves 97% skill adherence across 40+ complex skills, and can autonomously improve its own performance by 30% over 100+ rounds of self-optimization.
- M2.7 scores 56.22% on SWE-Pro, matching GPT-5.3-Codex and approaching Claude Opus 4.6.
- Self-optimization achieves 30% performance improvement over 100+ autonomous rounds.
- Agent Teams enable multi-agent collaboration with defined role boundaries.
Architecture and the self-evolution concept
MiniMax M2.7 uses the same MoE architecture family as M2.5, with 230B total parameters and 10B active per token across a 204,800-token context window. What distinguishes M2.7 is its self-evolution capability: the model is described as the "first model deeply participating in its own evolution," meaning it can autonomously improve its performance through iterative self-optimization loops.
Over 100+ rounds of autonomous self-optimization, M2.7 achieves a 30% performance improvement. It also handles 30-50% of RL research workflows autonomously, reducing the human effort required for model improvement cycles.

Official image
MiniMax M2.7 frames self-evolution as a product and research story on one page
The M2.7 release page is the strongest public source for the self-evolution narrative, benchmark highlights, and the way MiniMax wants buyers to think about Agent Teams.
- Good supporting image for the self-optimization story.
- Pairs well with the Hugging Face route when you need both hosted and open-weight context.
Source: Official MiniMax M2.7 release.
| Capability | Detail | Significance |
|---|---|---|
| Self-optimization rounds | 100+ autonomous rounds | 30% cumulative performance improvement |
| Skill adherence | 97% across 40+ complex skills | Reliable execution of diverse tasks |
| RL research automation | 30-50% of workflows handled autonomously | Reduces human effort in model improvement |
| Agent Teams | Multi-agent collaboration with role boundaries | Enables complex task decomposition |
Benchmark performance
M2.7 posts strong results on the most demanding agent and coding benchmarks. On SWE-Pro it scores 56.22%, matching GPT-5.3-Codex and approaching Claude Opus 4.6 at ~58%. On Terminal-Bench 2.0 it reaches 57.0%, placing it in the top tier alongside Qwen 3.6-Plus (61.6%) and Claude Opus 4.6 (65.4%).
SWE-Pro benchmark scores for the current generation of coding-agent models.
Official Anthropic evaluation.
Official MiniMax M2.7 release.
Official OpenAI evaluation.
Source: Official MiniMax M2.7 release.
Multi-step terminal and agent workflow benchmark for leading coding models.
Official Anthropic evaluation.
Official Qwen 3.6 release.
Official MiniMax M2.7 release.
Official Z.AI evaluation.
Source: Official MiniMax M2.7 release.
MLE-Bench Lite and GDPval-AA
On MLE-Bench Lite, which evaluates machine learning engineering capability, M2.7 scores 66.6%, placing it behind Claude Opus 4.6 (75.7%) and GPT-5.4 (71.2%) but tied with Gemini-3.1 at the same score.
On GDPval-AA ELO, M2.7 achieves a rating of 1495, which is the highest score among open-source models evaluated.
| Model | Score | Notes |
|---|---|---|
| Claude Opus 4.6 | 75.7% | Highest score on MLE-Bench Lite |
| GPT-5.4 | 71.2% | Strong ML engineering performance |
| MiniMax M2.7 | 66.6% | Competitive with Gemini-3.1 |
| Gemini-3.1 | 66.6% | Tied with M2.7 |
Open-source and ecosystem
MiniMax M2.7 was open-sourced on April 12, 2026, with weights available on Hugging Face. The model is positioned as both a research platform for self-evolving AI and a practical coding-agent model for production use.
- Weights available on Hugging Face at huggingface.co/MiniMaxAI/MiniMax-M2.7.
- Compatible with the same Token Plan and tool integrations as M2.5.
- Agent Teams feature enables multi-agent workflows with defined role boundaries.
- 97% skill adherence across 40+ complex skills ensures reliable task execution.
- Handles 30-50% of RL research workflows autonomously.
| Model | ELO | Status |
|---|---|---|
| MiniMax M2.7 | 1495 | Highest open-source |
| Claude Opus 4.6 | Higher | Proprietary |
| GPT-5.3-Codex | Higher | Proprietary |
Evaluate M2.7 through the same Token Plan as M2.5
If you are already using MiniMax M2.5, switching to M2.7 uses the same Token Plan and tool integrations. Start with agent-heavy tasks to see where the self-evolution capabilities add the most value.
Sources and official links
Frequently asked questions
What does "self-evolving" mean for M2.7?
M2.7 can autonomously participate in its own improvement cycle. Over 100+ rounds of self-optimization, the model achieves a 30% performance improvement without human intervention. It also handles 30-50% of RL research workflows autonomously.
Is MiniMax M2.7 open source?
Yes. M2.7 was open-sourced on April 12, 2026. The weights are available on Hugging Face at huggingface.co/MiniMaxAI/MiniMax-M2.7.
How does M2.7 differ from M2.5?
M2.5 focuses on SWE-bench Verified performance (80.2%) and cost efficiency. M2.7 focuses on self-evolution, agent capabilities (Agent Teams), and autonomous optimization. They share the same MoE architecture family but are optimized for different use cases. M2.7 is not a simple replacement for M2.5.