Model Guide9 min readReviewed Apr 20, 2026

MiniMax M2.7: The Self-Evolving Agent Model (SWE-Pro 56.22%, Terminal-Bench 57.0%)

MiniMax M2.7 was announced on March 18, 2026 and open-sourced on April 12, 2026. At 230B total / 10B active parameters with a 204,800-token context window, it is described as the "first model deeply participating in its own evolution." M2.7 introduces Agent Teams for multi-agent collaboration, achieves 97% skill adherence across 40+ complex skills, and can autonomously improve its own performance by 30% over 100+ rounds of self-optimization.

Published Apr 19, 2026Updated Apr 20, 2026
  • M2.7 scores 56.22% on SWE-Pro, matching GPT-5.3-Codex and approaching Claude Opus 4.6.
  • Self-optimization achieves 30% performance improvement over 100+ autonomous rounds.
  • Agent Teams enable multi-agent collaboration with defined role boundaries.
Quick note: This guide is based on public docs and release pages, but you should still verify current pricing, limits, supported tools, and region-specific billing on the official source before you pay, subscribe, or integrate.

Architecture and the self-evolution concept

MiniMax M2.7 uses the same MoE architecture family as M2.5, with 230B total parameters and 10B active per token across a 204,800-token context window. What distinguishes M2.7 is its self-evolution capability: the model is described as the "first model deeply participating in its own evolution," meaning it can autonomously improve its performance through iterative self-optimization loops.

Over 100+ rounds of autonomous self-optimization, M2.7 achieves a 30% performance improvement. It also handles 30-50% of RL research workflows autonomously, reducing the human effort required for model improvement cycles.

MiniMax M2.7 official snapshot infographic
The self-evolution story is much easier to follow when you separate the release-page claims, open-weight route, and Token Plan access path. Source: Official MiniMax M2.7 release.
Official MiniMax M2.7 release banner image

Official image

MiniMax M2.7 frames self-evolution as a product and research story on one page

The M2.7 release page is the strongest public source for the self-evolution narrative, benchmark highlights, and the way MiniMax wants buyers to think about Agent Teams.

  • Good supporting image for the self-optimization story.
  • Pairs well with the Hugging Face route when you need both hosted and open-weight context.

Source: Official MiniMax M2.7 release.

Self-evolution capabilities
CapabilityDetailSignificance
Self-optimization rounds100+ autonomous rounds30% cumulative performance improvement
Skill adherence97% across 40+ complex skillsReliable execution of diverse tasks
RL research automation30-50% of workflows handled autonomouslyReduces human effort in model improvement
Agent TeamsMulti-agent collaboration with role boundariesEnables complex task decomposition

Benchmark performance

M2.7 posts strong results on the most demanding agent and coding benchmarks. On SWE-Pro it scores 56.22%, matching GPT-5.3-Codex and approaching Claude Opus 4.6 at ~58%. On Terminal-Bench 2.0 it reaches 57.0%, placing it in the top tier alongside Qwen 3.6-Plus (61.6%) and Claude Opus 4.6 (65.4%).

SWE-Pro

SWE-Pro benchmark scores for the current generation of coding-agent models.

Claude Opus 4.6~58

Official Anthropic evaluation.

MiniMax M2.756.22

Official MiniMax M2.7 release.

GPT-5.3-Codex56.22

Official OpenAI evaluation.

Source: Official MiniMax M2.7 release.

Terminal-Bench 2.0

Multi-step terminal and agent workflow benchmark for leading coding models.

Claude Opus 4.665.4

Official Anthropic evaluation.

Qwen 3.6-Plus61.6

Official Qwen 3.6 release.

MiniMax M2.757.0

Official MiniMax M2.7 release.

GLM-556.2

Official Z.AI evaluation.

Source: Official MiniMax M2.7 release.

MLE-Bench Lite and GDPval-AA

On MLE-Bench Lite, which evaluates machine learning engineering capability, M2.7 scores 66.6%, placing it behind Claude Opus 4.6 (75.7%) and GPT-5.4 (71.2%) but tied with Gemini-3.1 at the same score.

On GDPval-AA ELO, M2.7 achieves a rating of 1495, which is the highest score among open-source models evaluated.

MLE-Bench Lite results
ModelScoreNotes
Claude Opus 4.675.7%Highest score on MLE-Bench Lite
GPT-5.471.2%Strong ML engineering performance
MiniMax M2.766.6%Competitive with Gemini-3.1
Gemini-3.166.6%Tied with M2.7

Open-source and ecosystem

MiniMax M2.7 was open-sourced on April 12, 2026, with weights available on Hugging Face. The model is positioned as both a research platform for self-evolving AI and a practical coding-agent model for production use.

  • Weights available on Hugging Face at huggingface.co/MiniMaxAI/MiniMax-M2.7.
  • Compatible with the same Token Plan and tool integrations as M2.5.
  • Agent Teams feature enables multi-agent workflows with defined role boundaries.
  • 97% skill adherence across 40+ complex skills ensures reliable task execution.
  • Handles 30-50% of RL research workflows autonomously.
GDPval-AA ELO rankings
ModelELOStatus
MiniMax M2.71495Highest open-source
Claude Opus 4.6HigherProprietary
GPT-5.3-CodexHigherProprietary

Evaluate M2.7 through the same Token Plan as M2.5

If you are already using MiniMax M2.5, switching to M2.7 uses the same Token Plan and tool integrations. Start with agent-heavy tasks to see where the self-evolution capabilities add the most value.

Sources and official links

Frequently asked questions

What does "self-evolving" mean for M2.7?

M2.7 can autonomously participate in its own improvement cycle. Over 100+ rounds of self-optimization, the model achieves a 30% performance improvement without human intervention. It also handles 30-50% of RL research workflows autonomously.

Is MiniMax M2.7 open source?

Yes. M2.7 was open-sourced on April 12, 2026. The weights are available on Hugging Face at huggingface.co/MiniMaxAI/MiniMax-M2.7.

How does M2.7 differ from M2.5?

M2.5 focuses on SWE-bench Verified performance (80.2%) and cost efficiency. M2.7 focuses on self-evolution, agent capabilities (Agent Teams), and autonomous optimization. They share the same MoE architecture family but are optimized for different use cases. M2.7 is not a simple replacement for M2.5.