Qwen3.6-Plus: Alibaba's Agentic Coding Model (SWE-Bench 78.8%, Terminal-Bench 61.6%)
Qwen3.6-Plus is Alibaba Cloud's latest agentic coding model, released in April 2026 as part of the Qwen3 family. It scores 78.8% on SWE-Bench Verified and a class-leading 61.6% on Terminal-Bench 2.0, making it one of the strongest publicly benchmarked models for real-world coding and terminal-based agent workflows. With 1M-token context and full multimodal support (text, image, and video), it is positioned as a general-purpose frontier model that also excels at software engineering tasks.
- Qwen3.6-Plus scores 78.8% SWE-Bench Verified and a class-leading 61.6% on Terminal-Bench 2.0.
- 1M-token context window with full multimodal support (text, image, and video).
- API available via OpenAI-compatible endpoints in Beijing, Singapore, and US regions.
Qwen3 family timeline
The Qwen3 family has expanded rapidly since the initial Qwen3 release on April 29, 2025. Understanding the timeline helps distinguish the dense-flagship models from the MoE (Mixture of Experts) efficiency variants and the coding-specialized releases.

Official screenshot
Qwen3.6-Plus already has route-specific public pricing on the Model Studio side
The Alibaba pricing page is the safest public surface for route-aware Qwen3.6-Plus billing because it distinguishes mainland and international rows directly on the official page.
- Useful when articles need one source-backed image for regional pricing differences.
- Pairs well with benchmark claims from the Qwen 3.6 release page.
Source: Alibaba Cloud Model Studio pricing.
| Release | Date | Key details |
|---|---|---|
| Qwen3 (initial) | Apr 29, 2025 | First Qwen3 release; MoE and dense variants for open-weight models |
| Qwen3-Max | Sep 5, 2025 | >1T parameter dense LLM; flagship general-purpose model |
| Qwen3-Coder-Plus | Sep 2025 | 256K context; 92 programming languages; coding-specialized variant |
| Qwen3.5 series | Feb 16, 2026 | Flash, 27B, 35B-A3B, 122B-A10B Plus; efficiency-focused MoE releases |
| Qwen3.6-Plus | Apr 2026 | 1M context; multimodal (text+image+video); agentic coding focus |
Qwen3.6-Plus coding and agent benchmarks
Qwen3.6-Plus targets agentic coding workflows directly. Its Terminal-Bench 2.0 score of 61.6% is the highest in its class, and its SWE-Bench Verified result of 78.8% places it competitively against models like Claude Opus 4.5 and MiniMax M2.5. The MCPMark score of 48.2% also leads the field for tool-calling and MCP integration.
| Benchmark | Score | Notes |
|---|---|---|
| SWE-Bench Verified | 78.8% | Top-tier; competitive with Claude Opus 4.5 |
| SWE-Bench Multilingual | 73.8% | Strong multilingual coding capability |
| SWE-Bench Pro | 56.6% | Harder professional-level tasks |
| Terminal-Bench 2.0 | 61.6% | Best in class for terminal agent workflows |
| Claw-Eval Avg | 74.8 | Strong agentic evaluation score |
| MCPMark | 48.2% | Best score for MCP tool-calling integration |
| LiveCodeBench v6 | 87.1% | Competitive live coding performance |
Qwen3.6-Plus vs leading coding models on the most widely cited software engineering benchmark.
Official Anthropic benchmark.
Official MiniMax M2.5 release.
Official Qwen 3.6 release.
Shown in Qwen's official comparison table.
Official Kimi K2.5 technical blog.
Source: Official Qwen 3.6 release.
Terminal-Bench measures multi-step terminal and agent workflow performance. Qwen3.6-Plus leads this benchmark class.
Official Anthropic benchmark.
Official Qwen 3.6 release.
Official MiniMax M2.7 release.
Shown in Qwen's official comparison table.
Source: Official Qwen 3.6 release.
General knowledge and reasoning benchmarks
Beyond coding, Qwen3.6-Plus scores 88.5% on MMLU-Pro and 90.4% on GPQA, confirming strong general knowledge and graduate-level reasoning. Its WMT24++ score of 84.3% also reflects top-tier multilingual translation capability.
| Benchmark | Score |
|---|---|
| MMLU-Pro | 88.5% |
| GPQA | 90.4% |
| WMT24++ | 84.3% |
Vision and multimodal benchmarks
Qwen3.6-Plus is fully multimodal, supporting text, image, and video inputs. Its vision benchmarks are competitive with or ahead of models like GPT-5.2, Claude Opus 4.5, and Gemini 3 Pro across document understanding, mathematical visual reasoning, real-world QA, and video comprehension.
- MMMU 86.0 and MathVision 88.0 reflect strong visual-reasoning and math-from-image performance.
- OmniDocBench1.5 at 91.2 and CC-OCR at 83.4 show excellent document and OCR understanding.
- VideoMME (with subtitles) at 87.8 confirms robust video comprehension for multimodal workflows.
| Benchmark | Qwen3.6-Plus | GPT-5.2 | Claude Opus 4.5 | Gemini 3 Pro |
|---|---|---|---|---|
| MMMU | 86.0 | — | — | — |
| MathVision | 88.0 | — | — | — |
| RealWorldQA | 85.4 | — | — | — |
| OmniDocBench1.5 | 91.2 | — | — | — |
| CC-OCR | 83.4 | — | — | — |
| VideoMME (w/ sub) | 87.8 | — | — | — |
Published pricing and endpoints
Qwen3.6-Plus is available through Alibaba Cloud's DashScope / Model Studio route, and the public pricing page already splits mainland and international rows rather than pretending there is one universal price. That route-aware pricing story is now part of the model guide, not just a billing appendix.
The API transport story is similarly clear. Alibaba publishes OpenAI-compatible endpoints for Beijing, Singapore, and the US, plus an Anthropic-compatible route on the international side.
- OpenAI-compatible endpoints: `dashscope.aliyuncs.com`, `dashscope-intl.aliyuncs.com`, and `dashscope-us.aliyuncs.com`.
- Anthropic-compatible route: `https://dashscope-intl.aliyuncs.com/apps/anthropic`.
- Qwen3-Coder-Plus is still a separate coding-specific route with its own release history and positioning.
| Route | Context band | Input | Output | Notes |
|---|---|---|---|---|
| Mainland China | 0-256K | 2 CNY / 1M input tokens | 12 CNY / 1M output tokens | The public page also shows a 90-day new-account validity note. |
| Mainland China | 256K-1M | 8 CNY / 1M input tokens | 48 CNY / 1M output tokens | Use this row for the 1M-context route. |
| International (Singapore) | 0-256K | 3.7471 CNY / 1M input tokens | 22.4826 CNY / 1M output tokens | Published in the international pricing section. |
| International (Singapore) | 256K-1M | 14.9884 CNY / 1M input tokens | 44.965 CNY / 1M output tokens | Published separately from the mainland route. |
Ready to try Qwen3.6-Plus?
Start with the DashScope API endpoint closest to your region, using the OpenAI SDK. Check the official release page for the latest pricing and model availability.
Sources and official links
Frequently asked questions
How does Qwen3.6-Plus differ from Qwen3-Coder-Plus?
Qwen3.6-Plus is a general-purpose multimodal model with 1M context that also excels at coding. Qwen3-Coder-Plus is a separate coding-specialized model released in September 2025 with 256K context and support for 92 programming languages. They serve different use cases: choose Qwen3.6-Plus for multimodal and long-context tasks, and Qwen3-Coder-Plus for focused coding workflows.
Is Qwen3.6-Plus available outside China?
Yes. Alibaba Cloud provides API endpoints in Singapore (dashscope-intl) and the US (dashscope-us), both OpenAI-compatible. International users should use these regional endpoints for lower latency.
What does Terminal-Bench 2.0 actually measure?
Terminal-Bench 2.0 evaluates multi-step terminal and agent workflows. It tests whether a model can complete complex sequences of shell commands, tool calls, and file manipulations autonomously. Qwen3.6-Plus leads this benchmark at 61.6%, ahead of MiniMax M2.7 (57.0%) and GLM-5 (56.2%).