Model Guide10 min readReviewed Apr 20, 2026

Qwen3.6-Plus: Alibaba's Agentic Coding Model (SWE-Bench 78.8%, Terminal-Bench 61.6%)

Qwen3.6-Plus is Alibaba Cloud's latest agentic coding model, released in April 2026 as part of the Qwen3 family. It scores 78.8% on SWE-Bench Verified and a class-leading 61.6% on Terminal-Bench 2.0, making it one of the strongest publicly benchmarked models for real-world coding and terminal-based agent workflows. With 1M-token context and full multimodal support (text, image, and video), it is positioned as a general-purpose frontier model that also excels at software engineering tasks.

Published Apr 19, 2026Updated Apr 20, 2026

Qwen3.6-Plus scores 78.8% SWE-Bench Verified and a class-leading 61.6% on Terminal-Bench 2.0.
1M-token context window with full multimodal support (text, image, and video).
API available via OpenAI-compatible endpoints in Beijing, Singapore, and US regions.

Quick note: This guide is based on public docs and release pages, but you should still verify current pricing, limits, supported tools, and region-specific billing on the official source before you pay, subscribe, or integrate.

Qwen3 family timeline

The Qwen3 family has expanded rapidly since the initial Qwen3 release on April 29, 2025. Understanding the timeline helps distinguish the dense-flagship models from the MoE (Mixture of Experts) efficiency variants and the coding-specialized releases.

Qwen3.6-Plus official snapshot infographic — A route-aware summary of the official Qwen 3.6 release, Model Studio pricing rows, and the published DashScope endpoint choices. Source: Official Qwen 3.6 release.

Official Alibaba Model Studio pricing page screenshot for Qwen3.6-Plus

Official screenshot

Qwen3.6-Plus already has route-specific public pricing on the Model Studio side

The Alibaba pricing page is the safest public surface for route-aware Qwen3.6-Plus billing because it distinguishes mainland and international rows directly on the official page.

Useful when articles need one source-backed image for regional pricing differences.
Pairs well with benchmark claims from the Qwen 3.6 release page.

Source: Alibaba Cloud Model Studio pricing.

Qwen3 family release timeline
Release	Date	Key details
Qwen3 (initial)	Apr 29, 2025	First Qwen3 release; MoE and dense variants for open-weight models
Qwen3-Max	Sep 5, 2025	>1T parameter dense LLM; flagship general-purpose model
Qwen3-Coder-Plus	Sep 2025	256K context; 92 programming languages; coding-specialized variant
Qwen3.5 series	Feb 16, 2026	Flash, 27B, 35B-A3B, 122B-A10B Plus; efficiency-focused MoE releases
Qwen3.6-Plus	Apr 2026	1M context; multimodal (text+image+video); agentic coding focus

Qwen3.6-Plus coding and agent benchmarks

Qwen3.6-Plus targets agentic coding workflows directly. Its Terminal-Bench 2.0 score of 61.6% is the highest in its class, and its SWE-Bench Verified result of 78.8% places it competitively against models like Claude Opus 4.5 and MiniMax M2.5. The MCPMark score of 48.2% also leads the field for tool-calling and MCP integration.

Qwen3.6-Plus coding and agent benchmarks
Benchmark	Score	Notes
SWE-Bench Verified	78.8%	Top-tier; competitive with Claude Opus 4.5
SWE-Bench Multilingual	73.8%	Strong multilingual coding capability
SWE-Bench Pro	56.6%	Harder professional-level tasks
Terminal-Bench 2.0	61.6%	Best in class for terminal agent workflows
Claw-Eval Avg	74.8	Strong agentic evaluation score
MCPMark	48.2%	Best score for MCP tool-calling integration
LiveCodeBench v6	87.1%	Competitive live coding performance

SWE-Bench Verified comparison

Qwen3.6-Plus vs leading coding models on the most widely cited software engineering benchmark.

Claude Opus 4.580.9

Official Anthropic benchmark.

MiniMax M2.580.2

Official MiniMax M2.5 release.

Qwen3.6-Plus78.8

Official Qwen 3.6 release.

GLM-577.8

Shown in Qwen's official comparison table.

Kimi K2.576.8

Official Kimi K2.5 technical blog.

Source: Official Qwen 3.6 release.

Terminal-Bench 2.0 comparison

Terminal-Bench measures multi-step terminal and agent workflow performance. Qwen3.6-Plus leads this benchmark class.

Claude Opus 4.665.4

Official Anthropic benchmark.

Qwen3.6-Plus61.6

Official Qwen 3.6 release.

MiniMax M2.757.0

Official MiniMax M2.7 release.

GLM-556.2

Shown in Qwen's official comparison table.

Source: Official Qwen 3.6 release.

General knowledge and reasoning benchmarks

Beyond coding, Qwen3.6-Plus scores 88.5% on MMLU-Pro and 90.4% on GPQA, confirming strong general knowledge and graduate-level reasoning. Its WMT24++ score of 84.3% also reflects top-tier multilingual translation capability.

Qwen3.6-Plus reasoning and language benchmarks
Benchmark	Score
MMLU-Pro	88.5%
GPQA	90.4%
WMT24++	84.3%

Vision and multimodal benchmarks

Qwen3.6-Plus is fully multimodal, supporting text, image, and video inputs. Its vision benchmarks are competitive with or ahead of models like GPT-5.2, Claude Opus 4.5, and Gemini 3 Pro across document understanding, mathematical visual reasoning, real-world QA, and video comprehension.

MMMU 86.0 and MathVision 88.0 reflect strong visual-reasoning and math-from-image performance.
OmniDocBench1.5 at 91.2 and CC-OCR at 83.4 show excellent document and OCR understanding.
VideoMME (with subtitles) at 87.8 confirms robust video comprehension for multimodal workflows.

Vision benchmarks: Qwen3.6-Plus vs leading multimodal models
Benchmark	Qwen3.6-Plus	GPT-5.2	Claude Opus 4.5	Gemini 3 Pro
MMMU	86.0	—	—	—
MathVision	88.0	—	—	—
RealWorldQA	85.4	—	—	—
OmniDocBench1.5	91.2	—	—	—
CC-OCR	83.4	—	—	—
VideoMME (w/ sub)	87.8	—	—	—

Published pricing and endpoints

Qwen3.6-Plus is available through Alibaba Cloud's DashScope / Model Studio route, and the public pricing page already splits mainland and international rows rather than pretending there is one universal price. That route-aware pricing story is now part of the model guide, not just a billing appendix.

The API transport story is similarly clear. Alibaba publishes OpenAI-compatible endpoints for Beijing, Singapore, and the US, plus an Anthropic-compatible route on the international side.

OpenAI-compatible endpoints: `dashscope.aliyuncs.com`, `dashscope-intl.aliyuncs.com`, and `dashscope-us.aliyuncs.com`.
Anthropic-compatible route: `https://dashscope-intl.aliyuncs.com/apps/anthropic`.
Qwen3-Coder-Plus is still a separate coding-specific route with its own release history and positioning.

Official public pricing rows for `qwen3.6-plus`
Route	Context band	Input	Output	Notes
Mainland China	0-256K	2 CNY / 1M input tokens	12 CNY / 1M output tokens	The public page also shows a 90-day new-account validity note.
Mainland China	256K-1M	8 CNY / 1M input tokens	48 CNY / 1M output tokens	Use this row for the 1M-context route.
International (Singapore)	0-256K	3.7471 CNY / 1M input tokens	22.4826 CNY / 1M output tokens	Published in the international pricing section.
International (Singapore)	256K-1M	14.9884 CNY / 1M input tokens	44.965 CNY / 1M output tokens	Published separately from the mainland route.

BuyGLM shows package prices in USD. When a source page is published in CNY, the displayed value uses a fixed 1 USD = 8 CNY conversion and should still be checked against the live vendor page before payment.

Ready to try Qwen3.6-Plus?

Start with the DashScope API endpoint closest to your region, using the OpenAI SDK. Check the official release page for the latest pricing and model availability.

Open the Qwen 3.6 release Submit request

Sources and official links

Frequently asked questions

How does Qwen3.6-Plus differ from Qwen3-Coder-Plus?

Qwen3.6-Plus is a general-purpose multimodal model with 1M context that also excels at coding. Qwen3-Coder-Plus is a separate coding-specialized model released in September 2025 with 256K context and support for 92 programming languages. They serve different use cases: choose Qwen3.6-Plus for multimodal and long-context tasks, and Qwen3-Coder-Plus for focused coding workflows.

Is Qwen3.6-Plus available outside China?

Yes. Alibaba Cloud provides API endpoints in Singapore (dashscope-intl) and the US (dashscope-us), both OpenAI-compatible. International users should use these regional endpoints for lower latency.

What does Terminal-Bench 2.0 actually measure?

Terminal-Bench 2.0 evaluates multi-step terminal and agent workflows. It tests whether a model can complete complex sequences of shell commands, tool calls, and file manipulations autonomously. Qwen3.6-Plus leads this benchmark at 61.6%, ahead of MiniMax M2.7 (57.0%) and GLM-5 (56.2%).

Qwen3 family timeline

Qwen3.6-Plus already has route-specific public pricing on the Model Studio side

Qwen3.6-Plus coding and agent benchmarks

General knowledge and reasoning benchmarks

Vision and multimodal benchmarks

Published pricing and endpoints

Ready to try Qwen3.6-Plus?

Sources and official links

Frequently asked questions

Related guides

AI Coding Benchmarks 2026: Which Public Numbers You Can Actually Trust After Qwen3.6-Max and Kimi K2.6

GLM-5.1 vs Qwen 3.6-Plus: Which Is Better for Agentic Coding?

Doubao Seed2.0: ByteDance's Official Pro, Lite, Mini, and Code Model Guide