Kimi K2.6: Open-Source Coding, 300-Agent Swarms, and 80.2 SWE-Bench Verified
Kimi K2.6 is Moonshot AI’s April 20, 2026 coding-and-agent release. The public materials now give buyers a much cleaner evidence chain than earlier previews: an official Chinese launch post, an English technical blog with a full benchmark table, a dedicated Moonshot Open Platform pricing page, a quickstart doc, and an open-source Hugging Face release. Together they make K2.6 one of the easiest new Chinese coding models to cover without guesswork.
- Official Apr 20, 2026 rollout across kimi.com, the app, API, and Kimi Code, with an open-source release on Hugging Face.
- Public benchmark table includes SWE-Bench Verified 80.2, SWE-Bench Pro 58.6, Terminal-Bench 2.0 66.7, and LiveCodeBench v6 89.6.
- Agent-swarm workflows scale to 300 sub-agents and 4,000 coordinated steps, with BrowseComp rising to 86.3 in the swarm setting.
- Official API pricing for `kimi-k2.6` is ¥1.10 cached input, ¥6.50 input, and ¥27.00 output per 1M tokens with 262,144 context.
What shipped on April 20, 2026
Moonshot AI’s Chinese launch post says Kimi K2.6 is available across kimi.com, the app, API, and Kimi Code. The same release also positions K2.6 as an open-source model, which Moonshot backs up with a public Hugging Face entry. That combination matters because it gives readers both a hosted route and a self-hosted route from day one.
The launch story is explicitly workflow-first. Moonshot highlights roughly 20% improvement on Kimi Code Bench, a 13-hour continuous coding case that modified more than 4,000 lines, and a swarm-style agent system that can coordinate 300 sub-agents across 4,000 steps. This is not framed as a casual chat refresh. It is framed as a long-horizon coding and agent release.
Official image
Moonshot presents K2.6 as a coding-and-agent release from the first screen
The official K2.6 technical blog already combines the hero visual, product rollout, and the “open-source coding” positioning in one page. It is the strongest public image asset for the release.
- Useful for connecting the hosted API route, Kimi Code, and the open-source release in one card.
- Matches the benchmark table and long-horizon coding cases published lower on the same official page.
Source: Official Kimi K2.6 tech blog.
| Signal | What is public today | Why it matters |
|---|---|---|
| Availability | kimi.com, app, API, and Kimi Code | This is a broad product rollout, not just a lab post. |
| Open-source route | Public Hugging Face release | Teams can inspect or self-host the released model. |
| Coding uplift | ~20% improvement on Kimi Code Bench | Moonshot positions K2.6 as a real coding upgrade over the prior generation. |
| Agent scale | 300 sub-agents and 4,000 steps | K2.6 is explicitly marketed for long-horizon orchestration. |
| Endurance examples | 13-hour coding case and 5-day proactive agent run | These examples are stronger than a simple benchmark screenshot. |
| API model name | `kimi-k2.6` | The route is already documented on the Moonshot Open Platform side. |
Benchmark rows you can quote directly
The English Kimi K2.6 technical blog is the strongest source in the whole release bundle because it publishes a broad benchmark table instead of a handful of cropped hero claims. That makes it possible to quote K2.6 across coding, agent, search, and reasoning tasks without inventing unsupported comparisons.
For coding-focused readers, the cleanest rows are SWE-Bench Verified 80.2, SWE-Bench Pro 58.6, Terminal-Bench 2.0 66.7, SciCode 52.2, OJBench (Python) 60.6, and LiveCodeBench v6 89.6. For web and agent tasks, BrowseComp 83.2 and BrowseComp (agent swarm) 86.3 are especially easy to explain.
Official image
Moonshot publishes the Kimi Code Bench uplift directly in the technical blog
The official K2.6 blog does not stop at prose. It embeds the Kimi Code Bench graphic, which makes the long-horizon coding improvement over K2.5 easier to document from a first-party source.
- Useful for articles that want a source-backed coding-specific visual instead of only quoting the benchmark table.
- Works well beside the main benchmark section because it comes from the same official blog page.
Source: Official Kimi K2.6 tech blog.
Official image
Kimi K2.6 also extends into coding-driven design and website generation
The Kimi Design Bench graphic on the official blog helps explain that K2.6 is not only a repo-coding model. Moonshot also positions it around front-end generation, full-stack workflows, and design-heavy output.
- Useful when the guide needs a source-backed image for UI generation and coding-driven design claims.
- Helps separate plain code completion from broader product-building workflows.
Source: Official Kimi K2.6 tech blog.
| Benchmark | Kimi K2.6 | Why this row matters |
|---|---|---|
| SWE-Bench Verified | 80.2 | One of the clearest public coding-quality rows in the release. |
| SWE-Bench Pro | 58.6 | Useful when comparing higher-difficulty software engineering tasks. |
| Terminal-Bench 2.0 | 66.7 | Strong signal for multi-step terminal and agent workflows. |
| LiveCodeBench v6 | 89.6 | Clean short-form coding benchmark row for developer audiences. |
| BrowseComp | 83.2 | Shows K2.6 is not only a repo model; it is also strong at browse-heavy tasks. |
| BrowseComp (agent swarm) | 86.3 | Easy proof that the swarm setting adds measurable value. |
| DeepSearchQA (f1 / accuracy) | 92.5 / 83.0 | Useful for search, retrieval, and verification workflows. |
| SciCode | 52.2 | Relevant for technical and scientific coding tasks. |
These are the most reusable rows from the official K2.6 technical blog when writing about coding and agent workflows.
Official Kimi K2.6 tech blog.
Official Kimi K2.6 tech blog.
Official Kimi K2.6 tech blog.
Official Kimi K2.6 tech blog.
Official Kimi K2.6 tech blog.
Official Kimi K2.6 tech blog.
Source: Official Kimi K2.6 tech blog.
Long-horizon coding and swarm execution
Moonshot backs the benchmark table with unusually concrete long-run cases. In one local optimization example, K2.6 spent more than 12 hours on a Qwen3.5-0.8B optimization task, made over 4,000 tool calls across 14 iterations, and improved throughput from about 15 tokens per second to about 193 tokens per second. That is the kind of evidence that helps readers understand whether the model can stay useful after hour one.
The second standout example is an exchange-core case: 13 hours of execution, more than 1,000 tool calls, more than 4,000 lines modified, and reported throughput gains of 185% median and 133% peak. On the Chinese launch side, Moonshot also says K2.6 can keep proactive agents running for five days and coordinate 300 sub-agents over 4,000 steps.
- BrowseComp improves from 83.2 to 86.3 when the swarm setting is enabled in the official table.
- The release repeatedly emphasizes skills from documents and files as part of the multi-agent workflow.
- These long-run examples are one reason K2.6 is easier to position as an agent model than older Kimi family releases.
Official image
K2.6 shows a full Mac inference optimization case, not only benchmark rows
Moonshot uses this official case-study image to support the long-horizon execution story: more than 12 hours, 4,000+ tool calls, 14 iterations, and throughput rising from about 15 to about 193 tokens per second.
- Useful when readers want proof that K2.6 can stay productive across hours of engineering work.
- Adds concrete runtime evidence to the benchmark section without relying on a third-party recap.
Source: Official Kimi K2.6 tech blog.
Official image
The exchange-core case is a concrete production-style refactor example
This official graphic backs the exchange-core story on the K2.6 blog: 13 hours of execution, 1,000+ tool calls, more than 4,000 lines modified, and major throughput gains on a mature open-source system.
- Useful for readers who care more about sustained engineering output than a one-shot coding demo.
- Pairs naturally with the long-horizon section because it is the strongest public visual for that claim set.
Source: Official Kimi K2.6 tech blog.
Official image
Moonshot also publishes a Claw-oriented visual for agent workflow evaluation
This image comes from the official K2.6 blog and supports the swarm and agent-loop narrative with a first-party benchmark visual instead of an unofficial screenshot or repost.
- Useful for articles that position K2.6 around autonomous tool use and multi-agent workflows.
- Best paired with the swarm-execution section rather than treated as a generic coding benchmark.
Source: Official Kimi K2.6 tech blog.
| Case | Public result | What it proves |
|---|---|---|
| Local Qwen3.5-0.8B optimization | 12+ hours, 4,000+ tool calls, 14 iterations, ~15 to ~193 tokens/s | Long-horizon repo optimization is a first-class use case in the official blog. |
| exchange-core case | 13 hours, 1,000+ tool calls, 4,000+ lines modified | K2.6 is presented as capable of sustained production-style refactoring. |
| exchange-core outcome | 185% median throughput gain, 133% peak throughput gain | The case study ties long runtime to measurable performance output. |
| Agent swarm scale | 300 sub-agents, 4,000 coordinated steps | Swarm orchestration is part of the public product narrative, not a private demo. |
| Proactive agent endurance | Up to 5 days | The launch post frames K2.6 as a durable agent, not only a single-request coder. |
Pricing and route clarity: API is not Kimi Code membership
The Moonshot Open Platform pricing page publishes a dedicated `kimi-k2.6` row with enough detail to write a clean route-aware pricing section. The row uses a 1M token billing unit, lists cached input, input, and output separately, and confirms a 262,144-token context window.
Just as important, this API pricing page is not the same product as Kimi Code membership pricing. If a guide mixes the two, it stops being useful. Kimi Code is the product route for Moonshot’s first-party coding client. The Moonshot Open Platform table is the route for programmatic API billing.
- The pricing page notes text, image, and video input support.
- The same public route notes thinking and non-thinking modes, auto context cache, ToolCalls, JSON Mode, Partial Mode, and web search.
- Use Moonshot Open Platform docs for API pricing and the Kimi Code site for membership-style product messaging.
| Billing item | Public price | Notes |
|---|---|---|
| Cached input | ¥1.10 / 1M tokens | Published on the dedicated `chat-k26` pricing page. |
| Input | ¥6.50 / 1M tokens | Public Open Platform API route. |
| Output | ¥27.00 / 1M tokens | Public Open Platform API route. |
| Context window | 262,144 tokens | Listed on the same pricing row. |
Open-source deployment and ecosystem routes
Kimi K2.6 is one of the few just-launched Chinese coding models where the hosted route and the open-source route are both already easy to reference. Moonshot’s quickstart page documents the API path for `kimi-k2.6`, while Hugging Face gives a concrete self-hosting entry point for readers who want to evaluate or deploy it outside Moonshot’s managed platform.
For a guide, the cleanest structure is to separate these routes explicitly: first-party Kimi Code for product UX, Moonshot Open Platform for API billing, and Hugging Face for self-hosted evaluation. That makes the article materially more useful than a one-column “price + benchmark” summary.
| Route | What is public | Best fit |
|---|---|---|
| Kimi Code | First-party coding product route referenced in the launch post | Teams that want Moonshot’s client experience instead of token billing. |
| Moonshot Open Platform | Pricing page plus quickstart docs for `kimi-k2.6` | Developers who need programmable API access. |
| Hugging Face release | Public Kimi K2.6 model page | Readers who want self-hosting, evaluation, or ecosystem packaging. |
Use Kimi K2.6 when you need both benchmark strength and route clarity
The official source set is unusually complete: launch post, technical blog, pricing row, quickstart docs, and an open-source release. That makes K2.6 one of the easiest new coding models to evaluate without mixing product routes or inventing missing details.
Sources and official links
Frequently asked questions
Is Kimi K2.6 the same thing as Kimi Code membership?
No. Kimi Code is Moonshot’s first-party coding product route. The `kimi-k2.6` pricing page belongs to the Moonshot Open Platform API route. They are related, but they are not the same billing surface and should not be treated as one price table.
Which Kimi K2.6 benchmark numbers are the safest to quote?
The safest rows are the ones published directly in the official English technical blog: SWE-Bench Verified 80.2, SWE-Bench Pro 58.6, Terminal-Bench 2.0 66.7, LiveCodeBench v6 89.6, BrowseComp 83.2, BrowseComp (agent swarm) 86.3, and DeepSearchQA 92.5 f1 / 83.0 accuracy.
How much context and API pricing is publicly documented for Kimi K2.6?
The public Moonshot Open Platform pricing row lists a 262,144-token context window and prices of ¥1.10 cached input, ¥6.50 input, and ¥27.00 output per 1M tokens.
Is Kimi K2.6 open source?
Yes. Moonshot AI frames K2.6 as an open-source release and publishes a public Hugging Face page for the model, giving readers a traceable self-hosting route in addition to the hosted API route.
What makes K2.6 different from K2.5 in practice?
The public material for K2.6 is much more explicit about long-horizon coding, swarm orchestration, and concrete agent runtime. It also publishes a broader benchmark table and a dedicated `kimi-k2.6` pricing page, which makes it easier to evaluate as a production coding model rather than only as a general-purpose family update.