Model Guide12 min readReviewed Apr 21, 2026

Kimi K2.6: Open-Source Coding, 300-Agent Swarms, and 80.2 SWE-Bench Verified

Kimi K2.6 is Moonshot AI’s April 20, 2026 coding-and-agent release. The public materials now give buyers a much cleaner evidence chain than earlier previews: an official Chinese launch post, an English technical blog with a full benchmark table, a dedicated Moonshot Open Platform pricing page, a quickstart doc, and an open-source Hugging Face release. Together they make K2.6 one of the easiest new Chinese coding models to cover without guesswork.

Published Apr 19, 2026Updated Apr 21, 2026
  • Official Apr 20, 2026 rollout across kimi.com, the app, API, and Kimi Code, with an open-source release on Hugging Face.
  • Public benchmark table includes SWE-Bench Verified 80.2, SWE-Bench Pro 58.6, Terminal-Bench 2.0 66.7, and LiveCodeBench v6 89.6.
  • Agent-swarm workflows scale to 300 sub-agents and 4,000 coordinated steps, with BrowseComp rising to 86.3 in the swarm setting.
  • Official API pricing for `kimi-k2.6` is ¥1.10 cached input, ¥6.50 input, and ¥27.00 output per 1M tokens with 262,144 context.
Quick note: This guide is based on public docs and release pages, but you should still verify current pricing, limits, supported tools, and region-specific billing on the official source before you pay, subscribe, or integrate.

What shipped on April 20, 2026

Moonshot AI’s Chinese launch post says Kimi K2.6 is available across kimi.com, the app, API, and Kimi Code. The same release also positions K2.6 as an open-source model, which Moonshot backs up with a public Hugging Face entry. That combination matters because it gives readers both a hosted route and a self-hosted route from day one.

The launch story is explicitly workflow-first. Moonshot highlights roughly 20% improvement on Kimi Code Bench, a 13-hour continuous coding case that modified more than 4,000 lines, and a swarm-style agent system that can coordinate 300 sub-agents across 4,000 steps. This is not framed as a casual chat refresh. It is framed as a long-horizon coding and agent release.

Kimi K2.6 coding and agent snapshot infographic
The official Kimi K2.6 tech blog and pricing page give enough detail to cover coding benchmarks, long-horizon execution, agent swarms, and current API pricing without guesswork. Source: Official Kimi K2.6 tech blog.
Official Kimi K2.6 hero image from the technical blog

Official image

Moonshot presents K2.6 as a coding-and-agent release from the first screen

The official K2.6 technical blog already combines the hero visual, product rollout, and the “open-source coding” positioning in one page. It is the strongest public image asset for the release.

  • Useful for connecting the hosted API route, Kimi Code, and the open-source release in one card.
  • Matches the benchmark table and long-horizon coding cases published lower on the same official page.

Source: Official Kimi K2.6 tech blog.

Kimi K2.6 public launch snapshot
SignalWhat is public todayWhy it matters
Availabilitykimi.com, app, API, and Kimi CodeThis is a broad product rollout, not just a lab post.
Open-source routePublic Hugging Face releaseTeams can inspect or self-host the released model.
Coding uplift~20% improvement on Kimi Code BenchMoonshot positions K2.6 as a real coding upgrade over the prior generation.
Agent scale300 sub-agents and 4,000 stepsK2.6 is explicitly marketed for long-horizon orchestration.
Endurance examples13-hour coding case and 5-day proactive agent runThese examples are stronger than a simple benchmark screenshot.
API model name`kimi-k2.6`The route is already documented on the Moonshot Open Platform side.

Benchmark rows you can quote directly

The English Kimi K2.6 technical blog is the strongest source in the whole release bundle because it publishes a broad benchmark table instead of a handful of cropped hero claims. That makes it possible to quote K2.6 across coding, agent, search, and reasoning tasks without inventing unsupported comparisons.

For coding-focused readers, the cleanest rows are SWE-Bench Verified 80.2, SWE-Bench Pro 58.6, Terminal-Bench 2.0 66.7, SciCode 52.2, OJBench (Python) 60.6, and LiveCodeBench v6 89.6. For web and agent tasks, BrowseComp 83.2 and BrowseComp (agent swarm) 86.3 are especially easy to explain.

Official Kimi Code Bench image from the Kimi K2.6 technical blog

Official image

Moonshot publishes the Kimi Code Bench uplift directly in the technical blog

The official K2.6 blog does not stop at prose. It embeds the Kimi Code Bench graphic, which makes the long-horizon coding improvement over K2.5 easier to document from a first-party source.

  • Useful for articles that want a source-backed coding-specific visual instead of only quoting the benchmark table.
  • Works well beside the main benchmark section because it comes from the same official blog page.

Source: Official Kimi K2.6 tech blog.

Official Kimi Design Bench image from the Kimi K2.6 technical blog

Official image

Kimi K2.6 also extends into coding-driven design and website generation

The Kimi Design Bench graphic on the official blog helps explain that K2.6 is not only a repo-coding model. Moonshot also positions it around front-end generation, full-stack workflows, and design-heavy output.

  • Useful when the guide needs a source-backed image for UI generation and coding-driven design claims.
  • Helps separate plain code completion from broader product-building workflows.

Source: Official Kimi K2.6 tech blog.

Key Kimi K2.6 benchmark rows from the public technical blog
BenchmarkKimi K2.6Why this row matters
SWE-Bench Verified80.2One of the clearest public coding-quality rows in the release.
SWE-Bench Pro58.6Useful when comparing higher-difficulty software engineering tasks.
Terminal-Bench 2.066.7Strong signal for multi-step terminal and agent workflows.
LiveCodeBench v689.6Clean short-form coding benchmark row for developer audiences.
BrowseComp83.2Shows K2.6 is not only a repo model; it is also strong at browse-heavy tasks.
BrowseComp (agent swarm)86.3Easy proof that the swarm setting adds measurable value.
DeepSearchQA (f1 / accuracy)92.5 / 83.0Useful for search, retrieval, and verification workflows.
SciCode52.2Relevant for technical and scientific coding tasks.
Kimi K2.6 public benchmark highlights

These are the most reusable rows from the official K2.6 technical blog when writing about coding and agent workflows.

LiveCodeBench v689.6

Official Kimi K2.6 tech blog.

BrowseComp (agent swarm)86.3

Official Kimi K2.6 tech blog.

SWE-Bench Verified80.2

Official Kimi K2.6 tech blog.

DeepSearchQA (f1)92.5

Official Kimi K2.6 tech blog.

Terminal-Bench 2.066.7

Official Kimi K2.6 tech blog.

SWE-Bench Pro58.6

Official Kimi K2.6 tech blog.

Source: Official Kimi K2.6 tech blog.

Long-horizon coding and swarm execution

Moonshot backs the benchmark table with unusually concrete long-run cases. In one local optimization example, K2.6 spent more than 12 hours on a Qwen3.5-0.8B optimization task, made over 4,000 tool calls across 14 iterations, and improved throughput from about 15 tokens per second to about 193 tokens per second. That is the kind of evidence that helps readers understand whether the model can stay useful after hour one.

The second standout example is an exchange-core case: 13 hours of execution, more than 1,000 tool calls, more than 4,000 lines modified, and reported throughput gains of 185% median and 133% peak. On the Chinese launch side, Moonshot also says K2.6 can keep proactive agents running for five days and coordinate 300 sub-agents over 4,000 steps.

  • BrowseComp improves from 83.2 to 86.3 when the swarm setting is enabled in the official table.
  • The release repeatedly emphasizes skills from documents and files as part of the multi-agent workflow.
  • These long-run examples are one reason K2.6 is easier to position as an agent model than older Kimi family releases.
Official K2.6 Qwen3.5-0.8B Mac inference optimization case image

Official image

K2.6 shows a full Mac inference optimization case, not only benchmark rows

Moonshot uses this official case-study image to support the long-horizon execution story: more than 12 hours, 4,000+ tool calls, 14 iterations, and throughput rising from about 15 to about 193 tokens per second.

  • Useful when readers want proof that K2.6 can stay productive across hours of engineering work.
  • Adds concrete runtime evidence to the benchmark section without relying on a third-party recap.

Source: Official Kimi K2.6 tech blog.

Official K2.6 exchange-core coding showcase image

Official image

The exchange-core case is a concrete production-style refactor example

This official graphic backs the exchange-core story on the K2.6 blog: 13 hours of execution, 1,000+ tool calls, more than 4,000 lines modified, and major throughput gains on a mature open-source system.

  • Useful for readers who care more about sustained engineering output than a one-shot coding demo.
  • Pairs naturally with the long-horizon section because it is the strongest public visual for that claim set.

Source: Official Kimi K2.6 tech blog.

Official Kimi Claw Bench image from the Kimi K2.6 technical blog

Official image

Moonshot also publishes a Claw-oriented visual for agent workflow evaluation

This image comes from the official K2.6 blog and supports the swarm and agent-loop narrative with a first-party benchmark visual instead of an unofficial screenshot or repost.

  • Useful for articles that position K2.6 around autonomous tool use and multi-agent workflows.
  • Best paired with the swarm-execution section rather than treated as a generic coding benchmark.

Source: Official Kimi K2.6 tech blog.

Public long-run examples and swarm signals for K2.6
CasePublic resultWhat it proves
Local Qwen3.5-0.8B optimization12+ hours, 4,000+ tool calls, 14 iterations, ~15 to ~193 tokens/sLong-horizon repo optimization is a first-class use case in the official blog.
exchange-core case13 hours, 1,000+ tool calls, 4,000+ lines modifiedK2.6 is presented as capable of sustained production-style refactoring.
exchange-core outcome185% median throughput gain, 133% peak throughput gainThe case study ties long runtime to measurable performance output.
Agent swarm scale300 sub-agents, 4,000 coordinated stepsSwarm orchestration is part of the public product narrative, not a private demo.
Proactive agent enduranceUp to 5 daysThe launch post frames K2.6 as a durable agent, not only a single-request coder.

Pricing and route clarity: API is not Kimi Code membership

The Moonshot Open Platform pricing page publishes a dedicated `kimi-k2.6` row with enough detail to write a clean route-aware pricing section. The row uses a 1M token billing unit, lists cached input, input, and output separately, and confirms a 262,144-token context window.

Just as important, this API pricing page is not the same product as Kimi Code membership pricing. If a guide mixes the two, it stops being useful. Kimi Code is the product route for Moonshot’s first-party coding client. The Moonshot Open Platform table is the route for programmatic API billing.

  • The pricing page notes text, image, and video input support.
  • The same public route notes thinking and non-thinking modes, auto context cache, ToolCalls, JSON Mode, Partial Mode, and web search.
  • Use Moonshot Open Platform docs for API pricing and the Kimi Code site for membership-style product messaging.
Official Moonshot Open Platform pricing for `kimi-k2.6`
Billing itemPublic priceNotes
Cached input¥1.10 / 1M tokensPublished on the dedicated `chat-k26` pricing page.
Input¥6.50 / 1M tokensPublic Open Platform API route.
Output¥27.00 / 1M tokensPublic Open Platform API route.
Context window262,144 tokensListed on the same pricing row.
BuyGLM shows package prices in USD. When a source page is published in CNY, the displayed value uses a fixed 1 USD = 8 CNY conversion and should still be checked against the live vendor page before payment.

Open-source deployment and ecosystem routes

Kimi K2.6 is one of the few just-launched Chinese coding models where the hosted route and the open-source route are both already easy to reference. Moonshot’s quickstart page documents the API path for `kimi-k2.6`, while Hugging Face gives a concrete self-hosting entry point for readers who want to evaluate or deploy it outside Moonshot’s managed platform.

For a guide, the cleanest structure is to separate these routes explicitly: first-party Kimi Code for product UX, Moonshot Open Platform for API billing, and Hugging Face for self-hosted evaluation. That makes the article materially more useful than a one-column “price + benchmark” summary.

Kimi K2.6 route map for buyers and builders
RouteWhat is publicBest fit
Kimi CodeFirst-party coding product route referenced in the launch postTeams that want Moonshot’s client experience instead of token billing.
Moonshot Open PlatformPricing page plus quickstart docs for `kimi-k2.6`Developers who need programmable API access.
Hugging Face releasePublic Kimi K2.6 model pageReaders who want self-hosting, evaluation, or ecosystem packaging.

Use Kimi K2.6 when you need both benchmark strength and route clarity

The official source set is unusually complete: launch post, technical blog, pricing row, quickstart docs, and an open-source release. That makes K2.6 one of the easiest new coding models to evaluate without mixing product routes or inventing missing details.

Sources and official links

Frequently asked questions

Is Kimi K2.6 the same thing as Kimi Code membership?

No. Kimi Code is Moonshot’s first-party coding product route. The `kimi-k2.6` pricing page belongs to the Moonshot Open Platform API route. They are related, but they are not the same billing surface and should not be treated as one price table.

Which Kimi K2.6 benchmark numbers are the safest to quote?

The safest rows are the ones published directly in the official English technical blog: SWE-Bench Verified 80.2, SWE-Bench Pro 58.6, Terminal-Bench 2.0 66.7, LiveCodeBench v6 89.6, BrowseComp 83.2, BrowseComp (agent swarm) 86.3, and DeepSearchQA 92.5 f1 / 83.0 accuracy.

How much context and API pricing is publicly documented for Kimi K2.6?

The public Moonshot Open Platform pricing row lists a 262,144-token context window and prices of ¥1.10 cached input, ¥6.50 input, and ¥27.00 output per 1M tokens.

Is Kimi K2.6 open source?

Yes. Moonshot AI frames K2.6 as an open-source release and publishes a public Hugging Face page for the model, giving readers a traceable self-hosting route in addition to the hosted API route.

What makes K2.6 different from K2.5 in practice?

The public material for K2.6 is much more explicit about long-horizon coding, swarm orchestration, and concrete agent runtime. It also publishes a broader benchmark table and a dedicated `kimi-k2.6` pricing page, which makes it easier to evaluate as a production coding model rather than only as a general-purpose family update.