Moonshot AI released Kimi K2.7-Code on June 12, 2026, an open-weights coding model that scores 81.1 percent on MCPMark Verified, ahead of Claude Opus 4.8's 76.4 percent on the same tool-use benchmark. It ships under a Modified MIT license with full weights on Hugging Face, and API pricing far below the frontier closed models.
Try It: Swap It Into Your Coding Agent
If you run an agentic coding setup, K2.7-Code is a drop-in test. Pull the weights from Hugging Face to run locally, or point your existing CLI at the Moonshot API at $0.95 per million input tokens and $4.00 per million output tokens. Cache hits drop to $0.19 per million. Run your usual multi-step refactor or test-writing task against it and compare completion rate to your current model before you commit to a monthly bill.
Why It Matters for Creators
Builders who lean on AI agents for real work have been paying frontier prices for long-horizon coding. A 1-trillion-parameter open model that beats Opus 4.8 on tool use, at a fraction of the cost and with weights you can self-host, changes the calculus for anyone shipping side projects or running cost-sensitive pipelines. Open weights also mean no rate limits and no vendor lock-in for teams that want full control.
Key Details
Architecture: Mixture-of-experts with 1 trillion total parameters and 32 billion active per token, 256K context.
Benchmarks: Gains over Kimi K2.6 of 21.8 percent on Kimi Code Bench v2, 11.0 percent on Program Bench, and 31.5 percent on MLS Bench Lite.
Efficiency: Roughly 30 percent lower reasoning-token usage than K2.6, which trims both latency and spend.
License: Modified MIT, weights downloadable from the Kimi K2 repository.
What to Do Next
Download the weights if you have the hardware, or test the hosted API on a real task from your backlog this week. If you already use a Claude or GPT coding agent, run the same prompt through K2.7-Code and measure the difference in success rate and cost. For most independent builders, an open model that matches frontier tool use is worth an afternoon of benchmarking.