Microsoft used its Build 2026 keynote on June 2 to ship MAI-Code-1-Flash, an in-house coding model that is live today inside the GitHub Copilot model picker for Free, Pro, Pro+, and Max tiers. Microsoft says the model was trained inside Copilot's production harness rather than benchmarked externally, and it claims a +16-point lead over Claude Haiku 4.5 on SWE-Bench Pro (51.2 percent versus 35.2 percent).

Try It Today

Open Visual Studio Code, sign in to GitHub Copilot, and pick MAI-Code-1-Flash from the model picker (or let auto-selection route to it on quick edits and inline chat). No new install, no waitlist, no separate billing tier. Run it against a representative slice of your real codebase for a day: log acceptance rate, suggestion latency, and how often you reach for a heavier model. Microsoft's pitch is that Flash answers brief asks in fewer tokens and reserves longer reasoning for harder problems, so a like-for-like comparison against your current default (GPT-5.5, Claude Sonnet 4.6, or Haiku 4.5) tells you whether the swap holds up on your code.

Why It Matters

Until today, the small-fast tier of coding assistants was largely an Anthropic story. Claude Haiku 4.5 set the bar for low-latency pair programming at $1 input / $5 output per million tokens, and the Cursor and Windsurf integrations leaned on it. MAI-Code-1-Flash is the first model from a non-Anthropic top-tier lab that ships specifically against that envelope, and it lands inside the developer tool with the largest install base. For solo creators and small teams who pay for Copilot Pro, the Flash tier is a free upgrade with no migration cost.

Key Details

  • Performance: 85.8 percent adjusted accuracy on Microsoft's internal 186-question benchmark across 34 categories, per Microsoft's announcement.
  • SWE-Bench Pro: 51.2 percent, putting Flash in the same tier as GPT-5.3 on the SWE-Bench coding evaluation suite, per third-party Build coverage.
  • Token efficiency: up to 60 percent fewer tokens on hard SWE-Bench Verified tasks vs comparable models.
  • Training: real Copilot tool interactions including multi-step file editing, terminal calls, context retrieval, and inline chat flows.
  • Strategic fit: lands alongside Project Polaris, the in-house MoE flagship that takes over as Copilot's default in August (see our Polaris coverage). Flash is the fast tier, Polaris is the deep tier.

What to Do Next

If you bill Copilot to a client or business, pin MAI-Code-1-Flash to one project and your current default to another for a week, then compare acceptance rate and time-to-merge. The model is free to use across all paid tiers, so the test costs nothing beyond your existing subscription. Note that Copilot moved to usage-based billing on June 1, so heavier model usage now shows up directly on the bill, which is exactly the price-pressure Microsoft is hoping Flash relieves.