Microsoft MAI-Thinking-1 Tops Claude Sonnet in User Evals

Microsoft announced MAI-Thinking-1 on June 2, a sparse Mixture of Experts reasoning model that beats Claude Sonnet 4.6 in blind user evaluations across 1,276 tasks and matches Claude Opus 4.6 on SWE-Bench Pro. The model entered private preview on Azure AI Foundry today, with a public preview planned for the MAI Playground.

What Happened

MAI-Thinking-1 uses a sparse Mixture of Experts architecture with 35 billion active parameters and approximately 1 trillion total parameters. Microsoft trained the model from scratch on clean, commercially licensed data with no distillation from third-party models, positioning it for enterprise deployments where training data licensing is audited in procurement reviews. The announcement is part of Microsoft's Build 2026 push, which also included MAI-Code-1-Flash (a 5B coding model now live in GitHub Copilot) and the MAI v2 media suite for image, voice, and transcription.

Why It Matters

The benchmark numbers are meaningful for LLM workflow builders. A 97.0 percent score on AIME 2025 and 94.5 percent on AIME 2026 place MAI-Thinking-1 among the strongest reasoning models at any parameter count. On SWE-Bench Pro, the model matches Claude Opus 4.6, a notable result for a 35B active-parameter model at a fraction of the inference cost of larger frontier models. The 256k token context window (approximately 600 pages) and Chat Completions API compatibility mean it can drop into existing Claude or OpenAI pipelines without a rewrite.

Key Details

Architecture: Sparse Mixture of Experts, 35B active / ~1T total parameters
Context window: 256k tokens (~600 pages)
AIME 2025: 97.0 percent
AIME 2026: 94.5 percent
SWE-Bench Pro: Matches Claude Opus 4.6
User preference: Preferred over Claude Sonnet 4.6 across 1,276 blind side-by-side tasks
Training: No third-party distillation; clean, commercially licensed data only
API: Chat Completions API compatible
Pricing: Not yet disclosed
Access: Private preview now on Azure AI Foundry; public preview on MAI Playground coming soon

What to Do Next

Request private preview access through Azure AI Foundry to be positioned for the public rollout on the MAI Playground. If you build LLM-powered creative tools, add MAI-Thinking-1 to your evaluation list now: the Chat Completions API compatibility means you can swap an endpoint in an existing pipeline and run your standard eval suite without additional integration work. The 256k context window is particularly useful for document-heavy workflows such as parsing long-form content, comparing multiple sources simultaneously, or maintaining context across long creative sessions. Microsoft's benchmark methodology on the announcement page is worth reviewing before designing your own evals.

Microsoft MAI-Thinking-1 Tops Claude Sonnet in User Evals

What Happened

Why It Matters

Key Details

What to Do Next

Keep reading

ComfyUI v0.29.0 Adds HeyGen, GPT-5.6, and Gemma4 Nodes

Sessiongrep: Searchable Memory for AI Coding Agents

How to Make YouTube Thumbnails With AI (2026 Guide)

What Happened

Why It Matters

Key Details

What to Do Next

Stay ahead of AI

Keep reading

ComfyUI v0.29.0 Adds HeyGen, GPT-5.6, and Gemma4 Nodes

Sessiongrep: Searchable Memory for AI Coding Agents

How to Make YouTube Thumbnails With AI (2026 Guide)

Stay ahead of Creative AI