Z.AI (Zhipu) released GLM-5.1 on April 7, a 754-billion-parameter open-weight model that claims the top spot on SWE-Bench Pro with a score of 58.4, surpassing Claude Opus 4.6, GPT-5.4, and Gemini 3.1 Pro on the autonomous coding benchmark.
What Happened
GLM-5.1 is a mixture-of-experts architecture with roughly 40 billion active parameters per token. It supports a 200,000-token context window with up to 128,000 output tokens, and Z.AI says the model can sustain autonomous execution on a single coding task for up to eight hours without human intervention.
The model is available as open weights, making it the first non-US model to top SWE-Bench Pro. Z.AI published API documentation alongside the release.
Why It Matters
An open-weight model matching or exceeding closed frontier models on agentic coding benchmarks shifts the competitive landscape for AI-assisted development. Creative professionals using AI coding tools for workflow automation, plugin development, and pipeline scripting now have a new option that can run locally or via API without vendor lock-in.
GLM-5.1 also posts strong numbers on reasoning benchmarks: 95.3 on AIME 2026, 82.6 on HMMT Feb. 2026, and 86.2 on GPQA-Diamond. On agentic benchmarks, it scores 68.7 on CyberGym, 68.0 on BrowseComp, and 71.8 on MCP-Atlas, all substantial improvements over its predecessor GLM-5.
Key Details
- Parameters: 754B total, ~40B active per token (mixture-of-experts)
- Context: 200K input, 128K output tokens
- SWE-Bench Pro: 58.4 (first place, beating Opus 4.6 and GPT-5.4)
- Autonomous runtime: Up to 8 hours on a single task
- License: Open weights
What to Do Next
Developers can access GLM-5.1 through the Z.AI API. The open weights are available for local deployment. For a comparison of how GLM-5.1 stacks up against other coding tools, see our roundup of AI coding tools in 2026. For context on the open-source AI race, read our analysis of Gemma 4 and the open-source playbook.