Z.AI (Zhipu) released GLM-5.1 on April 7, a 754-billion-parameter open-weight model that claims the top spot on SWE-Bench Pro with a score of 58.4, surpassing Claude Opus 4.6, GPT-5.4, and Gemini 3.1 Pro on the autonomous coding benchmark.

What Happened

GLM-5.1 is a mixture-of-experts architecture with roughly 40 billion active parameters per token. It supports a 200,000-token context window with up to 128,000 output tokens, and Z.AI says the model can sustain autonomous execution on a single coding task for up to eight hours without human intervention.

The model is available as open weights, making it the first non-US model to top SWE-Bench Pro. Z.AI published API documentation alongside the release.

Why It Matters

An open-weight model matching or exceeding closed frontier models on agentic coding benchmarks shifts the competitive landscape for AI-assisted development. Creative professionals using AI coding tools for workflow automation, plugin development, and pipeline scripting now have a new option that can run locally or via API without vendor lock-in.

GLM-5.1 also posts strong numbers on reasoning benchmarks: 95.3 on AIME 2026, 82.6 on HMMT Feb. 2026, and 86.2 on GPQA-Diamond. On agentic benchmarks, it scores 68.7 on CyberGym, 68.0 on BrowseComp, and 71.8 on MCP-Atlas, all substantial improvements over its predecessor GLM-5.

Key Details

  • Parameters: 754B total, ~40B active per token (mixture-of-experts)
  • Context: 200K input, 128K output tokens
  • SWE-Bench Pro: 58.4 (first place, beating Opus 4.6 and GPT-5.4)
  • Autonomous runtime: Up to 8 hours on a single task
  • License: Open weights

What to Do Next

Developers can access GLM-5.1 through the Z.AI API. The open weights are available for local deployment. For a comparison of how GLM-5.1 stacks up against other coding tools, see our roundup of AI coding tools in 2026. For context on the open-source AI race, read our analysis of Gemma 4 and the open-source playbook.