Alibaba officially launched Qwen3.7-Max at its Cloud Summit on May 19-20, 2026 in Hangzhou, positioning the proprietary model as an agent-first reasoning system with a 1M-token context window and a claim that no open-weights peer matches: state-of-the-art non-hallucination rate on the AA-omniscience benchmark. The launch ships alongside Alibaba's new Zhenwu M890 AI chip, signaling that the company is now competing on both the model layer and the silicon layer.

Try it: Run an Agent Workflow Against Qwen3.7-Max Today

Qwen3.7-Max is live in Qwen Chat and through the Alibaba Cloud Model Studio API right now. The fastest creator workflow to test the agent claims: spin up a research or content-pipeline agent that needs to chain 50+ tool calls (web search, file edits, code execution) without losing context, then run the same task against Claude Opus 4.7 and Qwen3.7-Max side by side. Alibaba says the model can sustain continuous agent operation for up to 35 hours and manage over 1,000 tool calls without performance degradation, which is the kind of bound that matters for overnight pipelines, batch refactoring, and long-running research agents that previously needed manual checkpointing.

Why It Matters

The headline number is the AA-omniscience non-hallucination rate. Artificial Analysis placed Qwen3.7-Max at index score 56.58, ranking it #4 across 218 models, and community testing on Hacker News reported the non-hallucination rate beats Claude Opus 4.7, Gemini 3.1 Pro, and GPT-5.5. For creators building agents, retrieval systems, or fact-grounded content pipelines, hallucination is the single biggest reliability cost, and a closed model from the Chinese ecosystem now sets the floor.

Key Details

Qwen3.7-Max is proprietary (API-only, no open weights), unlike the Plus variant which Alibaba has historically open-sourced after a short delay. The model targets agentic coding, complex reasoning, and long-horizon task execution. According to domain-b's launch coverage, the 35-hour autonomous operation claim and 1,000-tool-call capacity come directly from Alibaba's Cloud Summit demos.

The model debuted on the Arena leaderboard as a preview on May 14, 2026, hitting #13 overall and #7 in math before the official launch (background covered in our preview post). The May 19-20 announcement is the formal product release with API pricing and the Agent Frontier positioning.

On the silicon side, the Zhenwu M890 chip claims 3x the performance of its predecessor with 144GB HBM3 memory and 800 GB/s interchip bandwidth, on Alibaba's in-house PPU architecture. Alibaba says it has already shipped 560,000 Zhenwu units to over 400 customers across 20 industries.

What to Do Next

If you run agent pipelines or fact-grounded retrieval systems, benchmark Qwen3.7-Max against your current default on your own task before May ends. The combination of 1M context, 35-hour run length, and SOTA non-hallucination rate is the specific reliability bundle that separates a model good for chat from a model good for unattended creator workflows. Pricing through Alibaba Cloud Model Studio will determine whether this matters in production or stays a benchmark talking point.