Arcee Ships Open Reasoning Model at 96% Less Cost

Arcee AI has released Trinity-Large-Thinking, an open-weight reasoning model under Apache 2.0 that ranks second on PinchBench behind only Claude Opus 4.6. The model costs $0.90 per million output tokens, roughly 96% less than comparable closed alternatives.

What Happened

Trinity-Large-Thinking adds extended reasoning before generating responses, improving multi-turn tool calling, context coherence, and instruction following across long agent sessions. The model served 3.37 trillion tokens during its Preview phase on OpenRouter since January 2026, establishing real-world usage data before this official release.

The model is available through Arcee's API and on Hugging Face under Apache 2.0. The Preview version remains free on OpenRouter with reduced hardware allocation for testing.

Why It Matters

Open-weight reasoning models have been scarce at the top of benchmark leaderboards. Most high-performing reasoning models remain closed-source with premium pricing. Trinity-Large-Thinking reaching the second spot on PinchBench under an open license gives developers a viable alternative for building open-source AI agents without per-token costs spiraling during complex workflows.

The 96% cost reduction matters most for agent workloads where models make dozens of tool calls per task. At $0.90 per million output tokens, teams running continuous agent loops can sustain operations that would be prohibitively expensive with closed models.

Key Details

Apache 2.0 license for full commercial use
Ranked #2 on PinchBench, behind Claude Opus 4.6
$0.90 per million output tokens via Arcee API
3.37 trillion tokens served during Preview phase
Optimized for multi-turn tool calling and long-running agent loops
Available on Hugging Face and OpenRouter

What to Do Next

Developers building agent workflows can test Trinity-Large-Thinking for free via OpenRouter's Preview tier. For production deployments, the Arcee API offers the full model at $0.90 per million output tokens. The Apache 2.0 license also allows self-hosting for teams that want to run inference on their own hardware.

Arcee Ships Open Reasoning Model at 96% Less Cost

What Happened

Why It Matters

Key Details

What to Do Next

Keep reading

ElevenLabs Revamps ElevenMusic as AI Music Creator Platform

Wonder Public Alpha: AI Design Agent Goes Canvas-to-Code

Hera Launch: AI Product Videos From a Single Prompt

What Happened

Why It Matters

Key Details

What to Do Next

Stay ahead of AI

Keep reading

ElevenLabs Revamps ElevenMusic as AI Music Creator Platform

Wonder Public Alpha: AI Design Agent Goes Canvas-to-Code

Hera Launch: AI Product Videos From a Single Prompt

Stay ahead of Creative AI