Anthropic released Claude Opus 4.7 on April 16, 2026, pushing visual benchmark accuracy from 54.5% to 98.5%, lifting coding performance 13% over Opus 4.6, and tripling the supported image resolution to 2,576 pixels on the long edge. Pricing held steady at $5 per million input tokens and $25 per million output tokens. The release lands one day after Cursor 3.1's tiled multi-agent layout and a week after Anthropic's own Claude Managed Agents, putting a new capability tier underneath the year's most active corner of the AI tool market.

Background

Opus 4.6 shipped in February 2026 with strong reasoning and a vision system that worked on slide decks and screenshots but stumbled on dense imagery. Anthropic's published benchmark for fine-detail visual extraction sat at 54.5%, well below what creators needed for accurate work on storyboards, color charts, mood boards, multi-panel reference sheets, and production documents. Coding performance was competitive but not leading, particularly on multi-file reasoning where Cursor and Codex variants had pulled ahead.

Opus 4.7 is the first Anthropic model designed with the xhigh effort level baked in as a default for tool-use workflows. It is also the first Opus release that ships with explicit Task Budgets in beta, an admission that the long-running agent runs Anthropic's tools now produce need a way to cap token spend before they spiral.

Deep Analysis

The Vision Leap: From 54.5% to 98.5%

Vision benchmark comparison showing Claude Opus 4.6 at 54.5% and Opus 4.7 at 98.5% on fine-detail visual extraction
Anthropic's fine-detail visual benchmark: Opus 4.7 closes a 44-point gap.

The visual jump is the headline change. Going from 54.5% to 98.5% is not an incremental step on a metric Anthropic was already winning. It is a capability unlock. Opus 4.6 could describe an image, identify objects, and answer general questions about a frame. Opus 4.7 can read the labels on a Figma component library, transcribe text from a 4K screenshot of a Premiere timeline, count items in a packed reference sheet, and pull values from a dense data dashboard.

For creators, the practical surface this opens is large. Storyboard panels can be analyzed for shot continuity. Color palette references can be sampled and described in actionable terms. Multi-page production documents (call sheets, pitch decks, technical specs) can be ingested without the model losing track of detail. The 2,576-pixel long edge means a 4K screenshot survives ingestion without forced downscaling, which has been the silent failure mode of every prior Claude vision call.

Anthropic's documentation lists the new resolution at 3.75 megapixels of effective input. That number matters because it puts Claude past the threshold where a typical tablet screenshot, an iPhone reference photo, or a 1440p design comp can be processed at native fidelity. The model is no longer the bottleneck for ingesting visual reference.

Coding Velocity: 13% Lift in the Multi-Agent Race

Bar chart showing coding benchmarks: Opus 4.6 at 58% and Opus 4.7 at 70% on CursorBench, plus 3x Rakuten-SWE-Bench tasks resolved
Opus 4.7 resolves 70% of CursorBench tasks (up from 58%) and triples production-level SWE coverage.

The coding numbers are the second story. Opus 4.7 resolves 70% of tasks on CursorBench (up from 58% on Opus 4.6) and handles 3x more production-level tasks on the Rakuten-SWE-Bench evaluation. Anthropic's internal 93-task benchmark shows a 13% lift over Opus 4.6, including tasks that no previous Opus or Sonnet model had ever resolved.

That coding lift is not happening in isolation. Cursor 3.1 shipped tiled multi-agent layouts on April 13. GitHub's GitHub Copilot /fleet runs multiple agents in parallel from the CLI. Claude Code itself, Anthropic's first-party CLI, defaults to xhigh effort on Opus 4.7 across all plans. Each one of those tools is now spending more compute per request, and they all benefit from a base model that resolves more tasks per attempt.

The compounding effect matters more than any single benchmark. A multi-agent run with five sub-tasks improves geometrically when the underlying model resolves more of each step. If Opus 4.6 finished 58% of attempts cleanly and Opus 4.7 finishes 70%, a five-step chain that previously succeeded end-to-end about 6.6% of the time now succeeds about 16.8%. That is the kind of move that turns agent demos into agent products.

The xhigh Effort Level and Task Budgets

Diagram showing Claude effort levels low, medium, high, xhigh, max with token budget tiers and use cases
The new xhigh tier sits between high and max, with Task Budgets capping spend.

The new xhigh effort level is Anthropic's response to a year of feedback from agent builders. high was too cheap for production work. max was too aggressive for routine tool calls. xhigh sits between them and becomes the default for Claude Code, which signals where Anthropic expects most professional usage to land.

Task Budgets, shipped as a beta, let developers cap token spend across a long-running agent task. This addresses the most common failure mode of multi-agent runs: an agent loops, retries, or branches into a side investigation that drains thousands of tokens before producing a result. With Task Budgets, the orchestrator can halt the run before the bill compounds.

For studios and freelancers building automation around Claude, these two changes together are what actually make Opus 4.7 production-grade for agent workflows. The benchmark numbers move attention. The effort level and budget controls move deployments.

Same Price, More Frontier: What That Signals

Pricing comparison table: Opus 4.6 vs Opus 4.7 input and output token costs at $5 and $25 per million
Same $5 / $25 pricing despite material capability gains: Anthropic absorbs the upgrade.

Pricing is the quiet detail. Opus 4.7 ships at $5 per million input tokens and $25 per million output tokens, identical to Opus 4.6. That is unusual for an Anthropic flagship release. Earlier Opus versions raised the floor for advanced tiers, often justified by the new capability surface.

Holding price steady puts pressure on the rest of the frontier market. OpenAI's GPT-5.x line has been priced for premium agent workloads. Google's Gemini 3 Ultra sits at a similar tier. By delivering a 13% coding lift and a vision capability tripling without a price change, Anthropic forces every comparable provider to defend their pricing or match the move. For creators choosing a model for production workflows, the calculus shifts.

Impact on Creators

Three creator workflows benefit immediately. First, anyone using Claude for visual research (storyboarding, mood boards, asset audits, design reviews) gets a model that finally reads dense imagery without losing detail. Second, developers and technical artists building tools around Claude Code or the Anthropic API get a coding tier that resolves more tasks per attempt, with controllable spend via Task Budgets. Third, studios using Claude for Word and similar integrations get an underlying model upgrade automatically, with no migration work.

The agentic coding race deserves its own callout. Cursor 3.1, Copilot Workspace, and Claude Code are converging on the same pattern: many parallel agents, each running long, each consuming a frontier model. Opus 4.7 is the first Anthropic model that holds up under that load with both capability and budget controls in place.

Key Takeaways

  • Vision benchmark moved from 54.5% to 98.5%, a capability unlock for dense imagery (storyboards, design comps, dashboards, multi-page production docs).
  • Image resolution tripled to 2,576px on the long edge (3.75 megapixels), removing forced downscaling for 4K and high-fidelity references.
  • Coding performance up 13% over Opus 4.6, with 70% CursorBench (vs 58%) and 3x Rakuten-SWE-Bench tasks resolved.
  • New xhigh effort level becomes the default for Claude Code on all plans; Task Budgets beta caps token spend on long-running agent runs.
  • Pricing held flat at $5 input / $25 output per million tokens despite material capability gains.
  • API model name: claude-opus-4-7. Available on Claude.ai, Anthropic API, Amazon Bedrock, Google Cloud Vertex AI, and Microsoft Foundry.

What to Watch

The next signal will come from the multi-agent coding tools. If Cursor 3.1, Copilot Workspace, and Claude Code report measurable improvement in completion rates on real production tasks under Opus 4.7, the case for parallel-agent IDEs as a default coding setup gets a lot stronger. Watch the public dashboards from Cursor and Anthropic over the next two to three weeks.

The second signal is competitive pricing. If OpenAI or Google answer Opus 4.7's flat-price upgrade with a price cut on their own flagship tier, the frontier model market enters a new phase where capability gains are absorbed rather than monetized. That changes the economics for every studio running production agent loads. Full model documentation lives at Anthropic's model overview.