Google has built computer use directly into Gemini 3.5 Flash, announced June 24. The capability lets the model see a screen, reason about it, and take actions across browser, mobile, and desktop, so developers can build agents that operate real software instead of just talking about it.

What Happened

Computer use was previously a standalone Gemini 2.5 model. Google has now folded it in as a built-in tool inside the main Gemini 3.5 Flash model. That means the same fast, low-cost model you already call for text and vision can now click, type, scroll, and navigate interfaces. Google frames it for "long-horizon and enterprise automation tasks like continuous software testing and knowledge work across professional applications."

Why It Matters

For builders and creators, this lowers the bar to shipping agents that actually do work in the tools you use every day. Pairing screen control with Flash, Google's speed-and-cost tier, makes it cheaper to run agents that test a web app overnight, fill repetitive forms, or move data between apps that have no API. It is Google's direct answer to other agentic computer-use efforts, and putting it in Flash rather than a premium model is the signal: this is meant to run at scale.

Key Details

  • Where it works: The model can see, reason, and act across browser, mobile, and desktop environments.
  • Access today: Available now through the Gemini API computer-use docs, plus the Gemini Enterprise Agent Platform via the Google Cloud console.
  • Try the demo: Google hosts a live sandbox built with Browserbase at gemini.browserbase.com so you can watch the agent drive a browser.
  • Safety: Targeted adversarial training mitigates prompt-injection risk, and two optional enterprise safeguards add explicit user confirmation and automatic task stoppage when injection is detected.

What to Do Next

If you build agents, open the Gemini API docs and wire computer use into a small, well-scoped task first, like running a repeatable UI test, before handing it anything destructive. Start in the hosted demo to see how the model reasons over a screen, and turn on the confirmation safeguard for any flow that touches real accounts.