Gemini Omni: Google's New AI Video Model, First Look

On May 11, 2026, a Reddit user discovered Google's unreleased Gemini Omni video generation model inside the Gemini app, producing the first public demos of a system Google has not yet officially announced. The pop-up described Gemini Omni as a new video model that lets creators "remix your videos, edit directly in chat, try templates, and more." Researcher Max Weinbach confirmed that metadata ties Omni directly to Google's existing Veo video technology. Two video demos (a math professor writing proofs on a chalkboard and a restaurant dining scene) showed both what the model does well and where current limitations remain. Google I/O 2026 starts May 19, seven days from now.

What Happened

A Reddit user posted screenshots of a new card inside the Gemini app on May 11 showing the prompt "Start with an idea or try a template. Powered by Omni." A pop-up description read: "Create with Gemini Omni: meet our new video model, remix your videos, edit directly in chat, try templates, and more."

9to5Google broke the story on May 11, publishing the first two demo videos generated through early access. The following morning, Android Authority published a detailed analysis of the model's performance on May 12 at 7:36 AM UTC.

Google has not made any official statement about Gemini Omni. The model appears to have surfaced through limited early access or an incomplete rollout rather than a deliberate release. Based on the timing (one week before Google I/O 2026 on May 19), the official announcement is almost certainly planned for the keynote. TestingCatalog, which tracks Google UI leaks, confirmed the discovery and noted that Omni is distinct from the Veo 3.1 pathway currently powering video generation in Gemini.

What Gemini Omni Can Do

Based on the UI description and early demos, Gemini Omni supports four distinct modes out of the box:

Chat prompt generating video in Gemini Omni interface

Text-to-video generation: standard prompt-to-video at unspecified resolution
Remix: take an existing video and alter it based on a text prompt
In-chat editing: edit video directly inside the Gemini conversation window without switching to a separate editor
Templates: pre-built starting points for common generation tasks

The in-chat editing capability is the most distinctive feature. Current video generation tools (Runway, Kling, Pika) require you to export video to a separate timeline or re-generate with a new prompt to make edits. Gemini Omni's UI suggests editing will happen inline, treating video generation as a conversation rather than a one-shot render job. This matches the direction Google has been moving Gemini toward an agentic, multi-turn workspace.

The Demos: What Worked and What Did Not

The early user ran two test prompts. Both reveal strengths and weaknesses worth understanding before the I/O announcement.

Chalkboard demo success and restaurant scene artifacts comparison

Mathematical proof demo: The prompt asked for "a professor writes out a mathematical proof for trigonometric identities on a traditional chalkboard, explaining the step he is currently on in the equation." The output was described by 9to5Google as "very lifelike and reasonably accurate." The model handled both the text rendering on the chalkboard and the professor's physical hand movements with above-average coherence. For context, text rendering in AI video has been a persistent failure point: most models scramble letters or produce meaningless glyphs on surfaces like whiteboards. Gemini Omni's accuracy here is notable.

Restaurant scene demo: The prompt requested "two men at a table seaside at an upscale restaurant" with specific clothing, food, and interaction details. The output had visible artifacts: spaghetti appeared on empty plates rather than being served, and chewing animations did not match the number of bites taken. Android Authority noted these are common AI video generation flaws rather than Omni-specific failures: the same issues appear in Kling 3.0 and Runway Gen-3 Alpha on complex multi-object scenes.

One test the model blocked: the "Will Smith eating spaghetti" benchmark, a reference video used to evaluate AI video realism since 2023. The block suggests Google has content guardrails around recognizable individuals, consistent with how Gemini's image generation handles celebrity likenesses.

Technical Architecture: Built on Veo

Researcher Max Weinbach extracted metadata from the Gemini app UI and confirmed that Omni is "an extension of Veo" rather than an entirely new model architecture. This means Gemini Omni builds on the same Veo technology that currently powers video generation in Google's products, rather than replacing it from scratch.

Three interpretations are plausible based on the metadata and UI evidence:

Renamed Veo pathway: Omni is a new public name for the existing Veo 3.1 integration inside Gemini, updated with the editing and remix UI layer
Omni-modal extension: Omni is a Veo model with multimodal input support added, able to accept video, image, and text prompts in a single generation pass
Separate model, Veo lineage: Omni is a new model trained on Veo techniques but optimized for the in-chat editing use case

The I/O keynote on May 19 should clarify which of these is accurate. What the metadata confirms is that Google's video generation stack is consolidating under the Gemini umbrella rather than being maintained as a separate Veo product line.

Usage Limits: What the AI Pro Plan Covers

The early tester reported that two video generation requests consumed 86 percent of their daily quota on the Google AI Pro plan, which costs $19.99 per month. This is a real constraint for production workflows. If two prompts consume that much of a daily allocation, a creator planning to iterate on multiple video concepts in a single session will hit the limit quickly.

This usage pattern is consistent with how other high-quality video generation APIs are metered. Runway Gen-3 Alpha charges per second of generated video. Kling 3.0 uses a credit system. Google's approach of a daily quota on a subscription plan is more predictable for low-volume use but will frustrate power users who need to generate and compare multiple takes per session.

These usage limits reflect early access conditions and may change before the public launch. Google has not confirmed whether Gemini Omni will be gated behind AI Pro, available on lower tiers, or priced separately.

How Gemini Omni Compares to Other Video Tools

The comparison that matters most for creators already using AI video tools is Gemini Omni versus the current generation of dedicated video platforms.

Omni Runway Kling video model comparison cards

Tool	Text rendering accuracy	In-app editing	Pricing model	Access
Gemini Omni	Strong (demo only)	In-chat editing	AI Pro (TBC)	Early access only
Runway Gen-3 Alpha	Moderate	Timeline editor	Per-second credits	Available now
Kling 3.0	Moderate	Re-generate	Credit packs	Available now
Veo 3.1 (current)	Good	None	Google AI Pro	Available now

The key differentiator Gemini Omni introduces is the in-chat editing loop. Runway's Gen-3 Alpha is the current benchmark for professional AI video quality, but its editing workflow requires exporting clips to a separate timeline. Gemini Omni's in-chat approach is designed to keep the iteration cycle inside a single conversation, which reduces the time between idea and finalized clip for creators who generate and review quickly.

For creators already working within Google's ecosystem (Workspace, Gemini for script writing, Imagen for stills), an integrated video tool that does not require switching apps has obvious workflow advantages. The comparison is less about raw generation quality than about where in your workflow the tool fits.

What to Do Next

Watch Google I/O 2026 on May 19: The keynote starts at 10 AM PT at io.google/2026. Gemini Omni is near-certain to be officially revealed with pricing, access details, and live demos. This is the event where you will get confirmed specs rather than inference from UI leaks.
Check your Gemini app: If you are on the Google AI Pro plan, open Gemini and look for a video generation prompt with "Powered by Omni." Early access appears to be rolling out selectively, so availability is not guaranteed, but it is worth checking before May 19.
Hold off on long-term video tool commitments: If you are currently evaluating AI video platforms, wait until after I/O. Google's official announcement will clarify whether Gemini Omni replaces, supplements, or integrates with existing Veo access, which directly affects cost comparisons with Runway and Kling.

Frequently Asked Questions

What is Gemini Omni?

Gemini Omni is Google's new AI video generation model, discovered in the Gemini app UI on May 11, 2026. It supports text-to-video generation, video remixing, in-chat video editing, and template-based creation. Google has not officially announced it; the model surfaced through limited early access ahead of Google I/O 2026 on May 19.

How does Gemini Omni relate to Veo?

Metadata from the Gemini app confirms that Gemini Omni is an extension of Google's existing Veo video generation architecture, per researcher Max Weinbach. It does not appear to be a separate model built from scratch; instead it extends Veo with new capabilities including in-chat editing and the Omni interface layer.

Is Gemini Omni available now?

As of May 12, 2026, Gemini Omni is available to a limited number of users through early access or an incomplete rollout. It is not generally available. The official launch is expected at Google I/O 2026 on May 19.

What are the usage limits on Gemini Omni?

Early testing shows that two video generation requests consumed 86 percent of the daily quota on Google's AI Pro plan ($19.99/month). These figures are from early access conditions and may change at the official launch. Google has not confirmed final usage allocations.

How does Gemini Omni text rendering compare to other video models?

The mathematical proof demo showed above-average text rendering accuracy: the model correctly wrote trigonometric identities on a chalkboard with realistic hand movements. Text rendering has been a weak point for most AI video models. Gemini Omni's performance on this benchmark is one of its most notable early differentiators.

Can I use Gemini Omni for commercial video production?

No official terms of service have been published for Gemini Omni. Google's current Veo-based video generation tools carry specific restrictions around commercial applications and third-party rights. Until Google publishes terms at the I/O launch, the commercial licensing status is unknown. Do not use early access outputs in production work until terms are confirmed.