Google Releases Gemma 3: Open Multimodal Models Up to 27B

Google released Gemma 3 on March 12, 2026, a family of open-weight multimodal models in four sizes (1B, 4B, 12B, 27B) built on Gemini 2.0 technology. The models support 128k token context, 140+ languages, and image-plus-text input, with open weights on Hugging Face and Kaggle.

What Happened

Google DeepMind released Gemma 3 as its newest open-weight model family, with sizes ranging from 1B (text-only) to 27B (full multimodal). All models above 1B accept image and text input and produce text output. The official announcement describes Gemma 3 as built from the same research and infrastructure that powers Gemini 2.0.

In early benchmark evaluations, Gemma 3 27B outperforms Llama 3-405B, DeepSeek-V3, and o3-mini in human preference evaluations, all at a fraction of the parameter count. The Hugging Face integration post highlights strong performance on coding, reasoning, and instruction-following tasks.

Why It Matters

The 128k context window is the headline for creative AI users. It means you can pass entire scripts, long design briefs, or large code files into a locally-run model and get coherent responses across the full input. Previous open-weight models in this size range typically topped out at 8k-32k context.

The multimodal capability at the 4B size is particularly useful for creative workflows: at 4B parameters, Gemma 3 runs on consumer laptops with 8-16 GB of RAM while still accepting image input. This opens up local visual analysis, image-to-text workflows, and design critique pipelines without cloud dependency.

The Apache-style license permits commercial use, making these models viable for product integrations without royalties or usage restrictions.

Key Details

Sizes: 1B (text-only), 4B, 12B, 27B (all multimodal except 1B)
Context window: 128k tokens across all sizes
Languages: 140+ supported
Input: text and images (4B, 12B, 27B); text only (1B)
Output: text
Based on: Gemini 2.0 research and architecture
Available on: Hugging Face and Kaggle
License: Gemma Terms of Use (permits commercial use)

What to Do Next

The 4B instruction-tuned variant is the best starting point for most creative workflows. It runs locally on most modern laptops and handles image analysis, long-document summarization, and multilingual tasks. Download from Hugging Face or pull through Ollama once community builds are available. The DeepMind model page has detailed technical specifications and a live demo.

Google Releases Gemma 3: Open Multimodal Models Up to 27B

What Happened

Why It Matters

Key Details

What to Do Next

Keep reading

How Creators Actually Use AI: Workflow Analysis for 2026

OpenAI Acquires Astral to Boost Codex Python Tools

Xiaomi MiMo-V2 Ships Multimodal and TTS Models

What Happened

Why It Matters

Key Details

What to Do Next

Stay ahead of AI

Keep reading

How Creators Actually Use AI: Workflow Analysis for 2026

OpenAI Acquires Astral to Boost Codex Python Tools

Xiaomi MiMo-V2 Ships Multimodal and TTS Models

Stay ahead of Creative AI