Apple WWDC 2026 AI: AFM 3, Image Playground, Siri AI

Apple used the WWDC 2026 keynote on June 8 to unveil the third generation of Apple Foundation Models, a five-model family co-developed with Google that powers a rebuilt Siri AI, a new Image Playground, and a Spatial Reframing tool inside Photos. The family ranges from a 3-billion-parameter dense on-device model to a 20-billion-parameter sparse multimodal model that activates only 1 to 4 billion weights per request. For creators working on Mac, iPhone, iPad, or Vision Pro, this is the first time Apple Intelligence ships with a native image generation pipeline and an on-device multimodal model that third-party apps can call through the Foundation Models framework.

Coverage from TechCrunch and MacRumors confirms the Gemini collaboration is structured as distillation and joint training rather than API routing. User prompts stay on device or inside Apple's Private Cloud Compute. Google's models were used to help train Apple's, not to answer real-time requests.

What Apple shipped, by model

The five-model lineup is the core of the announcement. Each model handles a different slice of the creator workload, and the framework exposes them to apps in distinct ways.

Model	Location	Params	What creators get
AFM 3 Core	On-device	3B dense	Writing Tools, Smart Replies, summarization, classification
AFM 3 Core Advanced	On-device	20B sparse MoE (1-4B active)	Multimodal: image understanding, OCR, visual question answering, on-device dictation upgrade
AFM 3 Cloud	Private Cloud Compute	Undisclosed	Server workhorse for longer documents and complex prompts
AFM 3 Cloud Image	Private Cloud Compute	Undisclosed	Image Playground, Spatial Reframing, advanced photo editing
AFM 3 Cloud Pro	Private Cloud Compute	Undisclosed	Agentic tool use, complex reasoning, App Intents orchestration

The 20-billion-parameter sparse model on-device is the headline number. Until today, the largest model creators could call locally on a base-tier Apple Silicon Mac was the 3-billion-parameter AFM 2 that shipped with iOS 26 and macOS 26. Quadrupling effective capacity without leaving the device opens workflows like batch alt-text generation on a photo library, multilingual subtitle drafts inside Final Cut, and on-device retouch suggestions in Photos that previously required a Cloud Compute round-trip.

Apple's three Foundation Model tiers: on-device, server, and cloud

What this means for creator apps right now

The Foundation Models framework is the channel that matters for indie developers and creator-tool studios. The 2025 version of the framework, introduced last September, gave Swift developers direct access to the 3B on-device model with guided generation (the @Generable macro for typed Swift output), streaming, and tool calling. The 2026 update extends those same primitives to the new 20B multimodal model. Apps that already ship with Writing Tools wired up should see a quality lift with zero code changes. Apps that want image understanding, OCR, or visual reasoning now get a local-first path that was previously locked behind a cloud API.

Three concrete examples of what the framework now enables on-device:

Photo cataloging apps can pass an image to AFM 3 Core Advanced and receive a structured Swift object with tags, mood, OCR text, and a recommended caption, all in one call.
Audio note apps can chain the upgraded dictation pipeline with summarization to produce meeting minutes with action items, no cloud step, no per-minute fee.
Design tools can hand a layout screenshot to the multimodal model and get back a critique against accessibility heuristics or brand-style rules defined as Swift structs.

For image generation specifically, the pipeline moves to the cloud. Coverage of the larger Apple-Google relationship makes clear that the image stack is not API-shared with Gemini, so creator-tool studios will need to call Apple's Image Playground APIs rather than Imagen or Nano Banana to get the native iOS look and the on-device privacy guarantees.

Image Playground and Spatial Reframing, the consumer-facing layer

The two new creator-facing features in the OS itself are Image Playground and Spatial Reframing.

Image Playground was previously a sticker-and-cartoon generator added in iOS 18. The 2026 version, powered by AFM 3 Cloud Image, adds touch-based image modifications, photoreal output, and per-user personalization. Apple positions Apple Intelligence as a Canva-style consumer surface rather than a competitor to Midjourney or Flux, but the underlying model is the same one third-party apps can call through the framework. Indie design apps that want to ship a Comfy-style node graph against Apple's model now have a local-and-private alternative to a Gemini or Stable Diffusion API.

Spatial Reframing lives inside the Photos app. It uses Apple's 3D modeling pipeline plus the new multimodal model to let users adjust the angle or composition of an existing photo after the shot. Think of it as Photoshop's Generative Expand combined with NeRF-style novel view synthesis, but running in Photos with no external service. For social-video creators who shoot once and reframe for portrait, landscape, and square, this is the closest thing Apple has shipped to a native post-production tool since Final Cut's smart conform.

Apple Image Playground generates framed images on device

How AFM 3 stacks up against the assistants creators already use

Capability	Apple AFM 3	OpenAI ChatGPT 5	Google Gemini 3	Anthropic Claude Mythos
On-device option	Yes (3B + 20B sparse)	No	Gemini Nano on Pixel only	No
Image generation	Image Playground (cloud, US first)	GPT-Image 2 (cloud, global)	Nano Banana 2 (cloud, global)	None native
Photo editing	Spatial Reframing in Photos	None native	Pics, Photos editor	None native
Agentic tools	App Intents via AFM 3 Cloud Pro	Operator / GPT-5 agent	Project Astra agent	Computer Use
Developer SDK	Foundation Models framework (Swift, free)	API (per-token)	API (per-token)	API (per-token)
Privacy model	On-device or Private Cloud Compute	OpenAI servers	Google servers	Anthropic servers

The strategic read: creators on Apple devices now have a free, private, on-device path to multimodal AI through the Foundation Models framework, plus a cloud path for image generation that does not require a third-party API key. Apple is not trying to win on raw model quality. It is trying to be the default surface for the everyday creator workflow that does not need GPT-5 or Claude.

Apple AFM 3 compared against rival AI assistants

Availability and rollout

The new Siri AI, Image Playground, Spatial Reframing, and the third-generation models ship with iOS 27, iPadOS 27, macOS 27, and visionOS 27. Developer betas are available from June 8, per TechCrunch. Public betas land in July, with general availability in the fall alongside the new iPhone and Mac hardware cycle. The Foundation Models framework adds the multimodal AFM 3 Core Advanced as a new model option, gated on devices with the required Neural Engine throughput. The image generation API ships in beta as part of Image Playground extensions for third-party apps later in the developer preview window.

Frequently asked questions

Does Apple Intelligence now route prompts through Google's servers?

No. Per MacRumors and Apple's own ML research post, Google's models were used during training and distillation. Inference happens on-device or inside Apple's Private Cloud Compute. Google does not see user prompts.

Can I use the new image generation API in a third-party Mac or iOS app?

Yes, through Image Playground extensions. The full Foundation Models framework documentation at developer.apple.com lists the new image generation entry points alongside the existing text and guided-generation APIs. Image generation is gated to AFM 3 Cloud Image, so requests go through Private Cloud Compute.

How does the on-device 20B sparse model compare to the cloud 3B dense model from 2025?

Apple reports a step-change in quality on writing, summarization, and multimodal benchmarks. The sparse architecture activates 1 to 4 billion weights per token, so memory pressure stays close to a dense 3B model while quality approaches the cloud workhorse from the prior generation. Real-world impact for creators is faster local generation with fewer fallbacks to Private Cloud Compute.

Is the Foundation Models framework still free?

Yes. On-device inference costs nothing per call. Private Cloud Compute calls are bundled into the operating system at no charge, the same arrangement Apple introduced in 2025.

Does this make Final Cut Pro or Logic Pro pick up new AI features automatically?

Not on day one. Apple typically ships pro-app AI features on a separate cadence after the OS release. Expect Spatial Reframing-style tools to land in Final Cut and Photos workflow extensions in the months after iOS 27 general availability.

What does the Gemini collaboration mean for app developers who already integrate Gemini?

Nothing changes on the Google side. Apps that call the Google Gemini API directly continue to work as before. Apple's deal is about training data and architecture co-development, not API resale.

Apple WWDC 2026: New AI Models, Image Playground, Siri AI

What Apple shipped, by model

What this means for creator apps right now

Image Playground and Spatial Reframing, the consumer-facing layer

How AFM 3 stacks up against the assistants creators already use

Availability and rollout

Frequently asked questions

Does Apple Intelligence now route prompts through Google's servers?

Can I use the new image generation API in a third-party Mac or iOS app?

How does the on-device 20B sparse model compare to the cloud 3B dense model from 2025?

Is the Foundation Models framework still free?

Does this make Final Cut Pro or Logic Pro pick up new AI features automatically?

What does the Gemini collaboration mean for app developers who already integrate Gemini?

Keep reading

ComfyUI v0.29.0 Adds HeyGen, GPT-5.6, and Gemma4 Nodes

Sessiongrep: Searchable Memory for AI Coding Agents

How to Make YouTube Thumbnails With AI (2026 Guide)

What Apple shipped, by model

What this means for creator apps right now

Image Playground and Spatial Reframing, the consumer-facing layer

How AFM 3 stacks up against the assistants creators already use

Availability and rollout

Frequently asked questions

Does Apple Intelligence now route prompts through Google's servers?

Can I use the new image generation API in a third-party Mac or iOS app?

How does the on-device 20B sparse model compare to the cloud 3B dense model from 2025?

Is the Foundation Models framework still free?

Does this make Final Cut Pro or Logic Pro pick up new AI features automatically?

What does the Gemini collaboration mean for app developers who already integrate Gemini?

Stay ahead of AI

Keep reading

ComfyUI v0.29.0 Adds HeyGen, GPT-5.6, and Gemma4 Nodes

Sessiongrep: Searchable Memory for AI Coding Agents

How to Make YouTube Thumbnails With AI (2026 Guide)

Stay ahead of Creative AI