Apple used the WWDC 2026 keynote on June 8 to unveil the third generation of Apple Foundation Models, a five-model family co-developed with Google that powers a rebuilt Siri AI, a new Image Playground, and a Spatial Reframing tool inside Photos. The family ranges from a 3-billion-parameter dense on-device model to a 20-billion-parameter sparse multimodal model that activates only 1 to 4 billion weights per request. For creators working on Mac, iPhone, iPad, or Vision Pro, this is the first time Apple Intelligence ships with a native image generation pipeline and an on-device multimodal model that third-party apps can call through the Foundation Models framework.
Coverage from TechCrunch and MacRumors confirms the Gemini collaboration is structured as distillation and joint training rather than API routing. User prompts stay on device or inside Apple's Private Cloud Compute. Google's models were used to help train Apple's, not to answer real-time requests.
What Apple shipped, by model
The five-model lineup is the core of the announcement. Each model handles a different slice of the creator workload, and the framework exposes them to apps in distinct ways.
| Model | Location | Params | What creators get |
|---|---|---|---|
| AFM 3 Core | On-device | 3B dense | Writing Tools, Smart Replies, summarization, classification |
| AFM 3 Core Advanced | On-device | 20B sparse MoE (1-4B active) | Multimodal: image understanding, OCR, visual question answering, on-device dictation upgrade |
| AFM 3 Cloud | Private Cloud Compute | Undisclosed | Server workhorse for longer documents and complex prompts |
| AFM 3 Cloud Image | Private Cloud Compute | Undisclosed | Image Playground, Spatial Reframing, advanced photo editing |
| AFM 3 Cloud Pro | Private Cloud Compute | Undisclosed | Agentic tool use, complex reasoning, App Intents orchestration |
The 20-billion-parameter sparse model on-device is the headline number. Until today, the largest model creators could call locally on a base-tier Apple Silicon Mac was the 3-billion-parameter AFM 2 that shipped with iOS 26 and macOS 26. Quadrupling effective capacity without leaving the device opens workflows like batch alt-text generation on a photo library, multilingual subtitle drafts inside Final Cut, and on-device retouch suggestions in Photos that previously required a Cloud Compute round-trip.

What this means for creator apps right now
The Foundation Models framework is the channel that matters for indie developers and creator-tool studios. The 2025 version of the framework, introduced last September, gave Swift developers direct access to the 3B on-device model with guided generation (the @Generable macro for typed Swift output), streaming, and tool calling. The 2026 update extends those same primitives to the new 20B multimodal model. Apps that already ship with Writing Tools wired up should see a quality lift with zero code changes. Apps that want image understanding, OCR, or visual reasoning now get a local-first path that was previously locked behind a cloud API.
Three concrete examples of what the framework now enables on-device:
- Photo cataloging apps can pass an image to AFM 3 Core Advanced and receive a structured Swift object with tags, mood, OCR text, and a recommended caption, all in one call.
- Audio note apps can chain the upgraded dictation pipeline with summarization to produce meeting minutes with action items, no cloud step, no per-minute fee.
- Design tools can hand a layout screenshot to the multimodal model and get back a critique against accessibility heuristics or brand-style rules defined as Swift structs.
For image generation specifically, the pipeline moves to the cloud. Coverage of the larger Apple-Google relationship makes clear that the image stack is not API-shared with Gemini, so creator-tool studios will need to call Apple's Image Playground APIs rather than Imagen or Nano Banana to get the native iOS look and the on-device privacy guarantees.
Image Playground and Spatial Reframing, the consumer-facing layer
The two new creator-facing features in the OS itself are Image Playground and Spatial Reframing.
Image Playground was previously a sticker-and-cartoon generator added in iOS 18. The 2026 version, powered by AFM 3 Cloud Image, adds touch-based image modifications, photoreal output, and per-user personalization. Apple positions Apple Intelligence as a Canva-style consumer surface rather than a competitor to Midjourney or Flux, but the underlying model is the same one third-party apps can call through the framework. Indie design apps that want to ship a Comfy-style node graph against Apple's model now have a local-and-private alternative to a Gemini or Stable Diffusion API.
Spatial Reframing lives inside the Photos app. It uses Apple's 3D modeling pipeline plus the new multimodal model to let users adjust the angle or composition of an existing photo after the shot. Think of it as Photoshop's Generative Expand combined with NeRF-style novel view synthesis, but running in Photos with no external service. For social-video creators who shoot once and reframe for portrait, landscape, and square, this is the closest thing Apple has shipped to a native post-production tool since Final Cut's smart conform.

How AFM 3 stacks up against the assistants creators already use
| Capability | Apple AFM 3 | OpenAI ChatGPT 5 | Google Gemini 3 | Anthropic Claude Mythos |
|---|---|---|---|---|
| On-device option | Yes (3B + 20B sparse) | No | Gemini Nano on Pixel only | No |
| Image generation | Image Playground (cloud, US first) | GPT-Image 2 (cloud, global) | Nano Banana 2 (cloud, global) | None native |
| Photo editing | Spatial Reframing in Photos | None native | Pics, Photos editor | None native |
| Agentic tools | App Intents via AFM 3 Cloud Pro | Operator / GPT-5 agent | Project Astra agent | Computer Use |
| Developer SDK | Foundation Models framework (Swift, free) | API (per-token) | API (per-token) | API (per-token) |
| Privacy model | On-device or Private Cloud Compute | OpenAI servers | Google servers | Anthropic servers |
The strategic read: creators on Apple devices now have a free, private, on-device path to multimodal AI through the Foundation Models framework, plus a cloud path for image generation that does not require a third-party API key. Apple is not trying to win on raw model quality. It is trying to be the default surface for the everyday creator workflow that does not need GPT-5 or Claude.

Availability and rollout
The new Siri AI, Image Playground, Spatial Reframing, and the third-generation models ship with iOS 27, iPadOS 27, macOS 27, and visionOS 27. Developer betas are available from June 8, per TechCrunch. Public betas land in July, with general availability in the fall alongside the new iPhone and Mac hardware cycle. The Foundation Models framework adds the multimodal AFM 3 Core Advanced as a new model option, gated on devices with the required Neural Engine throughput. The image generation API ships in beta as part of Image Playground extensions for third-party apps later in the developer preview window.
Frequently asked questions
Does Apple Intelligence now route prompts through Google's servers?
No. Per MacRumors and Apple's own ML research post, Google's models were used during training and distillation. Inference happens on-device or inside Apple's Private Cloud Compute. Google does not see user prompts.
Can I use the new image generation API in a third-party Mac or iOS app?
Yes, through Image Playground extensions. The full Foundation Models framework documentation at developer.apple.com lists the new image generation entry points alongside the existing text and guided-generation APIs. Image generation is gated to AFM 3 Cloud Image, so requests go through Private Cloud Compute.
How does the on-device 20B sparse model compare to the cloud 3B dense model from 2025?
Apple reports a step-change in quality on writing, summarization, and multimodal benchmarks. The sparse architecture activates 1 to 4 billion weights per token, so memory pressure stays close to a dense 3B model while quality approaches the cloud workhorse from the prior generation. Real-world impact for creators is faster local generation with fewer fallbacks to Private Cloud Compute.
Is the Foundation Models framework still free?
Yes. On-device inference costs nothing per call. Private Cloud Compute calls are bundled into the operating system at no charge, the same arrangement Apple introduced in 2025.
Does this make Final Cut Pro or Logic Pro pick up new AI features automatically?
Not on day one. Apple typically ships pro-app AI features on a separate cadence after the OS release. Expect Spatial Reframing-style tools to land in Final Cut and Photos workflow extensions in the months after iOS 27 general availability.
What does the Gemini collaboration mean for app developers who already integrate Gemini?
Nothing changes on the Google side. Apps that call the Google Gemini API directly continue to work as before. Apple's deal is about training data and architecture co-development, not API resale.