Google DeepMind shipped Magenta RealTime 2 (MRT2) on June 4, an open-weights live music model that runs locally on Apple Silicon Macs and responds to text, MIDI, and audio prompts with roughly 200ms of latency. The first Magenta RealTime needed a TPU; this one plays from a MacBook.
Try it: Jam with MRT2 in 30 minutes
Download the standalone app from the Magenta MRT2 page (2.5GB base weights for M3 Pro or M2 Max, or a 450MB small model for any Apple Silicon chip). Set your audio interface to 48kHz stereo, plug in a MIDI controller, and start with a text prompt like "warm Rhodes chords, slow swing" to seed the model. Hold a chord on the MIDI keyboard and MRT2 will improvise accompaniment in roughly two frames (~80ms). Drop a four-bar audio loop into the canvas to clone its timbre, then interpolate between two prompts to morph genres live.
Why It Matters
Music creators have been stuck choosing between high-latency cloud models (Suno, Riffusion) and local audio tools that handle samples rather than continuous generation. MRT2 closes that gap: it is the first open-weights model that streams continuous audio with the latency budget of a real instrument. The Hugging Face model card reports a 40ms frame size (down from 2 seconds in v1) and 15x lower control latency, which makes the model usable for live performance, not just rendered tracks. Combined with this week's wave of local creator tools (see our Stable Audio 3 producer workflow), the on-device audio stack now covers loops, stems, and live generation without leaving your laptop.
Key Details
MRT2 ships in two sizes (2.4B base, 230M small) under a dual license: the codebase is Apache 2.0 and the model weights are CC-BY 4.0, which allows commercial use with attribution. Architecture shifted from a bidirectional encoder-decoder to a decoder-only design with sliding window attention to hit the frame budget. The magenta-realtime GitHub repo hosts the inference code, AU plugin scaffolding for DAW integration, and reference apps for Max/MSP, Pure Data, and SuperCollider. Control inputs include 128-dim multihot MIDI vectors, text prompts via MusicCoCa embeddings, and 16kHz mono context audio. Compared to the original Magenta RealTime v1 model from 2025, MRT2 drops TPU as a requirement and adds note-by-note pianoroll conditioning plus a drums on/off switch.
What to Do Next
If you already produce with open audio models like MOSS Audio, treat MRT2 as the live-instrument layer rather than a Suno replacement: route MIDI from your DAW, send chord changes from a session view, and bounce the audio output as a rendered stem. Musicians on Logic or Ableton should wait for the AU plugin to stabilize before committing to a session workflow; researchers and Max/MSP users can pull the GitHub code today. Watch for fine-tuning recipes once Google ships the announced musician-tuning toolkit.