ai-audio - Creative AI News

ai-audio Jun 11, 2026

Descript Adds Tone Tags and Claude/ChatGPT Editing

Descript shipped inline tone tags for ElevenLabs V3 speakers and a connector for Claude and ChatGPT on June 11, 2026, pushing further into directable AI editing.

Deep Dive Jun 11, 2026

ElevenLabs Music v2 vs Suno: Best AI Music Tool 2026

ElevenLabs Music v2 and Suno both make broadcast-ready tracks. The real 2026 choice is licensing risk versus creative control. Here is how they compare.

AI Tools Jun 10, 2026

Scribix Transcribes Audio and Video in Your Browser

Scribix is a browser-based AI transcription tool that turns audio and video into editable text with speaker labels, word-level timestamps, and SRT or VTT export.

NVIDIA Jun 4, 2026

NVIDIA Nemotron 3.5 ASR: 40 Languages at 80ms Latency

NVIDIA released Nemotron 3.5 ASR on June 4: an open 600M streaming speech model covering 40 language-locales with sub-100ms latency for voice agents.

AI Tools Jun 4, 2026

Magenta RealTime 2: Google's Live Music Model Runs on Mac

Google Magenta RealTime 2 runs live music generation locally on Mac and Windows, producing instrument tracks in real time from text prompts.

AI Tools Jun 4, 2026

Higgs Audio v3 TTS 4B: 100-Language Voice AI for Chat

Boson AI has released Higgs Audio v3 TTS 4B, a 4-billion-parameter text-to-speech model built for voice chat. It supports 100 languages with zero-shot voice cloning and inline emotion control tokens.

Deep Dive Jun 4, 2026

Stable Audio 3 Workflow: From Prompt to Mastered Track

Step-by-step workflow for taking a Stable Audio 3 text prompt all the way to a mastered track. Prompt design, stems, DAW arrangement, mastering. Cost zero, time 45 minutes.

News May 27, 2026

Sesame iOS Preview: 4 Voice Agents, Real-Time Search

Voice AI lab Sesame opened the iOS preview of its mobile app on May 27, 2026, shipping four distinct voice agents with real-time web search and visual information cards.

ai-audio May 27, 2026

The Muser: Local AI Music Generation That Rivals Suno

Musicians and creators who want AI music generation without monthly fees now have a compelling open-source option. The Muser launched on GitHub on May 27, 2026 as a self-hostable platform that generates complete music tracks locally.

Deep Dive May 26, 2026

AI Music Similarity: Search by Melody, Rhythm, or Timbre

MERIT is a new AI framework that learns to separate melody, rhythm, and timbre components of music similarity. Unlike existing tools that treat a song as a single fingerprint, it lets you ask: do these two tracks share the same groove?

ai-audio May 19, 2026

Audio.Observer: AI Turns Today's News Into Music

Audio.Observer is a new streaming platform that turns daily news headlines into AI-generated songs. It launched on May 19, 2026, and publishes new tracks every day across tech, markets, world events, and culture.

News May 19, 2026

Parakeet TDT v3 Silently Translates French to English

Parakeet TDT v3, the fast NVIDIA transcription model, silently translates French speech to English with up to 31% English intrusion on spontaneous recordings.

ai-audio May 13, 2026

ElevenLabs May Update: GPT-5.4 and RAG for Voice Agents

ElevenLabs shipped a substantial May 13 update: GPT-5.4-mini and nano options in Voice Agents, document-level RAG for knowledge bases, and workspace-level API analytics.

Deep Dive May 5, 2026

Inworld TTS-2: Sub-200ms Voice Beats ElevenLabs and Cartesia

Inworld released Realtime TTS-2 on May 5 with sub-200ms latency, plain-English direction, and one voice across 100+ languages, ranking above ElevenLabs and Google on Speech Arena.

Deep Dive Apr 30, 2026

xAI Grok 4.3 + Custom Voices Bundle Creator Stack

xAI pushed Grok 4.3 to the public API on April 30 and bundled Custom Voices, a one-minute voice cloning workflow, at no per-voice surcharge. The release consolidates text, voice, and creative agents on a single xAI bill.

Deep Dive Apr 2, 2026

Open Source Audio AI Closes the Quality Gap

The first week of April 2026 delivered three significant open-source audio AI models that directly challenge paid alternatives in voice synthesis, sound effects, and multilingual speech.

Deep Dive Mar 27, 2026

The AI Music War: How One Week Fractured an Industry

In one week, Suno v5.5 added voice cloning, Udio pivoted to a walled garden, Google launched Lyria 3 Pro for developers, and Spotify built artist protection tools, fracturing AI music into competing visions.

Stay ahead of Creative AI