OpenAI Acquires Weights.gg, Folds Voice-Cloning Team
OpenAI quietly bought voice-cloning startup Weights.gg, dispersed its six-person team across existing groups, and folded the tech into Voice Engine work.
OpenAI quietly bought voice-cloning startup Weights.gg, dispersed its six-person team across existing groups, and folded the tech into Voice Engine work.
Supertonic 3 is an open-weights, CPU-only TTS engine from Supertone with 31 languages, expression tags, and zero-shot voice cloning.
Break-the-Beat! is a new AI model that renders drum MIDI patterns as realistic audio using a reference recording for timbre. Researchers released a demo with dozens of examples spanning Speed Metal, Funk Rock, and electronic drum kits.
ScenemaAI released Scenema Audio on Hugging Face and GitHub, an open-weights expressive TTS and zero-shot voice cloning model built on the audio half of Lightricks LTX-2. MIT inference code, 13 languages, real-time on a 24 GB GPU.
Google partners with Believe and TuneCore to route Flow Music, its Lyria 3 Pro song studio, to indie artists. Ambassador program meets Google's product team weekly.
ElevenLabs crossed $500 million in ARR on May 5, 2026, announcing BlackRock, Nvidia, Jamie Foxx, and Eva Longoria as new investors. The $11B valuation makes it the most-funded independent voice AI company.
Ableton Live 12.4 shipped May 5, 2026 as a free update for Live 12. The standout feature is Link Audio, which streams audio wirelessly between two devices on your local network without extra hardware.
Jamie Pine's Voicebox brings seven TTS engines and voice cloning to your local machine. Free, open source, and entirely offline.
RODE announced RODECaster Studio at NAB 2026 on April 17, 2026 -- a new Mac and Windows desktop app that brings AI-powered editing to podcast post-production.
Google launched a native Gemini app for Mac on April 16, 2026, bringing image, video, and music generation directly to the desktop for the first time.
Researchers published Darwin-TTS on April 15, 2026, a text-to-speech model that adds emotional expression to AI voice without any training, fine-tuning, or new data.
Google released Gemini 3.1 Flash TTS on April 15, 2026, a text-to-speech model that outperforms ElevenLabs v3 in quality benchmarks while offering a generous free tier.
Splice launched Variations and Craft on April 15, 2026, letting producers remix any sample from its 3-million-sound library while automatically compensating the original creator.
ComfyUI now supports Sonilo via Partner Nodes, letting creators generate full-length soundtracks that sync to video footage frame by frame.
llama.cpp release b8769 adds audio multimodal support for Qwen3-Omni and Qwen3-ASR models, bringing local speech recognition and audio understanding to consumer hardware.
OpenBMB released VoxCPM2, a 2 billion parameter text-to-speech model that runs on 8GB VRAM, supports 30 languages at 48kHz, and can design voices from natural language descriptions.
ElevenLabs has announced on-premise and on-device deployment options for its voice AI platform, letting organizations run text-to-speech inference entirely within their own infrastructure.
Suno licensing negotiations with Universal Music Group and Sony Music have stalled over a core disagreement: whether users can download and share AI-generated songs outside the platform.
Google quietly launches Eloquent, a free Gemma-powered dictation app for iOS with offline transcription, filler word removal, and text transformation tools.
Sync, the Y Combinator-backed lipsync startup, launched Sync-3 on April 6. The new model processes entire shots at once, producing 4K native output with built-in obstruction detection.
Sony AI has released Woosh, an open source sound effects foundation model supporting text-to-audio and video-to-audio generation.
OmniVoice is a zero-shot text-to-speech model supporting 600+ languages with voice cloning, an Apache 2.0 license, and inference 40x faster than real-time.
Deezer has licensed its AI-generated music detection technology to Hungarys EJI performers rights body. The platform flags over 60,000 AI tracks daily, with 39% of all uploads now involving AI.
ElevenLabs and IBM announced a partnership on March 25 to integrate 10,000+ voices across 70 languages into IBM watsonx Orchestrate.