DeepL Acquires Mixhalo for Live Voice Translation
DeepL has acquired Mixhalo, the real-time audio platform built for concerts and conferences, pushing AI voice translation into live events.
DeepL has acquired Mixhalo, the real-time audio platform built for concerts and conferences, pushing AI voice translation into live events.
Cartesia launched Sonic-3.5 and Ink-2 on June 15, 2026, claiming the top streaming voice AI on both speaking and listening, built for real-time voice agents.
TTS Audio Suite shipped v5.0.0 for ComfyUI, adding Higgs Audio v3 voice cloning, Transformers 5, and Runtime Isolation for legacy engines.
The best AI music generators in 2026, from Suno and Udio to ElevenLabs Music, Google Lyria, and Stable Audio. We compare standout features, pricing, and commercial-use rights so you can pick the right tool for songs, scores, or royalty-free background music.
A new open-source Audio-Reactive LoRA for LTX 2.3 turns a still image and an audio track into a music-driven clip, with motion synced to the beat.
Suno's new Advanced Split mode lets Premier subscribers extract any of nearly 100 instruments from a track, from a full drum kit to vocals.
A revamped open-source TTS benchmark now compares 46 text-to-speech models using objective scores and blind human voting, so creators can see which voices actually hold up.
A new research benchmark published June 1, 2026 reveals that state-of-the-art AI music detectors systematically fail on hybrid productions, the kind created by most real-world music producers using tools like Suno or Udio.
OpenMOSS published the MOSS-Audio technical report on June 1, 2026, documenting four open-source audio-language models that achieve benchmark scores rivaling systems three to four times their size.
A developer used Stability AI Stable Audio 3 Medium to generate 15,834 free audio samples: 10,359 drum one-shots and 5,475 pitched instrument recordings, available for immediate download.
Baidu open-sourced NAVA, a 6.3B parameter joint audio-video model that generates 720p video with synced dual-channel audio in a single pass.
parakeet.cpp ports NVIDIA Parakeet automatic speech recognition models to ggml, eliminating the Python runtime entirely — with byte-identical output to NeMo at up to 1.86x faster throughput.
A team of 10 researchers published Dasheng AudioGen on May 27, 2026, a unified model that generates complete audio scenes from text descriptions.
ElevenLabs released Music v2 on May 26, 2026 with dynamic genre transitions and up to 50% lower prices. Live now on ElevenMusic and ElevenCreative.
Orchestria launches May 25 with a multi-agent AI music engine that exposes drums, bass, and melody as editable stems. Free, Pro $8, Maestro $25.
Developer Shukant Pal built Pretzel at the Google I/O hackathon on May 24, 2026: a live web-based music sequencer where an AI agent controls what everyone hears. No signup required.
Sonar entered public beta offering natural-language search across podcasts, news, earnings calls, and radio archives for AI agents. Free tier: 500 queries/month.
Spotify's new Studio desktop app uses an AI agent to generate private, personalized podcasts from your calendar, email, and web browsing. In research preview across 20+ markets.
Spotify announced a new ElevenLabs-powered audiobook creation tool at its May 21 Investor Day, with new audiobook subscription plans expected later in 2026.
Wrap an existing chat agent (Claude, GPT, Gemini) with ElevenLabs Speech Engine voice in roughly 30 minutes. Step-by-step tutorial covers Node and Python SDKs, WebRTC token flow, turn detection, interruption, and the upgrade path to ElevenAgents.
Researchers at Tampere University published a preprint introducing ACAD, an AI system that adapts what it treats as noise based on the acoustic scene it detects.
Stability AI released Stable Audio 3 on May 20, 2026, a family of open-weights latent diffusion models that generate up to 6 minutes and 20 seconds of audio from a text prompt, including on a MacBook Pro M4.
StemDeck v0.5.0 splits any track into vocals, drums, bass, guitar, piano, and other elements. Runs locally, free, no account required.
ComfyUI-DramaBox added LoRA weight injection on May 16, letting creators load custom voice personalities into workflows without model reloads.