Fish Audio S2: Open-Source TTS Beats GPT-4o
Fish Audio released S2, an open-source text-to-speech model trained on over 10 million hours of audio data spanning 50 languages. The model beats GPT-4o-mini-tts on the EmergentTTS-Eval benchmark with an 81.88% win rate, and ships with full weights, fine-tuning code, and a streaming inference engine