Jamie Pine has released Voicebox, a free, open-source desktop application that brings professional voice synthesis to your local machine. With seven text-to-speech engines, voice cloning from a few seconds of audio, and support for 23 languages, it is a privacy-first alternative to cloud-based services like ElevenLabs.
For the broader landscape, see our complete producer guide to AI music and audio in 2026.
What Happened
Pine shared Voicebox on Hacker News on April 20, 2026. The application supports seven TTS engines -- Qwen3-TTS, LuxTTS, Kokoro, and four others -- each suited to different use cases for voice quality, speed, and language coverage. All processing runs locally on your machine with no server connection required. Prebuilt binaries are available for macOS and Windows, with Docker support documented for Linux environments.
Why It Matters
Cloud-based voice services charge per character or per audio minute and require an internet connection for every generation. Voicebox eliminates both constraints. For podcasters, game developers, and accessibility tool builders who generate large volumes of audio, the cost difference is significant. The privacy angle matters too: audio content and scripts never leave the device.
Voice AI is moving in two directions at once. This week, xAI launched its Grok Voice API at $0.10 per hour for speech-to-text and $4.20 per million tokens for text-to-speech. Voicebox offers a free, local alternative for creators who want the same capability without recurring costs or data leaving their machine.
Key Details
- TTS engines: Seven supported, including Qwen3-TTS, LuxTTS, and Kokoro
- Voice cloning: Clone any voice from a few seconds of reference audio
- Preset voices: 50+ ready-to-use profiles
- Languages: 23 supported
- Audio effects: Eight post-processing options including pitch shift, reverb, and compression
- Stories Editor: Multi-track timeline editor for podcast-style multi-voice compositions
- REST API: Local API at
localhost:17493for custom workflow integrations - GPU support: Apple Silicon via MLX, Nvidia via CUDA, AMD via ROCm, Intel Arc via IPEX
- Platforms: macOS and Windows (Docker for Linux)
- Cost: Free, open source on GitHub
What to Do Next
Download Voicebox from the the repository. Prebuilt installers are available for macOS and Windows. After launching, the local REST API documentation is available at localhost:17493/docs for anyone building custom integrations.
Podcasters should explore the Stories Editor for scripted, multi-character audio. Game developers can use the API to pipe voice generation directly into build or dialogue systems. For a look at hardware-based AI audio production, see the RODECaster Studio launch from NAB 2026, which targets live recording and post-production workflows.