Voicebox: Free Local Voice Cloning Studio

Jamie Pine has released Voicebox, a free, open-source desktop application that brings professional voice synthesis to your local machine. With seven text-to-speech engines, voice cloning from a few seconds of audio, and support for 23 languages, it is a privacy-first alternative to cloud-based services like ElevenLabs.

For the broader landscape, see our complete producer guide to AI music and audio in 2026.

What Happened

Pine shared Voicebox on Hacker News on April 20, 2026. The application supports seven TTS engines -- Qwen3-TTS, LuxTTS, Kokoro, and four others -- each suited to different use cases for voice quality, speed, and language coverage. All processing runs locally on your machine with no server connection required. Prebuilt binaries are available for macOS and Windows, with Docker support documented for Linux environments.

Why It Matters

Cloud-based voice services charge per character or per audio minute and require an internet connection for every generation. Voicebox eliminates both constraints. For podcasters, game developers, and accessibility tool builders who generate large volumes of audio, the cost difference is significant. The privacy angle matters too: audio content and scripts never leave the device.

Voice AI is moving in two directions at once. This week, xAI launched its Grok Voice API at $0.10 per hour for speech-to-text and $4.20 per million tokens for text-to-speech. Voicebox offers a free, local alternative for creators who want the same capability without recurring costs or data leaving their machine.

Key Details

TTS engines: Seven supported, including Qwen3-TTS, LuxTTS, and Kokoro
Voice cloning: Clone any voice from a few seconds of reference audio
Preset voices: 50+ ready-to-use profiles
Languages: 23 supported
Audio effects: Eight post-processing options including pitch shift, reverb, and compression
Stories Editor: Multi-track timeline editor for podcast-style multi-voice compositions
REST API: Local API at localhost:17493 for custom workflow integrations
GPU support: Apple Silicon via MLX, Nvidia via CUDA, AMD via ROCm, Intel Arc via IPEX
Platforms: macOS and Windows (Docker for Linux)
Cost: Free, open source on GitHub

What to Do Next

Download Voicebox from the the repository. Prebuilt installers are available for macOS and Windows. After launching, the local REST API documentation is available at localhost:17493/docs for anyone building custom integrations.

Podcasters should explore the Stories Editor for scripted, multi-character audio. Game developers can use the API to pipe voice generation directly into build or dialogue systems. For a look at hardware-based AI audio production, see the RODECaster Studio launch from NAB 2026, which targets live recording and post-production workflows.

Voicebox: Free Open-Source Voice Studio with 7 TTS Engines

What Happened

Why It Matters

Key Details

What to Do Next

Keep reading

Gemini Generates PDFs, Excel, Slides Direct From Chat

IBM Granite 4.1: Dense LLMs Walk Back the MoE Bet

Mistral Medium 3.5: 128B Open Weights, Cloud Vibe Agents

What Happened

Why It Matters

Key Details

What to Do Next

Stay ahead of AI

Keep reading

Gemini Generates PDFs, Excel, Slides Direct From Chat

IBM Granite 4.1: Dense LLMs Walk Back the MoE Bet

Mistral Medium 3.5: 128B Open Weights, Cloud Vibe Agents

Stay ahead of Creative AI