Guide · updated July 2026 · licenses checked

6 Best Self-Hosted ElevenLabs Alternatives

ElevenLabs is excellent and priced like it knows it. Here are the open-source text-to-speech and voice-cloning models you can run on your own hardware instead — including the license traps that make some of them illegal to use commercially.

TL;DR — our picks

Best all-rounder: Kokoro. Small, fast, permissive license, shockingly good quality for its size — runs on a CPU.
Voice cloning: Chatterbox. MIT-licensed cloning from a short audio sample, with emotion control.
Smart-home / low power: Piper. Real-time on a Raspberry Pi.

ModelVoice cloningCommercial useHardwareBest for
KokoroNo✔ PermissiveCPU is fineNarration, apps, general TTS
ChatterboxYes✔ MITGPU recommendedVoice cloning with emotion control
OpenVoice v2Yes✔ MITGPU recommendedCloning + tone/style control
PiperNo✔ MITCPU / Raspberry PiHome Assistant, embedded, speed
XTTS-v2Yes✘ Non-commercialGPU, ~4GB+ VRAMMultilingual cloning (personal use)
F5-TTSYes✘ Weights non-commercialGPU recommendedResearch-grade cloning quality
The license trap: with TTS, the GitHub code and the model weights often carry different licenses. A repo can be MIT while the downloadable voice model forbids commercial use. If you're making money from the audio (YouTube counts), check the weights license — we've marked the status in the table.

The six, in detail

01 · KokoroSmall model, big voiceALL-ROUNDER

Kokoro became the community favorite for a reason: it's a tiny model (82M parameters — hundreds of times smaller than an LLM) that produces clean, natural narration, runs in real time on ordinary CPUs, and ships under a permissive license. If you want "read this text aloud, nicely, on my own machine," start here and you may never need anything else.

Trade-off: you pick from its built-in voices — it doesn't clone yours.

Kokoro on GitHub →
02 · ChatterboxOpen voice cloning, done rightCLONING

Released by Resemble AI under MIT, Chatterbox clones a voice from a short reference clip and — its party trick — lets you dial emotion and intensity up or down. It's the closest open-source answer to what people actually buy ElevenLabs for.

Trade-off: it wants a GPU for comfortable generation speeds, and like all cloning models, output quality tracks the quality of your reference audio.

Chatterbox on GitHub →
03 · OpenVoice v2Clone the voice, control the toneCLONING

OpenVoice (from MyShell) separates whose voice from how it's spoken — clone a speaker, then independently adjust style. Version 2 moved to an MIT license, which made it safe for commercial projects.

Trade-off: setup is more researcher-flavored than Chatterbox; expect to read the README properly.

OpenVoice on GitHub →
04 · PiperReal-time TTS on a PiLOW POWER

Piper is the workhorse of the Home Assistant world: MIT-licensed, absurdly efficient, real-time speech on hardware as small as a Raspberry Pi, with a big catalog of ready-made voices across many languages. For smart-home announcements, accessibility, or anything embedded, it's the default.

Trade-off: voices are pleasant but noticeably more "TTS" than Kokoro or the cloning models.

Piper on GitHub →
05 · XTTS-v2Great cloning, restrictive licenseCAVEAT

Coqui's XTTS-v2 does multilingual voice cloning from a few seconds of audio and still holds up. But two caveats: the company behind it shut down (the community keeps forks alive), and the model license is non-commercial. Fine for personal projects; not for anything that earns.

Coqui TTS on GitHub →
06 · F5-TTSResearch-grade, license-limitedCAVEAT

F5-TTS produces some of the most natural cloned speech in open source. The catch mirrors XTTS: the released model weights carry non-commercial terms (they were trained on a non-commercial dataset). Beautiful for experiments and personal use; check licensing carefully before anything commercial.

F5-TTS on GitHub →

Hardware reality check

Good news: TTS is far lighter than chat models.

Honest exit ramp: if you need many languages, consistent studio quality, and zero maintenance — the self-hosted stack will feel like a hobby, because it is one. ElevenLabs' paid tiers exist for a reason. See ElevenLabs pricing →

FAQ

Is there a truly free ElevenLabs alternative?

Yes — everything above is free to run. Your costs are hardware and time. Kokoro on a CPU is the lowest-friction starting point.

Can I legally clone my own voice?

Cloning your own voice is fine. Cloning someone else's without consent ranges from ethically bad to legally actionable, depending on where you live and what you do with it. Don't.

Which one should a beginner start with?

Kokoro if you just need speech; Chatterbox if you specifically want cloning. Both have active communities when you get stuck.