so i yes, espeak exists and still sounds terrible even worse than picoTTS (last update 4 yrs ago?). so what else is there? i look at mimic3 and it says they are dead and one should go for piper here: https://github.com/MycroftAI/mimic3 the link to piper followed I get: https://github.com/rhasspy/piper "This repository was archived by the owner on Oct 6, 2025. It is now read-only. "
ok, so coqui? https://github.com/coqui-ai/TTS no update in over 12 months…how bad can it be? https://coqui.ai/ …great it is a page for gambling now.
so, what are you using? gTTS is not offline.


Kokoro is your best bet right now. It works wonderfully even in a docker container with no GPU. There are others but I don’t have the list right now. Will throw another update on here when I do.
The rhasspy guy was very invested in Coqui. He built a lot of his own stuff, for his home automation and such. But Coqui was superior, so he started spending time on that.
Unfortunately, the coqui team (based out of Mozilla) was very distracted and didn’t ship a lot of stuff on time or at all. It doesn’t even have basic stuff like SSML support right now, if I recall correctly. So the rhasspy guy also lost steam.
Of course, with the OpenAI model of audio generation, you’re expected to not use SSML at all and just use the black box API to get “good enough” results. That really sucks.
Oh, I just remembered which other one I wanted to mention - someone has built an open source version of NotebookLLM, complete with multi voice support. But it requires GPU, I believe. Do what you will with that. I’ll add a link if I find it.
I prefer kokoro because it’s really solid and works really well on CPU.