Title question mostly. I’ve played with XTTS-v2 and it worked pretty well, but I’m wondering if folks are using anything else special. I’d like to train my own voice finetune which is what I did with XTTS-v2, and then use it with home assistant’s voice feature. Welcome all opinions on it!
if you need English - right now it’s kokoro-fastapi https://github.com/remsky/Kokoro-FastAPI set this container up and use it as an openai TTS endpoint using this hacs integration https://github.com/sfortis/openai_tts
Very nice! I’ll check this out!
Piper works pretty well. I’m only using it because it was easier to find a custom glados voice.
Kokoro has good default voices. I also started trying out Speaches recently. It provides an open ai api wrapper around several options
Any tips on getting speaches to work with Home assistant? Got speaches working but haven’t gone the next step yet.
Don’t know much about the training side of things, but I have Piper set up with home assistant using the Wyoming protocol and it just goes. Some of the out-of-the-box voices are pretty decent too.
Pretty much just personal preference at this point. XTTS is certainly not the most efficient though.
any personal preferences you recommend?
Pico, Piper, Mary, and Google all run locally and off of CPU only.
I think all the rest require cloud accounts or acceleration hardware to work quickly.
I’m personally fine with Mary or Piper, but I know some people like the fancier ones.
Google? Have you verified that?
Yes. Have a look at the docs: https://www.home-assistant.io/integrations/google_translate/