XTTS Deep50D's fixed Glz remake (Functional Text-to-Speech)

XTTS is a Voice generation model that lets you clone voices into different languages by using just a quick 3-second audio clip. XTTS is built on previous research, like Tortoise, with additional architectural innovations and training to make cross-language voice cloning and multilingual speech generation possible. This is the same model that powers our creator application Coqui Studio as well as the Coqui API. In production we apply modifications to make low-latency streaming possible. Leave a star on the Github TTS, where our open-source inference and training code lives.

For faster inference without waiting in the queue, you should duplicate this space.

Sending illegal content of any kind, in any language is, of course, FORBIDDEN. The authors of this space cannot be held responsible for those who violate the strictly [ETHICAL AND MORAL] use of this model.

Fixed stuff to make this thing work again and corrected some other things. Possibly still a bunch to fix or reinstate, or something like that xDeep50D

By using this demo you agree to the terms of the Coqui Public Model License at https://coqui.ai/cpml