Recent Releases of tts
tts - v0.22.0
What's Changed
- fix: Few typos in Tortoise docs. by @VladCuciureanu in https://github.com/coqui-ai/TTS/pull/3352
- fix pause problem of Chinese speech by @aaron-lii in https://github.com/coqui-ai/TTS/pull/3351
- Fix typos by @omahs in https://github.com/coqui-ai/TTS/pull/3368
- Print message for either commercial license or CPML by @JRMeyer in https://github.com/coqui-ai/TTS/pull/3381
- Add inference parameters by @WeberJulian in https://github.com/coqui-ai/TTS/pull/3373
- Training fastspeech2 with External Speaker Embeddings by @freds0 in https://github.com/coqui-ai/TTS/pull/3404
- fixes a typo by @joelhoward0 in https://github.com/coqui-ai/TTS/pull/3392
- support multiple GPU training for XTTS by @aaron-lii in https://github.com/coqui-ai/TTS/pull/3391
- Add studio speakers to open source XTTS! by @WeberJulian in https://github.com/coqui-ai/TTS/pull/3405
New Contributors
- @VladCuciureanu made their first contribution in https://github.com/coqui-ai/TTS/pull/3352
- @aaron-lii made their first contribution in https://github.com/coqui-ai/TTS/pull/3351
- @omahs made their first contribution in https://github.com/coqui-ai/TTS/pull/3368
- @JRMeyer made their first contribution in https://github.com/coqui-ai/TTS/pull/3381
- @freds0 made their first contribution in https://github.com/coqui-ai/TTS/pull/3404
- @joelhoward0 made their first contribution in https://github.com/coqui-ai/TTS/pull/3392
Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.21.3...v0.22.0
- Python
Published by erogol about 2 years ago
tts - v0.21.3
What's Changed
- Add XTTS Fine tuning gradio demo by @Edresson in https://github.com/coqui-ai/TTS/pull/3296
Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.21.2...v0.21.3
No-Code XTTS fine-tuning
We created a UI that you can use to fine-tune XTTS with your data. You can run it on Colab, locally, or on a server.
@WeberJulian has also recorded a video for showing step-by-step tutorial
You can also follow the XTTS docs if you are a read-and-learn type.
- Python
Published by erogol about 2 years ago
tts - v0.21.2
What's Changed
- Run XTTS models by direct name with versions by @erogol in https://github.com/coqui-ai/TTS/pull/3318
- fix: correctly strip/restore initial punctuation by @eginhard in https://github.com/coqui-ai/TTS/pull/3336
- Fix link to installation instructions by @Vuizur in https://github.com/coqui-ai/TTS/pull/3329
New Contributors
- @Vuizur made their first contribution in https://github.com/coqui-ai/TTS/pull/3329
Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.21.1...v0.21.2
This PR allows for running XTTS models with version tags. So you the user can access any version they like.
```python from TTS.api import TTS
get v2.0.2
tts = TTS(modelname="xttsv2.0.2", gpu=True)
get the latest version
tts = TTS(model_name="xtts", gpu=True)
generate speech by cloning a voice using default settings
tts.ttstofile(text="Here is my sample text.", filepath="output.wav", speakerwav=["reference.wav", "reference1.wav"], language="en") ```
Making automatic sentence splitting optional. So you can apply any custom logic for processing the text before passing it to the model. Set split_sentences False.
```python from TTS.api import TTS
get v2.0.2
tts = TTS(modelname="xttsv2.0.2", gpu=True)
generate speech by cloning a voice using default settings
tts.ttstofile(text="Here is my sample text.", filepath="output.wav", speakerwav=["reference.wav", "reference1.wav"], language="en", split_sentences=False) ```
- Python
Published by erogol about 2 years ago
tts - v0.21.0
What's Changed
- Remove duplicate/unused code by @eginhard in https://github.com/coqui-ai/TTS/pull/3243
- Making the Model Manager's Progress bar statically accessible via the class. by @FlorianEagox in https://github.com/coqui-ai/TTS/pull/3297
- More informative error for wrong --language argument by @eginhard in https://github.com/coqui-ai/TTS/pull/3294
- Don't pass quotes to espeak by @eginhard in https://github.com/coqui-ai/TTS/pull/3286
- Fix ttswithvc by @eginhard in https://github.com/coqui-ai/TTS/pull/3275
- Misjudgment of
is_multi_lingualWhen Loading Multilingual Model viamodel_pathby @TITC in https://github.com/coqui-ai/TTS/pull/3273 - Introducing Development Dockerfile by @Kaszanas in https://github.com/coqui-ai/TTS/pull/3263
- update deepspeed version by @WeberJulian in https://github.com/coqui-ai/TTS/pull/3281
New Contributors
- @FlorianEagox made their first contribution in https://github.com/coqui-ai/TTS/pull/3297
- @TITC made their first contribution in https://github.com/coqui-ai/TTS/pull/3273
- @Kaszanas made their first contribution in https://github.com/coqui-ai/TTS/pull/3263
Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.20.6...v0.21.0
- Python
Published by erogol about 2 years ago
tts - v0.20.6
What's Changed
- Remove duplicate AudioProcessor code, fix ExtractTTSpectrogram.ipynb by @eginhard in https://github.com/coqui-ai/TTS/pull/3230
- Add sentence splitting by @WeberJulian in https://github.com/coqui-ai/TTS/pull/3227
- Fix zh bug by @WeberJulian in https://github.com/coqui-ai/TTS/pull/3238
- Update versions by @erogol in https://github.com/coqui-ai/TTS/pull/3248
- Ensures that only GPT model is in training mode during XTTS GPT training by @Edresson in https://github.com/coqui-ai/TTS/pull/3241
- Loosen dependencies and make k_diffusion optional by @erogol in https://github.com/coqui-ai/TTS/pull/3249
- Update XTTS v2.0.2 by @erogol in https://github.com/coqui-ai/TTS/pull/3249
Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.20.5...v0.20.6
- Python
Published by erogol over 2 years ago
tts - v0.20.5
What's Changed
- Add speed control for inference by @WeberJulian in https://github.com/coqui-ai/TTS/pull/3214
- Update README.md by @eltociear in https://github.com/coqui-ai/TTS/pull/3215
- Fix XTTS GPT padding and inference issues by @Edresson in https://github.com/coqui-ai/TTS/pull/3216
Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.20.4...v0.20.5
- Python
Published by erogol over 2 years ago
tts - v0.20.4
What's Changed
- Update XTTS cloning by @erogol in https://github.com/coqui-ai/TTS/pull/3207
- fix max generation length for XTTS by @WeberJulian in https://github.com/coqui-ai/TTS/pull/3208
Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.20.3...v0.20.4
- Python
Published by erogol over 2 years ago
tts - v0.20.3
What's Changed
- XTTS- Torchaudio should use proper backend to load audio by @gorkemgoknar in https://github.com/coqui-ai/TTS/pull/3179
- PyTorch 2.1 Updates (Weight Norm and TorchAudio I/O) by @MattyB95 in https://github.com/coqui-ai/TTS/pull/3176
- xtts/tokenizer: merge duplicate implementations of preprocess_text by @akx in https://github.com/coqui-ai/TTS/pull/3170
- fix(formatters): set missing root_path attribute by @eginhard in https://github.com/coqui-ai/TTS/pull/3182
Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.20.2...v0.20.3
- Python
Published by erogol over 2 years ago
tts - v0.20.2
What's Changed
- Add char limit warn by @WeberJulian in https://github.com/coqui-ai/TTS/pull/3130
- Fix coqui api by @erogol in https://github.com/coqui-ai/TTS/pull/3168
- Fix #3153 by @erogol in https://github.com/coqui-ai/TTS/pull/3169
- Move FreeVCConfig to TTS.vc.configs (like all other config classes) by @akx in https://github.com/coqui-ai/TTS/pull/3126
- Fix ModelManager.list_models() by @eginhard in https://github.com/coqui-ai/TTS/pull/3128
- Fix for exception on streaming on last chunk by @gorkemgoknar in https://github.com/coqui-ai/TTS/pull/3160
- Add lang code in XTTS doc by @WeberJulian in https://github.com/coqui-ai/TTS/pull/3158
- Remove v1 doc and tests by @WeberJulian in https://github.com/coqui-ai/TTS/pull/3172
New Contributors
- @eginhard made their first contribution in https://github.com/coqui-ai/TTS/pull/3128
- @gorkemgoknar made their first contribution in https://github.com/coqui-ai/TTS/pull/3160
Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.20.1...v0.20.2
- Python
Published by erogol over 2 years ago
tts - v0.20.1
What's Changed
- Drop diffusion from XTTS by @erogol in https://github.com/coqui-ai/TTS/pull/3150
- Bug fixes and add support for multiples speaker references on XTTS inference by @Edresson in https://github.com/coqui-ai/TTS/pull/3149
- Fix XTTS v2.0 training recipe by @Edresson in https://github.com/coqui-ai/TTS/pull/3154
Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.20.0...v0.20.1
- Python
Published by erogol over 2 years ago
tts - v0.20.0
What's Changed
- Run
make style& re-enable it in CI by @akx in https://github.com/coqui-ai/TTS/pull/3127 - XTTS v2.0 by @Edresson in https://github.com/coqui-ai/TTS/pull/3137
Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.19.1...v0.20.0
- Python
Published by erogol over 2 years ago
tts - v0.19.1
What's Changed
- Second round of issue fixing for XTTS v1.1 by @WeberJulian in https://github.com/coqui-ai/TTS/pull/3103
- fix for issue 3067 by @Aya-AlJafari in https://github.com/coqui-ai/TTS/pull/3109
- Bug: self.model_name needed to be initialized. by @vltmedia in https://github.com/coqui-ai/TTS/pull/2983
New Contributors
- @vltmedia made their first contribution in https://github.com/coqui-ai/TTS/pull/2983
Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.19.0...v0.19.1
- Python
Published by erogol over 2 years ago
tts - v0.18.0
What's Changed
- XTTS v1.1 by @WeberJulian in https://github.com/coqui-ai/TTS/pull/3089
Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.17.10...v0.18.0
XTTS v1.1
This model is trained on top of XTTS v1, using output masking. We mask the part of the output that is used as the audio prompt while training and don't compute loss for that segment. This helps us to resolve the hallucination issue that V1 experienced.
Changes
- Add Japanese
- Resolve the hallucination issue (repeating the audio prompt)
- Increased expressivity
- Hash check to control model version
- Added
ne_hifiganthat was trained without denoising that brought some EQ and compression profile that might be unwanted for some use-cases
- Python
Published by erogol over 2 years ago
tts - v0.17.9
What's Changed
- fixed bugs in fastpitch tts synthesis by @Aya-AlJafari in https://github.com/coqui-ai/TTS/pull/3058
- Update AnalyzeDataset.ipynb by @meryemsakin in https://github.com/coqui-ai/TTS/pull/2783
- Synthesizer skips over embeddings file if model only has one speaker by @wonkothesanest in https://github.com/coqui-ai/TTS/pull/2587
- fixed typo of docs\source\implementinganew_model.md by @Subash-Lamichhane in https://github.com/coqui-ai/TTS/pull/3066
- fixed typo of /docs by @Subash-Lamichhane in https://github.com/coqui-ai/TTS/pull/3065
- Add play and speed to cli options by @David-bfg in https://github.com/coqui-ai/TTS/pull/3027
- Fix doc dataset by @WeberJulian in https://github.com/coqui-ai/TTS/pull/3070
- fix readme by @WeberJulian in https://github.com/coqui-ai/TTS/pull/3071
New Contributors
- @meryemsakin made their first contribution in https://github.com/coqui-ai/TTS/pull/2783
- @wonkothesanest made their first contribution in https://github.com/coqui-ai/TTS/pull/2587
- @Subash-Lamichhane made their first contribution in https://github.com/coqui-ai/TTS/pull/3066
- @David-bfg made their first contribution in https://github.com/coqui-ai/TTS/pull/3027
Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.17.8...v0.17.9
- Python
Published by erogol over 2 years ago
tts - v0.17.7
What's Changed
- Upgrade and Optimize TTS Code in extractttsspectrogram.ipynb by @anupammaurya6767 in https://github.com/coqui-ai/TTS/pull/3012
- None is not able to be read for "XTTS", fixes crash if its set to None. by @OPPEYRADY in https://github.com/coqui-ai/TTS/pull/3009
- Streaming inference for XTTS 🚀 by @WeberJulian in https://github.com/coqui-ai/TTS/pull/3035
New Contributors
- @anupammaurya6767 made their first contribution in https://github.com/coqui-ai/TTS/pull/3012
- @OPPEYRADY made their first contribution in https://github.com/coqui-ai/TTS/pull/3009
Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.17.6...v0.17.7
- Python
Published by erogol over 2 years ago
tts - v0.17.6
What's Changed
- Duplicate code removal by @akx in https://github.com/coqui-ai/TTS/pull/3003
- Loosen dependency pins by @akx in https://github.com/coqui-ai/TTS/pull/3001
- Remove unnecessary black exclude config by @akx in https://github.com/coqui-ai/TTS/pull/2999
- Ensure
ttsCLI tool readme and usage is in sync by @akx in https://github.com/coqui-ai/TTS/pull/2993 - Adding Belarusian TTS model by @erogol in https://github.com/coqui-ai/TTS/pull/2922
- Tortoise inference fix and fix zoo unit tests by @Edresson in https://github.com/coqui-ai/TTS/pull/3010
New Contributors
- @akx made their first contribution in https://github.com/coqui-ai/TTS/pull/3003
Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.17.5...v0.17.6
- Python
Published by erogol over 2 years ago
tts - v0.17.5
What's Changed
- Fix fsspec requirement by @WeberJulian in https://github.com/coqui-ai/TTS/pull/2970
- Add coqui blog post by @osanseviero in https://github.com/coqui-ai/TTS/pull/2949
- fix: xtts not taking into account device flag by @loupzeur in https://github.com/coqui-ai/TTS/pull/2951
- fix package versions by @WeberJulian in https://github.com/coqui-ai/TTS/pull/2990
New Contributors
- @osanseviero made their first contribution in https://github.com/coqui-ai/TTS/pull/2949
- @loupzeur made their first contribution in https://github.com/coqui-ai/TTS/pull/2951
Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.17.4...v0.17.5
- Python
Published by erogol over 2 years ago
tts - 👑v0.17.0
What's Changed
- 👑XTTS implementation by @erogol in https://github.com/coqui-ai/TTS/pull/2939
- Fix requests exception handling in manage.py by @Cohee1207 in https://github.com/coqui-ai/TTS/pull/2912
- Fixed spectrogram checking on librosa 0.10.x by @T145 in https://github.com/coqui-ai/TTS/pull/2899
- Add CML-TTS dataset YourTTS training recipe by @Edresson in https://github.com/coqui-ai/TTS/pull/2934
New Contributors
- @Cohee1207 made their first contribution in https://github.com/coqui-ai/TTS/pull/2912
- @T145 made their first contribution in https://github.com/coqui-ai/TTS/pull/2899
Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.16.6...v0.17.0
What's Changed
- Fix requests exception handling in manage.py by @Cohee1207 in https://github.com/coqui-ai/TTS/pull/2912
- Fixed spectrogram checking on librosa 0.10.x by @T145 in https://github.com/coqui-ai/TTS/pull/2899
- Add CML-TTS dataset YourTTS training recipe by @Edresson in https://github.com/coqui-ai/TTS/pull/2934
New Contributors
- @Cohee1207 made their first contribution in https://github.com/coqui-ai/TTS/pull/2912
- @T145 made their first contribution in https://github.com/coqui-ai/TTS/pull/2899
Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.16.6...v0.17.0
- Python
Published by erogol over 2 years ago
tts - v0.16.6
What's Changed
- Update README with new device API by @jaketae in https://github.com/coqui-ai/TTS/pull/2876
- Add device flag to TTS CLI by @jaketae in https://github.com/coqui-ai/TTS/pull/2875
- [WIP] Add phonemizer for Belarusian language by @alex73 in https://github.com/coqui-ai/TTS/pull/2856
- Updated scipy version by @Exponefrv1 in https://github.com/coqui-ai/TTS/pull/2914
- Update docs by @erogol in https://github.com/coqui-ai/TTS/pull/2919
New Contributors
- @Exponefrv1 made their first contribution in https://github.com/coqui-ai/TTS/pull/2914
Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.16.5...v0.16.6
- Python
Published by erogol over 2 years ago
tts - v0.16.4
What's Changed
- Add customizable data home path by @WeberJulian in https://github.com/coqui-ai/TTS/pull/2871
- Add device support in TTS and Synthesizer by @jaketae in https://github.com/coqui-ai/TTS/pull/2855
New Contributors
- @jaketae made their first contribution in https://github.com/coqui-ai/TTS/pull/2855
Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.16.3...v0.16.4
- Python
Published by erogol over 2 years ago
tts - v0.16.3
What's Changed
- Update Studio API for XTTS by @erogol in https://github.com/coqui-ai/TTS/pull/2861
- Denote human voices in README.md by @michaelnew in https://github.com/coqui-ai/TTS/pull/2851
New Contributors
- @michaelnew made their first contribution in https://github.com/coqui-ai/TTS/pull/2851
Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.16.2...v0.16.3
- Python
Published by erogol over 2 years ago
tts - v0.16.2
What's Changed
- add post functionality to /api/tts by @ChaseCares in https://github.com/coqui-ai/TTS/pull/2836
- Add fairseq onnx support and strict configuration, fixes some onnx errors by @SystemPanic in https://github.com/coqui-ai/TTS/pull/2831
- Fix phoneme coverage notebook imports by @erogol in https://github.com/coqui-ai/TTS/pull/2845
- Handle missing JA phonemizer by @erogol in https://github.com/coqui-ai/TTS/pull/2843
Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.16.1...v0.16.2
- Python
Published by erogol over 2 years ago
tts - v0.16.1
What's Changed
- Adds multi-language support for VITS onnx, fixes onnx exporting and inference errors by @SystemPanic in https://github.com/coqui-ai/TTS/pull/2816
- Recipe for Belarusian TTS by @alex73 in https://github.com/coqui-ai/TTS/pull/2756
- Delightful TTS VCTK recipe fixes by @AWAS666 in https://github.com/coqui-ai/TTS/pull/2808
- Add kwargs to ignore extra arguments w/o error by @erogol in https://github.com/coqui-ai/TTS/pull/2822
- Fix DelightfulTTS by @erogol in https://github.com/coqui-ai/TTS/pull/2823
New Contributors
- @SystemPanic made their first contribution in https://github.com/coqui-ai/TTS/pull/2816
- @AWAS666 made their first contribution in https://github.com/coqui-ai/TTS/pull/2808
Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.16.0...v0.16.1
- Python
Published by erogol over 2 years ago
tts - v0.16.0
What's Changed
- Fix #2749 by @erogol in https://github.com/coqui-ai/TTS/pull/2750
- Fix share model page URL by @alex73 in https://github.com/coqui-ai/TTS/pull/2757
- Make Japanese-specific dependencies optional by @polm in https://github.com/coqui-ai/TTS/pull/2776
- API tests by @erogol in https://github.com/coqui-ai/TTS/pull/2790
- Add Delightful-TTS model by @loganhart420 in https://github.com/coqui-ai/TTS/pull/2095
- Fix Tortoise load by @erogol in https://github.com/coqui-ai/TTS/pull/2791
New Contributors
- @alex73 made their first contribution in https://github.com/coqui-ai/TTS/pull/2757
- @polm made their first contribution in https://github.com/coqui-ai/TTS/pull/2776
Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.15.6...v0.16.0
- Python
Published by erogol over 2 years ago
tts - 🛠️v0.15.6
What's Changed
- fix loading of model and vocoder configs by @ChaseCares in https://github.com/coqui-ai/TTS/pull/2698
- Update compute_embeddings.py by @46319943 in https://github.com/coqui-ai/TTS/pull/2668
- delete meaningless print() by @ZhouGongZaiShi in https://github.com/coqui-ai/TTS/pull/2662
- fixed small spelling mistakes finetuning.md by @Woutervdvelde in https://github.com/coqui-ai/TTS/pull/2551
- Resolve conflicts by @erogol in https://github.com/coqui-ai/TTS/pull/2741
- Export multispeaker onnx by @erogol in https://github.com/coqui-ai/TTS/pull/2743
- Fix #2745 by @erogol in https://github.com/coqui-ai/TTS/pull/2748
New Contributors
- @ChaseCares made their first contribution in https://github.com/coqui-ai/TTS/pull/2698
- @46319943 made their first contribution in https://github.com/coqui-ai/TTS/pull/2668
- @ZhouGongZaiShi made their first contribution in https://github.com/coqui-ai/TTS/pull/2662
- @Woutervdvelde made their first contribution in https://github.com/coqui-ai/TTS/pull/2551
Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.15.5...v0.15.6
- Python
Published by erogol over 2 years ago
tts - 🛠️ v0.15.5
What's Changed
- Update docs and credits by @erogol in https://github.com/coqui-ai/TTS/pull/2733
Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.15.4...v0.15.5
- Python
Published by erogol over 2 years ago
tts - 😄 v0.15.0
What's Changed
- Update stochasticdurationpredictor.py by @mengting7tw in https://github.com/coqui-ai/TTS/pull/2663
- Fix Tortoise load by @erogol in https://github.com/coqui-ai/TTS/pull/2697
- Inference API for 🐶Bark by @erogol in https://github.com/coqui-ai/TTS/pull/2685
- Drop Python 3.7 and 3.8 and stage Python 3.11 by @erogol in https://github.com/coqui-ai/TTS/pull/2700
Running 🐶Bark
```python text = "Hello, my name is Manmay , how are you?"
from TTS.tts.configs.bark_config import BarkConfig from TTS.tts.models.bark import Bark
config = BarkConfig() model = Bark.initfromconfig(config) model.loadcheckpoint(config, checkpointdir="path/to/model/dir/", eval=True)
with random speaker
outputdict = model.synthesize(text, config, speakerid="random", voice_dirs=None)
cloning a speaker.
It assumes that you have a speaker file in bark_voices/speaker_n/speaker.wav or bark_voices/speaker_n/speaker.npz
outputdict = model.synthesize(text, config, speakerid="ljspeech", voicedirs="barkvoices/") ```
Using 🐸TTS API:
```python from TTS.api import TTS
Load the model to GPU
Bark is really slow on CPU, so we recommend using GPU.
tts = TTS("tts_models/multilingual/multi-dataset/bark", gpu=True)
Cloning a new speaker
This expects to find a mp3 or wav file like bark_voices/new_speaker/speaker.wav
It computes the cloning values and stores in bark_voices/new_speaker/speaker.npz
tts.ttstofile(text="Hello, my name is Manmay , how are you?", filepath="output.wav", voicedir="bark_voices/", speaker="ljspeech")
When you run it again it uses the stored values to generate the voice.
tts.ttstofile(text="Hello, my name is Manmay , how are you?", filepath="output.wav", voicedir="bark_voices/", speaker="ljspeech")
random speaker
tts = TTS("ttsmodels/multilingual/multi-dataset/bark", gpu=True) tts.ttstofile("hello world", filepath="out.wav") ```
Using 🐸TTS Command line:
```console
cloning the ljspeech voice
tts --modelname ttsmodels/multilingual/multi-dataset/bark \ --text "This is an example." \ --outpath "output.wav" \ --voicedir barkvoices/ \ --speakeridx "ljspeech" \ --progress_bar True
Random voice generation
tts --modelname ttsmodels/multilingual/multi-dataset/bark \ --text "This is an example." \ --outpath "output.wav" \ --progressbar True ```
Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.14.3...v0.15.0
- Python
Published by erogol over 2 years ago
tts - ⛈️ v0.14.2
Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.14.1...v0.14.2
- Python
Published by erogol over 2 years ago
tts - 🚗 v0.14.1
What's Changed
- Fetch all built-in speakers from API by @reuben in https://github.com/coqui-ai/TTS/pull/2626
- fix typo by @vodiylik in https://github.com/coqui-ai/TTS/pull/2647
- Port Fairseq TTS models by @erogol in https://github.com/coqui-ai/TTS/pull/2628
New Contributors
- @vodiylik made their first contribution in https://github.com/coqui-ai/TTS/pull/2647
Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.14.0...v0.14.1
Example text to speech using Fairseq models in ~1100 languages 🤯.
For these models use the following name format: tts_models/<lang-iso_code>/fairseq/vits.
You can find the list of language ISO codes here and learn about the Fairseq models here.
```python from TTS.api import TTS api = TTS(modelname="ttsmodels/eng/fairseq/vits", gpu=True) api.ttstofile("This is a test.", file_path="output.wav")
TTS with on the fly voice conversion
api = TTS("ttsmodels/deu/fairseq/vits") api.ttswithvctofile( "Wie sage ich auf Italienisch, dass ich dich liebe?", speakerwav="target/speaker.wav", file_path="ouptut.wav" ) ```
- Python
Published by erogol over 2 years ago
tts - v0.14.0
What's Changed
- Typos and minor fixes by @prakharpbuf in https://github.com/coqui-ai/TTS/pull/2508
- Add FR and ES gruut languages as requirement to avoid inference issues by @Edresson in https://github.com/coqui-ai/TTS/pull/2572
- Lighter docker image by @WeberJulian in https://github.com/coqui-ai/TTS/pull/2600
- Use default_factory for audio parameter by @v4hn in https://github.com/coqui-ai/TTS/pull/2576
- Update README.md by @HighnessAtharva in https://github.com/coqui-ai/TTS/pull/2577
- Add Jenny model by @erogol in https://github.com/coqui-ai/TTS/pull/2603
- Warn when lang is not avail by @erogol in https://github.com/coqui-ai/TTS/pull/2460
- Update VAD for silence trimming. by @erogol in https://github.com/coqui-ai/TTS/pull/2604
- Tortoise TTS inference by @manmay-nakhashi in https://github.com/coqui-ai/TTS/pull/2547
- Draft ONNX export for VITS by @erogol in https://github.com/coqui-ai/TTS/pull/2563
New Contributors
- @prakharpbuf made their first contribution in https://github.com/coqui-ai/TTS/pull/2508
- @v4hn made their first contribution in https://github.com/coqui-ai/TTS/pull/2576
- @HighnessAtharva made their first contribution in https://github.com/coqui-ai/TTS/pull/2577
Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.13.3...v0.14.0
- Python
Published by erogol almost 3 years ago
tts - v0.14.0_models
Jenny VITS model trained by 👑@noml4u
bash
tts --model_name tts_models/en/jenny/jenny --text "This is a test. This is also a test."
- Python
Published by erogol almost 3 years ago
tts - v0.13.3_models
Single speaker Bangla Male/Female models
These are single-speaker VITS models with a 22050hz sampling rate.
By 👑 @mobassir94 Original repo: https://github.com/mobassir94/comprehensive-bangla-tts
Male Model
shell
tts --model_name tts_models/bn/custom/vits-male --text "এটি ডেমো করার উদ্দেশ্যে একটি ডেমো"
```python from TTS.api import TTS
tts = TTS(modelname="ttsmodels/bn/custom/vits-male") tts.ttstofile(text="এটি ডেমো করার উদ্দেশ্যে একটি ডেমো", file_path="output.wav")
TTS with voice conversion to a reference speaker in target_speaker.wav
ttswithvctofile(text="এটি ডেমো করার উদ্দেশ্যে একটি ডেমো", speakerwav="targetspeaker.wav", file_path="output.wav") ```
Female Model
shell
tts --model_name tts_models/bn/custom/vits-female --text "এটি ডেমো করার উদ্দেশ্যে একটি ডেমো"
```python from TTS.api import TTS
tts = TTS(modelname="ttsmodels/bn/custom/vits-female") tts.ttstofile(text="এটি ডেমো করার উদ্দেশ্যে একটি ডেমো", file_path="output.wav")
TTS with voice conversion to a reference speaker in target_speaker.wav
ttswithvctofile(text="এটি ডেমো করার উদ্দেশ্যে একটি ডেমো", speakerwav="targetspeaker.wav", file_path="output.wav") ```
- Python
Published by erogol almost 3 years ago
tts - 🌈v0.13.2
What's Changed
- 🐸Studio models by
ttsby @erogol in https://github.com/coqui-ai/TTS/pull/2515 - Update VAD by @erogol in https://github.com/coqui-ai/TTS/pull/2509
- 🌈 v0.13.2 by @erogol in https://github.com/coqui-ai/TTS/pull/2519
Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.13.1...v0.13.2
- Python
Published by erogol almost 3 years ago
tts - v0.13.1
What's Changed
- Update Librosa Version To V0.10.0 by @MattyB95 in https://github.com/coqui-ai/TTS/pull/2480
- Api voice conversion by @erogol in https://github.com/coqui-ai/TTS/pull/2495
- ✨ v0.13.1 by @erogol in https://github.com/coqui-ai/TTS/pull/2499
New Contributors
- @MattyB95 made their first contribution in https://github.com/coqui-ai/TTS/pull/2480
Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.13.0...v0.13.1
- Python
Published by erogol almost 3 years ago
tts - v0.13.0
What's Changed
- vits.py training fixed due to return_complex by @iamkhalidbashir in https://github.com/coqui-ai/TTS/pull/2418
- Update numba version by @erogol in https://github.com/coqui-ai/TTS/pull/2435
- Implement FreeVC by @erogol in https://github.com/coqui-ai/TTS/pull/2451
- [minor] hifigan_generator.py typo by @p0p4k in https://github.com/coqui-ai/TTS/pull/2462
- [minor] batch["speaker_ids"] getting set two times by @p0p4k in https://github.com/coqui-ai/TTS/pull/2470
- fix typo by @BenoitWang in https://github.com/coqui-ai/TTS/pull/2475
- Fixes typo in README.md example code by @TCNOco in https://github.com/coqui-ai/TTS/pull/2478
- 🐸 Coqui Studio API integration by @erogol in https://github.com/coqui-ai/TTS/pull/2484
New Contributors
- @BenoitWang made their first contribution in https://github.com/coqui-ai/TTS/pull/2475
- @TCNOco made their first contribution in https://github.com/coqui-ai/TTS/pull/2478
Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.12.0...v0.13.0
- Python
Published by erogol almost 3 years ago
tts - v0.12.0
What's Changed
- numpy version for py310 by @p0p4k in https://github.com/coqui-ai/TTS/pull/2316
- Fix Speaker Consistency Loss (SCL) by @Edresson in https://github.com/coqui-ai/TTS/pull/2364
- OverFlow with test sentences by @thennal10 in https://github.com/coqui-ai/TTS/pull/2253
- Basic Mary-TTS API compatibility by @fquirin in https://github.com/coqui-ai/TTS/pull/2352
- add energy by default to Fastspeech2 config by @manmay-nakhashi in https://github.com/coqui-ai/TTS/pull/2326
- Remove doc bot by @erogol in https://github.com/coqui-ai/TTS/pull/2399
- Update docs by @erogol in https://github.com/coqui-ai/TTS/pull/2389
- v0.12.0 by @erogol in https://github.com/coqui-ai/TTS/pull/2390
New Contributors
- @thennal10 made their first contribution in https://github.com/coqui-ai/TTS/pull/2253
- @fquirin made their first contribution in https://github.com/coqui-ai/TTS/pull/2352
Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.11.1...v0.12.0
- Python
Published by erogol almost 3 years ago
tts - v0.11.1
What's Changed
- Add pre-trained NeuralHMM model by @erogol in https://github.com/coqui-ai/TTS/pull/2314
- v0.11.1 by @erogol in https://github.com/coqui-ai/TTS/pull/2339
Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.11.0...v0.11.1
- Python
Published by erogol about 3 years ago
tts - v0.11.0
What's Changed
- v0.9.0 by @erogol in https://github.com/coqui-ai/TTS/pull/1942
- 🚀 v0.10.0 by @erogol in https://github.com/coqui-ai/TTS/pull/2205
- v0.10.1 by @erogol in https://github.com/coqui-ai/TTS/pull/2242
- Fastspeech2 by @manmay-nakhashi in https://github.com/coqui-ai/TTS/pull/2073
- Cache speaker encoder model by @erogol in https://github.com/coqui-ai/TTS/pull/2284
- Adding neural HMM TTS Model by @shivammehta25 in https://github.com/coqui-ai/TTS/pull/2272
- Add Catalan text cleaners for Catalan support by @GerrySant in https://github.com/coqui-ai/TTS/pull/2295
- Use packaging.version for version comparisons by @mweinelt in https://github.com/coqui-ai/TTS/pull/2310
- Fix tts-server for multi-lingual models by @marius851000 in https://github.com/coqui-ai/TTS/pull/2257
- API from model path by @erogol in https://github.com/coqui-ai/TTS/pull/2303
- v0.11.0 by @erogol in https://github.com/coqui-ai/TTS/pull/2277
- v0.11.0 by @erogol in https://github.com/coqui-ai/TTS/pull/2328
- Bump up to v0.11.0 by @erogol in https://github.com/coqui-ai/TTS/pull/2329
- v0.11.0 (#2329) by @erogol in https://github.com/coqui-ai/TTS/pull/2337
New Contributors
- @GerrySant made their first contribution in https://github.com/coqui-ai/TTS/pull/2295
- @mweinelt made their first contribution in https://github.com/coqui-ai/TTS/pull/2310
- @marius851000 made their first contribution in https://github.com/coqui-ai/TTS/pull/2257
Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.10.2...v0.11.0
- Python
Published by erogol about 3 years ago
tts - v0.11.0_models
- NeuralHMM trained on LJSpeech by 👑 @shivammehta25
- Python
Published by erogol about 3 years ago
tts - v0.10.2
What's Changed
- Multilingual tokenizer by @WeberJulian in https://github.com/coqui-ai/TTS/pull/2229
- Fixed bug related to yourtts speaker embeddings issue by @iamkhalidbashir in https://github.com/coqui-ai/TTS/pull/2234
- Update the Trainer requirement version for a compatible one by @Edresson in https://github.com/coqui-ai/TTS/pull/2276
New Contributors
- @iamkhalidbashir made their first contribution in https://github.com/coqui-ai/TTS/pull/2234
Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.10.1...v0.10.2
- Python
Published by erogol about 3 years ago
tts - v0.10.1
What's Changed
- fixed tutorial 2 incompatibility with new dev by @Aya-AlJafari in https://github.com/coqui-ai/TTS/pull/2161
- Fix capacitron test when cuda is enabled by @WeberJulian in https://github.com/coqui-ai/TTS/pull/2189
- Fix VITS multi-speaker voice conversion inference by @Edresson in https://github.com/coqui-ai/TTS/pull/2187
- Handle espeak 1.48.15 by @erogol in https://github.com/coqui-ai/TTS/pull/2203
- Python API implementation by @erogol in https://github.com/coqui-ai/TTS/pull/2195
- Update README by @erogol in https://github.com/coqui-ai/TTS/pull/2204
- Update formatters.py by @p0p4k in https://github.com/coqui-ai/TTS/pull/2194
- Adding OverFlow by @shivammehta25 in https://github.com/coqui-ai/TTS/pull/2183
- Add YourTTS VCTK recipe by @Edresson in https://github.com/coqui-ai/TTS/pull/2198
- Add Original YourTTS vocabulary on YourTTS recipe for full transfer learning by @Edresson in https://github.com/coqui-ai/TTS/pull/2206
- Adding pre-trained Overflow model by @erogol in https://github.com/coqui-ai/TTS/pull/2211
- Fixup overflow by @erogol in https://github.com/coqui-ai/TTS/pull/2218
- Add Ukrainian LADA (female) voice by @egorsmkv in https://github.com/coqui-ai/TTS/pull/2226
New Contributors
- @shivammehta25 made their first contribution in https://github.com/coqui-ai/TTS/pull/2183
- @egorsmkv made their first contribution in https://github.com/coqui-ai/TTS/pull/2226
Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.9.0...v0.10.1
- Python
Published by erogol about 3 years ago
tts - v0.10.1_models
This release includes 3 new models
- Multilingual YourTTS updated version based on the recent recipe.
- Catalan multi-speaker VITS model by 👑@gullabi
console
tts --model_name tts_models/ca/custom/vits --text "Ei, com estàs avui? Us desitjo unes bones festes." --speaker_idx d0cd44fcdae652efb0dd428cd1b8f1911e6eb2ca3469a1f2d6f9faf97a9d05e30f28387dfb81bfb4c97eba64187a0c047c85bf06998ccaec58781f3982626bb6
Note: Speaker names are quite long 😄 for this model
- Parsian single-speaker female GlowTTS model by 👑@karim23657 (without compatible vocoder)
console
tts --model_name tts_models/fa/custom/glow-tts --text "سلام امروز چطوری؟ تعطیلات خوشی را برای شما آرزو می کنم."
- Python
Published by erogol about 3 years ago
tts - v0.10.0
What's Changed
- v0.9.0 by @erogol in https://github.com/coqui-ai/TTS/pull/1942
- fixed tutorial 2 incompatibility with new dev by @Aya-AlJafari in https://github.com/coqui-ai/TTS/pull/2161
- Fix capacitron test when cuda is enabled by @WeberJulian in https://github.com/coqui-ai/TTS/pull/2189
- Fix VITS multi-speaker voice conversion inference by @Edresson in https://github.com/coqui-ai/TTS/pull/2187
- Handle espeak 1.48.15 by @erogol in https://github.com/coqui-ai/TTS/pull/2203
- Python API implementation by @erogol in https://github.com/coqui-ai/TTS/pull/2195
- Update README by @erogol in https://github.com/coqui-ai/TTS/pull/2204
- Update formatters.py by @p0p4k in https://github.com/coqui-ai/TTS/pull/2194
- Adding OverFlow by @shivammehta25 in https://github.com/coqui-ai/TTS/pull/2183
- Add YourTTS VCTK recipe by @Edresson in https://github.com/coqui-ai/TTS/pull/2198
- Add Original YourTTS vocabulary on YourTTS recipe for full transfer learning by @Edresson in https://github.com/coqui-ai/TTS/pull/2206
- Adding pre-trained Overflow model by @erogol in https://github.com/coqui-ai/TTS/pull/2211
- Fixup overflow by @erogol in https://github.com/coqui-ai/TTS/pull/2218
- 🚀 v0.10.0 by @erogol in https://github.com/coqui-ai/TTS/pull/2205
New Contributors
- @shivammehta25 made their first contribution in https://github.com/coqui-ai/TTS/pull/2183
Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.9.0...v0.10.0
- Python
Published by erogol about 3 years ago
tts - v0.10.0_models
- Overflow model trained on LJSpeech dataset using pretrained HifiGAN vocoder
vocoder_models/en/ljspeech/hifigan_v2.
- Python
Published by erogol about 3 years ago
tts - v0.9.0
New models
- Added 25 new models covering 25 different EU languages from 👑https://github.com/NeonGeckoCom/neon-tts-plugin-coqui
What's Changed
- Trick to Upsampling to High sampling rates using VITS model by @Edresson in https://github.com/coqui-ai/TTS/pull/1456
- Update Coqpit requirement by @Edresson in https://github.com/coqui-ai/TTS/pull/1539
- Missing
fprefix on f-strings fix by @code-review-doctor in https://github.com/coqui-ai/TTS/pull/1532 - tiny improvement in data_path resolvement by @taras-sereda in https://github.com/coqui-ai/TTS/pull/1567
- Fix VITS upsampling asserts by @Edresson in https://github.com/coqui-ai/TTS/pull/1550
- Fix the bug in eSpeak wrapper for eSpeak version 1.48.15 by @Edresson in https://github.com/coqui-ai/TTS/pull/1560
- 🐍 Python 3.10.x support and drop Python 3.6 support by @erogol in https://github.com/coqui-ai/TTS/pull/1565
- Update CI tests by @erogol in https://github.com/coqui-ai/TTS/pull/1572
- Build and publish CPU only Docker image by @erogol in https://github.com/coqui-ai/TTS/pull/1573
- Add an assert for the upsampling trick by @erogol in https://github.com/coqui-ai/TTS/pull/1538
- Add audio length sampler balancer by @Edresson in https://github.com/coqui-ai/TTS/pull/1561
- Change the VITS upsampling interpolation trick to linear by @Edresson in https://github.com/coqui-ai/TTS/pull/1564
- Capacitron by @a-froghyar in https://github.com/coqui-ai/TTS/pull/977
- Fixed usecuda issue in computeembeddings.py by @ribeiromiranda in https://github.com/coqui-ai/TTS/pull/1587
- Training recipes for thorsten dataset by @noranraskin in https://github.com/coqui-ai/TTS/pull/1020
- fix invalid json by @s3781009 in https://github.com/coqui-ai/TTS/pull/1599
- Use fsspec and torch for embedding file IO by @erogol in https://github.com/coqui-ai/TTS/pull/1581
- Adding TTS Tutorials by @Aya-AlJafari in https://github.com/coqui-ai/TTS/pull/1584
- Internal formatter by @WeberJulian in https://github.com/coqui-ai/TTS/pull/1629
- Update trainingamodel.md by @klotlabs in https://github.com/coqui-ai/TTS/pull/1620
- Add synpaflex formatter by @WeberJulian in https://github.com/coqui-ai/TTS/pull/1616
- added support for model_info in CLI by @p0p4k in https://github.com/coqui-ai/TTS/pull/1623
- Add Thorsten VITS model by @WeberJulian in https://github.com/coqui-ai/TTS/pull/1675
- Checkpoint bug fix by @manmay-nakhashi in https://github.com/coqui-ai/TTS/pull/1641
- docs : Adding
in the arguments for CLI by @camillem in https://github.com/coqui-ai/TTS/pull/1469 - Fix Publish CI by @WeberJulian in https://github.com/coqui-ai/TTS/pull/1597
- Fix tokenizer for punc only by @WeberJulian in https://github.com/coqui-ai/TTS/pull/1717
- Add durations as aux input for VITS by @WeberJulian in https://github.com/coqui-ai/TTS/pull/1694
- feat: updated capacitron recipes and lr fix by @a-froghyar in https://github.com/coqui-ai/TTS/pull/1718
- Implement VitsAudioConfig by @erogol in https://github.com/coqui-ai/TTS/pull/1556
- Fix aux tests by @WeberJulian in https://github.com/coqui-ai/TTS/pull/1753
- Fix for FloorDiv Function Warning by @iprovalo in https://github.com/coqui-ai/TTS/pull/1760
- Update download_vctk.sh by @mengting7tw in https://github.com/coqui-ai/TTS/pull/1739
- Update decoder.py by @p0p4k in https://github.com/coqui-ai/TTS/pull/1792
- Update requirements.txt for python 3.10 support by @p0p4k in https://github.com/coqui-ai/TTS/pull/1791
- Update README.md by @yuripourre in https://github.com/coqui-ai/TTS/pull/1776
- Fix & update WaveRNN vocoder model by @vanIvan in https://github.com/coqui-ai/TTS/pull/1749
- Update requirements.txt; inflect==5.6 by @p0p4k in https://github.com/coqui-ai/TTS/pull/1809
- Update README.md; download progress bar in CLI. by @p0p4k in https://github.com/coqui-ai/TTS/pull/1797
- Update wavenet.py by @p0p4k in https://github.com/coqui-ai/TTS/pull/1796
- Adjust default to be able to process longer sentences by @lkiesow in https://github.com/coqui-ai/TTS/pull/1835
- Fix language flags generated by espeak-ng phonemizer by @Lokhozt in https://github.com/coqui-ai/TTS/pull/1801
- fix getrandomembeddings --> getrandomembedding by @manmay-nakhashi in https://github.com/coqui-ai/TTS/pull/1726
- Introduce numpy and torch transforms by @erogol in https://github.com/coqui-ai/TTS/pull/1705
- Implement bucketed weighted sampling for VITS by @erogol in https://github.com/coqui-ai/TTS/pull/1871
- capacitron_layers multi speaker bug fix by @manmay-nakhashi in https://github.com/coqui-ai/TTS/pull/1664
- updates to dataset analysis notebooks by @jreus in https://github.com/coqui-ai/TTS/pull/1853
- Fix BCE loss issue by @erogol in https://github.com/coqui-ai/TTS/pull/1872
- Remove deprecated files by @erogol in https://github.com/coqui-ai/TTS/pull/1873
- Handle when no batch sampler by @erogol in https://github.com/coqui-ai/TTS/pull/1882
- Fix tune wavegrad by @geth-network in https://github.com/coqui-ai/TTS/pull/1844
- Add new DE Thorsten models by @erogol in https://github.com/coqui-ai/TTS/pull/1898
- Add speaker encoder recipe by @Edresson in https://github.com/coqui-ai/TTS/pull/1912
- Add capacitron v2 model by @WeberJulian in https://github.com/coqui-ai/TTS/pull/1768
- Fixes a race condition with multiple simultaneous get requests. by @KyuubiYoru in https://github.com/coqui-ai/TTS/pull/1807
- Fix find unique phonemes script by @Edresson in https://github.com/coqui-ai/TTS/pull/1928
- Add YourTTS and SC-GlowTTS on available models by @Edresson in https://github.com/coqui-ai/TTS/pull/1933
- Korean Phonemizer by @harmlessman in https://github.com/coqui-ai/TTS/pull/1822
- Add espeak support for Chinese by @happylittlecat2333 in https://github.com/coqui-ai/TTS/pull/1905
- Replace pyworld by pyin by @Edresson in https://github.com/coqui-ai/TTS/pull/1946
- d-vector handling by @erogol in https://github.com/coqui-ai/TTS/pull/1945
- Fixups by @erogol in https://github.com/coqui-ai/TTS/pull/1967
- Fix VC by @WeberJulian in https://github.com/coqui-ai/TTS/pull/1971
- Update readme by @erogol in https://github.com/coqui-ai/TTS/pull/1978
- Add metafile arg to compute embeddings script by @erogol in https://github.com/coqui-ai/TTS/pull/1977
- Fix dataset handling with the new embedding file keys by @Edresson in https://github.com/coqui-ai/TTS/pull/1991
- Fix colliding dataset cache file names by @Edresson in https://github.com/coqui-ai/TTS/pull/1994
- Write non-speech files in a TXT by @erogol in https://github.com/coqui-ai/TTS/pull/2048
- Minor bug fixes on VITS/YourTTS and inference by @Edresson in https://github.com/coqui-ai/TTS/pull/2054
- Check num of columns in coqui format by @erogol in https://github.com/coqui-ai/TTS/pull/2066
- Remove
/prefix from the relative path by @erogol in https://github.com/coqui-ai/TTS/pull/2065 - Update Tutorial2trainyourfirstTTSmodel.ipynb by @CeadeS in https://github.com/coqui-ai/TTS/pull/2079
- Update forward_tts.md by @mrshu in https://github.com/coqui-ai/TTS/pull/2019
- Use "formatter" key in the datasets json array by @humada05 in https://github.com/coqui-ai/TTS/pull/2114
- capacitron training fixes by @victor-shepardson in https://github.com/coqui-ai/TTS/pull/2086
- mailabs formatter: back/forward slash in file path fix by @freezerain in https://github.com/coqui-ai/TTS/pull/1938
- Add Discord server badge by @erogol in https://github.com/coqui-ai/TTS/pull/2136
- Remove langs expect
enanddeby @erogol in https://github.com/coqui-ai/TTS/pull/2135 - Cache fsspec downloaded files by @erogol in https://github.com/coqui-ai/TTS/pull/2132
- Update dep caching in actions by @erogol in https://github.com/coqui-ai/TTS/pull/2138
- Update README.md by @eltociear in https://github.com/coqui-ai/TTS/pull/2146
- Makes docker images lighter by @WeberJulian in https://github.com/coqui-ai/TTS/pull/2149
- Doc update docker by @WeberJulian in https://github.com/coqui-ai/TTS/pull/2153
- Add neon models by @loganhart420 in https://github.com/coqui-ai/TTS/pull/2140
- Fix documentation by @WeberJulian in https://github.com/coqui-ai/TTS/pull/2154
New Contributors
- @code-review-doctor made their first contribution in https://github.com/coqui-ai/TTS/pull/1532
- @taras-sereda made their first contribution in https://github.com/coqui-ai/TTS/pull/1567
- @ribeiromiranda made their first contribution in https://github.com/coqui-ai/TTS/pull/1587
- @s3781009 made their first contribution in https://github.com/coqui-ai/TTS/pull/1599
- @Aya-AlJafari made their first contribution in https://github.com/coqui-ai/TTS/pull/1584
- @klotlabs made their first contribution in https://github.com/coqui-ai/TTS/pull/1620
- @p0p4k made their first contribution in https://github.com/coqui-ai/TTS/pull/1623
- @manmay-nakhashi made their first contribution in https://github.com/coqui-ai/TTS/pull/1641
- @camillem made their first contribution in https://github.com/coqui-ai/TTS/pull/1469
- @iprovalo made their first contribution in https://github.com/coqui-ai/TTS/pull/1760
- @mengting7tw made their first contribution in https://github.com/coqui-ai/TTS/pull/1739
- @yuripourre made their first contribution in https://github.com/coqui-ai/TTS/pull/1776
- @vanIvan made their first contribution in https://github.com/coqui-ai/TTS/pull/1749
- @lkiesow made their first contribution in https://github.com/coqui-ai/TTS/pull/1835
- @Lokhozt made their first contribution in https://github.com/coqui-ai/TTS/pull/1801
- @jreus made their first contribution in https://github.com/coqui-ai/TTS/pull/1853
- @geth-network made their first contribution in https://github.com/coqui-ai/TTS/pull/1844
- @KyuubiYoru made their first contribution in https://github.com/coqui-ai/TTS/pull/1807
- @harmlessman made their first contribution in https://github.com/coqui-ai/TTS/pull/1822
- @happylittlecat2333 made their first contribution in https://github.com/coqui-ai/TTS/pull/1905
- @CeadeS made their first contribution in https://github.com/coqui-ai/TTS/pull/2079
- @mrshu made their first contribution in https://github.com/coqui-ai/TTS/pull/2019
- @humada05 made their first contribution in https://github.com/coqui-ai/TTS/pull/2114
- @victor-shepardson made their first contribution in https://github.com/coqui-ai/TTS/pull/2086
- @freezerain made their first contribution in https://github.com/coqui-ai/TTS/pull/1938
- @eltociear made their first contribution in https://github.com/coqui-ai/TTS/pull/2146
Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.6.2...v0.9.0
- Python
Published by erogol over 3 years ago
tts - v0.8.0
What's Changed
- Trick to Upsampling to High sampling rates using VITS model by @Edresson in https://github.com/coqui-ai/TTS/pull/1456
- Update Coqpit requirement by @Edresson in https://github.com/coqui-ai/TTS/pull/1539
- Missing
fprefix on f-strings fix by @code-review-doctor in https://github.com/coqui-ai/TTS/pull/1532 - tiny improvement in data_path resolvement by @taras-sereda in https://github.com/coqui-ai/TTS/pull/1567
- Fix VITS upsampling asserts by @Edresson in https://github.com/coqui-ai/TTS/pull/1550
- Fix the bug in eSpeak wrapper for eSpeak version 1.48.15 by @Edresson in https://github.com/coqui-ai/TTS/pull/1560
- 🐍 Python 3.10.x support and drop Python 3.6 support by @erogol in https://github.com/coqui-ai/TTS/pull/1565
- Update CI tests by @erogol in https://github.com/coqui-ai/TTS/pull/1572
- Build and publish CPU only Docker image by @erogol in https://github.com/coqui-ai/TTS/pull/1573
- Add an assert for the upsampling trick by @erogol in https://github.com/coqui-ai/TTS/pull/1538
- Add audio length sampler balancer by @Edresson in https://github.com/coqui-ai/TTS/pull/1561
- Change the VITS upsampling interpolation trick to linear by @Edresson in https://github.com/coqui-ai/TTS/pull/1564
- Capacitron by @a-froghyar in https://github.com/coqui-ai/TTS/pull/977
- Fixed usecuda issue in computeembeddings.py by @ribeiromiranda in https://github.com/coqui-ai/TTS/pull/1587
- Training recipes for thorsten dataset by @noranraskin in https://github.com/coqui-ai/TTS/pull/1020
- fix invalid json by @s3781009 in https://github.com/coqui-ai/TTS/pull/1599
- Use fsspec and torch for embedding file IO by @erogol in https://github.com/coqui-ai/TTS/pull/1581
- Adding TTS Tutorials by @Aya-AlJafari in https://github.com/coqui-ai/TTS/pull/1584
- Internal formatter by @WeberJulian in https://github.com/coqui-ai/TTS/pull/1629
- Update trainingamodel.md by @klotlabs in https://github.com/coqui-ai/TTS/pull/1620
- Add synpaflex formatter by @WeberJulian in https://github.com/coqui-ai/TTS/pull/1616
- added support for model_info in CLI by @p0p4k in https://github.com/coqui-ai/TTS/pull/1623
- Add Thorsten VITS model by @WeberJulian in https://github.com/coqui-ai/TTS/pull/1675
- Checkpoint bug fix by @manmay-nakhashi in https://github.com/coqui-ai/TTS/pull/1641
- docs : Adding
in the arguments for CLI by @camillem in https://github.com/coqui-ai/TTS/pull/1469 - Fix Publish CI by @WeberJulian in https://github.com/coqui-ai/TTS/pull/1597
- Fix tokenizer for punc only by @WeberJulian in https://github.com/coqui-ai/TTS/pull/1717
- Add durations as aux input for VITS by @WeberJulian in https://github.com/coqui-ai/TTS/pull/1694
- feat: updated capacitron recipes and lr fix by @a-froghyar in https://github.com/coqui-ai/TTS/pull/1718
- Implement VitsAudioConfig by @erogol in https://github.com/coqui-ai/TTS/pull/1556
- Fix aux tests by @WeberJulian in https://github.com/coqui-ai/TTS/pull/1753
- Fix for FloorDiv Function Warning by @iprovalo in https://github.com/coqui-ai/TTS/pull/1760
- Update download_vctk.sh by @mengting7tw in https://github.com/coqui-ai/TTS/pull/1739
- Update decoder.py by @p0p4k in https://github.com/coqui-ai/TTS/pull/1792
- Update requirements.txt for python 3.10 support by @p0p4k in https://github.com/coqui-ai/TTS/pull/1791
- Update README.md by @yuripourre in https://github.com/coqui-ai/TTS/pull/1776
- Fix & update WaveRNN vocoder model by @vanIvan in https://github.com/coqui-ai/TTS/pull/1749
- Update requirements.txt; inflect==5.6 by @p0p4k in https://github.com/coqui-ai/TTS/pull/1809
- Update README.md; download progress bar in CLI. by @p0p4k in https://github.com/coqui-ai/TTS/pull/1797
- Update wavenet.py by @p0p4k in https://github.com/coqui-ai/TTS/pull/1796
- Adjust default to be able to process longer sentences by @lkiesow in https://github.com/coqui-ai/TTS/pull/1835
- Fix language flags generated by espeak-ng phonemizer by @Lokhozt in https://github.com/coqui-ai/TTS/pull/1801
- fix getrandomembeddings --> getrandomembedding by @manmay-nakhashi in https://github.com/coqui-ai/TTS/pull/1726
- Introduce numpy and torch transforms by @erogol in https://github.com/coqui-ai/TTS/pull/1705
- Implement bucketed weighted sampling for VITS by @erogol in https://github.com/coqui-ai/TTS/pull/1871
- capacitron_layers multi speaker bug fix by @manmay-nakhashi in https://github.com/coqui-ai/TTS/pull/1664
- updates to dataset analysis notebooks by @jreus in https://github.com/coqui-ai/TTS/pull/1853
- Fix BCE loss issue by @erogol in https://github.com/coqui-ai/TTS/pull/1872
- Remove deprecated files by @erogol in https://github.com/coqui-ai/TTS/pull/1873
- Handle when no batch sampler by @erogol in https://github.com/coqui-ai/TTS/pull/1882
- Fix tune wavegrad by @geth-network in https://github.com/coqui-ai/TTS/pull/1844
- Add new DE Thorsten models by @erogol in https://github.com/coqui-ai/TTS/pull/1898
New Contributors
- @code-review-doctor made their first contribution in https://github.com/coqui-ai/TTS/pull/1532
- @taras-sereda made their first contribution in https://github.com/coqui-ai/TTS/pull/1567
- @ribeiromiranda made their first contribution in https://github.com/coqui-ai/TTS/pull/1587
- @s3781009 made their first contribution in https://github.com/coqui-ai/TTS/pull/1599
- @Aya-AlJafari made their first contribution in https://github.com/coqui-ai/TTS/pull/1584
- @klotlabs made their first contribution in https://github.com/coqui-ai/TTS/pull/1620
- @p0p4k made their first contribution in https://github.com/coqui-ai/TTS/pull/1623
- @manmay-nakhashi made their first contribution in https://github.com/coqui-ai/TTS/pull/1641
- @camillem made their first contribution in https://github.com/coqui-ai/TTS/pull/1469
- @iprovalo made their first contribution in https://github.com/coqui-ai/TTS/pull/1760
- @mengting7tw made their first contribution in https://github.com/coqui-ai/TTS/pull/1739
- @yuripourre made their first contribution in https://github.com/coqui-ai/TTS/pull/1776
- @vanIvan made their first contribution in https://github.com/coqui-ai/TTS/pull/1749
- @lkiesow made their first contribution in https://github.com/coqui-ai/TTS/pull/1835
- @Lokhozt made their first contribution in https://github.com/coqui-ai/TTS/pull/1801
- @jreus made their first contribution in https://github.com/coqui-ai/TTS/pull/1853
- @geth-network made their first contribution in https://github.com/coqui-ai/TTS/pull/1844
Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.6.2...v0.8.0
- Python
Published by erogol over 3 years ago
tts - v0.8.0 models
✨New models ✨ from @thorstenMueller
✨New models ✨ from @NeonGeckoCom
- Python
Published by erogol over 3 years ago
tts - v0.7.1 models
Add capacitron V2 model to TTS zoo. It's more stable and just as expressive!
- Python
Published by WeberJulian over 3 years ago
tts - v0.7.0
What's Changed
- Trick to Upsampling to High sampling rates using VITS model by @Edresson in https://github.com/coqui-ai/TTS/pull/1456
- Update Coqpit requirement by @Edresson in https://github.com/coqui-ai/TTS/pull/1539
- Missing
fprefix on f-strings fix by @code-review-doctor in https://github.com/coqui-ai/TTS/pull/1532 - tiny improvement in data_path resolvement by @taras-sereda in https://github.com/coqui-ai/TTS/pull/1567
- Fix VITS upsampling asserts by @Edresson in https://github.com/coqui-ai/TTS/pull/1550
- Fix the bug in eSpeak wrapper for eSpeak version 1.48.15 by @Edresson in https://github.com/coqui-ai/TTS/pull/1560
- 🐍 Python 3.10.x support and drop Python 3.6 support by @erogol in https://github.com/coqui-ai/TTS/pull/1565
- Update CI tests by @erogol in https://github.com/coqui-ai/TTS/pull/1572
- Build and publish CPU only Docker image by @erogol in https://github.com/coqui-ai/TTS/pull/1573
- Add an assert for the upsampling trick by @erogol in https://github.com/coqui-ai/TTS/pull/1538
- Add audio length sampler balancer by @Edresson in https://github.com/coqui-ai/TTS/pull/1561
- Change the VITS upsampling interpolation trick to linear by @Edresson in https://github.com/coqui-ai/TTS/pull/1564
- Capacitron by @a-froghyar in https://github.com/coqui-ai/TTS/pull/977
- Fixed usecuda issue in computeembeddings.py by @ribeiromiranda in https://github.com/coqui-ai/TTS/pull/1587
- Training recipes for thorsten dataset by @noranraskin in https://github.com/coqui-ai/TTS/pull/1020
- fix invalid json by @s3781009 in https://github.com/coqui-ai/TTS/pull/1599
- Use fsspec and torch for embedding file IO by @erogol in https://github.com/coqui-ai/TTS/pull/1581
- Adding TTS Tutorials by @Aya-AlJafari in https://github.com/coqui-ai/TTS/pull/1584
- Internal formatter by @WeberJulian in https://github.com/coqui-ai/TTS/pull/1629
- Update trainingamodel.md by @klotlabs in https://github.com/coqui-ai/TTS/pull/1620
- Add synpaflex formatter by @WeberJulian in https://github.com/coqui-ai/TTS/pull/1616
- added support for model_info in CLI by @p0p4k in https://github.com/coqui-ai/TTS/pull/1623
- v0.7.0 by @erogol in https://github.com/coqui-ai/TTS/pull/1537
New Contributors
- @code-review-doctor made their first contribution in https://github.com/coqui-ai/TTS/pull/1532
- @taras-sereda made their first contribution in https://github.com/coqui-ai/TTS/pull/1567
- @ribeiromiranda made their first contribution in https://github.com/coqui-ai/TTS/pull/1587
- @s3781009 made their first contribution in https://github.com/coqui-ai/TTS/pull/1599
- @Aya-AlJafari made their first contribution in https://github.com/coqui-ai/TTS/pull/1584
- @klotlabs made their first contribution in https://github.com/coqui-ai/TTS/pull/1620
- @p0p4k made their first contribution in https://github.com/coqui-ai/TTS/pull/1623
Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.6.2...v0.7.0
- Python
Published by erogol over 3 years ago
tts - Speaker Encoder Model
Speaker encoder model and config file.
- Python
Published by Aya-AlJafari over 3 years ago
tts - v0.7.0 models
- English Capacitron-T2 models + corresponding HiFiGAN V2 vocoder. Implemented and trained by 👑@a-froghyar
- German VITS model trained on Thorsten Dataset by 👑@thorstenMueller 👑@domcross
- Python
Published by WeberJulian almost 4 years ago
tts - v0.6.2
What's Changed
- Fix multilingual recipe by @Edresson in https://github.com/coqui-ai/TTS/pull/1354
- Fix recipes as to the recent API changes. by @erogol in https://github.com/coqui-ai/TTS/pull/1367
- Add docsqa to docs website by @nomagick in https://github.com/coqui-ai/TTS/pull/1363
- REBASED: Add support for the speaker encoder training using torch spectrograms by @Edresson in https://github.com/coqui-ai/TTS/pull/1348
- Add alphas to control language and speaker balancer by @Edresson in https://github.com/coqui-ai/TTS/pull/1216
- Add Voice conversion inference support by @Edresson in https://github.com/coqui-ai/TTS/pull/1337
- Update issue template by @erogol in https://github.com/coqui-ai/TTS/pull/1370
- Open bible dataset formatter by @Edresson in https://github.com/coqui-ai/TTS/pull/1365
- REBASED: Transform Speaker Encoder in a Generic Encoder and Implement Emotion Encoder training support by @Edresson in https://github.com/coqui-ai/TTS/pull/1349
- Fix typo workflow text by @WeberJulian in https://github.com/coqui-ai/TTS/pull/1403
- Add CITATION.cff by @erogol in https://github.com/coqui-ai/TTS/pull/1404
- Fix default phonemizer for ja and zh by @WeberJulian in https://github.com/coqui-ai/TTS/pull/1399
- Make Style by @erogol in https://github.com/coqui-ai/TTS/pull/1405
- Fix #1380 by @erogol in https://github.com/coqui-ai/TTS/pull/1409
- Hinge Gruut version to 2.2.3 by @erogol in https://github.com/coqui-ai/TTS/pull/1419
- Update CheckSpectrograms notebook by @erogol in https://github.com/coqui-ai/TTS/pull/1418
- Fix #1423 by @Edresson in https://github.com/coqui-ai/TTS/pull/1424
- Update model file extension by @erogol in https://github.com/coqui-ai/TTS/pull/1422
- Fix model manager by @erogol in https://github.com/coqui-ai/TTS/pull/1436
- Add formatting tests by @WeberJulian in https://github.com/coqui-ai/TTS/pull/1437
- Update base model wrt 👟 by @erogol in https://github.com/coqui-ai/TTS/pull/1406
- Replace webrtcvad by silero-vad by @Edresson in https://github.com/coqui-ai/TTS/pull/1431
- Bug fix in freeze encoder by @Edresson in https://github.com/coqui-ai/TTS/pull/1391
- Enforce phonemizer definition for synthesis by @WeberJulian in https://github.com/coqui-ai/TTS/pull/1441
- Fix G2P backend of the released models by @WeberJulian in https://github.com/coqui-ai/TTS/pull/1461
- Add EmbeddingManager and BaseIDManager by @Edresson in https://github.com/coqui-ai/TTS/pull/1374
- Update requirements coqui_trainer -> trainer by @erogol in https://github.com/coqui-ai/TTS/pull/1478
- Update CONTRIBUTING.md, fix header by @Jackiexiao in https://github.com/coqui-ai/TTS/pull/1463
- Add African models by @WeberJulian in https://github.com/coqui-ai/TTS/pull/1511
- Print Model's license when downloading by @erogol in https://github.com/coqui-ai/TTS/pull/1512
- Improve docsQA default questions by @nomagick in https://github.com/coqui-ai/TTS/pull/1411
- patch print license by @WeberJulian in https://github.com/coqui-ai/TTS/pull/1514
- v0.6.2 by @erogol in https://github.com/coqui-ai/TTS/pull/1353
New Contributors
- @nomagick made their first contribution in https://github.com/coqui-ai/TTS/pull/1363
- @Jackiexiao made their first contribution in https://github.com/coqui-ai/TTS/pull/1463
Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.6.1...v0.6.2
- Python
Published by erogol almost 4 years ago
tts - v0.6.2 models
This release add 6 new VITS models for languages of the openbible dataset.
- ewe
- hausa
- lingala
- yoruba
- asante-twi
- akuapem-twi
Original work (audio and text) by Biblica available for free at www.biblica.com and open.bible.
- Python
Published by WeberJulian almost 4 years ago
tts - v0.6.1 models
What's Changed
- Renamed all checkpoints from
model_file.pth.tartomodel_file.pth - Tested and fixed for all tts models the
"phonemizer"backend key in config
For best performance, you should use the commit version attached to each model
- Python
Published by WeberJulian almost 4 years ago
tts - v0.6.0
What's Changed
Tokenizer API
Tokenizer API is defined by the TTSTokenizer class. It is intended to provide all the text processing functionalities to a tts model. New tokenizers can also be added by subclassing the TTSTokenizer class.
Phonemizer API
Phonemizer API is defined by the BasePhonemizer class and implemented by the ESpeak and Gruut wrappers, ZHCH, JPJA phonemizers. New phonemizers can be added by implementing the BasePhonemizer class.
BaseCharacters
BaseCharacters class provides an API to define the model vocabulary and provide the dictionary to map characters to token IDs and back. There are two pre-defined classes inheriting from BaseCharacters. IPAPhonemes and Graphemes that respectively define the IPA phoneme character set for models using phonemes and grapheme set for models using raw characters.
Punctuations class
Punctuations class to strip out punctuations and restore them when needed.
Language specific text normalization routines under TTS.tts.utils.text
Under TTS.tts.utils.text there are folders for each language to accommodate the text normalization routines that
are designed for the language.
👟Trainer
We separate the trainer as a new repo 👟Trainer. It is a general-purpose model trainer for Pytorch with certain design choices in mind.
- Support for different experiment tracking dashboards like ClearML, Tensorboard, MLFlow, and W&Bs.
- Flexible to train any kind of DL model.
- Simple code base and easily expandable.
- Easy to debug.
It is a very early-stage and monolithic library currently. Feel free to share your ✨feedback✨ and ✨contribute✨.
VITS implementation update
With this version of VITS model, we get rid of some of the issues that affect the model performance. It also illustrates well how you could adapt any open-source model implementation to 🐸TTS and 👟Trainer without even knowing the rest for 🐸TTS library.
Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.5.0...v0.6.0
New Models
GlowTTS + HifiGAN Turkish by 👑Fatih Akademi
console $ tts --model_name tts_models/tr/common-voice/glow-tts --text "Bu bizim için oluşturulmuş bir örnek sevgili dostum."VITS and GlowTTS Italian by 👑@nicolalandro using MAI Italian male and female subsets.
Female VITS model
console $ tts --model_name tts_models/it/mai_female/vits --text "Questo è un esempio per noi, mio <200b><200b>caro amico."Male VITS model
console $ tts --model_name tts_models/it/mai_male/vits --text "Questo è un esempio per noi, mio <200b><200b>caro amico."
- Python
Published by erogol almost 4 years ago
tts - v0.5.0
What's Changed
- Fix some setup papercuts by @reuben in https://github.com/coqui-ai/TTS/pull/1022
- Add additional datasets by @loganhart420 in https://github.com/coqui-ai/TTS/pull/1021
- Add UK vocoder models by @erogol in https://github.com/coqui-ai/TTS/pull/1031
- Add multilingual models support by @erogol in https://github.com/coqui-ai/TTS/pull/1007
- Implement YourTTS by @WeberJulian and @Edresson
- Fixes before YourTTS merge by @WeberJulian in https://github.com/coqui-ai/TTS/pull/1044
- Fix language assignment by @erogol in https://github.com/coqui-ai/TTS/pull/1047
- Fix if else statement by @erogol in https://github.com/coqui-ai/TTS/pull/1050
- Fix train_tts.py and uncomment code by @WeberJulian in https://github.com/coqui-ai/TTS/pull/1051
- v0.5.0 by @erogol in https://github.com/coqui-ai/TTS/pull/1027
New Contributors
- @reuben made their first contribution in https://github.com/coqui-ai/TTS/pull/1022
- @loganhart420 made their first contribution in https://github.com/coqui-ai/TTS/pull/1021
Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.4.2...v0.5.0
- Python
Published by erogol about 4 years ago
tts - v0.5.0_models
Model releases accompanying v0.5.0
- Python
Published by erogol about 4 years ago
tts - v0.4.2
What's Changed
- Model zoo tests by @erogol in https://github.com/coqui-ai/TTS/pull/900
- v0.4.2 by @erogol in https://github.com/coqui-ai/TTS/pull/901
- Optional silence trimming during inference and find_endpoint() fix by @george-roussos in https://github.com/coqui-ai/TTS/pull/898
- Update gruut to version 2.0 by @synesthesiam in https://github.com/coqui-ai/TTS/pull/882
- Documentation corrections for finetuning and data preparation by @gullabi in https://github.com/coqui-ai/TTS/pull/931
- server: fix compatibility with ttsmodels/en/ljspeech/fastpitch by @Mic92 in https://github.com/coqui-ai/TTS/pull/893
- v0.4.2 by @erogol in https://github.com/coqui-ai/TTS/pull/914
New Contributors
- @george-roussos made their first contribution in https://github.com/coqui-ai/TTS/pull/898
- @gullabi made their first contribution in https://github.com/coqui-ai/TTS/pull/931
Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.4.1...v0.4.2
- Python
Published by erogol about 4 years ago
tts - v0.4.0
🐸 v0.4.0
- Update multi-speaker training API.
- VCTK recipes for all the TTS models.
- Documentation for multi-speaker training.
- Pre-trained Ukrainian GlowTTS model from 👑 https://github.com/robinhad/ukrainian-tts
- Pre-trained FastPitch VCTK model
- Dataset downloaders for LJSpeech and VCTK under
TTS.utils.downloaders - Documentation reformatting.
Trainer V2 and compact. updates in model implementations.
This update makes the Trainer V2 responsible for only the training of a model. The rest is excluded from the trainer and they need to be done either in the model or before calling the trainer.
Try out new models
- Pre-trained FastPitch VCTK model
bash
tts --model_name tts_models/en/vctk/fast_pitch --text "This is my sample text to voice." --speaker_idx VCTK_p229
- Pre-trained Ukrainian GlowTTS model from 👑 https://github.com/robinhad/ukrainian-tts
bash
tts --model_name tts_models/uk/mai/glow-tts --text "Це зразок тексту, щоб спробувати нашу модель."
- Python
Published by erogol over 4 years ago
tts - v0.3.0
🐸 v0.3.0
New ForwardTTS implementation.
This version implements a new ForwardTTS interface that can be configured as any feed-forward TTS model that uses a duration predictor at inference time. Currently, we provide 3 pre-configured models and plan to implement one more.
- SpeedySpeech
- FastSpeech
- FastPitch
- FastSpeech 2 (TODO)
Through this API, any model can be trained in two ways. Either using pre-computed durations from a pre-trained Tacotron model or using an alignment network to learn durations from the dataset. The alignment network is only used at training and discarded at inference. You can set which mode you want to use by just setting the use_aligner field in the configuration.
This new API will help us to design more efficient inference run-time for all these models using ONNX like run-time optimizers.
Old FastPitch and SpeedySpeech implementations are deprecated for the sake of this new implementation.
Fine-Tuning Documentation
This version introduces documentation for model fine-tunning. You can see it under https://tts.readthedocs.io/ when this is merged.
New Model Releases
- English Speedy Speech model on LJSpeech
Try out:
bash
tts --text "This is a sample text for my model to speak." --model_name tts_models/en/ljspeech/speedy-speech
- Fine-tuned UnivNet Vocoder
Try out:
bash
tts --text "This is how it is." --model_name tts_models/en/ljspeech/tacotron2-DDC_ph
- Python
Published by erogol over 4 years ago
tts - v0.2.2
🐸 v0.2.2
FastPitch model with an Aligner Network is implemented with other changes accompanying it.
- Alignment Network: https://arxiv.org/abs/2108.10447
- Fast Pitch Model: https://arxiv.org/abs/2006.06873
Thanks to 👑 @kaiidams for his Japanese g2p update.
Try FastPitch model:
bash
tts --model_name tts_models/en/ljspeech/fast_pitch --text "This is my sample text to voice."
- Python
Published by erogol over 4 years ago
tts - v0.2.1
🐸 v0.2.1
🐞Bug Fixes
- Fix distributed training and solve compact issues with the Trainer API.
- Fix bugs in the VITS model implementation that caused training instabilities.
- Fix some Abstract Class usage issues in WaveRNN and WaveGrad models.
💾 Code updates
- Use a single gradient scaler for all the optimizers in TrainerAPI. Previously, we used one scaler per optimizer.
🏃♀️Operational Updates
- Update to Pylint 2.10.2
Thanks to 👑 @fijipants for his fixes 🛠️ Thanks to 👑 @agrinh for his flag and discussion in DDP issues
- Python
Published by erogol over 4 years ago
tts - v0.2.0
🐸 v0.2.0
🐞Bug Fixes
- Fix phoneme pre-compute issue.
- Fix multi-speaker setup in Tacotron models.
- Fix small issues in the Trainer regarding multi-optimizer training.
💾 Code updates
- W&B integration for model logging and experiment tracking, (👑 @AyushExel)
Code uses the Tensorboard by default. For W&B, you need to set
log_dashboardoption in the config and defineproject_nameandwandb_entity. - Use ffsspec for model saving/loading (👑 @agrinh)
- Allow models to define their own symbol list with in-class
make_symbols() - Allow choosing after epoch or after step LR scheduler update with
scheduler_after_epoch. - Make converting spectrogram from amplitude to DB optional with
do_amp_to_db_linearanddo_amp_to_db_linearoptions.
🗒️ Docs updates
- Add GlowTTS and VITS docs.
🤖 Model implementations
- VITS implementation with pre-trained models (https://arxiv.org/abs/2106.06103)
🚀 Model releases
vocodermodels--ja--kokoro--hifiganv1 (👑 @kaiidams)
HiFiGAN model trained on Kokoro dataset to complement the existing Japanese model.
Try it out:
bash tts --model_name tts_models/ja/kokoro/tacotron2-DDC --text "こんにちは、今日はいい天気ですか?"ttsmodels--en--ljspeech--tacotronDDCph
TacotronDDC with phonemes trained on LJSpeech. It is to fix the pronunciation errors caused by the raw text in the released TacotronDDC model.
Try it out:
bash tts --model_name tts_models/en/ljspeech/tacotronDDC_ph --text "hello, how are you today?"tts_models--en--ljspeech--vits
VITS model trained on LJSpeech.
Try it out:
bash tts --model_name tts_models/en/ljspeech/vits --text "hello, how are you today?"tts_models--en--vctk--vits
VITS model trained on VCTK with multi-speaker support.
Try it out:
bash tts-server --model_name tts_models/en/vctk/vitsvocoder_models--en--ljspeech--univnet
UnivNet model trained on LJSpeech to complement the TacotronDDC model above.
Try it out:
bash tts --model_name tts_models/en/ljspeech/tacotronDDC_ph --text "hello, how are you today?"
- Python
Published by erogol over 4 years ago
tts - v0.1.3
🐸 v0.1.3
🐞Bug Fixes
Fix Tacotron stopnet training
Models trained after v0.1 had the problem that the stopnet was not trained. It caused models not to generate audio at evaluation and inference time.
Fix
test_runat training. (👑 @WeberJulian)In training :frog: TTS would skip the
test_runand not generate test audio samples. Now it is fixed :).Fix
server.pyfor multi-speaker models.
💾 Code updates
- Refactoring in
compute_embeddings.pyfor efficiency and compatibility with the latest speaker encoder. (👑 @Edresson)
🚀 Model releases
- New Fullband-MelGAN model for Thorsten German dataset. (👑 @thorstenMueller)
Try it:
python
tts --model_name tts_models/de/thorsten/tacotron2-DCA --text "Was geschehen ist geschehen, es ist geschichte."
- Python
Published by erogol over 4 years ago
tts - v0.1.0
🐸 v0.1.0
In a nutshell, there are a ton of updates in this release. I don't know if we can cover them all here but let's try.
After this release, 🐸 TTS stands on the following architecture.
Trainer APIfor training.Synthesizer APIfor inference.ModelManager APIfor managing 🐸TTS model zoo.SpeakerManager APIfor managing speakers in a multi-speaker setting.- (TBI)
Exporter APIfor exporting models to ONNX, TorchScript, etc. - (TBI)
Data Processing APIfor making a dataset ready for training. Model APIfor implementing models, compatible with all the other components above.
Updates
💾 Code updates
Brand new
Trainer APIWe unified all the training code in a lightweight but feature complete
Trainer API. From now on all the 🐸TTS models will use this new API for training.It provides mixed precision (with Nvidia's APEX of
torch.amp) and multi-gpu training for all the models.Brand new
Model APIAbstract
BaseModeland itsBaseTTS,BaseVocoderchild classes are used as the basis of the 🐸TTS models now. Any model that implements one of these classes, works seamlessly with theTrainerandSynthesizer.Brand new 🐸TTS
recipes.We decided to merge the recipes to the main project. Now we host recipes for the LJspeech dataset, covering all the implemented models. So you can pick the model you want, change the parameters, and train your own model easily.
Thanks to the new
Trainer APIand 👩✈️Coqpit integration, we could implement these recipes with pure python.Updates
SpeakerManager APITTS.utilsSpeakerManageris now the core unit to manage speakers in a multi-speaker model and interface aSpeakerEncodermodel with thettsandvocodermodels.Updated model training mechanics.
You can now use pure Python to define your model and run the training. It is useful to train models on a Jupyter Notebook or the other python environments.
We also keep the old mechanics by using
TTS/bin/train_tts.pyor `TTS/bin/train_vocoder.py. You just need to change the previous training script name with one of these two based on your model.bash python TTS/bin/train_tacotron.py --config_path config.jsonbecomes
bash python TTS/bin/train_tts.py --config_path config.jsonUse 👩✈️Coqpit for managing model class arguments.
Now all the model arguments are defined in a
coqpitclass and imported by the model config.gruutbased character to phoneme conversion. (👑 @synesthesiam)As a drop-in replacement for the previous solution that is compatible with the released models. So now all these models are functional again without version nitpicking.
Set
test_sentencesin the config rather than providing a txt file.Set the maximum number of decoder steps of
Tacotron1-2models in the config.
🏃♀️ Operational Updates
- FINALLY DOCUMENTATION!! https://tts.readthedocs.io
- Enable support for Python 3.9
- Changes for PyTorch 1.9.0
🏅 Model implementations
- Univnet GAN Vocoder: https://arxiv.org/pdf/2106.07889.pdf (👑 @rishikksh20)
🚀 Model releases
We solved the compat issues and re-release some of the models. You can see them in the released binaries section.
You don't need to change anything. If you use v0.1.0, by default, it uses these new models.
- Python
Published by erogol over 4 years ago
tts - v0.0.15
🐸 v0.0.15
🐞Bug Fixes
- [x] Fix tb_logger init for rank > 0 processes in distributed training.
💾 Code updates
- [x] Refactoring and optimization in the speaker encoder module. (:crown: @Edresson )
- [x] Replacing
unidecodewithanyascii - [x] Japanese text to phoneme conversion. (:crown: @kaiidams)
- [x] Japanese
ttsrecipe to train Tacotron2-DDC on Kokoro dataset (:crown: @kaiidams)
:walking_woman: Operational Updates
- [x] Start using
pylint == 2.8.3 - [x] Reorg
testsfiles. - [x] Upload to pypi automatically on release.
- [x] Move
VERSIONfile underTTSfolder.
🏅 Model implementations
- [x] New Speaker Encoder implementation based on https://arxiv.org/abs/2009.14153 (:crown: @Edresson )
🚀 New Pre-Trained Model Releases
- [x] Japanese Tacotron model (:crown: @kaiidams)
:bulb: All the models below are available by tts or tts-server endpoints on CLI as explained here.
- Python
Published by erogol over 4 years ago
tts - v0.0.14
:frog: v0.0.14
🐞Bug Fixes
- [x] Remove breaking line from Tacotron models. (👑 @a-froghyra)
💾 Code updates
- [x] BREAKING: Coqpit integration for config management and the first 🐸TTS recipe, for LJSpeech Check #476.
Every model now tied to a Python class that defines the configuration scheme. It provides a better interface and lets the user know better what are the default values, expected value types, and mandatory fields.
Specific model configs are defined under TTS/tts/configs and TTS/vocoder/configs. TTS/config/shared_configs.py hosts configs that are shared by all the :frog: TTS models. Configs shared by tts models are hosted under TTS/tts/configs/shared_configs.py and shared by vocoder models are under TTS/vocoder/configs/shared_config.py.
For example TacotronConfig follows BaseTrainingConfig -> BaseTTSConfig -> TacotronConfig.
- [x] BREAKING: Remove phonemizer support due to License conflict.
This essentially deprecates the support for all the models using phonemes as input. Feel free to suggest in-place options if you are affected by this change.
- [x] Start hosting :woman_cook: recipes under :frog: TTS. The first recipe is for Tacotron2-DDC with LJspeech dataset under
TTS/recipes/.
Please check here for more details.
- [x] Add extract_tts_spectrograms.py that supports GlowTTS and Tacotron1-2. (👑 @Edresson)
- [x] Add version.py (👑 @chmodsss)
- Python
Published by erogol almost 5 years ago
tts - v0.0.13
:frog: v0.0.13
🐞Bug Fixes
💾 Code updates
SpeakerManagerclass for handling multi-speaker model management and interfacingspeaker.jsonfile.- Enabling multi-speaker models with
ttsandtts-serverendpoints. (:crown: @kirianguiller ) - Allow choosing a different
noise scalefor GlowTTS at inference. - Glow-TTS updates to import SC-Glow Models.
- Fixing windows support (:crown: @WeberJulian )
:walking_woman: Operational Updates
- Refactoring :frog: TTS installation and allow selecting different scopes (
all, tf, notebooks)for installation depending on the specific needs.
🏅 Model implementations
🚀 New Pre-Trained Model Releases
- SC-GlowTTS multi-speaker English model from our work https://arxiv.org/abs/2104.05557 (:crown: @Edresson )
- HiFiGAN vocoder finetuned for the above model.
- Tacotron DDC Non-Binary English model using Accenture's Sam dataset.
- HiFiGAN vocoder trained for the models above.
Released Models
:bulb: All the models below are available by tts or tts-server endpoints on CLI as explained here.
Models with ✨️ below are new with this release.
- SC-GlowTTS model is from our latest paper in a collaboration with @Edresson and @mueller91.
- The new non-binary TTS model is trained using the SAM dataset from Accenture Labs. Check out their blog post
| Language | Dataset | Model Name | Model Type | TTS version | Download|
| -- | -- | -- | -- | -- | :-: |
:sparkles: English (non-binary) | sam (acccenture) | Tacotron2-DDC | tts | :smile: v0.0.13 | :floppy_disk:
:sparkles: English (multi-speaker) | VCTK | SC-GlowTTS | tts | :smile: v0.0.13| :floppy_disk: |
English | LJSpeech | Tacotron-DDC | tts | v0.0.12| :floppy_disk:
German | Thorsten-DE | Tacotron-DCA | tts | v0.0.11 |:floppy_disk:
German | Thorsten-DE | Wavegrad | vocoder |v0.0.11 |:floppy_disk:
English | LJSpeech | SpeedySpeech | tts | v0.0.10 |:floppy_disk:
English | EK1 | Tacotron2 | tts |v0.0.10 |:floppy_disk:
Dutch | MAI | TacotronDDC | tts | v0.0.10 |:floppy_disk:
Chinese | Baker | TacotronDDC-GST | tts | v0.0.10 |:floppy_disk:
English | LJSpeech | TacotronDCA | tts |v0.0.9 |:floppy_disk:
English | LJSpeech | Glow-TTS | tts |v0.0.9 |:floppy_disk:
Spanish | M-AILabs | TacotronDDC | tts |v0.0.9 |:floppy_disk:
French | MAILabs | TacotronDDC | tts |v0.0.9 |:floppy_disk:
Dutch | MAI | TacotronDDC | tts | v0.0.10 |:floppy_disk:
:sparkles: English | sam (accenture) | HiFiGAN | vocoder | :smile: v0.0.13| :floppy_disk:
:sparkles: English | VCTK | HiFiGAN | vocoder | :smile: v0.0.13| :floppy_disk:
English | LJSpeech | HiFiGAN | vocoder | v0.0.12| :floppy_disk:
English | EK1 | WaveGrad | vocoder | v0.0.10 |:floppy_disk:
Dutch | MAI | ParallelWaveGAN | vocoder | v0.0.10 |:floppy_disk:
English | LJSpeech | MB-MelGAN | vocoder |v0.0.9 |:floppy_disk:
:earthafrica: Multi-Lang | LibriTTS | FullBand-MelGAN | vocoder |v0.0.9 |:floppy_disk:
:earth_africa: Multi-Lang | LibriTTS | WaveGrad | vocoder |v0.0.9 |:floppy_disk:
Update Jun 7 2021: Ruslan (Russian) model has been removed due to the license conflict.
- Python
Published by erogol almost 5 years ago
tts - v0.0.12
:frog: v0.0.12
🐞Bug Fixes
- [x] fix #419 (This is a crucial bug fix).
- [x] fix #408
💾 Code updates
- [x] Enable logging model config.json on Tensorboard. #418
- [x] Update code style standards and use a
Makefileto ease regular tasks. #423 - [x] Enable using
Tacotron.prenet.dropoutat inference time. This leads to a better quality with some models. - [x] Update default
ttsmodel to LJspeech TacotronDDC. - [x] Show the real waveform on Tensorboard in GAN vocoder training.
:walking_woman: Operational Updates
🏅 Model implementations
- [x] initial HiFiGAN implementation (:crown: @rishikksh20 @erogol) #422
🚀 New Pre-Trained Model Releases
- [ ] ~~Universal HifiGAN model~~(postponed to the next version for :crown: @Edresson's updated model.)
- [x] LJSpeech, Tacotron2 Double Decoder Consistency v2 model. Check our blog post to learn more about Double Decoder Consistency.
- [x] LJSpeech HifiGAN model.
Released Models
:bulb: All the models below are available by tts end point as explained here.
| Language | Dataset | Model Name | Model Type | TTS version | Download| | -- | -- | -- | -- | -- | :-: | :sparkles: English | LJSpeech | Tacotron-DDC | tts |:smiley: v0.0.12| :floppy_disk: German | Thorsten-DE | Tacotron-DCA | tts | v0.0.11 |:floppy_disk: German | Thorsten-DE | Wavegrad | vocoder |v0.0.11 |:floppy_disk: English | LJSpeech | SpeedySpeech | tts | v0.0.10 |:floppy_disk: English | EK1 | Tacotron2 | tts |v0.0.10 |:floppy_disk: Dutch | MAI | TacotronDDC | tts | v0.0.10 |:floppy_disk: Chinese | Baker | TacotronDDC-GST | tts | v0.0.10 |:floppy_disk: English | LJSpeech | TacotronDCA | tts |v0.0.9 |:floppy_disk: English | LJSpeech | Glow-TTS | tts |v0.0.9 |:floppy_disk: Spanish | M-AILabs | TacotronDDC | tts |v0.0.9 |:floppy_disk: French | MAILabs | TacotronDDC | tts |v0.0.9 |:floppy_disk: Dutch | MAI | TacotronDDC | tts | v0.0.10 |:floppy_disk: :sparkles: English | LJSpeech | HiFiGAN | vocoder | :smiley: v0.0.12| :floppy_disk: English | EK1 | WaveGrad | vocoder | v0.0.10 |:floppy_disk: Dutch | MAI | ParallelWaveGAN | vocoder | v0.0.10 |:floppy_disk: English | LJSpeech | MB-MelGAN | vocoder |v0.0.9 |:floppy_disk: :earthafrica: Multi-Lang | LibriTTS | FullBand-MelGAN | vocoder |v0.0.9 |:floppy_disk: :earth_africa: Multi-Lang | LibriTTS | WaveGrad | vocoder |v0.0.9 |:floppy_disk:
- Python
Published by erogol almost 5 years ago
tts - v0.0.11
:frog: v0.0.11
🐞Bug Fixes
- [x] Fixed #374. (Thx for reporting @a-froghyar )
💾 Code updates
- [x]
/bin/resample.pyto resample wavefiles (:crown: @WeberJulian) - [x] Some updates for Windows compat. (:crown: @GuyPaddock)
- [x] Fixing
CheckSpectrogramnotebook. (:crown: @GuyPaddock) - [x] Fix #392
:walking_woman: Operational Updates
🏅 Model implementations
- [x] initial AlignTTS implementation. (https://github.com/coqui-ai/TTS/pull/398)
- [ ] initial HiFiGAN implementation (:crown: @rishikksh20) (postponed to the next release)
🚀 New Pre-Trained Model Releases
- [x] German - Tacotron2-DCA trained with thorsten_dataset. (:crown: @thorstenMueller )
- [x] German - Wavegrad vocoder with thorsten_dataset. (:crown: @thorstenMueller)
Released Models
:bulb: All the models below are available by tts end point as explained here.
| Language | Dataset | Model Name | Model Type | TTS version | Download| | -- | -- | -- | -- | -- | :-: | :sparkles: German | Thorsten-DE | Tacotron-DCA | tts |:smiley: v0.0.11 |:floppy_disk: :sparkles: German | Thorsten-DE | Wavegrad | vocoder |:smiley: v0.0.11 |:floppy_disk: English | LJSpeech | SpeedySpeech | tts | v0.0.10 |:floppy_disk: English | EK1 | Tacotron2 | tts |v0.0.10 |:floppy_disk: Dutch | MAI | TacotronDDC | tts | v0.0.10 |:floppy_disk: Chinese | Baker | TacotronDDC-GST | tts | v0.0.10 |:floppy_disk: English | LJSpeech | TacotronDCA | tts |v0.0.9 |:floppy_disk: English | LJSpeech | Glow-TTS | tts |v0.0.9 |:floppy_disk: Spanish | M-AILabs | TacotronDDC | tts |v0.0.9 |:floppy_disk: French | MAILabs | TacotronDDC | tts |v0.0.9 |:floppy_disk: Dutch | MAI | TacotronDDC | tts | v0.0.10 |:floppy_disk: English | EK1 | WaveGrad | vocoder | v0.0.10 |:floppy_disk: Dutch | MAI | ParallelWaveGAN | vocoder | v0.0.10 |:floppy_disk: English | LJSpeech | MB-MelGAN | vocoder |v0.0.9 |:floppy_disk: :earthafrica: Multi-Lang | LibriTTS | FullBand-MelGAN | vocoder |v0.0.9 |:floppy_disk: :earth_africa: Multi-Lang | LibriTTS | WaveGrad | vocoder |v0.0.9 |:floppy_disk:
- Python
Published by erogol almost 5 years ago
tts - v0.0.10
:frog: v0.0.10
🐞Bug Fixes
- [x] Make
synthesizer.pysaving the output audio with the vocoder sampling rate. It is necessary if there is sampling rates of the tts and the vocoder models are different and interpolation is applied to the tts model output before running the vocoder. Practically, it fixes generated Spanish and French voices byttsortts-serveron the terminal. - [x] Handling utf-8 on Windows. (by @adonispujols)
- [x] Fix Loading the last model when
--continue_training. It was loading the best_model regardless.
💾 Code updates
- [x] Breaking Change: Update default set of characters in
symbols.py. This might require you to set your character set inconfig.jsonif you like to use this version with your models trained with the previous version. - [x] Chinese backend for text processing (#654 by @kirianguiller)
- [x] Enable torch.hub integration for the released models.
- [x] First github release.
- [x] dep. version fixes. Using numpy > 1.17.5 breaks some tests.
- [x] WaveRNN fix (by @gerazov )
- [x] Big refactoring for the training scripts to share the init part of the code. (by @gerazov)
- [x] Enable ModelManager to download models from Github releases.
- [x] Add a test for
compute_statistics.py - [x] light-touch updates in
ttsandtts-serverentry points. (thanks @thorstenMueller ) - [x] Define default vocoder models for each tts model in
.models.json.ttsandtts-serverentry points use the default vocoder if the user does not specify. - [x]
find_unique_chars.pyto find all the unique characters in a dataset. - [x] A better way to handling best models through training. (thx @gerazov )
- [x] pass used characters to the model config.json at the beginning of the training. This prevents any code update later to affect the trained models.
- [x] Migration to Github Actions for CI.
- [x] Deprecate wheel based use of tts-server for the sake of the new design.
- [x] :frog:
:walking_woman: Operational Updates
- [x] Move released models to Github Releases and deprecate GDrive being the first option.
🏅 Model implementations
- No updates 😓
🚀 New Pre-Trained Model Releases
- [x] English ek1 - Tacotron2 model and WaveGrad vocoder under
.models.json. (huge THX!! to @nmstoker) - [x] Russian Ruslan - Tacotron2-DDC model.
- [x] Dutch model. (huge THX!! to @r-dh )
- [x] Chinese Tacotron2 model. (huge THX!! to @kirianguiller)
- [x] English LJSpeech - SpeechSpeech with WaveNet decoder.
Released Models
:bulb: All the models below are available by tts end point as explained here.
| Language | Dataset | Model Name | Model Type | TTS version | Download| | -- | -- | -- | -- | -- | :-: | English | LJSpeech | SpeedySpeech | tts |:smiley: v0.0.10 |:floppy_disk: English | EK1 | Tacotron2 | tts |:smiley: v0.0.10 |:floppy_disk: Dutch | MAI | TacotronDDC | tts |:smiley: v0.0.10 |:floppy_disk: Chinese | Baker | TacotronDDC-GST | tts |:smiley: v0.0.10 |:floppy_disk: English | LJSpeech | TacotronDCA | tts |v0.0.9 |:floppy_disk: English | LJSpeech | Glow-TTS | tts |v0.0.9 |:floppy_disk: Spanish | M-AILabs | TacotronDDC | tts |v0.0.9 |:floppy_disk: French | MAILabs | TacotronDDC | tts |v0.0.9 |:floppy_disk: Dutch | MAI | TacotronDDC | tts |:smiley: v0.0.10 |:floppy_disk: English | EK1 | WaveGrad | vocoder |:smiley: v0.0.10 |:floppy_disk: Dutch | MAI | ParallelWaveGAN | vocoder |:smiley: v0.0.10 |:floppy_disk: English | LJSpeech | MB-MelGAN | vocoder |v0.0.9 |:floppy_disk: :earthafrica: Multi-Lang | LibriTTS | FullBand-MelGAN | vocoder |v0.0.9 |:floppy_disk: :earth_africa: Multi-Lang | LibriTTS | WaveGrad | vocoder |v0.0.9 |:floppy_disk:
- Python
Published by erogol almost 5 years ago
tts - v0.0.9
:frog: TTS v0.0.9 - the first release :tada:
This is the first and v0.0.9 release of :frog:TTS. :frog:TTS is still an evolving project and any upcoming release might be significantly different and not backward compatible.
In this release, we provide the following models.
Language | Dataset | Model Name | Model Type | Download -- | -- | -- | -- | -- English | LJSpeech | TacotronDCA | tts |:floppy_disk: English | LJSpeech | Glow-TTS | tts |:floppy_disk: Spanish | M-AILabs | TacotronDDC | tts |:floppy_disk: French | MAILabs | TacotronDDC | tts |:floppy_disk: English | LJSpeech | MB-MelGAN | vocoder |:floppy_disk: :earthafrica: Multi-Lang | LibriTTS | FullBand-MelGAN | vocoder|:floppy_disk: :earth_africa: Multi-Lang | LibriTTS | WaveGrad | vocoder |:floppy_disk:
Notes
- Multi-Lang vocoder models are intended for non-English models.
- Vocoder models are independently trained from the tts models with possibly different sampling rates. Therefore, the performance is not optimal.
- All models are trained with phonemes generated by espeak back-end (not espeak-ng).
- This release has been tested under Python 3.6, 3.7, and 3.8. It is strongly suggested to use
condato install the dependencies and set up the environment.
Edit:
(22.03.2021) - Fullband Universal Vocoder is corrected with the right model files. Previously, we released the wrong model with that name.
- Python
Published by erogol almost 5 years ago