Recent Releases of tts

tts - v0.22.0

What's Changed

  • fix: Few typos in Tortoise docs. by @VladCuciureanu in https://github.com/coqui-ai/TTS/pull/3352
  • fix pause problem of Chinese speech by @aaron-lii in https://github.com/coqui-ai/TTS/pull/3351
  • Fix typos by @omahs in https://github.com/coqui-ai/TTS/pull/3368
  • Print message for either commercial license or CPML by @JRMeyer in https://github.com/coqui-ai/TTS/pull/3381
  • Add inference parameters by @WeberJulian in https://github.com/coqui-ai/TTS/pull/3373
  • Training fastspeech2 with External Speaker Embeddings by @freds0 in https://github.com/coqui-ai/TTS/pull/3404
  • fixes a typo by @joelhoward0 in https://github.com/coqui-ai/TTS/pull/3392
  • support multiple GPU training for XTTS by @aaron-lii in https://github.com/coqui-ai/TTS/pull/3391
  • Add studio speakers to open source XTTS! by @WeberJulian in https://github.com/coqui-ai/TTS/pull/3405

New Contributors

  • @VladCuciureanu made their first contribution in https://github.com/coqui-ai/TTS/pull/3352
  • @aaron-lii made their first contribution in https://github.com/coqui-ai/TTS/pull/3351
  • @omahs made their first contribution in https://github.com/coqui-ai/TTS/pull/3368
  • @JRMeyer made their first contribution in https://github.com/coqui-ai/TTS/pull/3381
  • @freds0 made their first contribution in https://github.com/coqui-ai/TTS/pull/3404
  • @joelhoward0 made their first contribution in https://github.com/coqui-ai/TTS/pull/3392

Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.21.3...v0.22.0

- Python
Published by erogol about 2 years ago

tts - v0.21.3

What's Changed

  • Add XTTS Fine tuning gradio demo by @Edresson in https://github.com/coqui-ai/TTS/pull/3296

Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.21.2...v0.21.3

No-Code XTTS fine-tuning

We created a UI that you can use to fine-tune XTTS with your data. You can run it on Colab, locally, or on a server.

@WeberJulian has also recorded a video for showing step-by-step tutorial

You can also follow the XTTS docs if you are a read-and-learn type.

- Python
Published by erogol about 2 years ago

tts - v0.21.2

What's Changed

  • Run XTTS models by direct name with versions by @erogol in https://github.com/coqui-ai/TTS/pull/3318
  • fix: correctly strip/restore initial punctuation by @eginhard in https://github.com/coqui-ai/TTS/pull/3336
  • Fix link to installation instructions by @Vuizur in https://github.com/coqui-ai/TTS/pull/3329

New Contributors

  • @Vuizur made their first contribution in https://github.com/coqui-ai/TTS/pull/3329

Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.21.1...v0.21.2

This PR allows for running XTTS models with version tags. So you the user can access any version they like.

```python from TTS.api import TTS

get v2.0.2

tts = TTS(modelname="xttsv2.0.2", gpu=True)

get the latest version

tts = TTS(model_name="xtts", gpu=True)

generate speech by cloning a voice using default settings

tts.ttstofile(text="Here is my sample text.", filepath="output.wav", speakerwav=["reference.wav", "reference1.wav"], language="en") ```

Making automatic sentence splitting optional. So you can apply any custom logic for processing the text before passing it to the model. Set split_sentences False.

```python from TTS.api import TTS

get v2.0.2

tts = TTS(modelname="xttsv2.0.2", gpu=True)

generate speech by cloning a voice using default settings

tts.ttstofile(text="Here is my sample text.", filepath="output.wav", speakerwav=["reference.wav", "reference1.wav"], language="en", split_sentences=False) ```

- Python
Published by erogol about 2 years ago

tts - v0.21.1

  • Adding a basic Hindi text cleaner. https://github.com/coqui-ai/TTS/commit/32065139e713b3e44aa88e72c4d35012bb888238

- Python
Published by erogol about 2 years ago

tts - v0.21.0

What's Changed

  • Remove duplicate/unused code by @eginhard in https://github.com/coqui-ai/TTS/pull/3243
  • Making the Model Manager's Progress bar statically accessible via the class. by @FlorianEagox in https://github.com/coqui-ai/TTS/pull/3297
  • More informative error for wrong --language argument by @eginhard in https://github.com/coqui-ai/TTS/pull/3294
  • Don't pass quotes to espeak by @eginhard in https://github.com/coqui-ai/TTS/pull/3286
  • Fix ttswithvc by @eginhard in https://github.com/coqui-ai/TTS/pull/3275
  • Misjudgment of is_multi_lingual When Loading Multilingual Model via model_path by @TITC in https://github.com/coqui-ai/TTS/pull/3273
  • Introducing Development Dockerfile by @Kaszanas in https://github.com/coqui-ai/TTS/pull/3263
  • update deepspeed version by @WeberJulian in https://github.com/coqui-ai/TTS/pull/3281

New Contributors

  • @FlorianEagox made their first contribution in https://github.com/coqui-ai/TTS/pull/3297
  • @TITC made their first contribution in https://github.com/coqui-ai/TTS/pull/3273
  • @Kaszanas made their first contribution in https://github.com/coqui-ai/TTS/pull/3263

Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.20.6...v0.21.0

- Python
Published by erogol about 2 years ago

tts - v0.20.6

What's Changed

  • Remove duplicate AudioProcessor code, fix ExtractTTSpectrogram.ipynb by @eginhard in https://github.com/coqui-ai/TTS/pull/3230
  • Add sentence splitting by @WeberJulian in https://github.com/coqui-ai/TTS/pull/3227
  • Fix zh bug by @WeberJulian in https://github.com/coqui-ai/TTS/pull/3238
  • Update versions by @erogol in https://github.com/coqui-ai/TTS/pull/3248
  • Ensures that only GPT model is in training mode during XTTS GPT training by @Edresson in https://github.com/coqui-ai/TTS/pull/3241
  • Loosen dependencies and make k_diffusion optional by @erogol in https://github.com/coqui-ai/TTS/pull/3249
  • Update XTTS v2.0.2 by @erogol in https://github.com/coqui-ai/TTS/pull/3249

Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.20.5...v0.20.6

- Python
Published by erogol over 2 years ago

tts - v0.20.5

What's Changed

  • Add speed control for inference by @WeberJulian in https://github.com/coqui-ai/TTS/pull/3214
  • Update README.md by @eltociear in https://github.com/coqui-ai/TTS/pull/3215
  • Fix XTTS GPT padding and inference issues by @Edresson in https://github.com/coqui-ai/TTS/pull/3216

Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.20.4...v0.20.5

- Python
Published by erogol over 2 years ago

tts - v0.20.4

What's Changed

  • Update XTTS cloning by @erogol in https://github.com/coqui-ai/TTS/pull/3207
  • fix max generation length for XTTS by @WeberJulian in https://github.com/coqui-ai/TTS/pull/3208

Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.20.3...v0.20.4

- Python
Published by erogol over 2 years ago

tts - v0.20.3

What's Changed

  • XTTS- Torchaudio should use proper backend to load audio by @gorkemgoknar in https://github.com/coqui-ai/TTS/pull/3179
  • PyTorch 2.1 Updates (Weight Norm and TorchAudio I/O) by @MattyB95 in https://github.com/coqui-ai/TTS/pull/3176
  • xtts/tokenizer: merge duplicate implementations of preprocess_text by @akx in https://github.com/coqui-ai/TTS/pull/3170
  • fix(formatters): set missing root_path attribute by @eginhard in https://github.com/coqui-ai/TTS/pull/3182

Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.20.2...v0.20.3

- Python
Published by erogol over 2 years ago

tts - v0.20.2

What's Changed

  • Add char limit warn by @WeberJulian in https://github.com/coqui-ai/TTS/pull/3130
  • Fix coqui api by @erogol in https://github.com/coqui-ai/TTS/pull/3168
  • Fix #3153 by @erogol in https://github.com/coqui-ai/TTS/pull/3169
  • Move FreeVCConfig to TTS.vc.configs (like all other config classes) by @akx in https://github.com/coqui-ai/TTS/pull/3126
  • Fix ModelManager.list_models() by @eginhard in https://github.com/coqui-ai/TTS/pull/3128
  • Fix for exception on streaming on last chunk by @gorkemgoknar in https://github.com/coqui-ai/TTS/pull/3160
  • Add lang code in XTTS doc by @WeberJulian in https://github.com/coqui-ai/TTS/pull/3158
  • Remove v1 doc and tests by @WeberJulian in https://github.com/coqui-ai/TTS/pull/3172

New Contributors

  • @eginhard made their first contribution in https://github.com/coqui-ai/TTS/pull/3128
  • @gorkemgoknar made their first contribution in https://github.com/coqui-ai/TTS/pull/3160

Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.20.1...v0.20.2

- Python
Published by erogol over 2 years ago

tts - v0.20.1

What's Changed

  • Drop diffusion from XTTS by @erogol in https://github.com/coqui-ai/TTS/pull/3150
  • Bug fixes and add support for multiples speaker references on XTTS inference by @Edresson in https://github.com/coqui-ai/TTS/pull/3149
  • Fix XTTS v2.0 training recipe by @Edresson in https://github.com/coqui-ai/TTS/pull/3154

Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.20.0...v0.20.1

- Python
Published by erogol over 2 years ago

tts - v0.20.0

What's Changed

  • Run make style & re-enable it in CI by @akx in https://github.com/coqui-ai/TTS/pull/3127
  • XTTS v2.0 by @Edresson in https://github.com/coqui-ai/TTS/pull/3137

Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.19.1...v0.20.0

- Python
Published by erogol over 2 years ago

tts - v0.19.1

What's Changed

  • Second round of issue fixing for XTTS v1.1 by @WeberJulian in https://github.com/coqui-ai/TTS/pull/3103
  • fix for issue 3067 by @Aya-AlJafari in https://github.com/coqui-ai/TTS/pull/3109
  • Bug: self.model_name needed to be initialized. by @vltmedia in https://github.com/coqui-ai/TTS/pull/2983

New Contributors

  • @vltmedia made their first contribution in https://github.com/coqui-ai/TTS/pull/2983

Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.19.0...v0.19.1

- Python
Published by erogol over 2 years ago

tts - v0.19.0

What's Changed

  • XTTS v1.1 GPT Trainer by @Edresson in https://github.com/coqui-ai/TTS/pull/3086

Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.18.2...v0.19.0

- Python
Published by erogol over 2 years ago

tts - v0.18.2

What's Changed

  • Fix xtts v1.1 by @WeberJulian in https://github.com/coqui-ai/TTS/pull/3096

Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.18.1...v0.18.2

- Python
Published by erogol over 2 years ago

tts - v0.18.1

What's Changed

  • Bug fix on XTTS v1.1 inference by @Edresson and @WeberJulian in https://github.com/coqui-ai/TTS/pull/3093

Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.18.0...v0.18.1

- Python
Published by Edresson over 2 years ago

tts - v0.18.0

What's Changed

  • XTTS v1.1 by @WeberJulian in https://github.com/coqui-ai/TTS/pull/3089

Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.17.10...v0.18.0

XTTS v1.1

This model is trained on top of XTTS v1, using output masking. We mask the part of the output that is used as the audio prompt while training and don't compute loss for that segment. This helps us to resolve the hallucination issue that V1 experienced.

Changes

  • Add Japanese
  • Resolve the hallucination issue (repeating the audio prompt)
  • Increased expressivity
  • Hash check to control model version
  • Added ne_hifigan that was trained without denoising that brought some EQ and compression profile that might be unwanted for some use-cases

- Python
Published by erogol over 2 years ago

tts - v0.17.10

Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.17.9...v0.17.10

- Python
Published by erogol over 2 years ago

tts - v0.17.9

What's Changed

  • fixed bugs in fastpitch tts synthesis by @Aya-AlJafari in https://github.com/coqui-ai/TTS/pull/3058
  • Update AnalyzeDataset.ipynb by @meryemsakin in https://github.com/coqui-ai/TTS/pull/2783
  • Synthesizer skips over embeddings file if model only has one speaker by @wonkothesanest in https://github.com/coqui-ai/TTS/pull/2587
  • fixed typo of docs\source\implementinganew_model.md by @Subash-Lamichhane in https://github.com/coqui-ai/TTS/pull/3066
  • fixed typo of /docs by @Subash-Lamichhane in https://github.com/coqui-ai/TTS/pull/3065
  • Add play and speed to cli options by @David-bfg in https://github.com/coqui-ai/TTS/pull/3027
  • Fix doc dataset by @WeberJulian in https://github.com/coqui-ai/TTS/pull/3070
  • fix readme by @WeberJulian in https://github.com/coqui-ai/TTS/pull/3071

New Contributors

  • @meryemsakin made their first contribution in https://github.com/coqui-ai/TTS/pull/2783
  • @wonkothesanest made their first contribution in https://github.com/coqui-ai/TTS/pull/2587
  • @Subash-Lamichhane made their first contribution in https://github.com/coqui-ai/TTS/pull/3066
  • @David-bfg made their first contribution in https://github.com/coqui-ai/TTS/pull/3027

Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.17.8...v0.17.9

- Python
Published by erogol over 2 years ago

tts - v0.17.8

What's Changed

  • XTTS redownload if needed by @Edresson in https://github.com/coqui-ai/TTS/pull/3038

Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.17.7...v0.17.8

- Python
Published by gorkemgoknar over 2 years ago

tts - v0.17.7

What's Changed

  • Upgrade and Optimize TTS Code in extractttsspectrogram.ipynb by @anupammaurya6767 in https://github.com/coqui-ai/TTS/pull/3012
  • None is not able to be read for "XTTS", fixes crash if its set to None. by @OPPEYRADY in https://github.com/coqui-ai/TTS/pull/3009
  • Streaming inference for XTTS 🚀 by @WeberJulian in https://github.com/coqui-ai/TTS/pull/3035

New Contributors

  • @anupammaurya6767 made their first contribution in https://github.com/coqui-ai/TTS/pull/3012
  • @OPPEYRADY made their first contribution in https://github.com/coqui-ai/TTS/pull/3009

Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.17.6...v0.17.7

- Python
Published by erogol over 2 years ago

tts - v0.17.6

What's Changed

  • Duplicate code removal by @akx in https://github.com/coqui-ai/TTS/pull/3003
  • Loosen dependency pins by @akx in https://github.com/coqui-ai/TTS/pull/3001
  • Remove unnecessary black exclude config by @akx in https://github.com/coqui-ai/TTS/pull/2999
  • Ensure tts CLI tool readme and usage is in sync by @akx in https://github.com/coqui-ai/TTS/pull/2993
  • Adding Belarusian TTS model by @erogol in https://github.com/coqui-ai/TTS/pull/2922
  • Tortoise inference fix and fix zoo unit tests by @Edresson in https://github.com/coqui-ai/TTS/pull/3010

New Contributors

  • @akx made their first contribution in https://github.com/coqui-ai/TTS/pull/3003

Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.17.5...v0.17.6

- Python
Published by erogol over 2 years ago

tts - v0.17.5

What's Changed

  • Fix fsspec requirement by @WeberJulian in https://github.com/coqui-ai/TTS/pull/2970
  • Add coqui blog post by @osanseviero in https://github.com/coqui-ai/TTS/pull/2949
  • fix: xtts not taking into account device flag by @loupzeur in https://github.com/coqui-ai/TTS/pull/2951
  • fix package versions by @WeberJulian in https://github.com/coqui-ai/TTS/pull/2990

New Contributors

  • @osanseviero made their first contribution in https://github.com/coqui-ai/TTS/pull/2949
  • @loupzeur made their first contribution in https://github.com/coqui-ai/TTS/pull/2951

Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.17.4...v0.17.5

- Python
Published by erogol over 2 years ago

tts - v0.17.4

Identical to v0.17.3 - retriggered due to failed CI and missing package on PyPI

- Python
Published by reuben over 2 years ago

tts - v0.17.3

You can set COQUI_TOS_AGREED=1 to pass the ToS when using XTTS

- Python
Published by erogol over 2 years ago

tts - v0.17.2

What's Changed

  • Fix model tests by @erogol in https://github.com/coqui-ai/TTS/pull/2943

Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.17.1...v0.17.2

- Python
Published by erogol over 2 years ago

tts - v0.17.1

Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.17.0...v0.17.1

- Python
Published by erogol over 2 years ago

tts - 👑v0.17.0

What's Changed

  • 👑XTTS implementation by @erogol in https://github.com/coqui-ai/TTS/pull/2939
  • Fix requests exception handling in manage.py by @Cohee1207 in https://github.com/coqui-ai/TTS/pull/2912
  • Fixed spectrogram checking on librosa 0.10.x by @T145 in https://github.com/coqui-ai/TTS/pull/2899
  • Add CML-TTS dataset YourTTS training recipe by @Edresson in https://github.com/coqui-ai/TTS/pull/2934

New Contributors

  • @Cohee1207 made their first contribution in https://github.com/coqui-ai/TTS/pull/2912
  • @T145 made their first contribution in https://github.com/coqui-ai/TTS/pull/2899

Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.16.6...v0.17.0

What's Changed

  • Fix requests exception handling in manage.py by @Cohee1207 in https://github.com/coqui-ai/TTS/pull/2912
  • Fixed spectrogram checking on librosa 0.10.x by @T145 in https://github.com/coqui-ai/TTS/pull/2899
  • Add CML-TTS dataset YourTTS training recipe by @Edresson in https://github.com/coqui-ai/TTS/pull/2934

New Contributors

  • @Cohee1207 made their first contribution in https://github.com/coqui-ai/TTS/pull/2912
  • @T145 made their first contribution in https://github.com/coqui-ai/TTS/pull/2899

Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.16.6...v0.17.0

- Python
Published by erogol over 2 years ago

tts - v0.16.6

What's Changed

  • Update README with new device API by @jaketae in https://github.com/coqui-ai/TTS/pull/2876
  • Add device flag to TTS CLI by @jaketae in https://github.com/coqui-ai/TTS/pull/2875
  • [WIP] Add phonemizer for Belarusian language by @alex73 in https://github.com/coqui-ai/TTS/pull/2856
  • Updated scipy version by @Exponefrv1 in https://github.com/coqui-ai/TTS/pull/2914
  • Update docs by @erogol in https://github.com/coqui-ai/TTS/pull/2919

New Contributors

  • @Exponefrv1 made their first contribution in https://github.com/coqui-ai/TTS/pull/2914

Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.16.5...v0.16.6

- Python
Published by erogol over 2 years ago

tts - v0.16.5

What's Changed

  • Fix loading Bark by @erogol in https://github.com/coqui-ai/TTS/pull/2893

Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.16.4...v0.16.5

- Python
Published by erogol over 2 years ago

tts - v0.16.4

What's Changed

  • Add customizable data home path by @WeberJulian in https://github.com/coqui-ai/TTS/pull/2871
  • Add device support in TTS and Synthesizer by @jaketae in https://github.com/coqui-ai/TTS/pull/2855

New Contributors

  • @jaketae made their first contribution in https://github.com/coqui-ai/TTS/pull/2855

Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.16.3...v0.16.4

- Python
Published by erogol over 2 years ago

tts - v0.16.3

What's Changed

  • Update Studio API for XTTS by @erogol in https://github.com/coqui-ai/TTS/pull/2861
  • Denote human voices in README.md by @michaelnew in https://github.com/coqui-ai/TTS/pull/2851

New Contributors

  • @michaelnew made their first contribution in https://github.com/coqui-ai/TTS/pull/2851

Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.16.2...v0.16.3

- Python
Published by erogol over 2 years ago

tts - v0.16.2

What's Changed

  • add post functionality to /api/tts by @ChaseCares in https://github.com/coqui-ai/TTS/pull/2836
  • Add fairseq onnx support and strict configuration, fixes some onnx errors by @SystemPanic in https://github.com/coqui-ai/TTS/pull/2831
  • Fix phoneme coverage notebook imports by @erogol in https://github.com/coqui-ai/TTS/pull/2845
  • Handle missing JA phonemizer by @erogol in https://github.com/coqui-ai/TTS/pull/2843

Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.16.1...v0.16.2

- Python
Published by erogol over 2 years ago

tts - v0.16.1

What's Changed

  • Adds multi-language support for VITS onnx, fixes onnx exporting and inference errors by @SystemPanic in https://github.com/coqui-ai/TTS/pull/2816
  • Recipe for Belarusian TTS by @alex73 in https://github.com/coqui-ai/TTS/pull/2756
  • Delightful TTS VCTK recipe fixes by @AWAS666 in https://github.com/coqui-ai/TTS/pull/2808
  • Add kwargs to ignore extra arguments w/o error by @erogol in https://github.com/coqui-ai/TTS/pull/2822
  • Fix DelightfulTTS by @erogol in https://github.com/coqui-ai/TTS/pull/2823

New Contributors

  • @SystemPanic made their first contribution in https://github.com/coqui-ai/TTS/pull/2816
  • @AWAS666 made their first contribution in https://github.com/coqui-ai/TTS/pull/2808

Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.16.0...v0.16.1

- Python
Published by erogol over 2 years ago

tts - v0.16.0

What's Changed

  • Fix #2749 by @erogol in https://github.com/coqui-ai/TTS/pull/2750
  • Fix share model page URL by @alex73 in https://github.com/coqui-ai/TTS/pull/2757
  • Make Japanese-specific dependencies optional by @polm in https://github.com/coqui-ai/TTS/pull/2776
  • API tests by @erogol in https://github.com/coqui-ai/TTS/pull/2790
  • Add Delightful-TTS model by @loganhart420 in https://github.com/coqui-ai/TTS/pull/2095
  • Fix Tortoise load by @erogol in https://github.com/coqui-ai/TTS/pull/2791

New Contributors

  • @alex73 made their first contribution in https://github.com/coqui-ai/TTS/pull/2757
  • @polm made their first contribution in https://github.com/coqui-ai/TTS/pull/2776

Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.15.6...v0.16.0

- Python
Published by erogol over 2 years ago

tts - 🛠️v0.15.6

What's Changed

  • fix loading of model and vocoder configs by @ChaseCares in https://github.com/coqui-ai/TTS/pull/2698
  • Update compute_embeddings.py by @46319943 in https://github.com/coqui-ai/TTS/pull/2668
  • delete meaningless print() by @ZhouGongZaiShi in https://github.com/coqui-ai/TTS/pull/2662
  • fixed small spelling mistakes finetuning.md by @Woutervdvelde in https://github.com/coqui-ai/TTS/pull/2551
  • Resolve conflicts by @erogol in https://github.com/coqui-ai/TTS/pull/2741
  • Export multispeaker onnx by @erogol in https://github.com/coqui-ai/TTS/pull/2743
  • Fix #2745 by @erogol in https://github.com/coqui-ai/TTS/pull/2748

New Contributors

  • @ChaseCares made their first contribution in https://github.com/coqui-ai/TTS/pull/2698
  • @46319943 made their first contribution in https://github.com/coqui-ai/TTS/pull/2668
  • @ZhouGongZaiShi made their first contribution in https://github.com/coqui-ai/TTS/pull/2662
  • @Woutervdvelde made their first contribution in https://github.com/coqui-ai/TTS/pull/2551

Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.15.5...v0.15.6

- Python
Published by erogol over 2 years ago

tts - 🛠️ v0.15.5

What's Changed

  • Update docs and credits by @erogol in https://github.com/coqui-ai/TTS/pull/2733

Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.15.4...v0.15.5

- Python
Published by erogol over 2 years ago

tts - 🛠️v0.15.4

Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.15.2...v0.15.4

- Python
Published by erogol over 2 years ago

tts - 🛠️v0.15.2

Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.15.1...v0.15.2

- Python
Published by erogol over 2 years ago

tts - 🛠️v0.15.1

Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.15.0...v0.15.1

  • Fix docs

- Python
Published by erogol over 2 years ago

tts - 😄 v0.15.0

What's Changed

  • Update stochasticdurationpredictor.py by @mengting7tw in https://github.com/coqui-ai/TTS/pull/2663
  • Fix Tortoise load by @erogol in https://github.com/coqui-ai/TTS/pull/2697
  • Inference API for 🐶Bark by @erogol in https://github.com/coqui-ai/TTS/pull/2685
  • Drop Python 3.7 and 3.8 and stage Python 3.11 by @erogol in https://github.com/coqui-ai/TTS/pull/2700

Running 🐶Bark

```python text = "Hello, my name is Manmay , how are you?"

from TTS.tts.configs.bark_config import BarkConfig from TTS.tts.models.bark import Bark

config = BarkConfig() model = Bark.initfromconfig(config) model.loadcheckpoint(config, checkpointdir="path/to/model/dir/", eval=True)

with random speaker

outputdict = model.synthesize(text, config, speakerid="random", voice_dirs=None)

cloning a speaker.

It assumes that you have a speaker file in bark_voices/speaker_n/speaker.wav or bark_voices/speaker_n/speaker.npz

outputdict = model.synthesize(text, config, speakerid="ljspeech", voicedirs="barkvoices/") ```

Using 🐸TTS API:

```python from TTS.api import TTS

Load the model to GPU

Bark is really slow on CPU, so we recommend using GPU.

tts = TTS("tts_models/multilingual/multi-dataset/bark", gpu=True)

Cloning a new speaker

This expects to find a mp3 or wav file like bark_voices/new_speaker/speaker.wav

It computes the cloning values and stores in bark_voices/new_speaker/speaker.npz

tts.ttstofile(text="Hello, my name is Manmay , how are you?", filepath="output.wav", voicedir="bark_voices/", speaker="ljspeech")

When you run it again it uses the stored values to generate the voice.

tts.ttstofile(text="Hello, my name is Manmay , how are you?", filepath="output.wav", voicedir="bark_voices/", speaker="ljspeech")

random speaker

tts = TTS("ttsmodels/multilingual/multi-dataset/bark", gpu=True) tts.ttstofile("hello world", filepath="out.wav") ```

Using 🐸TTS Command line:

```console

cloning the ljspeech voice

tts --modelname ttsmodels/multilingual/multi-dataset/bark \ --text "This is an example." \ --outpath "output.wav" \ --voicedir barkvoices/ \ --speakeridx "ljspeech" \ --progress_bar True

Random voice generation

tts --modelname ttsmodels/multilingual/multi-dataset/bark \ --text "This is an example." \ --outpath "output.wav" \ --progressbar True ```

Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.14.3...v0.15.0

- Python
Published by erogol over 2 years ago

tts - 👉 v0.14.3

- Python
Published by erogol over 2 years ago

tts - ⛈️ v0.14.2

Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.14.1...v0.14.2

- Python
Published by erogol over 2 years ago

tts - 🚗 v0.14.1

What's Changed

  • Fetch all built-in speakers from API by @reuben in https://github.com/coqui-ai/TTS/pull/2626
  • fix typo by @vodiylik in https://github.com/coqui-ai/TTS/pull/2647
  • Port Fairseq TTS models by @erogol in https://github.com/coqui-ai/TTS/pull/2628

New Contributors

  • @vodiylik made their first contribution in https://github.com/coqui-ai/TTS/pull/2647

Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.14.0...v0.14.1

Example text to speech using Fairseq models in ~1100 languages 🤯.

For these models use the following name format: tts_models/<lang-iso_code>/fairseq/vits.

You can find the list of language ISO codes here and learn about the Fairseq models here.

```python from TTS.api import TTS api = TTS(modelname="ttsmodels/eng/fairseq/vits", gpu=True) api.ttstofile("This is a test.", file_path="output.wav")

TTS with on the fly voice conversion

api = TTS("ttsmodels/deu/fairseq/vits") api.ttswithvctofile( "Wie sage ich auf Italienisch, dass ich dich liebe?", speakerwav="target/speaker.wav", file_path="ouptut.wav" ) ```

- Python
Published by erogol over 2 years ago

tts - v0.14.0

What's Changed

  • Typos and minor fixes by @prakharpbuf in https://github.com/coqui-ai/TTS/pull/2508
  • Add FR and ES gruut languages as requirement to avoid inference issues by @Edresson in https://github.com/coqui-ai/TTS/pull/2572
  • Lighter docker image by @WeberJulian in https://github.com/coqui-ai/TTS/pull/2600
  • Use default_factory for audio parameter by @v4hn in https://github.com/coqui-ai/TTS/pull/2576
  • Update README.md by @HighnessAtharva in https://github.com/coqui-ai/TTS/pull/2577
  • Add Jenny model by @erogol in https://github.com/coqui-ai/TTS/pull/2603
  • Warn when lang is not avail by @erogol in https://github.com/coqui-ai/TTS/pull/2460
  • Update VAD for silence trimming. by @erogol in https://github.com/coqui-ai/TTS/pull/2604
  • Tortoise TTS inference by @manmay-nakhashi in https://github.com/coqui-ai/TTS/pull/2547
  • Draft ONNX export for VITS by @erogol in https://github.com/coqui-ai/TTS/pull/2563

New Contributors

  • @prakharpbuf made their first contribution in https://github.com/coqui-ai/TTS/pull/2508
  • @v4hn made their first contribution in https://github.com/coqui-ai/TTS/pull/2576
  • @HighnessAtharva made their first contribution in https://github.com/coqui-ai/TTS/pull/2577

Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.13.3...v0.14.0

- Python
Published by erogol almost 3 years ago

tts - v0.14.0_models

Jenny VITS model trained by 👑@noml4u

bash tts --model_name tts_models/en/jenny/jenny --text "This is a test. This is also a test."

- Python
Published by erogol almost 3 years ago

tts - v0.14.1_models

- Python
Published by erogol almost 3 years ago

tts - 🐶 v0.13.3

What's Changed

  • Bangla models by @erogol in https://github.com/coqui-ai/TTS/pull/2532

Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.13.2...v0.13.3

- Python
Published by erogol almost 3 years ago

tts - v0.13.3_models

Single speaker Bangla Male/Female models

These are single-speaker VITS models with a 22050hz sampling rate.

By 👑 @mobassir94 Original repo: https://github.com/mobassir94/comprehensive-bangla-tts

Male Model

shell tts --model_name tts_models/bn/custom/vits-male --text "এটি ডেমো করার উদ্দেশ্যে একটি ডেমো"

```python from TTS.api import TTS

tts = TTS(modelname="ttsmodels/bn/custom/vits-male") tts.ttstofile(text="এটি ডেমো করার উদ্দেশ্যে একটি ডেমো", file_path="output.wav")

TTS with voice conversion to a reference speaker in target_speaker.wav

ttswithvctofile(text="এটি ডেমো করার উদ্দেশ্যে একটি ডেমো", speakerwav="targetspeaker.wav", file_path="output.wav") ```

Female Model

shell tts --model_name tts_models/bn/custom/vits-female --text "এটি ডেমো করার উদ্দেশ্যে একটি ডেমো"

```python from TTS.api import TTS

tts = TTS(modelname="ttsmodels/bn/custom/vits-female") tts.ttstofile(text="এটি ডেমো করার উদ্দেশ্যে একটি ডেমো", file_path="output.wav")

TTS with voice conversion to a reference speaker in target_speaker.wav

ttswithvctofile(text="এটি ডেমো করার উদ্দেশ্যে একটি ডেমো", speakerwav="targetspeaker.wav", file_path="output.wav") ```

- Python
Published by erogol almost 3 years ago

tts - 🌈v0.13.2

What's Changed

  • 🐸Studio models by tts by @erogol in https://github.com/coqui-ai/TTS/pull/2515
  • Update VAD by @erogol in https://github.com/coqui-ai/TTS/pull/2509
  • 🌈 v0.13.2 by @erogol in https://github.com/coqui-ai/TTS/pull/2519

Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.13.1...v0.13.2

- Python
Published by erogol almost 3 years ago

tts - v0.13.1

What's Changed

  • Update Librosa Version To V0.10.0 by @MattyB95 in https://github.com/coqui-ai/TTS/pull/2480
  • Api voice conversion by @erogol in https://github.com/coqui-ai/TTS/pull/2495
  • ✨ v0.13.1 by @erogol in https://github.com/coqui-ai/TTS/pull/2499

New Contributors

  • @MattyB95 made their first contribution in https://github.com/coqui-ai/TTS/pull/2480

Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.13.0...v0.13.1

- Python
Published by erogol almost 3 years ago

tts - v0.13.0

What's Changed

  • vits.py training fixed due to return_complex by @iamkhalidbashir in https://github.com/coqui-ai/TTS/pull/2418
  • Update numba version by @erogol in https://github.com/coqui-ai/TTS/pull/2435
  • Implement FreeVC by @erogol in https://github.com/coqui-ai/TTS/pull/2451
  • [minor] hifigan_generator.py typo by @p0p4k in https://github.com/coqui-ai/TTS/pull/2462
  • [minor] batch["speaker_ids"] getting set two times by @p0p4k in https://github.com/coqui-ai/TTS/pull/2470
  • fix typo by @BenoitWang in https://github.com/coqui-ai/TTS/pull/2475
  • Fixes typo in README.md example code by @TCNOco in https://github.com/coqui-ai/TTS/pull/2478
  • 🐸 Coqui Studio API integration by @erogol in https://github.com/coqui-ai/TTS/pull/2484

New Contributors

  • @BenoitWang made their first contribution in https://github.com/coqui-ai/TTS/pull/2475
  • @TCNOco made their first contribution in https://github.com/coqui-ai/TTS/pull/2478

Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.12.0...v0.13.0

- Python
Published by erogol almost 3 years ago

tts - FreeVC Models

Landed with v0.12.0

- Python
Published by erogol almost 3 years ago

tts - v0.12.0

What's Changed

  • numpy version for py310 by @p0p4k in https://github.com/coqui-ai/TTS/pull/2316
  • Fix Speaker Consistency Loss (SCL) by @Edresson in https://github.com/coqui-ai/TTS/pull/2364
  • OverFlow with test sentences by @thennal10 in https://github.com/coqui-ai/TTS/pull/2253
  • Basic Mary-TTS API compatibility by @fquirin in https://github.com/coqui-ai/TTS/pull/2352
  • add energy by default to Fastspeech2 config by @manmay-nakhashi in https://github.com/coqui-ai/TTS/pull/2326
  • Remove doc bot by @erogol in https://github.com/coqui-ai/TTS/pull/2399
  • Update docs by @erogol in https://github.com/coqui-ai/TTS/pull/2389
  • v0.12.0 by @erogol in https://github.com/coqui-ai/TTS/pull/2390

New Contributors

  • @thennal10 made their first contribution in https://github.com/coqui-ai/TTS/pull/2253
  • @fquirin made their first contribution in https://github.com/coqui-ai/TTS/pull/2352

Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.11.1...v0.12.0

- Python
Published by erogol almost 3 years ago

tts - v0.11.1

What's Changed

  • Add pre-trained NeuralHMM model by @erogol in https://github.com/coqui-ai/TTS/pull/2314
  • v0.11.1 by @erogol in https://github.com/coqui-ai/TTS/pull/2339

Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.11.0...v0.11.1

- Python
Published by erogol about 3 years ago

tts - v0.11.0

What's Changed

  • v0.9.0 by @erogol in https://github.com/coqui-ai/TTS/pull/1942
  • 🚀 v0.10.0 by @erogol in https://github.com/coqui-ai/TTS/pull/2205
  • v0.10.1 by @erogol in https://github.com/coqui-ai/TTS/pull/2242
  • Fastspeech2 by @manmay-nakhashi in https://github.com/coqui-ai/TTS/pull/2073
  • Cache speaker encoder model by @erogol in https://github.com/coqui-ai/TTS/pull/2284
  • Adding neural HMM TTS Model by @shivammehta25 in https://github.com/coqui-ai/TTS/pull/2272
  • Add Catalan text cleaners for Catalan support by @GerrySant in https://github.com/coqui-ai/TTS/pull/2295
  • Use packaging.version for version comparisons by @mweinelt in https://github.com/coqui-ai/TTS/pull/2310
  • Fix tts-server for multi-lingual models by @marius851000 in https://github.com/coqui-ai/TTS/pull/2257
  • API from model path by @erogol in https://github.com/coqui-ai/TTS/pull/2303
  • v0.11.0 by @erogol in https://github.com/coqui-ai/TTS/pull/2277
  • v0.11.0 by @erogol in https://github.com/coqui-ai/TTS/pull/2328
  • Bump up to v0.11.0 by @erogol in https://github.com/coqui-ai/TTS/pull/2329
  • v0.11.0 (#2329) by @erogol in https://github.com/coqui-ai/TTS/pull/2337

New Contributors

  • @GerrySant made their first contribution in https://github.com/coqui-ai/TTS/pull/2295
  • @mweinelt made their first contribution in https://github.com/coqui-ai/TTS/pull/2310
  • @marius851000 made their first contribution in https://github.com/coqui-ai/TTS/pull/2257

Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.10.2...v0.11.0

- Python
Published by erogol about 3 years ago

tts - v0.11.0_models

  • NeuralHMM trained on LJSpeech by 👑 @shivammehta25

- Python
Published by erogol about 3 years ago

tts - v0.10.2

What's Changed

  • Multilingual tokenizer by @WeberJulian in https://github.com/coqui-ai/TTS/pull/2229
  • Fixed bug related to yourtts speaker embeddings issue by @iamkhalidbashir in https://github.com/coqui-ai/TTS/pull/2234
  • Update the Trainer requirement version for a compatible one by @Edresson in https://github.com/coqui-ai/TTS/pull/2276

New Contributors

  • @iamkhalidbashir made their first contribution in https://github.com/coqui-ai/TTS/pull/2234

Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.10.1...v0.10.2

- Python
Published by erogol about 3 years ago

tts - v0.10.1

What's Changed

  • fixed tutorial 2 incompatibility with new dev by @Aya-AlJafari in https://github.com/coqui-ai/TTS/pull/2161
  • Fix capacitron test when cuda is enabled by @WeberJulian in https://github.com/coqui-ai/TTS/pull/2189
  • Fix VITS multi-speaker voice conversion inference by @Edresson in https://github.com/coqui-ai/TTS/pull/2187
  • Handle espeak 1.48.15 by @erogol in https://github.com/coqui-ai/TTS/pull/2203
  • Python API implementation by @erogol in https://github.com/coqui-ai/TTS/pull/2195
  • Update README by @erogol in https://github.com/coqui-ai/TTS/pull/2204
  • Update formatters.py by @p0p4k in https://github.com/coqui-ai/TTS/pull/2194
  • Adding OverFlow by @shivammehta25 in https://github.com/coqui-ai/TTS/pull/2183
  • Add YourTTS VCTK recipe by @Edresson in https://github.com/coqui-ai/TTS/pull/2198
  • Add Original YourTTS vocabulary on YourTTS recipe for full transfer learning by @Edresson in https://github.com/coqui-ai/TTS/pull/2206
  • Adding pre-trained Overflow model by @erogol in https://github.com/coqui-ai/TTS/pull/2211
  • Fixup overflow by @erogol in https://github.com/coqui-ai/TTS/pull/2218
  • Add Ukrainian LADA (female) voice by @egorsmkv in https://github.com/coqui-ai/TTS/pull/2226

New Contributors

  • @shivammehta25 made their first contribution in https://github.com/coqui-ai/TTS/pull/2183
  • @egorsmkv made their first contribution in https://github.com/coqui-ai/TTS/pull/2226

Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.9.0...v0.10.1

- Python
Published by erogol about 3 years ago

tts - v0.10.1_models

This release includes 3 new models

  • Multilingual YourTTS updated version based on the recent recipe.
  • Catalan multi-speaker VITS model by 👑@gullabi

console tts --model_name tts_models/ca/custom/vits --text "Ei, com estàs avui? Us desitjo unes bones festes." --speaker_idx d0cd44fcdae652efb0dd428cd1b8f1911e6eb2ca3469a1f2d6f9faf97a9d05e30f28387dfb81bfb4c97eba64187a0c047c85bf06998ccaec58781f3982626bb6 Note: Speaker names are quite long 😄 for this model - Parsian single-speaker female GlowTTS model by 👑@karim23657 (without compatible vocoder)

console tts --model_name tts_models/fa/custom/glow-tts --text "سلام امروز چطوری؟ تعطیلات خوشی را برای شما آرزو می کنم."

- Python
Published by erogol about 3 years ago

tts - v0.10.0

What's Changed

  • v0.9.0 by @erogol in https://github.com/coqui-ai/TTS/pull/1942
  • fixed tutorial 2 incompatibility with new dev by @Aya-AlJafari in https://github.com/coqui-ai/TTS/pull/2161
  • Fix capacitron test when cuda is enabled by @WeberJulian in https://github.com/coqui-ai/TTS/pull/2189
  • Fix VITS multi-speaker voice conversion inference by @Edresson in https://github.com/coqui-ai/TTS/pull/2187
  • Handle espeak 1.48.15 by @erogol in https://github.com/coqui-ai/TTS/pull/2203
  • Python API implementation by @erogol in https://github.com/coqui-ai/TTS/pull/2195
  • Update README by @erogol in https://github.com/coqui-ai/TTS/pull/2204
  • Update formatters.py by @p0p4k in https://github.com/coqui-ai/TTS/pull/2194
  • Adding OverFlow by @shivammehta25 in https://github.com/coqui-ai/TTS/pull/2183
  • Add YourTTS VCTK recipe by @Edresson in https://github.com/coqui-ai/TTS/pull/2198
  • Add Original YourTTS vocabulary on YourTTS recipe for full transfer learning by @Edresson in https://github.com/coqui-ai/TTS/pull/2206
  • Adding pre-trained Overflow model by @erogol in https://github.com/coqui-ai/TTS/pull/2211
  • Fixup overflow by @erogol in https://github.com/coqui-ai/TTS/pull/2218
  • 🚀 v0.10.0 by @erogol in https://github.com/coqui-ai/TTS/pull/2205

New Contributors

  • @shivammehta25 made their first contribution in https://github.com/coqui-ai/TTS/pull/2183

Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.9.0...v0.10.0

- Python
Published by erogol about 3 years ago

tts - v0.10.0_models

  • Overflow model trained on LJSpeech dataset using pretrained HifiGAN vocoder vocoder_models/en/ljspeech/hifigan_v2.

- Python
Published by erogol about 3 years ago

tts - v0.9.0

New models

  • Added 25 new models covering 25 different EU languages from 👑https://github.com/NeonGeckoCom/neon-tts-plugin-coqui

What's Changed

  • Trick to Upsampling to High sampling rates using VITS model by @Edresson in https://github.com/coqui-ai/TTS/pull/1456
  • Update Coqpit requirement by @Edresson in https://github.com/coqui-ai/TTS/pull/1539
  • Missing f prefix on f-strings fix by @code-review-doctor in https://github.com/coqui-ai/TTS/pull/1532
  • tiny improvement in data_path resolvement by @taras-sereda in https://github.com/coqui-ai/TTS/pull/1567
  • Fix VITS upsampling asserts by @Edresson in https://github.com/coqui-ai/TTS/pull/1550
  • Fix the bug in eSpeak wrapper for eSpeak version 1.48.15 by @Edresson in https://github.com/coqui-ai/TTS/pull/1560
  • 🐍 Python 3.10.x support and drop Python 3.6 support by @erogol in https://github.com/coqui-ai/TTS/pull/1565
  • Update CI tests by @erogol in https://github.com/coqui-ai/TTS/pull/1572
  • Build and publish CPU only Docker image by @erogol in https://github.com/coqui-ai/TTS/pull/1573
  • Add an assert for the upsampling trick by @erogol in https://github.com/coqui-ai/TTS/pull/1538
  • Add audio length sampler balancer by @Edresson in https://github.com/coqui-ai/TTS/pull/1561
  • Change the VITS upsampling interpolation trick to linear by @Edresson in https://github.com/coqui-ai/TTS/pull/1564
  • Capacitron by @a-froghyar in https://github.com/coqui-ai/TTS/pull/977
  • Fixed usecuda issue in computeembeddings.py by @ribeiromiranda in https://github.com/coqui-ai/TTS/pull/1587
  • Training recipes for thorsten dataset by @noranraskin in https://github.com/coqui-ai/TTS/pull/1020
  • fix invalid json by @s3781009 in https://github.com/coqui-ai/TTS/pull/1599
  • Use fsspec and torch for embedding file IO by @erogol in https://github.com/coqui-ai/TTS/pull/1581
  • Adding TTS Tutorials by @Aya-AlJafari in https://github.com/coqui-ai/TTS/pull/1584
  • Internal formatter by @WeberJulian in https://github.com/coqui-ai/TTS/pull/1629
  • Update trainingamodel.md by @klotlabs in https://github.com/coqui-ai/TTS/pull/1620
  • Add synpaflex formatter by @WeberJulian in https://github.com/coqui-ai/TTS/pull/1616
  • added support for model_info in CLI by @p0p4k in https://github.com/coqui-ai/TTS/pull/1623
  • Add Thorsten VITS model by @WeberJulian in https://github.com/coqui-ai/TTS/pull/1675
  • Checkpoint bug fix by @manmay-nakhashi in https://github.com/coqui-ai/TTS/pull/1641
  • docs : Adding in the arguments for CLI by @camillem in https://github.com/coqui-ai/TTS/pull/1469
  • Fix Publish CI by @WeberJulian in https://github.com/coqui-ai/TTS/pull/1597
  • Fix tokenizer for punc only by @WeberJulian in https://github.com/coqui-ai/TTS/pull/1717
  • Add durations as aux input for VITS by @WeberJulian in https://github.com/coqui-ai/TTS/pull/1694
  • feat: updated capacitron recipes and lr fix by @a-froghyar in https://github.com/coqui-ai/TTS/pull/1718
  • Implement VitsAudioConfig by @erogol in https://github.com/coqui-ai/TTS/pull/1556
  • Fix aux tests by @WeberJulian in https://github.com/coqui-ai/TTS/pull/1753
  • Fix for FloorDiv Function Warning by @iprovalo in https://github.com/coqui-ai/TTS/pull/1760
  • Update download_vctk.sh by @mengting7tw in https://github.com/coqui-ai/TTS/pull/1739
  • Update decoder.py by @p0p4k in https://github.com/coqui-ai/TTS/pull/1792
  • Update requirements.txt for python 3.10 support by @p0p4k in https://github.com/coqui-ai/TTS/pull/1791
  • Update README.md by @yuripourre in https://github.com/coqui-ai/TTS/pull/1776
  • Fix & update WaveRNN vocoder model by @vanIvan in https://github.com/coqui-ai/TTS/pull/1749
  • Update requirements.txt; inflect==5.6 by @p0p4k in https://github.com/coqui-ai/TTS/pull/1809
  • Update README.md; download progress bar in CLI. by @p0p4k in https://github.com/coqui-ai/TTS/pull/1797
  • Update wavenet.py by @p0p4k in https://github.com/coqui-ai/TTS/pull/1796
  • Adjust default to be able to process longer sentences by @lkiesow in https://github.com/coqui-ai/TTS/pull/1835
  • Fix language flags generated by espeak-ng phonemizer by @Lokhozt in https://github.com/coqui-ai/TTS/pull/1801
  • fix getrandomembeddings --> getrandomembedding by @manmay-nakhashi in https://github.com/coqui-ai/TTS/pull/1726
  • Introduce numpy and torch transforms by @erogol in https://github.com/coqui-ai/TTS/pull/1705
  • Implement bucketed weighted sampling for VITS by @erogol in https://github.com/coqui-ai/TTS/pull/1871
  • capacitron_layers multi speaker bug fix by @manmay-nakhashi in https://github.com/coqui-ai/TTS/pull/1664
  • updates to dataset analysis notebooks by @jreus in https://github.com/coqui-ai/TTS/pull/1853
  • Fix BCE loss issue by @erogol in https://github.com/coqui-ai/TTS/pull/1872
  • Remove deprecated files by @erogol in https://github.com/coqui-ai/TTS/pull/1873
  • Handle when no batch sampler by @erogol in https://github.com/coqui-ai/TTS/pull/1882
  • Fix tune wavegrad by @geth-network in https://github.com/coqui-ai/TTS/pull/1844
  • Add new DE Thorsten models by @erogol in https://github.com/coqui-ai/TTS/pull/1898
  • Add speaker encoder recipe by @Edresson in https://github.com/coqui-ai/TTS/pull/1912
  • Add capacitron v2 model by @WeberJulian in https://github.com/coqui-ai/TTS/pull/1768
  • Fixes a race condition with multiple simultaneous get requests. by @KyuubiYoru in https://github.com/coqui-ai/TTS/pull/1807
  • Fix find unique phonemes script by @Edresson in https://github.com/coqui-ai/TTS/pull/1928
  • Add YourTTS and SC-GlowTTS on available models by @Edresson in https://github.com/coqui-ai/TTS/pull/1933
  • Korean Phonemizer by @harmlessman in https://github.com/coqui-ai/TTS/pull/1822
  • Add espeak support for Chinese by @happylittlecat2333 in https://github.com/coqui-ai/TTS/pull/1905
  • Replace pyworld by pyin by @Edresson in https://github.com/coqui-ai/TTS/pull/1946
  • d-vector handling by @erogol in https://github.com/coqui-ai/TTS/pull/1945
  • Fixups by @erogol in https://github.com/coqui-ai/TTS/pull/1967
  • Fix VC by @WeberJulian in https://github.com/coqui-ai/TTS/pull/1971
  • Update readme by @erogol in https://github.com/coqui-ai/TTS/pull/1978
  • Add metafile arg to compute embeddings script by @erogol in https://github.com/coqui-ai/TTS/pull/1977
  • Fix dataset handling with the new embedding file keys by @Edresson in https://github.com/coqui-ai/TTS/pull/1991
  • Fix colliding dataset cache file names by @Edresson in https://github.com/coqui-ai/TTS/pull/1994
  • Write non-speech files in a TXT by @erogol in https://github.com/coqui-ai/TTS/pull/2048
  • Minor bug fixes on VITS/YourTTS and inference by @Edresson in https://github.com/coqui-ai/TTS/pull/2054
  • Check num of columns in coqui format by @erogol in https://github.com/coqui-ai/TTS/pull/2066
  • Remove / prefix from the relative path by @erogol in https://github.com/coqui-ai/TTS/pull/2065
  • Update Tutorial2trainyourfirstTTSmodel.ipynb by @CeadeS in https://github.com/coqui-ai/TTS/pull/2079
  • Update forward_tts.md by @mrshu in https://github.com/coqui-ai/TTS/pull/2019
  • Use "formatter" key in the datasets json array by @humada05 in https://github.com/coqui-ai/TTS/pull/2114
  • capacitron training fixes by @victor-shepardson in https://github.com/coqui-ai/TTS/pull/2086
  • mailabs formatter: back/forward slash in file path fix by @freezerain in https://github.com/coqui-ai/TTS/pull/1938
  • Add Discord server badge by @erogol in https://github.com/coqui-ai/TTS/pull/2136
  • Remove langs expect en and de by @erogol in https://github.com/coqui-ai/TTS/pull/2135
  • Cache fsspec downloaded files by @erogol in https://github.com/coqui-ai/TTS/pull/2132
  • Update dep caching in actions by @erogol in https://github.com/coqui-ai/TTS/pull/2138
  • Update README.md by @eltociear in https://github.com/coqui-ai/TTS/pull/2146
  • Makes docker images lighter by @WeberJulian in https://github.com/coqui-ai/TTS/pull/2149
  • Doc update docker by @WeberJulian in https://github.com/coqui-ai/TTS/pull/2153
  • Add neon models by @loganhart420 in https://github.com/coqui-ai/TTS/pull/2140
  • Fix documentation by @WeberJulian in https://github.com/coqui-ai/TTS/pull/2154

New Contributors

  • @code-review-doctor made their first contribution in https://github.com/coqui-ai/TTS/pull/1532
  • @taras-sereda made their first contribution in https://github.com/coqui-ai/TTS/pull/1567
  • @ribeiromiranda made their first contribution in https://github.com/coqui-ai/TTS/pull/1587
  • @s3781009 made their first contribution in https://github.com/coqui-ai/TTS/pull/1599
  • @Aya-AlJafari made their first contribution in https://github.com/coqui-ai/TTS/pull/1584
  • @klotlabs made their first contribution in https://github.com/coqui-ai/TTS/pull/1620
  • @p0p4k made their first contribution in https://github.com/coqui-ai/TTS/pull/1623
  • @manmay-nakhashi made their first contribution in https://github.com/coqui-ai/TTS/pull/1641
  • @camillem made their first contribution in https://github.com/coqui-ai/TTS/pull/1469
  • @iprovalo made their first contribution in https://github.com/coqui-ai/TTS/pull/1760
  • @mengting7tw made their first contribution in https://github.com/coqui-ai/TTS/pull/1739
  • @yuripourre made their first contribution in https://github.com/coqui-ai/TTS/pull/1776
  • @vanIvan made their first contribution in https://github.com/coqui-ai/TTS/pull/1749
  • @lkiesow made their first contribution in https://github.com/coqui-ai/TTS/pull/1835
  • @Lokhozt made their first contribution in https://github.com/coqui-ai/TTS/pull/1801
  • @jreus made their first contribution in https://github.com/coqui-ai/TTS/pull/1853
  • @geth-network made their first contribution in https://github.com/coqui-ai/TTS/pull/1844
  • @KyuubiYoru made their first contribution in https://github.com/coqui-ai/TTS/pull/1807
  • @harmlessman made their first contribution in https://github.com/coqui-ai/TTS/pull/1822
  • @happylittlecat2333 made their first contribution in https://github.com/coqui-ai/TTS/pull/1905
  • @CeadeS made their first contribution in https://github.com/coqui-ai/TTS/pull/2079
  • @mrshu made their first contribution in https://github.com/coqui-ai/TTS/pull/2019
  • @humada05 made their first contribution in https://github.com/coqui-ai/TTS/pull/2114
  • @victor-shepardson made their first contribution in https://github.com/coqui-ai/TTS/pull/2086
  • @freezerain made their first contribution in https://github.com/coqui-ai/TTS/pull/1938
  • @eltociear made their first contribution in https://github.com/coqui-ai/TTS/pull/2146

Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.6.2...v0.9.0

- Python
Published by erogol over 3 years ago

tts - v0.8.0

What's Changed

  • Trick to Upsampling to High sampling rates using VITS model by @Edresson in https://github.com/coqui-ai/TTS/pull/1456
  • Update Coqpit requirement by @Edresson in https://github.com/coqui-ai/TTS/pull/1539
  • Missing f prefix on f-strings fix by @code-review-doctor in https://github.com/coqui-ai/TTS/pull/1532
  • tiny improvement in data_path resolvement by @taras-sereda in https://github.com/coqui-ai/TTS/pull/1567
  • Fix VITS upsampling asserts by @Edresson in https://github.com/coqui-ai/TTS/pull/1550
  • Fix the bug in eSpeak wrapper for eSpeak version 1.48.15 by @Edresson in https://github.com/coqui-ai/TTS/pull/1560
  • 🐍 Python 3.10.x support and drop Python 3.6 support by @erogol in https://github.com/coqui-ai/TTS/pull/1565
  • Update CI tests by @erogol in https://github.com/coqui-ai/TTS/pull/1572
  • Build and publish CPU only Docker image by @erogol in https://github.com/coqui-ai/TTS/pull/1573
  • Add an assert for the upsampling trick by @erogol in https://github.com/coqui-ai/TTS/pull/1538
  • Add audio length sampler balancer by @Edresson in https://github.com/coqui-ai/TTS/pull/1561
  • Change the VITS upsampling interpolation trick to linear by @Edresson in https://github.com/coqui-ai/TTS/pull/1564
  • Capacitron by @a-froghyar in https://github.com/coqui-ai/TTS/pull/977
  • Fixed usecuda issue in computeembeddings.py by @ribeiromiranda in https://github.com/coqui-ai/TTS/pull/1587
  • Training recipes for thorsten dataset by @noranraskin in https://github.com/coqui-ai/TTS/pull/1020
  • fix invalid json by @s3781009 in https://github.com/coqui-ai/TTS/pull/1599
  • Use fsspec and torch for embedding file IO by @erogol in https://github.com/coqui-ai/TTS/pull/1581
  • Adding TTS Tutorials by @Aya-AlJafari in https://github.com/coqui-ai/TTS/pull/1584
  • Internal formatter by @WeberJulian in https://github.com/coqui-ai/TTS/pull/1629
  • Update trainingamodel.md by @klotlabs in https://github.com/coqui-ai/TTS/pull/1620
  • Add synpaflex formatter by @WeberJulian in https://github.com/coqui-ai/TTS/pull/1616
  • added support for model_info in CLI by @p0p4k in https://github.com/coqui-ai/TTS/pull/1623
  • Add Thorsten VITS model by @WeberJulian in https://github.com/coqui-ai/TTS/pull/1675
  • Checkpoint bug fix by @manmay-nakhashi in https://github.com/coqui-ai/TTS/pull/1641
  • docs : Adding in the arguments for CLI by @camillem in https://github.com/coqui-ai/TTS/pull/1469
  • Fix Publish CI by @WeberJulian in https://github.com/coqui-ai/TTS/pull/1597
  • Fix tokenizer for punc only by @WeberJulian in https://github.com/coqui-ai/TTS/pull/1717
  • Add durations as aux input for VITS by @WeberJulian in https://github.com/coqui-ai/TTS/pull/1694
  • feat: updated capacitron recipes and lr fix by @a-froghyar in https://github.com/coqui-ai/TTS/pull/1718
  • Implement VitsAudioConfig by @erogol in https://github.com/coqui-ai/TTS/pull/1556
  • Fix aux tests by @WeberJulian in https://github.com/coqui-ai/TTS/pull/1753
  • Fix for FloorDiv Function Warning by @iprovalo in https://github.com/coqui-ai/TTS/pull/1760
  • Update download_vctk.sh by @mengting7tw in https://github.com/coqui-ai/TTS/pull/1739
  • Update decoder.py by @p0p4k in https://github.com/coqui-ai/TTS/pull/1792
  • Update requirements.txt for python 3.10 support by @p0p4k in https://github.com/coqui-ai/TTS/pull/1791
  • Update README.md by @yuripourre in https://github.com/coqui-ai/TTS/pull/1776
  • Fix & update WaveRNN vocoder model by @vanIvan in https://github.com/coqui-ai/TTS/pull/1749
  • Update requirements.txt; inflect==5.6 by @p0p4k in https://github.com/coqui-ai/TTS/pull/1809
  • Update README.md; download progress bar in CLI. by @p0p4k in https://github.com/coqui-ai/TTS/pull/1797
  • Update wavenet.py by @p0p4k in https://github.com/coqui-ai/TTS/pull/1796
  • Adjust default to be able to process longer sentences by @lkiesow in https://github.com/coqui-ai/TTS/pull/1835
  • Fix language flags generated by espeak-ng phonemizer by @Lokhozt in https://github.com/coqui-ai/TTS/pull/1801
  • fix getrandomembeddings --> getrandomembedding by @manmay-nakhashi in https://github.com/coqui-ai/TTS/pull/1726
  • Introduce numpy and torch transforms by @erogol in https://github.com/coqui-ai/TTS/pull/1705
  • Implement bucketed weighted sampling for VITS by @erogol in https://github.com/coqui-ai/TTS/pull/1871
  • capacitron_layers multi speaker bug fix by @manmay-nakhashi in https://github.com/coqui-ai/TTS/pull/1664
  • updates to dataset analysis notebooks by @jreus in https://github.com/coqui-ai/TTS/pull/1853
  • Fix BCE loss issue by @erogol in https://github.com/coqui-ai/TTS/pull/1872
  • Remove deprecated files by @erogol in https://github.com/coqui-ai/TTS/pull/1873
  • Handle when no batch sampler by @erogol in https://github.com/coqui-ai/TTS/pull/1882
  • Fix tune wavegrad by @geth-network in https://github.com/coqui-ai/TTS/pull/1844
  • Add new DE Thorsten models by @erogol in https://github.com/coqui-ai/TTS/pull/1898

New Contributors

  • @code-review-doctor made their first contribution in https://github.com/coqui-ai/TTS/pull/1532
  • @taras-sereda made their first contribution in https://github.com/coqui-ai/TTS/pull/1567
  • @ribeiromiranda made their first contribution in https://github.com/coqui-ai/TTS/pull/1587
  • @s3781009 made their first contribution in https://github.com/coqui-ai/TTS/pull/1599
  • @Aya-AlJafari made their first contribution in https://github.com/coqui-ai/TTS/pull/1584
  • @klotlabs made their first contribution in https://github.com/coqui-ai/TTS/pull/1620
  • @p0p4k made their first contribution in https://github.com/coqui-ai/TTS/pull/1623
  • @manmay-nakhashi made their first contribution in https://github.com/coqui-ai/TTS/pull/1641
  • @camillem made their first contribution in https://github.com/coqui-ai/TTS/pull/1469
  • @iprovalo made their first contribution in https://github.com/coqui-ai/TTS/pull/1760
  • @mengting7tw made their first contribution in https://github.com/coqui-ai/TTS/pull/1739
  • @yuripourre made their first contribution in https://github.com/coqui-ai/TTS/pull/1776
  • @vanIvan made their first contribution in https://github.com/coqui-ai/TTS/pull/1749
  • @lkiesow made their first contribution in https://github.com/coqui-ai/TTS/pull/1835
  • @Lokhozt made their first contribution in https://github.com/coqui-ai/TTS/pull/1801
  • @jreus made their first contribution in https://github.com/coqui-ai/TTS/pull/1853
  • @geth-network made their first contribution in https://github.com/coqui-ai/TTS/pull/1844

Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.6.2...v0.8.0

- Python
Published by erogol over 3 years ago

tts - v0.8.0 models

✨New models ✨ from @thorstenMueller
✨New models ✨ from @NeonGeckoCom

- Python
Published by erogol over 3 years ago

tts - v0.7.1 models

Add capacitron V2 model to TTS zoo. It's more stable and just as expressive!

- Python
Published by WeberJulian over 3 years ago

tts - v0.7.1

What's Changed

  • v0.7.1 by @erogol in https://github.com/coqui-ai/TTS/pull/1676

Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.7.0...v0.7.1

- Python
Published by erogol over 3 years ago

tts - v0.7.0

What's Changed

  • Trick to Upsampling to High sampling rates using VITS model by @Edresson in https://github.com/coqui-ai/TTS/pull/1456
  • Update Coqpit requirement by @Edresson in https://github.com/coqui-ai/TTS/pull/1539
  • Missing f prefix on f-strings fix by @code-review-doctor in https://github.com/coqui-ai/TTS/pull/1532
  • tiny improvement in data_path resolvement by @taras-sereda in https://github.com/coqui-ai/TTS/pull/1567
  • Fix VITS upsampling asserts by @Edresson in https://github.com/coqui-ai/TTS/pull/1550
  • Fix the bug in eSpeak wrapper for eSpeak version 1.48.15 by @Edresson in https://github.com/coqui-ai/TTS/pull/1560
  • 🐍 Python 3.10.x support and drop Python 3.6 support by @erogol in https://github.com/coqui-ai/TTS/pull/1565
  • Update CI tests by @erogol in https://github.com/coqui-ai/TTS/pull/1572
  • Build and publish CPU only Docker image by @erogol in https://github.com/coqui-ai/TTS/pull/1573
  • Add an assert for the upsampling trick by @erogol in https://github.com/coqui-ai/TTS/pull/1538
  • Add audio length sampler balancer by @Edresson in https://github.com/coqui-ai/TTS/pull/1561
  • Change the VITS upsampling interpolation trick to linear by @Edresson in https://github.com/coqui-ai/TTS/pull/1564
  • Capacitron by @a-froghyar in https://github.com/coqui-ai/TTS/pull/977
  • Fixed usecuda issue in computeembeddings.py by @ribeiromiranda in https://github.com/coqui-ai/TTS/pull/1587
  • Training recipes for thorsten dataset by @noranraskin in https://github.com/coqui-ai/TTS/pull/1020
  • fix invalid json by @s3781009 in https://github.com/coqui-ai/TTS/pull/1599
  • Use fsspec and torch for embedding file IO by @erogol in https://github.com/coqui-ai/TTS/pull/1581
  • Adding TTS Tutorials by @Aya-AlJafari in https://github.com/coqui-ai/TTS/pull/1584
  • Internal formatter by @WeberJulian in https://github.com/coqui-ai/TTS/pull/1629
  • Update trainingamodel.md by @klotlabs in https://github.com/coqui-ai/TTS/pull/1620
  • Add synpaflex formatter by @WeberJulian in https://github.com/coqui-ai/TTS/pull/1616
  • added support for model_info in CLI by @p0p4k in https://github.com/coqui-ai/TTS/pull/1623
  • v0.7.0 by @erogol in https://github.com/coqui-ai/TTS/pull/1537

New Contributors

  • @code-review-doctor made their first contribution in https://github.com/coqui-ai/TTS/pull/1532
  • @taras-sereda made their first contribution in https://github.com/coqui-ai/TTS/pull/1567
  • @ribeiromiranda made their first contribution in https://github.com/coqui-ai/TTS/pull/1587
  • @s3781009 made their first contribution in https://github.com/coqui-ai/TTS/pull/1599
  • @Aya-AlJafari made their first contribution in https://github.com/coqui-ai/TTS/pull/1584
  • @klotlabs made their first contribution in https://github.com/coqui-ai/TTS/pull/1620
  • @p0p4k made their first contribution in https://github.com/coqui-ai/TTS/pull/1623

Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.6.2...v0.7.0

- Python
Published by erogol over 3 years ago

tts - Speaker Encoder Model

Speaker encoder model and config file.

- Python
Published by Aya-AlJafari over 3 years ago

tts - v0.7.0 models

  • English Capacitron-T2 models + corresponding HiFiGAN V2 vocoder. Implemented and trained by 👑@a-froghyar
  • German VITS model trained on Thorsten Dataset by 👑@thorstenMueller 👑@domcross

- Python
Published by WeberJulian almost 4 years ago

tts - v0.6.2

What's Changed

  • Fix multilingual recipe by @Edresson in https://github.com/coqui-ai/TTS/pull/1354
  • Fix recipes as to the recent API changes. by @erogol in https://github.com/coqui-ai/TTS/pull/1367
  • Add docsqa to docs website by @nomagick in https://github.com/coqui-ai/TTS/pull/1363
  • REBASED: Add support for the speaker encoder training using torch spectrograms by @Edresson in https://github.com/coqui-ai/TTS/pull/1348
  • Add alphas to control language and speaker balancer by @Edresson in https://github.com/coqui-ai/TTS/pull/1216
  • Add Voice conversion inference support by @Edresson in https://github.com/coqui-ai/TTS/pull/1337
  • Update issue template by @erogol in https://github.com/coqui-ai/TTS/pull/1370
  • Open bible dataset formatter by @Edresson in https://github.com/coqui-ai/TTS/pull/1365
  • REBASED: Transform Speaker Encoder in a Generic Encoder and Implement Emotion Encoder training support by @Edresson in https://github.com/coqui-ai/TTS/pull/1349
  • Fix typo workflow text by @WeberJulian in https://github.com/coqui-ai/TTS/pull/1403
  • Add CITATION.cff by @erogol in https://github.com/coqui-ai/TTS/pull/1404
  • Fix default phonemizer for ja and zh by @WeberJulian in https://github.com/coqui-ai/TTS/pull/1399
  • Make Style by @erogol in https://github.com/coqui-ai/TTS/pull/1405
  • Fix #1380 by @erogol in https://github.com/coqui-ai/TTS/pull/1409
  • Hinge Gruut version to 2.2.3 by @erogol in https://github.com/coqui-ai/TTS/pull/1419
  • Update CheckSpectrograms notebook by @erogol in https://github.com/coqui-ai/TTS/pull/1418
  • Fix #1423 by @Edresson in https://github.com/coqui-ai/TTS/pull/1424
  • Update model file extension by @erogol in https://github.com/coqui-ai/TTS/pull/1422
  • Fix model manager by @erogol in https://github.com/coqui-ai/TTS/pull/1436
  • Add formatting tests by @WeberJulian in https://github.com/coqui-ai/TTS/pull/1437
  • Update base model wrt 👟 by @erogol in https://github.com/coqui-ai/TTS/pull/1406
  • Replace webrtcvad by silero-vad by @Edresson in https://github.com/coqui-ai/TTS/pull/1431
  • Bug fix in freeze encoder by @Edresson in https://github.com/coqui-ai/TTS/pull/1391
  • Enforce phonemizer definition for synthesis by @WeberJulian in https://github.com/coqui-ai/TTS/pull/1441
  • Fix G2P backend of the released models by @WeberJulian in https://github.com/coqui-ai/TTS/pull/1461
  • Add EmbeddingManager and BaseIDManager by @Edresson in https://github.com/coqui-ai/TTS/pull/1374
  • Update requirements coqui_trainer -> trainer by @erogol in https://github.com/coqui-ai/TTS/pull/1478
  • Update CONTRIBUTING.md, fix header by @Jackiexiao in https://github.com/coqui-ai/TTS/pull/1463
  • Add African models by @WeberJulian in https://github.com/coqui-ai/TTS/pull/1511
  • Print Model's license when downloading by @erogol in https://github.com/coqui-ai/TTS/pull/1512
  • Improve docsQA default questions by @nomagick in https://github.com/coqui-ai/TTS/pull/1411
  • patch print license by @WeberJulian in https://github.com/coqui-ai/TTS/pull/1514
  • v0.6.2 by @erogol in https://github.com/coqui-ai/TTS/pull/1353

New Contributors

  • @nomagick made their first contribution in https://github.com/coqui-ai/TTS/pull/1363
  • @Jackiexiao made their first contribution in https://github.com/coqui-ai/TTS/pull/1463

Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.6.1...v0.6.2

- Python
Published by erogol almost 4 years ago

tts - v0.6.2 models

This release add 6 new VITS models for languages of the openbible dataset.

  • ewe
  • hausa
  • lingala
  • yoruba
  • asante-twi
  • akuapem-twi

Original work (audio and text) by Biblica available for free at www.biblica.com and open.bible.

- Python
Published by WeberJulian almost 4 years ago

tts - v0.6.1 models

What's Changed

  • Renamed all checkpoints from model_file.pth.tar to model_file.pth
  • Tested and fixed for all tts models the "phonemizer" backend key in config

For best performance, you should use the commit version attached to each model

- Python
Published by WeberJulian almost 4 years ago

tts - v0.6.1

- Python
Published by erogol almost 4 years ago

tts - v0.6.0

What's Changed

Tokenizer API

Tokenizer API is defined by the TTSTokenizer class. It is intended to provide all the text processing functionalities to a tts model. New tokenizers can also be added by subclassing the TTSTokenizer class.

Phonemizer API

Phonemizer API is defined by the BasePhonemizer class and implemented by the ESpeak and Gruut wrappers, ZHCH, JPJA phonemizers. New phonemizers can be added by implementing the BasePhonemizer class.

BaseCharacters

BaseCharacters class provides an API to define the model vocabulary and provide the dictionary to map characters to token IDs and back. There are two pre-defined classes inheriting from BaseCharacters. IPAPhonemes and Graphemes that respectively define the IPA phoneme character set for models using phonemes and grapheme set for models using raw characters.

Punctuations class

Punctuations class to strip out punctuations and restore them when needed.

Language specific text normalization routines under TTS.tts.utils.text

Under TTS.tts.utils.text there are folders for each language to accommodate the text normalization routines that are designed for the language.

👟Trainer

We separate the trainer as a new repo 👟Trainer. It is a general-purpose model trainer for Pytorch with certain design choices in mind.

  • Support for different experiment tracking dashboards like ClearML, Tensorboard, MLFlow, and W&Bs.
  • Flexible to train any kind of DL model.
  • Simple code base and easily expandable.
  • Easy to debug.

It is a very early-stage and monolithic library currently. Feel free to share your ✨feedback✨ and ✨contribute✨.

VITS implementation update

With this version of VITS model, we get rid of some of the issues that affect the model performance. It also illustrates well how you could adapt any open-source model implementation to 🐸TTS and 👟Trainer without even knowing the rest for 🐸TTS library.

Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.5.0...v0.6.0

New Models

  • GlowTTS + HifiGAN Turkish by 👑Fatih Akademi

    console $ tts --model_name tts_models/tr/common-voice/glow-tts --text "Bu bizim için oluşturulmuş bir örnek sevgili dostum."

  • VITS and GlowTTS Italian by 👑@nicolalandro using MAI Italian male and female subsets.

    Female VITS model console $ tts --model_name tts_models/it/mai_female/vits --text "Questo è un esempio per noi, mio <200b><200b>caro amico."

    Male VITS model console $ tts --model_name tts_models/it/mai_male/vits --text "Questo è un esempio per noi, mio <200b><200b>caro amico."

- Python
Published by erogol almost 4 years ago

tts - v0.6.0 models

- Python
Published by erogol almost 4 years ago

tts - v0.5.0

What's Changed

  • Fix some setup papercuts by @reuben in https://github.com/coqui-ai/TTS/pull/1022
  • Add additional datasets by @loganhart420 in https://github.com/coqui-ai/TTS/pull/1021
  • Add UK vocoder models by @erogol in https://github.com/coqui-ai/TTS/pull/1031
  • Add multilingual models support by @erogol in https://github.com/coqui-ai/TTS/pull/1007
  • Implement YourTTS by @WeberJulian and @Edresson
  • Fixes before YourTTS merge by @WeberJulian in https://github.com/coqui-ai/TTS/pull/1044
  • Fix language assignment by @erogol in https://github.com/coqui-ai/TTS/pull/1047
  • Fix if else statement by @erogol in https://github.com/coqui-ai/TTS/pull/1050
  • Fix train_tts.py and uncomment code by @WeberJulian in https://github.com/coqui-ai/TTS/pull/1051
  • v0.5.0 by @erogol in https://github.com/coqui-ai/TTS/pull/1027

New Contributors

  • @reuben made their first contribution in https://github.com/coqui-ai/TTS/pull/1022
  • @loganhart420 made their first contribution in https://github.com/coqui-ai/TTS/pull/1021

Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.4.2...v0.5.0

- Python
Published by erogol about 4 years ago

tts - v0.5.0_models

Model releases accompanying v0.5.0

- Python
Published by erogol about 4 years ago

tts - v0.4.2

What's Changed

  • Model zoo tests by @erogol in https://github.com/coqui-ai/TTS/pull/900
  • v0.4.2 by @erogol in https://github.com/coqui-ai/TTS/pull/901
  • Optional silence trimming during inference and find_endpoint() fix by @george-roussos in https://github.com/coqui-ai/TTS/pull/898
  • Update gruut to version 2.0 by @synesthesiam in https://github.com/coqui-ai/TTS/pull/882
  • Documentation corrections for finetuning and data preparation by @gullabi in https://github.com/coqui-ai/TTS/pull/931
  • server: fix compatibility with ttsmodels/en/ljspeech/fastpitch by @Mic92 in https://github.com/coqui-ai/TTS/pull/893
  • v0.4.2 by @erogol in https://github.com/coqui-ai/TTS/pull/914

New Contributors

  • @george-roussos made their first contribution in https://github.com/coqui-ai/TTS/pull/898
  • @gullabi made their first contribution in https://github.com/coqui-ai/TTS/pull/931

Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.4.1...v0.4.2

- Python
Published by erogol about 4 years ago

tts - v0.4.1

What's Changed

  • v0.4.1 by @erogol in https://github.com/coqui-ai/TTS/pull/891

Full Changelog: https://github.com/coqui-ai/TTS/compare/v0.4.0...v0.4.1

- Python
Published by erogol over 4 years ago

tts - v0.4.0

🐸 v0.4.0

  • Update multi-speaker training API.
  • VCTK recipes for all the TTS models.
  • Documentation for multi-speaker training.
  • Pre-trained Ukrainian GlowTTS model from 👑 https://github.com/robinhad/ukrainian-tts
  • Pre-trained FastPitch VCTK model
  • Dataset downloaders for LJSpeech and VCTK under TTS.utils.downloaders
  • Documentation reformatting.
  • Trainer V2 and compact. updates in model implementations.

    This update makes the Trainer V2 responsible for only the training of a model. The rest is excluded from the trainer and they need to be done either in the model or before calling the trainer.

Try out new models

  • Pre-trained FastPitch VCTK model

bash tts --model_name tts_models/en/vctk/fast_pitch --text "This is my sample text to voice." --speaker_idx VCTK_p229

  • Pre-trained Ukrainian GlowTTS model from 👑 https://github.com/robinhad/ukrainian-tts

bash tts --model_name tts_models/uk/mai/glow-tts --text "Це зразок тексту, щоб спробувати нашу модель."

- Python
Published by erogol over 4 years ago

tts - v0.3.1

Mostly fixes for GlowTTS training.

- Python
Published by erogol over 4 years ago

tts - v0.3.0

🐸 v0.3.0

New ForwardTTS implementation.

This version implements a new ForwardTTS interface that can be configured as any feed-forward TTS model that uses a duration predictor at inference time. Currently, we provide 3 pre-configured models and plan to implement one more.

  1. SpeedySpeech
  2. FastSpeech
  3. FastPitch
  4. FastSpeech 2 (TODO)

Through this API, any model can be trained in two ways. Either using pre-computed durations from a pre-trained Tacotron model or using an alignment network to learn durations from the dataset. The alignment network is only used at training and discarded at inference. You can set which mode you want to use by just setting the use_aligner field in the configuration.

This new API will help us to design more efficient inference run-time for all these models using ONNX like run-time optimizers.

Old FastPitch and SpeedySpeech implementations are deprecated for the sake of this new implementation.

Fine-Tuning Documentation

This version introduces documentation for model fine-tunning. You can see it under https://tts.readthedocs.io/ when this is merged.

New Model Releases

  • English Speedy Speech model on LJSpeech

Try out: bash tts --text "This is a sample text for my model to speak." --model_name tts_models/en/ljspeech/speedy-speech

  • Fine-tuned UnivNet Vocoder

Try out: bash tts --text "This is how it is." --model_name tts_models/en/ljspeech/tacotron2-DDC_ph

- Python
Published by erogol over 4 years ago

tts - v0.2.2

🐸 v0.2.2

FastPitch model with an Aligner Network is implemented with other changes accompanying it.

  • Alignment Network: https://arxiv.org/abs/2108.10447
  • Fast Pitch Model: https://arxiv.org/abs/2006.06873

Thanks to 👑 @kaiidams for his Japanese g2p update.

Try FastPitch model: bash tts --model_name tts_models/en/ljspeech/fast_pitch --text "This is my sample text to voice."

- Python
Published by erogol over 4 years ago

tts - v0.2.1

🐸 v0.2.1

🐞Bug Fixes

  • Fix distributed training and solve compact issues with the Trainer API.
  • Fix bugs in the VITS model implementation that caused training instabilities.
  • Fix some Abstract Class usage issues in WaveRNN and WaveGrad models.

💾 Code updates

  • Use a single gradient scaler for all the optimizers in TrainerAPI. Previously, we used one scaler per optimizer.

🏃‍♀️Operational Updates

  • Update to Pylint 2.10.2

Thanks to 👑 @fijipants for his fixes 🛠️ Thanks to 👑 @agrinh for his flag and discussion in DDP issues

- Python
Published by erogol over 4 years ago

tts - v0.2.0

🐸 v0.2.0

🐞Bug Fixes

  • Fix phoneme pre-compute issue.
  • Fix multi-speaker setup in Tacotron models.
  • Fix small issues in the Trainer regarding multi-optimizer training.

💾 Code updates

  • W&B integration for model logging and experiment tracking, (👑 @AyushExel) Code uses the Tensorboard by default. For W&B, you need to set log_dashboard option in the config and define project_name and wandb_entity.
  • Use ffsspec for model saving/loading (👑 @agrinh)
  • Allow models to define their own symbol list with in-class make_symbols()
  • Allow choosing after epoch or after step LR scheduler update with scheduler_after_epoch.
  • Make converting spectrogram from amplitude to DB optional with do_amp_to_db_linear and do_amp_to_db_linear options.

🗒️ Docs updates

  • Add GlowTTS and VITS docs.

🤖 Model implementations

  • VITS implementation with pre-trained models (https://arxiv.org/abs/2106.06103)

🚀 Model releases

  • vocodermodels--ja--kokoro--hifiganv1 (👑 @kaiidams)

    HiFiGAN model trained on Kokoro dataset to complement the existing Japanese model.

    Try it out:

    bash tts --model_name tts_models/ja/kokoro/tacotron2-DDC --text "こんにちは、今日はいい天気ですか?"

  • ttsmodels--en--ljspeech--tacotronDDCph

    TacotronDDC with phonemes trained on LJSpeech. It is to fix the pronunciation errors caused by the raw text in the released TacotronDDC model.

    Try it out:

    bash tts --model_name tts_models/en/ljspeech/tacotronDDC_ph --text "hello, how are you today?"

  • tts_models--en--ljspeech--vits

    VITS model trained on LJSpeech.

    Try it out:

    bash tts --model_name tts_models/en/ljspeech/vits --text "hello, how are you today?"

  • tts_models--en--vctk--vits

    VITS model trained on VCTK with multi-speaker support.

    Try it out:

    bash tts-server --model_name tts_models/en/vctk/vits

  • vocoder_models--en--ljspeech--univnet

    UnivNet model trained on LJSpeech to complement the TacotronDDC model above.

    Try it out:

    bash tts --model_name tts_models/en/ljspeech/tacotronDDC_ph --text "hello, how are you today?"

- Python
Published by erogol over 4 years ago

tts - v0.1.3

🐸 v0.1.3

🐞Bug Fixes

  • Fix Tacotron stopnet training

    Models trained after v0.1 had the problem that the stopnet was not trained. It caused models not to generate audio at evaluation and inference time.

  • Fix test_run at training. (👑 @WeberJulian)

    In training :frog: TTS would skip the test_run and not generate test audio samples. Now it is fixed :).

  • Fix server.py for multi-speaker models.

💾 Code updates

  • Refactoring in compute_embeddings.py for efficiency and compatibility with the latest speaker encoder. (👑 @Edresson)

🚀 Model releases

  • New Fullband-MelGAN model for Thorsten German dataset. (👑 @thorstenMueller)

Try it:

python tts --model_name tts_models/de/thorsten/tacotron2-DCA --text "Was geschehen ist geschehen, es ist geschichte."

- Python
Published by erogol over 4 years ago

tts - v0.1.2

- Python
Published by erogol over 4 years ago

tts - v0.1.1

  • Fix #607 and #608

- Python
Published by erogol over 4 years ago

tts - v0.1.0

🐸 v0.1.0

In a nutshell, there are a ton of updates in this release. I don't know if we can cover them all here but let's try.

After this release, 🐸 TTS stands on the following architecture.

  • Trainer API for training.
  • Synthesizer API for inference.
  • ModelManager API for managing 🐸TTS model zoo.
  • SpeakerManager API for managing speakers in a multi-speaker setting.
  • (TBI) Exporter API for exporting models to ONNX, TorchScript, etc.
  • (TBI) Data Processing API for making a dataset ready for training.
  • Model API for implementing models, compatible with all the other components above.

Updates

💾 Code updates

  • Brand new Trainer API

    We unified all the training code in a lightweight but feature complete Trainer API. From now on all the 🐸TTS models will use this new API for training.

    It provides mixed precision (with Nvidia's APEX of torch.amp) and multi-gpu training for all the models.

  • Brand new Model API

    Abstract BaseModel and its BaseTTS, BaseVocoder child classes are used as the basis of the 🐸TTS models now. Any model that implements one of these classes, works seamlessly with the Trainer and Synthesizer.

  • Brand new 🐸TTS recipes.

    We decided to merge the recipes to the main project. Now we host recipes for the LJspeech dataset, covering all the implemented models. So you can pick the model you want, change the parameters, and train your own model easily.

    Thanks to the new Trainer API and 👩‍✈️Coqpit integration, we could implement these recipes with pure python.

  • Updates SpeakerManager API

    TTS.utilsSpeakerManager is now the core unit to manage speakers in a multi-speaker model and interface a SpeakerEncoder model with the tts and vocoder models.

  • Updated model training mechanics.

    You can now use pure Python to define your model and run the training. It is useful to train models on a Jupyter Notebook or the other python environments.

    We also keep the old mechanics by using TTS/bin/train_tts.py or `TTS/bin/train_vocoder.py. You just need to change the previous training script name with one of these two based on your model.

    bash python TTS/bin/train_tacotron.py --config_path config.json

    becomes

    bash python TTS/bin/train_tts.py --config_path config.json

  • Use 👩‍✈️Coqpit for managing model class arguments.

    Now all the model arguments are defined in a coqpit class and imported by the model config.

  • gruut based character to phoneme conversion. (👑 @synesthesiam)

    As a drop-in replacement for the previous solution that is compatible with the released models. So now all these models are functional again without version nitpicking.

  • Set test_sentences in the config rather than providing a txt file.

  • Set the maximum number of decoder steps of Tacotron1-2 models in the config.

🏃‍♀️ Operational Updates

  • FINALLY DOCUMENTATION!! https://tts.readthedocs.io
  • Enable support for Python 3.9
  • Changes for PyTorch 1.9.0

🏅 Model implementations

  • Univnet GAN Vocoder: https://arxiv.org/pdf/2106.07889.pdf (👑 @rishikksh20)

🚀 Model releases

We solved the compat issues and re-release some of the models. You can see them in the released binaries section.

You don't need to change anything. If you use v0.1.0, by default, it uses these new models.

- Python
Published by erogol over 4 years ago

tts - v0.0.15.1

  • Fix manifest file to include VERSION for pypi distribution.

- Python
Published by erogol over 4 years ago

tts - v0.0.15

🐸 v0.0.15

🐞Bug Fixes

  • [x] Fix tb_logger init for rank > 0 processes in distributed training.

💾 Code updates

  • [x] Refactoring and optimization in the speaker encoder module. (:crown: @Edresson )
  • [x] Replacing unidecode with anyascii
  • [x] Japanese text to phoneme conversion. (:crown: @kaiidams)
  • [x] Japanese tts recipe to train Tacotron2-DDC on Kokoro dataset (:crown: @kaiidams)

:walking_woman: Operational Updates

  • [x] Start using pylint == 2.8.3
  • [x] Reorg tests files.
  • [x] Upload to pypi automatically on release.
  • [x] Move VERSION file under TTS folder.

🏅 Model implementations

  • [x] New Speaker Encoder implementation based on https://arxiv.org/abs/2009.14153 (:crown: @Edresson )

🚀 New Pre-Trained Model Releases

  • [x] Japanese Tacotron model (:crown: @kaiidams)

:bulb: All the models below are available by tts or tts-server endpoints on CLI as explained here.

- Python
Published by erogol over 4 years ago

tts - v0.0.14

:frog: v0.0.14

🐞Bug Fixes

  • [x] Remove breaking line from Tacotron models. (👑 @a-froghyra)

💾 Code updates

  • [x] BREAKING: Coqpit integration for config management and the first 🐸TTS recipe, for LJSpeech Check #476.

Every model now tied to a Python class that defines the configuration scheme. It provides a better interface and lets the user know better what are the default values, expected value types, and mandatory fields.

Specific model configs are defined under TTS/tts/configs and TTS/vocoder/configs. TTS/config/shared_configs.py hosts configs that are shared by all the :frog: TTS models. Configs shared by tts models are hosted under TTS/tts/configs/shared_configs.py and shared by vocoder models are under TTS/vocoder/configs/shared_config.py.

For example TacotronConfig follows BaseTrainingConfig -> BaseTTSConfig -> TacotronConfig. - [x] BREAKING: Remove phonemizer support due to License conflict.

This essentially deprecates the support for all the models using phonemes as input. Feel free to suggest in-place options if you are affected by this change.

  • [x] Start hosting :woman_cook: recipes under :frog: TTS. The first recipe is for Tacotron2-DDC with LJspeech dataset under TTS/recipes/.

Please check here for more details. - [x] Add extract_tts_spectrograms.py that supports GlowTTS and Tacotron1-2. (👑 @Edresson) - [x] Add version.py (👑 @chmodsss)

- Python
Published by erogol almost 5 years ago

tts - v0.0.13

:frog: v0.0.13

🐞Bug Fixes

💾 Code updates

  • SpeakerManager class for handling multi-speaker model management and interfacing speaker.json file.
  • Enabling multi-speaker models with tts and tts-server endpoints. (:crown: @kirianguiller )
  • Allow choosing a different noise scale for GlowTTS at inference.
  • Glow-TTS updates to import SC-Glow Models.
  • Fixing windows support (:crown: @WeberJulian )

:walking_woman: Operational Updates

  • Refactoring :frog: TTS installation and allow selecting different scopes (all, tf, notebooks)for installation depending on the specific needs.

🏅 Model implementations

🚀 New Pre-Trained Model Releases

  • SC-GlowTTS multi-speaker English model from our work https://arxiv.org/abs/2104.05557 (:crown: @Edresson )
  • HiFiGAN vocoder finetuned for the above model.
  • Tacotron DDC Non-Binary English model using Accenture's Sam dataset.
  • HiFiGAN vocoder trained for the models above.

Released Models

:bulb: All the models below are available by tts or tts-server endpoints on CLI as explained here.

Models with ✨️ below are new with this release.

  • SC-GlowTTS model is from our latest paper in a collaboration with @Edresson and @mueller91.
  • The new non-binary TTS model is trained using the SAM dataset from Accenture Labs. Check out their blog post

| Language | Dataset | Model Name | Model Type | TTS version | Download| | -- | -- | -- | -- | -- | :-: | :sparkles: English (non-binary) | sam (acccenture) | Tacotron2-DDC | tts | :smile: v0.0.13 | :floppy_disk:
:sparkles: English (multi-speaker) | VCTK | SC-GlowTTS | tts | :smile: v0.0.13| :floppy_disk: | English | LJSpeech | Tacotron-DDC | tts | v0.0.12| :floppy_disk: German | Thorsten-DE | Tacotron-DCA | tts | v0.0.11 |:floppy_disk: German | Thorsten-DE | Wavegrad | vocoder |v0.0.11 |:floppy_disk: English | LJSpeech | SpeedySpeech | tts | v0.0.10 |:floppy_disk: English | EK1 | Tacotron2 | tts |v0.0.10 |:floppy_disk: Dutch | MAI | TacotronDDC | tts | v0.0.10 |:floppy_disk: Chinese | Baker | TacotronDDC-GST | tts | v0.0.10 |:floppy_disk: English | LJSpeech | TacotronDCA | tts |v0.0.9 |:floppy_disk: English | LJSpeech | Glow-TTS | tts |v0.0.9 |:floppy_disk: Spanish | M-AILabs | TacotronDDC | tts |v0.0.9 |:floppy_disk: French | MAILabs | TacotronDDC | tts |v0.0.9 |:floppy_disk: Dutch | MAI | TacotronDDC | tts | v0.0.10 |:floppy_disk: :sparkles: English | sam (accenture) | HiFiGAN | vocoder | :smile: v0.0.13| :floppy_disk: :sparkles: English | VCTK | HiFiGAN | vocoder | :smile: v0.0.13| :floppy_disk: English | LJSpeech | HiFiGAN | vocoder | v0.0.12| :floppy_disk: English | EK1 | WaveGrad | vocoder | v0.0.10 |:floppy_disk: Dutch | MAI | ParallelWaveGAN | vocoder | v0.0.10 |:floppy_disk: English | LJSpeech | MB-MelGAN | vocoder |v0.0.9 |:floppy_disk: :earthafrica: Multi-Lang | LibriTTS | FullBand-MelGAN | vocoder |v0.0.9 |:floppy_disk: :earth_africa: Multi-Lang | LibriTTS | WaveGrad | vocoder |v0.0.9 |:floppy_disk:

Update Jun 7 2021: Ruslan (Russian) model has been removed due to the license conflict.

- Python
Published by erogol almost 5 years ago

tts - v0.0.12

:frog: v0.0.12

🐞Bug Fixes

  • [x] fix #419 (This is a crucial bug fix).
  • [x] fix #408

💾 Code updates

  • [x] Enable logging model config.json on Tensorboard. #418
  • [x] Update code style standards and use a Makefile to ease regular tasks. #423
  • [x] Enable using Tacotron.prenet.dropout at inference time. This leads to a better quality with some models.
  • [x] Update default tts model to LJspeech TacotronDDC.
  • [x] Show the real waveform on Tensorboard in GAN vocoder training.

:walking_woman: Operational Updates

🏅 Model implementations

  • [x] initial HiFiGAN implementation (:crown: @rishikksh20 @erogol) #422

🚀 New Pre-Trained Model Releases

  • [ ] ~~Universal HifiGAN model~~(postponed to the next version for :crown: @Edresson's updated model.)
  • [x] LJSpeech, Tacotron2 Double Decoder Consistency v2 model. Check our blog post to learn more about Double Decoder Consistency.
  • [x] LJSpeech HifiGAN model.

Released Models

:bulb: All the models below are available by tts end point as explained here.

| Language | Dataset | Model Name | Model Type | TTS version | Download| | -- | -- | -- | -- | -- | :-: | :sparkles: English | LJSpeech | Tacotron-DDC | tts |:smiley: v0.0.12| :floppy_disk: German | Thorsten-DE | Tacotron-DCA | tts | v0.0.11 |:floppy_disk: German | Thorsten-DE | Wavegrad | vocoder |v0.0.11 |:floppy_disk: English | LJSpeech | SpeedySpeech | tts | v0.0.10 |:floppy_disk: English | EK1 | Tacotron2 | tts |v0.0.10 |:floppy_disk: Dutch | MAI | TacotronDDC | tts | v0.0.10 |:floppy_disk: Chinese | Baker | TacotronDDC-GST | tts | v0.0.10 |:floppy_disk: English | LJSpeech | TacotronDCA | tts |v0.0.9 |:floppy_disk: English | LJSpeech | Glow-TTS | tts |v0.0.9 |:floppy_disk: Spanish | M-AILabs | TacotronDDC | tts |v0.0.9 |:floppy_disk: French | MAILabs | TacotronDDC | tts |v0.0.9 |:floppy_disk: Dutch | MAI | TacotronDDC | tts | v0.0.10 |:floppy_disk: :sparkles: English | LJSpeech | HiFiGAN | vocoder | :smiley: v0.0.12| :floppy_disk: English | EK1 | WaveGrad | vocoder | v0.0.10 |:floppy_disk: Dutch | MAI | ParallelWaveGAN | vocoder | v0.0.10 |:floppy_disk: English | LJSpeech | MB-MelGAN | vocoder |v0.0.9 |:floppy_disk: :earthafrica: Multi-Lang | LibriTTS | FullBand-MelGAN | vocoder |v0.0.9 |:floppy_disk: :earth_africa: Multi-Lang | LibriTTS | WaveGrad | vocoder |v0.0.9 |:floppy_disk:

- Python
Published by erogol almost 5 years ago

tts - v0.0.11

:frog: v0.0.11

🐞Bug Fixes

  • [x] Fixed #374. (Thx for reporting @a-froghyar )

💾 Code updates

  • [x] /bin/resample.py to resample wavefiles (:crown: @WeberJulian)
  • [x] Some updates for Windows compat. (:crown: @GuyPaddock)
  • [x] Fixing CheckSpectrogram notebook. (:crown: @GuyPaddock)
  • [x] Fix #392

:walking_woman: Operational Updates

🏅 Model implementations

  • [x] initial AlignTTS implementation. (https://github.com/coqui-ai/TTS/pull/398)
  • [ ] initial HiFiGAN implementation (:crown: @rishikksh20) (postponed to the next release)

🚀 New Pre-Trained Model Releases

Released Models

:bulb: All the models below are available by tts end point as explained here.

| Language | Dataset | Model Name | Model Type | TTS version | Download| | -- | -- | -- | -- | -- | :-: | :sparkles: German | Thorsten-DE | Tacotron-DCA | tts |:smiley: v0.0.11 |:floppy_disk: :sparkles: German | Thorsten-DE | Wavegrad | vocoder |:smiley: v0.0.11 |:floppy_disk: English | LJSpeech | SpeedySpeech | tts | v0.0.10 |:floppy_disk: English | EK1 | Tacotron2 | tts |v0.0.10 |:floppy_disk: Dutch | MAI | TacotronDDC | tts | v0.0.10 |:floppy_disk: Chinese | Baker | TacotronDDC-GST | tts | v0.0.10 |:floppy_disk: English | LJSpeech | TacotronDCA | tts |v0.0.9 |:floppy_disk: English | LJSpeech | Glow-TTS | tts |v0.0.9 |:floppy_disk: Spanish | M-AILabs | TacotronDDC | tts |v0.0.9 |:floppy_disk: French | MAILabs | TacotronDDC | tts |v0.0.9 |:floppy_disk: Dutch | MAI | TacotronDDC | tts | v0.0.10 |:floppy_disk: English | EK1 | WaveGrad | vocoder | v0.0.10 |:floppy_disk: Dutch | MAI | ParallelWaveGAN | vocoder | v0.0.10 |:floppy_disk: English | LJSpeech | MB-MelGAN | vocoder |v0.0.9 |:floppy_disk: :earthafrica: Multi-Lang | LibriTTS | FullBand-MelGAN | vocoder |v0.0.9 |:floppy_disk: :earth_africa: Multi-Lang | LibriTTS | WaveGrad | vocoder |v0.0.9 |:floppy_disk:

- Python
Published by erogol almost 5 years ago

tts - v0.0.10

:frog: v0.0.10

🐞Bug Fixes

  • [x] Make synthesizer.py saving the output audio with the vocoder sampling rate. It is necessary if there is sampling rates of the tts and the vocoder models are different and interpolation is applied to the tts model output before running the vocoder. Practically, it fixes generated Spanish and French voices by tts or tts-server on the terminal.
  • [x] Handling utf-8 on Windows. (by @adonispujols)
  • [x] Fix Loading the last model when --continue_training. It was loading the best_model regardless.

💾 Code updates

  • [x] Breaking Change: Update default set of characters in symbols.py. This might require you to set your character set in config.json if you like to use this version with your models trained with the previous version.
  • [x] Chinese backend for text processing (#654 by @kirianguiller)
  • [x] Enable torch.hub integration for the released models.
  • [x] First github release.
  • [x] dep. version fixes. Using numpy > 1.17.5 breaks some tests.
  • [x] WaveRNN fix (by @gerazov )
  • [x] Big refactoring for the training scripts to share the init part of the code. (by @gerazov)
  • [x] Enable ModelManager to download models from Github releases.
  • [x] Add a test for compute_statistics.py
  • [x] light-touch updates in tts and tts-server entry points. (thanks @thorstenMueller )
  • [x] Define default vocoder models for each tts model in .models.json. tts and tts-server entry points use the default vocoder if the user does not specify.
  • [x] find_unique_chars.py to find all the unique characters in a dataset.
  • [x] A better way to handling best models through training. (thx @gerazov )
  • [x] pass used characters to the model config.json at the beginning of the training. This prevents any code update later to affect the trained models.
  • [x] Migration to Github Actions for CI.
  • [x] Deprecate wheel based use of tts-server for the sake of the new design.
  • [x] :frog:

:walking_woman: Operational Updates

  • [x] Move released models to Github Releases and deprecate GDrive being the first option.

🏅 Model implementations

  • No updates 😓

🚀 New Pre-Trained Model Releases

  • [x] English ek1 - Tacotron2 model and WaveGrad vocoder under .models.json. (huge THX!! to @nmstoker)
  • [x] Russian Ruslan - Tacotron2-DDC model.
  • [x] Dutch model. (huge THX!! to @r-dh )
  • [x] Chinese Tacotron2 model. (huge THX!! to @kirianguiller)
  • [x] English LJSpeech - SpeechSpeech with WaveNet decoder.

Released Models

:bulb: All the models below are available by tts end point as explained here.

| Language | Dataset | Model Name | Model Type | TTS version | Download| | -- | -- | -- | -- | -- | :-: | English | LJSpeech | SpeedySpeech | tts |:smiley: v0.0.10 |:floppy_disk: English | EK1 | Tacotron2 | tts |:smiley: v0.0.10 |:floppy_disk: Dutch | MAI | TacotronDDC | tts |:smiley: v0.0.10 |:floppy_disk: Chinese | Baker | TacotronDDC-GST | tts |:smiley: v0.0.10 |:floppy_disk: English | LJSpeech | TacotronDCA | tts |v0.0.9 |:floppy_disk: English | LJSpeech | Glow-TTS | tts |v0.0.9 |:floppy_disk: Spanish | M-AILabs | TacotronDDC | tts |v0.0.9 |:floppy_disk: French | MAILabs | TacotronDDC | tts |v0.0.9 |:floppy_disk: Dutch | MAI | TacotronDDC | tts |:smiley: v0.0.10 |:floppy_disk: English | EK1 | WaveGrad | vocoder |:smiley: v0.0.10 |:floppy_disk: Dutch | MAI | ParallelWaveGAN | vocoder |:smiley: v0.0.10 |:floppy_disk: English | LJSpeech | MB-MelGAN | vocoder |v0.0.9 |:floppy_disk: :earthafrica: Multi-Lang | LibriTTS | FullBand-MelGAN | vocoder |v0.0.9 |:floppy_disk: :earth_africa: Multi-Lang | LibriTTS | WaveGrad | vocoder |v0.0.9 |:floppy_disk:

- Python
Published by erogol almost 5 years ago

tts - v0.0.9

:frog: TTS v0.0.9 - the first release :tada:

This is the first and v0.0.9 release of :frog:TTS. :frog:TTS is still an evolving project and any upcoming release might be significantly different and not backward compatible.

In this release, we provide the following models.

Language | Dataset | Model Name | Model Type | Download -- | -- | -- | -- | -- English | LJSpeech | TacotronDCA | tts |:floppy_disk: English | LJSpeech | Glow-TTS | tts |:floppy_disk: Spanish | M-AILabs | TacotronDDC | tts |:floppy_disk: French | MAILabs | TacotronDDC | tts |:floppy_disk: English | LJSpeech | MB-MelGAN | vocoder |:floppy_disk: :earthafrica: Multi-Lang | LibriTTS | FullBand-MelGAN | vocoder|:floppy_disk: :earth_africa: Multi-Lang | LibriTTS | WaveGrad | vocoder |:floppy_disk:

Notes

  • Multi-Lang vocoder models are intended for non-English models.
  • Vocoder models are independently trained from the tts models with possibly different sampling rates. Therefore, the performance is not optimal.
  • All models are trained with phonemes generated by espeak back-end (not espeak-ng).
  • This release has been tested under Python 3.6, 3.7, and 3.8. It is strongly suggested to use conda to install the dependencies and set up the environment.

Edit:

(22.03.2021) - Fullband Universal Vocoder is corrected with the right model files. Previously, we released the wrong model with that name.

- Python
Published by erogol almost 5 years ago