vits_text_to_speech

https://github.com/ishank56/vits_text_to_speech

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (12.8%) to scientific vocabulary

Last synced: 10 months ago · JSON representation ·

Repository

Basic Info

Host: GitHub
Owner: Ishank56
License: mpl-2.0
Language: Python
Default Branch: master
Size: 133 MB

Statistics

Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Releases: 0

Created over 1 year ago · Last pushed over 1 year ago

Metadata Files

Readme Contributing License Code of conduct Citation

Collab Link for training Marathi model:

https://colab.research.google.com/drive/10T9VFTJ5uCg679Dg7IBQCIr7EQtMn_ZN?usp=sharing

Dataset used from:

https://www.openslr.org/103/ VCTK Format

TrainingVITSHindi_TTS

Introduction

This repository contains code for training a Text-to-Speech (TTS) model specifically for Hindi language using the VITS model. The VITS model is known for its high-quality speech synthesis capabilities.

Installation

To get started with training the Hindi TTS model, follow these steps: git clone https://github.com/Ishank56/vits_using_coqui.tts.git pip install -e . 2. Ensure that the Hindi dataset is available inside the Dataset folder. The Hindi data i used can be downloaded from here. The data should be formatted in a manner similar to LJSpeech_1.1 dataset for compatibility.

Dataset I used for training here: https://keithito.com/LJ-Speech-Dataset/

Install all required libraries for phonemizing Hindi alphabets. Espeak library is particularly useful for this purpose. For specific files config.json needs to be set accordingly while using vits model,
Adjust the parameters in the code according to your requirements. The rest of the parameters should already be updated accordingly for the Hindi dataset.

Dataset

The dataset provided in the Dataset folder contains text data and corresponding Wav files in Hindi language. It follows the same format as the LJSpeech 1.1 dataset for consistency. Ensure that the dataset is properly formatted and organized before proceeding with training.

Training

To train the TTS model for Hindi language, run the train_tts.py file. This file contains the necessary code for training the model using the VITS architecture. Make sure all dependencies are installed and the dataset is properly configured before initiating the training process.

python nvidia-smi #to check the GPUS available CUDA_VISIBLE_DEVICES="5" python train.py #Mention the GPU to run on and the training file

Inference

```python pip install TTS

tts --text "यह अपनत्व और उत्कर्ष गुलज़ार की पूरी ज़िदगी और उनके अनेक अन्य कार्यों में आसानी से लक्षित हो जा सकती है." \ --modelpath path/to/model.pth \ --configpath path/to/config.json \ --out_path folder/to/save/output.wav orpython python testing.py

```

Contributing

I made the project working and it is generating good synthesised output. Contributions to this project are welcome! If you have any improvements or suggestions, feel free to open an issue or submit a pull request.

License

I give credits to https://github.com/coqui-ai/TTS?tab=readme-ov-file and dataset contributors mentioned above.

Owner

Login: Ishank56
Kind: user

Repositories: 1
Profile: https://github.com/Ishank56

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you want to cite 🐸💬, feel free to use this (but only if you loved it 😊)"
title: "Coqui TTS"
abstract: "A deep learning toolkit for Text-to-Speech, battle-tested in research and production"
date-released: 2021-01-01
authors:
  - family-names: "Eren"
    given-names: "Gölge"
  - name: "The Coqui TTS Team"
version: 1.4
doi: 10.5281/zenodo.6334862
license: "MPL-2.0"
url: "https://www.coqui.ai"
repository-code: "https://github.com/coqui-ai/TTS"
keywords:
  - machine learning
  - deep learning
  - artificial intelligence
  - text to speech
  - TTS

GitHub Events

Total

Last Year

Dependencies

.github/workflows/aux_tests.yml actions

actions/checkout v3 composite
actions/setup-python v4 composite

.github/workflows/data_tests.yml actions

actions/checkout v3 composite
actions/setup-python v4 composite

.github/workflows/docker.yaml actions

actions/checkout v2 composite
docker/build-push-action v2 composite
docker/login-action v1 composite
docker/setup-buildx-action v1 composite
docker/setup-qemu-action v1 composite

.github/workflows/inference_tests.yml actions

actions/checkout v3 composite
actions/setup-python v4 composite

.github/workflows/pypi-release.yml actions

actions/checkout v3 composite
actions/download-artifact v2 composite
actions/setup-python v2 composite
actions/upload-artifact v2 composite

.github/workflows/style_check.yml actions

actions/checkout v3 composite
actions/setup-python v4 composite

.github/workflows/text_tests.yml actions

actions/checkout v3 composite
actions/setup-python v4 composite

.github/workflows/tts_tests.yml actions

actions/checkout v3 composite
actions/setup-python v4 composite

.github/workflows/tts_tests2.yml actions

actions/checkout v3 composite
actions/setup-python v4 composite

.github/workflows/vocoder_tests.yml actions

actions/checkout v3 composite
actions/setup-python v4 composite

.github/workflows/xtts_tests.yml actions

actions/checkout v3 composite
actions/setup-python v4 composite

.github/workflows/zoo_tests0.yml actions

actions/checkout v3 composite
actions/setup-python v4 composite

.github/workflows/zoo_tests1.yml actions

actions/checkout v3 composite
actions/setup-python v4 composite

.github/workflows/zoo_tests2.yml actions

actions/checkout v3 composite
actions/setup-python v4 composite

Dockerfile docker

${BASE} latest build

recipes/bel-alex73/docker-prepare/Dockerfile docker

ubuntu 22.04 build

TTS/demos/xtts_ft_demo/requirements.txt pypi

faster_whisper ==0.9.0
gradio ==4.7.1

TTS/encoder/requirements.txt pypi

numpy >=1.17.0
umap-learn *

TTS/tts/utils/monotonic_align/setup.py pypi

docs/requirements.txt pypi

furo *
linkify-it-py *
myst-parser ==2.0.0
sphinx ==7.2.5
sphinx_copybutton *
sphinx_inline_tabs *

pyproject.toml pypi

requirements.dev.txt pypi

black * development
coverage * development
isort * development
nose2 * development
pylint ==2.10.2 development

requirements.ja.txt pypi

cutlet *
mecab-python3 ==1.0.6
unidic-lite ==1.0.8

requirements.notebooks.txt pypi

bokeh ==1.4.0

requirements.txt pypi

aiohttp >=3.8.1
anyascii >=0.3.0
bangla *
bnnumerizer *
bnunicodenormalizer *
coqpit >=0.0.16
cython >=0.29.30
einops >=0.6.0
encodec >=0.1.1
flask >=2.0.1
fsspec >=2023.6.0
g2pkk >=0.1.1
gruut ==2.2.3
hangul_romanize *
inflect >=5.6.0
jamo *
jieba *
librosa >=0.10.0
matplotlib >=3.7.0
mutagen ==1.47.0
nltk *
num2words *
numba >=0.57.0
numba ==0.55.1
numpy >=1.24.3
numpy ==1.22.0
packaging >=23.1
pandas >=1.4,<2.0
pypinyin *
pysbd >=0.3.4
pyyaml >=6.0
scikit-learn >=1.3.0
scipy >=1.11.2
soundfile >=0.12.0
spacy >=3
torch >=2.1
torchaudio *
tqdm >=4.64.1
trainer >=0.0.36
transformers >=4.33.0
umap-learn >=0.5.1
unidecode >=1.3.2

setup.py pypi

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science