Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.8%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Basic Info
  • Host: GitHub
  • Owner: Ishank56
  • License: mpl-2.0
  • Language: Python
  • Default Branch: master
  • Size: 133 MB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created over 1 year ago · Last pushed over 1 year ago
Metadata Files
Readme Contributing License Code of conduct Citation

README.md

Collab Link for training Marathi model:

https://colab.research.google.com/drive/10T9VFTJ5uCg679Dg7IBQCIr7EQtMn_ZN?usp=sharing

Dataset used from:

https://www.openslr.org/103/ VCTK Format

TrainingVITSHindi_TTS

Introduction

This repository contains code for training a Text-to-Speech (TTS) model specifically for Hindi language using the VITS model. The VITS model is known for its high-quality speech synthesis capabilities.

Installation

To get started with training the Hindi TTS model, follow these steps: git clone https://github.com/Ishank56/vits_using_coqui.tts.git pip install -e . 2. Ensure that the Hindi dataset is available inside the Dataset folder. The Hindi data i used can be downloaded from here. The data should be formatted in a manner similar to LJSpeech_1.1 dataset for compatibility.

Dataset I used for training here: https://keithito.com/LJ-Speech-Dataset/

  1. Install all required libraries for phonemizing Hindi alphabets. Espeak library is particularly useful for this purpose. For specific files config.json needs to be set accordingly while using vits model,
  2. Adjust the parameters in the code according to your requirements. The rest of the parameters should already be updated accordingly for the Hindi dataset.

Dataset

The dataset provided in the Dataset folder contains text data and corresponding Wav files in Hindi language. It follows the same format as the LJSpeech 1.1 dataset for consistency. Ensure that the dataset is properly formatted and organized before proceeding with training.

Training

To train the TTS model for Hindi language, run the train_tts.py file. This file contains the necessary code for training the model using the VITS architecture. Make sure all dependencies are installed and the dataset is properly configured before initiating the training process.

python nvidia-smi #to check the GPUS available CUDA_VISIBLE_DEVICES="5" python train.py #Mention the GPU to run on and the training file

Inference

```python pip install TTS

tts --text "यह अपनत्व और उत्कर्ष गुलज़ार की पूरी ज़िदगी और उनके अनेक अन्य कार्यों में आसानी से लक्षित हो जा सकती है." \ --modelpath path/to/model.pth \ --configpath path/to/config.json \ --out_path folder/to/save/output.wav or python python testing.py

```

Contributing

I made the project working and it is generating good synthesised output. Contributions to this project are welcome! If you have any improvements or suggestions, feel free to open an issue or submit a pull request.

License

I give credits to https://github.com/coqui-ai/TTS?tab=readme-ov-file and dataset contributors mentioned above.

Owner

  • Login: Ishank56
  • Kind: user

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you want to cite 🐸💬, feel free to use this (but only if you loved it 😊)"
title: "Coqui TTS"
abstract: "A deep learning toolkit for Text-to-Speech, battle-tested in research and production"
date-released: 2021-01-01
authors:
  - family-names: "Eren"
    given-names: "Gölge"
  - name: "The Coqui TTS Team"
version: 1.4
doi: 10.5281/zenodo.6334862
license: "MPL-2.0"
url: "https://www.coqui.ai"
repository-code: "https://github.com/coqui-ai/TTS"
keywords:
  - machine learning
  - deep learning
  - artificial intelligence
  - text to speech
  - TTS

GitHub Events

Total
Last Year

Dependencies

.github/workflows/aux_tests.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v4 composite
.github/workflows/data_tests.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v4 composite
.github/workflows/docker.yaml actions
  • actions/checkout v2 composite
  • docker/build-push-action v2 composite
  • docker/login-action v1 composite
  • docker/setup-buildx-action v1 composite
  • docker/setup-qemu-action v1 composite
.github/workflows/inference_tests.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v4 composite
.github/workflows/pypi-release.yml actions
  • actions/checkout v3 composite
  • actions/download-artifact v2 composite
  • actions/setup-python v2 composite
  • actions/upload-artifact v2 composite
.github/workflows/style_check.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v4 composite
.github/workflows/text_tests.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v4 composite
.github/workflows/tts_tests.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v4 composite
.github/workflows/tts_tests2.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v4 composite
.github/workflows/vocoder_tests.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v4 composite
.github/workflows/xtts_tests.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v4 composite
.github/workflows/zoo_tests0.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v4 composite
.github/workflows/zoo_tests1.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v4 composite
.github/workflows/zoo_tests2.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v4 composite
Dockerfile docker
  • ${BASE} latest build
recipes/bel-alex73/docker-prepare/Dockerfile docker
  • ubuntu 22.04 build
TTS/demos/xtts_ft_demo/requirements.txt pypi
  • faster_whisper ==0.9.0
  • gradio ==4.7.1
TTS/encoder/requirements.txt pypi
  • numpy >=1.17.0
  • umap-learn *
TTS/tts/utils/monotonic_align/setup.py pypi
docs/requirements.txt pypi
  • furo *
  • linkify-it-py *
  • myst-parser ==2.0.0
  • sphinx ==7.2.5
  • sphinx_copybutton *
  • sphinx_inline_tabs *
pyproject.toml pypi
requirements.dev.txt pypi
  • black * development
  • coverage * development
  • isort * development
  • nose2 * development
  • pylint ==2.10.2 development
requirements.ja.txt pypi
  • cutlet *
  • mecab-python3 ==1.0.6
  • unidic-lite ==1.0.8
requirements.notebooks.txt pypi
  • bokeh ==1.4.0
requirements.txt pypi
  • aiohttp >=3.8.1
  • anyascii >=0.3.0
  • bangla *
  • bnnumerizer *
  • bnunicodenormalizer *
  • coqpit >=0.0.16
  • cython >=0.29.30
  • einops >=0.6.0
  • encodec >=0.1.1
  • flask >=2.0.1
  • fsspec >=2023.6.0
  • g2pkk >=0.1.1
  • gruut ==2.2.3
  • hangul_romanize *
  • inflect >=5.6.0
  • jamo *
  • jieba *
  • librosa >=0.10.0
  • matplotlib >=3.7.0
  • mutagen ==1.47.0
  • nltk *
  • num2words *
  • numba >=0.57.0
  • numba ==0.55.1
  • numpy >=1.24.3
  • numpy ==1.22.0
  • packaging >=23.1
  • pandas >=1.4,<2.0
  • pypinyin *
  • pysbd >=0.3.4
  • pyyaml >=6.0
  • scikit-learn >=1.3.0
  • scipy >=1.11.2
  • soundfile >=0.12.0
  • spacy >=3
  • torch >=2.1
  • torchaudio *
  • tqdm >=4.64.1
  • trainer >=0.0.36
  • transformers >=4.33.0
  • umap-learn >=0.5.1
  • unidecode >=1.3.2
setup.py pypi