radtts-uk
High-fidelity speech synthesis for Ukrainian using modern neural networks.
Science Score: 67.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 3 DOI reference(s) in README -
✓Academic publication links
Links to: zenodo.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (15.4%) to scientific vocabulary
Keywords
Repository
High-fidelity speech synthesis for Ukrainian using modern neural networks.
Basic Info
- Host: GitHub
- Owner: egorsmkv
- License: mit
- Language: Jupyter Notebook
- Default Branch: main
- Homepage: https://huggingface.co/spaces/Yehor/radtts-uk-vocos-demo
- Size: 1.77 MB
Statistics
- Stars: 8
- Watchers: 1
- Forks: 0
- Open Issues: 5
- Releases: 7
Topics
Metadata Files
README.md
Text-to-Speech for Ukrainian
High-fidelity speech synthesis for Ukrainian using modern neural networks.
Statuses
Demo
Check out our demo on Hugging Face space or just listen to samples here.
Features
- Multi-speaker model: 2 female (Tetiana, Lada) + 1 male (Mykyta) voices;
- Fine-grained control over speech parameters, including duration, fundamental frequency (F0), and energy;
- High-fidelity speech generation using the RAD-TTS++ acoustic model;
- Fast vocoding using Vocos;
- Synthesizes long sentences effectively;
- Supports a sampling rate of 44.1 kHz;
- Tested on Linux environments and Windows/WSL;
- Python API (requires Python 3.9 or later);
- CUDA-enabled for GPU acceleration.
Installation
```shell
Install from PyPI
pip install tts-uk
OR, for the latest development version:
pip install git+https://github.com/egorsmkv/tts_uk
OR, use git and local setup
git clone https://github.com/egorsmkv/ttsuk cd ttsuk uv sync # uv will handle the virtual environment ```
Read uv's installation section.
Also, you can download the repository as a ZIP archive.
Getting started
Code example:
```python import torchaudio
from tts_uk.inference import synthesis
samplingrate = 44100
Perform the synthesis, synthesis function returns:
- mels: Mel spectrograms of the generated audio.
- wave: The synthesized waveform by a Vocoder as a PyTorch tensor.
- stats: A dictionary containing synthesis statistics (processing time, duration, speech rate, etc).
mels, wave, stats = synthesis( text="Ви можете протестувати синтез мовлення українською мовою. Просто введіть текст, який ви хочете прослухати.", voice="tetiana", # tetiana, mykyta, lada ntakes=1, uselatesttake=False, tokendurscaling=1, f0mean=0, f0std=0, energymean=0, energystd=0, sigmadecoder=0.8, sigmatokenduration=0.666, sigmaf0=1, sigmaenergy=1, )
print(stats)
Save the generated audio to a WAV file.
torchaudio.save("audio.wav", wave.cpu(), samplingrate, encoding="PCMS") ```
Use these Google colabs:
- CPU inference
- GPU inference on T4 card (long document to synthesize)
Or run synthesis in a terminal:
shell
uv run example.py
If you need to synthesize articles we recommend consider wtpsplit.
Get help and support
Please feel free to connect with us using the Issues section.
License
Code has the MIT license.
Model authors
Acoustic
Vocoder
Community
- Discord: https://bit.ly/discord-uds
- Speech Recognition: https://t.me/speechrecognitionuk
- Speech Synthesis: https://t.me/speechsynthesisuk
Also, follow our Speech-UK initiative on Hugging Face!
Acknowledgements
Owner
- Name: Yehor Smoliakov
- Login: egorsmkv
- Kind: user
- Location: 50.4501° N, 30.5234° E
- Twitter: yehor_smoliakov
- Repositories: 22
- Profile: https://github.com/egorsmkv
Speech-to-Text, Text-to-Speech, Voice over Internet Protocol
Citation (CITATION.cff)
cff-version: 1.2.0 message: "If you use this software, please cite it as below." authors: - family-names: "Smoliakov" given-names: "Yehor" orcid: "https://orcid.org/0000-0002-8272-2095" title: "High-fidelity speech synthesis for Ukrainian using modern neural networks." version: 1.3.5 doi: 10.5281/zenodo.14966501 date-released: 2025-03-04 url: "https://github.com/egorsmkv/tts_uk"
GitHub Events
Total
- Create event: 11
- Release event: 7
- Issues event: 19
- Watch event: 8
- Delete event: 3
- Issue comment event: 1
- Push event: 89
- Pull request event: 2
Last Year
- Create event: 11
- Release event: 7
- Issues event: 19
- Watch event: 8
- Delete event: 3
- Issue comment event: 1
- Push event: 89
- Pull request event: 2
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 13
- Total pull requests: 1
- Average time to close issues: about 14 hours
- Average time to close pull requests: about 6 hours
- Total issue authors: 1
- Total pull request authors: 1
- Average comments per issue: 0.08
- Average comments per pull request: 0.0
- Merged pull requests: 1
- Bot issues: 0
- Bot pull requests: 1
Past Year
- Issues: 13
- Pull requests: 1
- Average time to close issues: about 14 hours
- Average time to close pull requests: about 6 hours
- Issue authors: 1
- Pull request authors: 1
- Average comments per issue: 0.08
- Average comments per pull request: 0.0
- Merged pull requests: 1
- Bot issues: 0
- Bot pull requests: 1
Top Authors
Issue Authors
- egorsmkv (15)
Pull Request Authors
- dependabot[bot] (2)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 2
-
Total downloads:
- pypi 84 last-month
-
Total dependent packages: 0
(may contain duplicates) -
Total dependent repositories: 0
(may contain duplicates) - Total versions: 11
- Total maintainers: 1
pypi.org: radtts-uk
RAD-TTS++ for Ukrainian
- Documentation: https://radtts-uk.readthedocs.io/
- License: MIT License
-
Latest release: 1.1.0
published 12 months ago
Rankings
pypi.org: tts-uk
High-fidelity speech synthesis for Ukrainian using modern neural networks.
- Homepage: https://github.com/egorsmkv/tts_uk
- Documentation: https://tts-uk.readthedocs.io/
- License: MIT License
-
Latest release: 1.3.7
published 11 months ago
Rankings
Maintainers (1)
Dependencies
- huggingface_hub >=0.29.1
- librosa >=0.10.2.post1
- numba >=0.60
- scipy >=1
- torch >=2.2.0
- torchaudio >=2.2.0
- vocos @ git+https://github.com/langtech-bsc/vocos.git@matcha
- audioread 3.0.1
- certifi 2025.1.31
- cffi 1.17.1
- charset-normalizer 3.4.1
- colorama 0.4.6
- decorator 5.2.1
- einops 0.8.1
- encodec 0.1.1
- filelock 3.17.0
- fsspec 2025.2.0
- huggingface-hub 0.29.1
- idna 3.10
- jinja2 3.1.5
- joblib 1.4.2
- lazy-loader 0.4
- librosa 0.10.2.post1
- llvmlite 0.43.0
- llvmlite 0.44.0
- markupsafe 3.0.2
- mpmath 1.3.0
- msgpack 1.1.0
- networkx 3.2.1
- networkx 3.4.2
- numba 0.60.0
- numba 0.61.0
- numpy 2.0.2
- numpy 2.1.3
- nvidia-cublas-cu12 12.4.5.8
- nvidia-cuda-cupti-cu12 12.4.127
- nvidia-cuda-nvrtc-cu12 12.4.127
- nvidia-cuda-runtime-cu12 12.4.127
- nvidia-cudnn-cu12 9.1.0.70
- nvidia-cufft-cu12 11.2.1.3
- nvidia-curand-cu12 10.3.5.147
- nvidia-cusolver-cu12 11.6.1.9
- nvidia-cusparse-cu12 12.3.1.170
- nvidia-cusparselt-cu12 0.6.2
- nvidia-nccl-cu12 2.21.5
- nvidia-nvjitlink-cu12 12.4.127
- nvidia-nvtx-cu12 12.4.127
- packaging 24.2
- platformdirs 4.3.6
- pooch 1.8.2
- pycparser 2.22
- pyyaml 6.0.2
- radtts-uk 1.0.0
- requests 2.32.3
- ruff 0.9.9
- scikit-learn 1.6.1
- scipy 1.13.1
- scipy 1.15.2
- setuptools 75.8.2
- soundfile 0.13.1
- soxr 0.5.0.post1
- sympy 1.13.1
- threadpoolctl 3.5.0
- torch 2.6.0
- torchaudio 2.6.0
- tqdm 4.67.1
- triton 3.2.0
- typing-extensions 4.12.2
- urllib3 2.3.0
- vocos 0.1.0