radtts-uk

High-fidelity speech synthesis for Ukrainian using modern neural networks.

https://github.com/egorsmkv/tts_uk

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 3 DOI reference(s) in README
✓
Academic publication links
Links to: zenodo.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (15.4%) to scientific vocabulary

Keywords

audio sound speech-uk synthesis text-to-speech tts ukrainian vocos wav wave

Last synced: 10 months ago · JSON representation ·

Repository

High-fidelity speech synthesis for Ukrainian using modern neural networks.

Basic Info

Host: GitHub
Owner: egorsmkv
License: mit
Language: Jupyter Notebook
Default Branch: main
Homepage: https://huggingface.co/spaces/Yehor/radtts-uk-vocos-demo
Size: 1.77 MB

Statistics

Stars: 8
Watchers: 1
Forks: 0
Open Issues: 5
Releases: 7

Topics

audio sound speech-uk synthesis text-to-speech tts ukrainian vocos wav wave

Created over 1 year ago · Last pushed 11 months ago

Metadata Files

Readme Funding Citation

Text-to-Speech for Ukrainian

High-fidelity speech synthesis for Ukrainian using modern neural networks.

Statuses

Demo

Check out our demo on Hugging Face space or just listen to samples here.

Features

Multi-speaker model: 2 female (Tetiana, Lada) + 1 male (Mykyta) voices;
Fine-grained control over speech parameters, including duration, fundamental frequency (F0), and energy;
High-fidelity speech generation using the RAD-TTS++ acoustic model;
Fast vocoding using Vocos;
Synthesizes long sentences effectively;
Supports a sampling rate of 44.1 kHz;
Tested on Linux environments and Windows/WSL;
Python API (requires Python 3.9 or later);
CUDA-enabled for GPU acceleration.

Installation

```shell

Install from PyPI

pip install tts-uk

OR, for the latest development version:

pip install git+https://github.com/egorsmkv/tts_uk

OR, use git and local setup

git clone https://github.com/egorsmkv/ttsuk cd ttsuk uv sync # uv will handle the virtual environment ```

Read uv's installation section.

Also, you can download the repository as a ZIP archive.

Getting started

Code example:

```python import torchaudio

from tts_uk.inference import synthesis

samplingrate = 44100

Perform the synthesis, `synthesis` function returns:

- mels: Mel spectrograms of the generated audio.

- wave: The synthesized waveform by a Vocoder as a PyTorch tensor.

- stats: A dictionary containing synthesis statistics (processing time, duration, speech rate, etc).

mels, wave, stats = synthesis( text="Ви можете протестувати синтез мовлення українською мовою. Просто введіть текст, який ви хочете прослухати.", voice="tetiana", # tetiana, mykyta, lada ntakes=1, uselatesttake=False, tokendurscaling=1, f0mean=0, f0std=0, energymean=0, energystd=0, sigmadecoder=0.8, sigmatokenduration=0.666, sigmaf0=1, sigmaenergy=1, )

print(stats)

Save the generated audio to a WAV file.

torchaudio.save("audio.wav", wave.cpu(), samplingrate, encoding="PCMS") ```

Use these Google colabs:

CPU inference
GPU inference on T4 card (long document to synthesize)

Or run synthesis in a terminal:

shell uv run example.py

If you need to synthesize articles we recommend consider wtpsplit.

Get help and support

Please feel free to connect with us using the Issues section.

License

Code has the MIT license.

Model authors

Acoustic

Yehor Smoliakov, HF profile

Vocoder

Serhiy Stetskovych, HF profile

Community

Discord: https://bit.ly/discord-uds
Speech Recognition: https://t.me/speechrecognitionuk
Speech Synthesis: https://t.me/speechsynthesisuk

Also, follow our Speech-UK initiative on Hugging Face!

Acknowledgements

Owner

Name: Yehor Smoliakov
Login: egorsmkv
Kind: user
Location: 50.4501° N, 30.5234° E

Twitter: yehor_smoliakov
Repositories: 22
Profile: https://github.com/egorsmkv

Speech-to-Text, Text-to-Speech, Voice over Internet Protocol

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Smoliakov"
  given-names: "Yehor"
  orcid: "https://orcid.org/0000-0002-8272-2095"
title: "High-fidelity speech synthesis for Ukrainian using modern neural networks."
version: 1.3.5
doi: 10.5281/zenodo.14966501
date-released: 2025-03-04
url: "https://github.com/egorsmkv/tts_uk"

GitHub Events

Total

Create event: 11
Release event: 7
Issues event: 19
Watch event: 8
Delete event: 3
Issue comment event: 1
Push event: 89
Pull request event: 2

Last Year

Create event: 11
Release event: 7
Issues event: 19
Watch event: 8
Delete event: 3
Issue comment event: 1
Push event: 89
Pull request event: 2

Issues and Pull Requests

Last synced: 11 months ago

All Time

Total issues: 13
Total pull requests: 1
Average time to close issues: about 14 hours
Average time to close pull requests: about 6 hours
Total issue authors: 1
Total pull request authors: 1
Average comments per issue: 0.08
Average comments per pull request: 0.0
Merged pull requests: 1
Bot issues: 0
Bot pull requests: 1

Past Year

Issues: 13
Pull requests: 1
Average time to close issues: about 14 hours
Average time to close pull requests: about 6 hours
Issue authors: 1
Pull request authors: 1
Average comments per issue: 0.08
Average comments per pull request: 0.0
Merged pull requests: 1
Bot issues: 0
Bot pull requests: 1

View more stats

Top Authors

Issue Authors

egorsmkv (15)

Pull Request Authors

dependabot[bot] (2)

Top Labels

Issue Labels

Pull Request Labels

dependencies (2) github_actions (2)

Packages

Total packages: 2
Total downloads:
- pypi 84 last-month

Total dependent packages: 0
(may contain duplicates)
Total dependent repositories: 0
(may contain duplicates)
Total versions: 11
Total maintainers: 1

pypi.org: radtts-uk

RAD-TTS++ for Ukrainian

Documentation: https://radtts-uk.readthedocs.io/
License: MIT License
Latest release: 1.1.0
published over 1 year ago

Versions: 1
Dependent Packages: 0
Dependent Repositories: 0

Rankings

Dependent packages count: 9.5%

Average: 31.7%

Dependent repos count: 53.8%

Last synced: over 1 year ago

pypi.org: tts-uk

High-fidelity speech synthesis for Ukrainian using modern neural networks.

Homepage: https://github.com/egorsmkv/tts_uk
Documentation: https://tts-uk.readthedocs.io/
License: MIT License
Latest release: 1.3.7
published over 1 year ago

Versions: 10
Dependent Packages: 0
Dependent Repositories: 0
Downloads: 84 Last month

Rankings

Dependent packages count: 9.5%

Average: 31.7%

Dependent repos count: 53.8%

Maintainers (1)

Egor.Smolyakov

Last synced: 11 months ago

Dependencies

pyproject.toml pypi

huggingface_hub >=0.29.1
librosa >=0.10.2.post1
numba >=0.60
scipy >=1
torch >=2.2.0
torchaudio >=2.2.0
vocos @ git+https://github.com/langtech-bsc/vocos.git@matcha

uv.lock pypi

audioread 3.0.1
certifi 2025.1.31
cffi 1.17.1
charset-normalizer 3.4.1
colorama 0.4.6
decorator 5.2.1
einops 0.8.1
encodec 0.1.1
filelock 3.17.0
fsspec 2025.2.0
huggingface-hub 0.29.1
idna 3.10
jinja2 3.1.5
joblib 1.4.2
lazy-loader 0.4
librosa 0.10.2.post1
llvmlite 0.43.0
llvmlite 0.44.0
markupsafe 3.0.2
mpmath 1.3.0
msgpack 1.1.0
networkx 3.2.1
networkx 3.4.2
numba 0.60.0
numba 0.61.0
numpy 2.0.2
numpy 2.1.3
nvidia-cublas-cu12 12.4.5.8
nvidia-cuda-cupti-cu12 12.4.127
nvidia-cuda-nvrtc-cu12 12.4.127
nvidia-cuda-runtime-cu12 12.4.127
nvidia-cudnn-cu12 9.1.0.70
nvidia-cufft-cu12 11.2.1.3
nvidia-curand-cu12 10.3.5.147
nvidia-cusolver-cu12 11.6.1.9
nvidia-cusparse-cu12 12.3.1.170
nvidia-cusparselt-cu12 0.6.2
nvidia-nccl-cu12 2.21.5
nvidia-nvjitlink-cu12 12.4.127
nvidia-nvtx-cu12 12.4.127
packaging 24.2
platformdirs 4.3.6
pooch 1.8.2
pycparser 2.22
pyyaml 6.0.2
radtts-uk 1.0.0
requests 2.32.3
ruff 0.9.9
scikit-learn 1.6.1
scipy 1.13.1
scipy 1.15.2
setuptools 75.8.2
soundfile 0.13.1
soxr 0.5.0.post1
sympy 1.13.1
threadpoolctl 3.5.0
torch 2.6.0
torchaudio 2.6.0
tqdm 4.67.1
triton 3.2.0
typing-extensions 4.12.2
urllib3 2.3.0
vocos 0.1.0

radtts-uk

Science Score: 67.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

Text-to-Speech for Ukrainian

Statuses

Demo

Features

Installation

Install from PyPI

OR, for the latest development version:

OR, use git and local setup

Getting started

Perform the synthesis, synthesis function returns:

- mels: Mel spectrograms of the generated audio.

- wave: The synthesized waveform by a Vocoder as a PyTorch tensor.

- stats: A dictionary containing synthesis statistics (processing time, duration, speech rate, etc).

Save the generated audio to a WAV file.

Get help and support

License

Model authors

Acoustic

Vocoder

Community

Acknowledgements

Owner

Citation (CITATION.cff)

GitHub Events

Total

Last Year

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

pypi.org: radtts-uk

Rankings

pypi.org: tts-uk

Rankings

Maintainers (1)

Dependencies

Perform the synthesis, `synthesis` function returns: