https://github.com/bookpoint54354/xtts_tune

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
○
.zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (11.9%) to scientific vocabulary

Last synced: 10 months ago · JSON representation

Repository

Basic Info

Host: GitHub
Owner: bookpoint54354
Language: Python
Default Branch: main
Size: 120 KB

Statistics

Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Releases: 0

Created over 1 year ago · Last pushed over 1 year ago

Metadata Files

Readme

README.md

xtts-finetune-webui

This webui is a slightly modified copy of the official webui for finetune xtts.

If you are looking for an option for normal XTTS use look here https://github.com/daswer123/xtts-webui

TODO

[ ] Add the ability to use via console

Key features:

Data processing

Updated faster-whisper to 0.10.0 with the ability to select a larger-v3 model.
Changed output folder to output folder inside the main folder.
If there is already a dataset in the output folder and you want to add new data, you can do so by simply adding new audio, what was there will not be processed again and the new data will be automatically added
Turn on VAD filter
After the dataset is created, a file is created that specifies the language of the dataset. This file is read before training so that the language always matches. It is convenient when you restart the interface

Fine-tuning XTTS Encoder

Added the ability to select the base model for XTTS, as well as when you re-training does not need to download the model again.
Added ability to select custom model as base model during training, which will allow finetune already finetune model.
Added possibility to get optimized version of the model for 1 click ( step 2.5, put optimized version in output folder).
You can choose whether to delete training folders after you have optimized the model
When you optimize the model, the example reference audio is moved to the output folder
Checking for correctness of the specified language and dataset language

Inference

Added possibility to customize infer settings during model checking.

Other

If you accidentally restart the interface during one of the steps, you can load data to additional buttons
Removed the display of logs as it was causing problems when restarted
The finished result is copied to the ready folder, these are fully finished files, you can move them anywhere and use them as a standard model
Added support for finetune Japanese

Changes in webui

1 - Data processing

2 - Fine-tuning XTTS Encoder

3 - Inference

Google colab

🐳 Run in Docker

docker docker run -it --gpus all --pull always -p 7860:7860 --platform=linux/amd64 athomasson2/fine_tune_xtts:huggingface python app.py

Install

Make sure you have Cuda installed
git clone https://github.com/daswer123/xtts-finetune-webui
cd xtts-finetune-webui
pip install torch==2.1.1+cu118 torchaudio==2.1.1+cu118 --index-url https://download.pytorch.org/whl/cu118
pip install -r requirements.txt

If you're using Windows

First start install.bat
To start the server start start.bat
Go to the local address 127.0.0.1:5003

On Linux

Run bash install.sh
To start the server start start.sh
Go to the local address 127.0.0.1:5003

On Apple Silicon Mac (python 3.10 env)

Run pip install --no-deps -r apple_silicon_requirements.txt
To start the server python xtts_demo.py
Go to the local address 127.0.0.1:5003 ~

Owner

Login: bookpoint54354
Kind: user

Repositories: 1
Profile: https://github.com/bookpoint54354

GitHub Events

Total

Push event: 4
Create event: 3

Last Year

Push event: 4
Create event: 3

Dependencies

Dockerfile docker

python 3.11-slim-bookworm build

apple_silicon_requirements.txt pypi

Babel ==2.15.0
Cython ==3.0.10
Flask ==3.0.3
Jinja2 ==3.1.4
Markdown ==3.6
MarkupSafe ==2.1.5
PyYAML ==6.0.1
Pygments ==2.18.0
SudachiDict-core ==20240409
SudachiPy ==0.6.8
TTS ==0.21.3
Unidecode ==1.3.8
Werkzeug ==3.0.3
absl-py ==2.1.0
aiofiles ==23.2.1
aiohttp ==3.9.5
aiosignal ==1.3.1
altair ==5.3.0
annotated-types ==0.7.0
anyascii ==0.3.2
anyio ==3.7.1
async-timeout ==4.0.3
attrs ==23.2.0
audioread ==3.0.1
av ==12.2.0
bangla ==0.0.2
blinker ==1.8.2
blis ==0.7.11
bnnumerizer ==0.0.2
bnunicodenormalizer ==0.1.7
catalogue ==2.0.10
certifi ==2024.7.4
cffi ==1.16.0
charset-normalizer ==3.3.2
click ==8.1.7
cloudpathlib ==0.16.0
colorama ==0.4.6
coloredlogs ==15.0.1
confection ==0.1.5
contourpy ==1.2.1
coqpit ==0.0.17
coqui-tts ==0.24.2
coqui-tts-trainer ==0.1.4
ctranslate2 ==4.3.1
cutlet ==0.4.0
cycler ==0.12.1
cymem ==2.0.8
dateparser ==1.1.8
decorator ==5.1.1
dnspython ==2.6.1
docopt ==0.6.2
einops ==0.8.0
email_validator ==2.2.0
encodec ==0.1.1
exceptiongroup ==1.2.2
fastapi ==0.103.1
fastapi-cli ==0.0.4
faster-whisper ==1.0.2
ffmpy ==0.3.2
filelock ==3.15.4
flatbuffers ==24.3.25
fonttools ==4.53.1
frozenlist ==1.4.1
fsspec ==2024.6.1
fugashi ==1.3.2
g2pkk ==0.1.2
gradio ==4.44.1
gradio_client ==1.3.0
grpcio ==1.64.1
gruut ==2.4.0
gruut-ipa ==0.13.0
gruut_lang_de ==2.0.1
gruut_lang_en ==2.0.1
gruut_lang_es ==2.0.1
gruut_lang_fr ==2.0.2
h11 ==0.14.0
hangul-romanize ==0.1.0
httpcore ==1.0.5
httptools ==0.6.1
httpx ==0.27.0
huggingface-hub ==0.23.5
humanfriendly ==10.0
idna ==3.7
importlib_resources ==6.4.0
inflect ==7.3.1
itsdangerous ==2.2.0
jaconv ==0.4.0
jamo ==0.4.1
jieba ==0.42.1
joblib ==1.4.2
jsonlines ==1.2.0
jsonschema ==4.23.0
jsonschema-specifications ==2023.12.1
kiwisolver ==1.4.5
langcodes ==3.4.0
language_data ==1.2.0
lazy_loader ==0.4
librosa ==0.10.2.post1
llvmlite ==0.43.0
marisa-trie ==1.2.0
markdown-it-py ==3.0.0
matplotlib ==3.8.4
mdurl ==0.1.2
mecab-python3 ==1.0.9
mojimoji ==0.0.13
more-itertools ==10.3.0
mpmath ==1.3.0
msgpack ==1.0.8
multidict ==6.0.5
murmurhash ==1.0.10
networkx ==2.8.8
nltk ==3.8.1
num2words ==0.5.13
numba ==0.60.0
numpy ==1.26.4
onnxruntime ==1.18.1
orjson ==3.10.6
packaging ==24.1
pandas ==1.5.3
pillow ==10.4.0
platformdirs ==4.2.2
pooch ==1.8.2
preshed ==3.0.9
protobuf ==4.25.3
psutil ==6.0.0
pycparser ==2.22
pydantic ==2.3.0
pydantic_core ==2.6.3
pydub ==0.25.1
pygame ==2.6.0
pynndescent ==0.5.13
pyparsing ==3.1.2
pypinyin ==0.51.0
pysbd ==0.3.4
python-crfsuite ==0.9.10
python-dateutil ==2.9.0.post0
python-dotenv ==1.0.1
python-multipart ==0.0.9
pytz ==2024.1
referencing ==0.35.1
regex ==2024.5.15
requests ==2.32.3
rich ==13.7.1
rpds-py ==0.19.0
ruff ==0.5.2
safetensors ==0.4.3
scikit-learn ==1.5.1
scipy ==1.11.4
semantic-version ==2.10.0
shellingham ==1.5.4
six ==1.16.0
smart-open ==6.4.0
sniffio ==1.3.1
soundfile ==0.12.1
soxr ==0.3.7
spacy ==3.7.4
spacy-legacy ==3.0.12
spacy-loggers ==1.0.5
srsly ==2.4.8
starlette ==0.27.0
sympy ==1.13.0
tensorboard ==2.17.0
tensorboard-data-server ==0.7.2
thinc ==8.2.5
threadpoolctl ==3.5.0
tokenizers ==0.19.1
tomlkit ==0.12.0
toolz ==0.12.1
torch ==2.3.1
torchaudio ==2.3.1
tqdm ==4.66.4
trainer ==0.0.36
transformers ==4.42.4
typeguard ==4.3.0
typer ==0.12.5
typing_extensions ==4.12.2
tzdata ==2024.1
tzlocal ==5.2
umap-learn ==0.5.6
unidic-lite ==1.0.8
urllib3 ==2.2.2
uvicorn ==0.30.1
uvloop ==0.19.0
wasabi ==1.1.3
watchfiles ==0.22.0
weasel ==0.3.4
websockets ==11.0.3
wrapt ==1.16.0
yarl ==1.9.4

requirements.txt pypi

coqui-tts ==0.24.2
cutlet *
faster_whisper ==1.0.3
fugashi *
gradio ==5.1.0
spacy ==3.7.5

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science