https://github.com/academic-hammer/talkingface-toolkit

Science Score: 10.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
○
codemeta.json file
○
.zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org, researchgate.net
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (5.6%) to scientific vocabulary

Last synced: 9 months ago · JSON representation

Repository

Basic Info

Host: GitHub
Owner: Academic-Hammer
Language: Python
Default Branch: main
Size: 448 MB

Statistics

Stars: 9
Watchers: 2
Forks: 40
Open Issues: 25
Releases: 0

Created over 2 years ago · Last pushed about 2 years ago

Metadata Files

Readme

talkingface-toolkit

框架整体介绍

checkpoints

主要保存的是训练和评估模型所需要的额外的预训练模型，在对应文件夹的README有更详细的介绍

datset

存放数据集以及数据集预处理之后的数据，详细内容见dataset里的README

saved

存放训练过程中保存的模型checkpoint, 训练过程中保存模型时自动创建

talkingface

主要功能模块，包括所有核心代码

config

根据模型和数据集名称自动生成所有模型、数据集、训练、评估等相关的配置信息 ``` config/

├── configurator.py

```

data

dataprocess：模型特有的数据处理代码，（可以是对方仓库自己实现的音频特征提取、推理时的数据处理）。如果实现的模型有这个需求，就要建立一对应的文件
dataset：每个模型都要重载torch.utils.data.Dataset 用于加载数据。每个模型都要有一个model_name+'_dataset.py'文件. __getitem__()方法的返回值应处理成字典类型的数据。 (核心部分) ``` data/

├── dataprocess

| ├── wav2lip_process.py

| ├── xxxx_process.py

├── dataset

| ├── wav2lip_dataset.py

| ├── xxx_dataset.py ```

evaluate

主要涉及模型评估的代码 LSE metric 需要的数据是生成的视频列表 SSIM metric 需要的数据是生成的视频和真实的视频列表

model

实现的模型的网络和对应的方法（核心部分）

主要分三类： - audio-driven (音频驱动) - image-driven （图像驱动） - nerf-based （基于神经辐射场的方法）

``` model/

├── audiodriventalkingface

| ├── wav2lip.py

├── imagedriventalkingface

| ├── xxxx.py

├── nerfbasedtalkingface

| ├── xxxx.py

├── abstract_talkingface.py

```

properties

保存默认配置文件，包括： - 数据集配置文件 - 模型配置文件 - 通用配置文件

需要根据对应模型和数据集增加对应的配置文件，通用配置文件overall.yaml一般不做修改 ``` properties/

├── dataset

| ├── xxx.yaml

├── model

| ├── xxx.yaml

├── overall.yaml

```

quick_start

通用的启动文件，根据传入参数自动配置数据集和模型，然后训练和评估（一般不需要修改） ``` quick_start/

├── quick_start.py

```

trainer

训练、评估函数的主类。在trainer中，如果可以使用基类Trainer实现所有功能，则不需要写一个新的。如果模型训练有一些特有部分，则需要重载Trainer。需要重载部分可能主要集中于: _train_epoch(), _valid_epoch()。重载的Trainer应该命名为：{model_name}Trainer ``` trainer/

├── trainer.py

```

utils

公用的工具类，包括s3fd人脸检测，视频抽帧、视频抽音频方法。还包括根据参数配置找对应的模型类、数据类等方法。一般不需要修改，但可以适当添加一些必须的且相对普遍的数据处理文件。

使用方法

环境要求

python=3.8
torch==1.13.1+cu116（gpu版，若设备不支持cuda可以使用cpu版）
numpy==1.20.3
librosa==0.10.1

尽量保证上面几个包的版本一致

提供了两种配置其他环境的方法： ``` pip install -r requirements.txt

conda env create -f environment.yml ```

建议使用conda虚拟环境！！！

训练和评估

bash python run_talkingface.py --model=xxxx --dataset=xxxx (--other_parameters=xxxxxx)

权重文件

LSE评估需要的权重: syncnet_v2.model 百度网盘下载
wav2lip需要的lip expert 权重：lipsync_expert.pth 百度网下载

可选论文：

Aduio_driven talkingface

Image_driven talkingface

Nerf-based talkingface

texttospeech

voice_conversion

作业要求

确保可以仅在命令行输入模型和数据集名称就可以训练、验证。（部分仓库没有提供训练代码的，可以不训练）
每个组都要提交一个README文件，写明完成的功能、最终实现的训练、验证截图、所使用的依赖、成员分工等。

Owner

Name: DataHammer
Login: Academic-Hammer
Kind: organization

Repositories: 12
Profile: https://github.com/Academic-Hammer

GitHub Events

Total

Watch event: 6
Pull request event: 1
Fork event: 1

Last Year

Watch event: 6
Pull request event: 1
Fork event: 1

Issues and Pull Requests

Last synced: over 1 year ago

All Time

Total issues: 2
Total pull requests: 36
Average time to close issues: 13 days
Average time to close pull requests: 3 days
Total issue authors: 2
Total pull request authors: 27
Average comments per issue: 0.5
Average comments per pull request: 0.08
Merged pull requests: 2
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 1
Pull requests: 1
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 1
Pull request authors: 1
Average comments per issue: 0.0
Average comments per pull request: 1.0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

Hujiazeng (1)
Klayand (1)

Pull Request Authors

Kline-song (7)
18pwp81 (6)
chuyi369 (4)
Atlus99 (3)
Aquariuslyh (3)
happy-fishingman (2)
zhouchushu03 (2)
Abstractjkc (2)
huanranchen (2)
LynxPeng (2)
FFFXX0319 (2)
ShenShuo137 (2)
Pissohappy (2)
yang-kun-long (2)
vhthree (2)

Top Labels

Issue Labels

Pull Request Labels

Dependencies

environment.yml pypi

absl-py ==2.0.0
addict ==2.4.0
aiosignal ==1.3.1
appdirs ==1.4.4
attrs ==23.1.0
audioread ==3.0.1
basicsr ==1.3.4.7
cachetools ==5.3.2
certifi ==2020.12.5
cffi ==1.16.0
charset-normalizer ==3.3.2
click ==8.1.7
cloudpickle ==3.0.0
colorama ==0.4.6
colorlog ==6.7.0
contourpy ==1.1.1
cycler ==0.12.1
decorator ==5.1.1
dlib ==19.22.1
docker-pycreds ==0.4.0
face-alignment ==1.3.5
ffmpeg ==1.4
filelock ==3.13.1
fonttools ==4.44.0
frozenlist ==1.4.0
future ==0.18.3
gitdb ==4.0.11
gitpython ==3.1.40
glob2 ==0.7
google-auth ==2.23.4
google-auth-oauthlib ==0.4.6
grpcio ==1.59.2
hyperopt ==0.2.5
idna ==3.4
imageio ==2.9.0
imageio-ffmpeg ==0.4.5
importlib-metadata ==6.8.0
importlib-resources ==6.1.0
joblib ==1.3.2
jsonschema ==4.19.2
jsonschema-specifications ==2023.7.1
kiwisolver ==1.4.5
kornia ==0.5.5
lazy-loader ==0.3
librosa ==0.10.1
llvmlite ==0.37.0
lmdb ==1.2.1
lws ==1.2.7
markdown ==3.5.1
markupsafe ==2.1.3
matplotlib ==3.6.3
msgpack ==1.0.7
networkx ==3.1
numba ==0.54.1
numpy ==1.20.3
oauthlib ==3.2.2
opencv-python ==3.4.9.33
packaging ==23.2
pandas ==1.3.4
pathtools ==0.1.2
pillow ==6.2.1
pkgutil-resolve-name ==1.3.10
platformdirs ==3.11.0
plotly ==5.18.0
pooch ==1.8.0
protobuf ==4.25.0
psutil ==5.9.6
pyasn1 ==0.5.0
pyasn1-modules ==0.3.0
pycparser ==2.21
pyparsing ==3.1.1
python-dateutil ==2.8.2
python-speech-features ==0.6
pytorch-fid ==0.3.0
pytz ==2023.3.post1
pywavelets ==1.4.1
pyyaml ==5.3.1
ray ==2.6.3
referencing ==0.30.2
requests ==2.31.0
requests-oauthlib ==1.3.1
rpds-py ==0.12.0
rsa ==4.9
scikit-image ==0.16.2
scikit-learn ==1.3.2
scipy ==1.5.0
sentry-sdk ==1.34.0
setproctitle ==1.3.3
six ==1.16.0
smmap ==5.0.1
soundfile ==0.12.1
soxr ==0.3.7
tabulate ==0.9.0
tb-nightly ==2.12.0a20230126
tenacity ==8.2.3
tensorboard ==2.7.0
tensorboard-data-server ==0.6.1
tensorboard-plugin-wit ==1.8.1
texttable ==1.7.0
thop ==0.1.1
threadpoolctl ==3.2.0
tomli ==2.0.1
torch ==1.13.1
torchaudio ==0.13.1
torchvision ==0.14.1
tqdm ==4.66.1
trimesh ==3.9.20
typing-extensions ==4.8.0
tzdata ==2023.3
urllib3 ==2.0.7
wandb ==0.15.12
werkzeug ==3.0.1
yapf ==0.40.2
zipp ==3.17.0

requirements.txt pypi

GitPython ==3.1.40
Markdown ==3.5.1
MarkupSafe ==2.1.3
Pillow ==6.2.1
PyWavelets ==1.4.1
PyYAML ==5.3.1
Werkzeug ==3.0.1
absl-py ==2.0.0
addict ==2.4.0
aiosignal ==1.3.1
appdirs ==1.4.4
attrs ==23.1.0
audioread ==3.0.1
basicsr ==1.3.4.7
cachetools ==5.3.2
certifi ==2020.12.5
cffi ==1.16.0
charset-normalizer ==3.3.2
click ==8.1.7
cloudpickle ==3.0.0
colorama ==0.4.6
colorlog ==6.7.0
contourpy ==1.1.1
cycler ==0.12.1
decorator ==5.1.1
dlib ==19.22.1
docker-pycreds ==0.4.0
face-alignment ==1.3.5
ffmpeg ==1.4
filelock ==3.13.1
fonttools ==4.44.0
frozenlist ==1.4.0
future ==0.18.3
gitdb ==4.0.11
glob2 ==0.7
google-auth ==2.23.4
google-auth-oauthlib ==0.4.6
grpcio ==1.59.2
hyperopt ==0.2.5
idna ==3.4
imageio ==2.9.0
imageio-ffmpeg ==0.4.5
importlib-metadata ==6.8.0
importlib-resources ==6.1.0
joblib ==1.3.2
jsonschema ==4.19.2
jsonschema-specifications ==2023.7.1
kiwisolver ==1.4.5
kornia ==0.5.5
lazy_loader ==0.3
librosa ==0.10.1
llvmlite ==0.37.0
lmdb ==1.2.1
lws ==1.2.7
matplotlib ==3.6.3
msgpack ==1.0.7
networkx ==3.1
numba ==0.54.1
numpy ==1.20.3
oauthlib ==3.2.2
opencv-python ==3.4.9.33
packaging ==23.2
pandas ==1.3.4
pathtools ==0.1.2
pkgutil_resolve_name ==1.3.10
platformdirs ==3.11.0
plotly ==5.18.0
pooch ==1.8.0
protobuf ==4.25.0
psutil ==5.9.6
pyasn1 ==0.5.0
pyasn1-modules ==0.3.0
pycparser ==2.21
pyparsing ==3.1.1
python-dateutil ==2.8.2
python-speech-features ==0.6
pytorch-fid ==0.3.0
pytz ==2023.3.post1
ray ==2.6.3
referencing ==0.30.2
requests ==2.31.0
requests-oauthlib ==1.3.1
rpds-py ==0.12.0
rsa ==4.9
scikit-image ==0.16.2
scikit-learn ==1.3.2
scipy ==1.5.0
sentry-sdk ==1.34.0
setproctitle ==1.3.3
six ==1.16.0
smmap ==5.0.1
soundfile ==0.12.1
soxr ==0.3.7
tabulate ==0.9.0
tb-nightly ==2.12.0a20230126
tenacity ==8.2.3
tensorboard ==2.7.0
tensorboard-data-server ==0.6.1
tensorboard-plugin-wit ==1.8.1
texttable ==1.7.0
thop ==0.1.1.post2209072238
threadpoolctl ==3.2.0
tomli ==2.0.1
torch ==1.13.1
torchaudio ==0.13.1
torchvision ==0.14.1
tqdm ==4.66.1
trimesh ==3.9.20
typing_extensions ==4.8.0
tzdata ==2023.3
urllib3 ==2.0.7
wandb ==0.15.12
yapf ==0.40.2
zipp ==3.17.0