audiodeepfakedetection

SUTD 50.039 Deep Learning Course Project (2022 Spring)

https://github.com/markhershey/audiodeepfakedetection

Science Score: 36.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
✓
Academic publication links
Links to: zenodo.org
○
Committers with academic emails
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (10.4%) to scientific vocabulary

Keywords

audio audio-deepfake-detection deep-learning deepfake-detection

Last synced: 6 months ago · JSON representation

Repository

SUTD 50.039 Deep Learning Course Project (2022 Spring)

Basic Info

Host: GitHub
Owner: MarkHershey
License: mit
Language: Python
Default Branch: master
Homepage: https://markhh.com/AudioDeepFakeDetection/
Size: 197 MB

Statistics

Stars: 76
Watchers: 3
Forks: 19
Open Issues: 0
Releases: 0

Topics

audio audio-deepfake-detection deep-learning deepfake-detection

Created almost 4 years ago · Last pushed about 2 years ago

Metadata Files

Readme License Citation

Audio Deep Fake Detection

A Course Project for SUTD 50.039 Theory and Practice of Deep Learning (2022 Spring)

Created by Mark He Huang, Peiyuan Zhang, James Raphael Tiovalen, Madhumitha Balaji, and Shyam Sridhar.

Check out our: Project Report | Interactive Website

Setup Environment

```bash

Set up Python virtual environment

python3 -m venv venv && source venv/bin/activate

Make sure your PIP is up to date

pip install -U pip wheel setuptools

Install required dependencies

pip install -r requirements.txt ```

Install PyTorch that suits your machine: https://pytorch.org/get-started/locally/

Setup Datasets

You may download the datasets used in the project from the following URLs:

(Real) Human Voice Dataset: LJ Speech (v1.1)
- This dataset consists of 13,100 short audio clips of a single speaker reading passages from 7 non-fiction books.
(Fake) Synthetic Voice Dataset: WaveFake (v1.20)
- The dataset consists of 104,885 generated audio clips (16-bit PCM wav).

After downloading the datasets, you may extract them under data/real and data/fake respectively. In the end, the data directory should look like this:

data real wavs fake common_voices_prompts_from_conformer_fastspeech2_pwg_ljspeech jsut_multi_band_melgan jsut_parallel_wavegan ljspeech_full_band_melgan ljspeech_hifiGAN ljspeech_melgan ljspeech_melgan_large ljspeech_multi_band_melgan ljspeech_parallel_wavegan ljspeech_waveglow

Model Checkpoints

You may download the model checkpoints from here: Google Drive. Unzip the files and replace the saved directory with the extracted files.

Training

Use the train.py script to train the model.

``` usage: train.py [-h] [--realdir REALDIR] [--fakedir FAKEDIR] [--batchsize BATCHSIZE] [--epochs EPOCHS] [--seed SEED] [--feature_classname {wave,lfcc,mfcc}] [--modelclassname {MLP,WaveRNN,WaveLSTM,SimpleLSTM,ShallowCNN,TSSD}] [--in_distribution {True,False}] [--device DEVICE] [--deterministic] [--restore] [--eval_only] [--debug] [--debugall]

optional arguments: -h, --help show this help message and exit --realdir REALDIR, --real REALDIR Directory containing real data. (default: 'data/real') --fakedir FAKEDIR, --fake FAKEDIR Directory containing fake data. (default: 'data/fake') --batchsize BATCHSIZE Batch size. (default: 256) --epochs EPOCHS Number of maximum epochs to train. (default: 20) --seed SEED Random seed. (default: 42) --featureclassname {wave,lfcc,mfcc} Feature classname. (default: 'lfcc') --modelclassname {MLP,WaveRNN,WaveLSTM,SimpleLSTM,ShallowCNN,TSSD} Model classname. (default: 'ShallowCNN') --indistribution {True,False}, --indist {True,False} Whether to use in distribution experiment setup. (default: True) --device DEVICE Device to use. (default: 'cuda' if possible) --deterministic Whether to use deterministic training (reproducible results). --restore Whether to restore from checkpoint. --evalonly Whether to evaluate only. --debug Whether to use debug mode. --debugall Whether to use debug mode for all models. ```

Example:

To make sure all models can run successfully on your device, you can run the following command to test:

bash python train.py --debug_all

To train the model ShallowCNN with lfcc features in the in-distribution setting, you can run the following command:

bash python train.py --real data/real --fake data/fake --batch_size 128 --epochs 20 --seed 42 --feature_classname lfcc --model_classname ShallowCNN

Please use inline environment variable CUDA_VISIBLE_DEVICES to specify the GPU device(s) to use. For example:

bash CUDA_VISIBLE_DEVICES=0 python train.py

Evaluation

By default, we directly use test set for training validation, and the best model and the best predictions will be automatically saved in the saved directory during training/testing. Go to the directory saved to see the evaluation results.

To evaluate on the test set using trained model, you can run the following command:

bash python train.py --feature_classname lfcc --model_classname ShallowCNN --restore --eval_only

Run the following command to re-compute the evaluation results based on saved predictions and labels:

bash python metrics.py

Acknowledgements

We thank Dr. Matthieu De Mari and Prof. Berrak Sisman for their teaching and guidance.
We thank Joel Frank and Lea Schnherr. Our code is partially adopted from their repository WaveFake.
We thank Prof. Liu Jun for providing GPU resources for conducting experiments for this project.

License

Our project is licensed under the MIT License.

Owner

Name: Mark Huang
Login: MarkHershey
Kind: user
Location: Singapore

Website: markhh.com
Twitter: markkkhh
Repositories: 17
Profile: https://github.com/MarkHershey

ML Research | PhD Student at SUTD

GitHub Events

Total

Watch event: 7
Fork event: 3

Last Year

Watch event: 7
Fork event: 3

Committers

Last synced: 9 months ago

All Time

Total Commits: 99
Total Committers: 5
Avg Commits per committer: 19.8
Development Distribution Score (DDS): 0.404

Past Year

Commits: 0
Committers: 0
Avg Commits per committer: 0.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
Mark Huang	h**s@g**m	59
James R T	j**o@g**m	24
SHSR2001	s**t@g**m	7
github-actions	g**s@g**m	6
Madhu-balaji-01	m**3@g**m	3

Committer Domains (Top 20 + Academic)

github.com: 1

Issues and Pull Requests

Last synced: 9 months ago

All Time

Total issues: 8
Total pull requests: 2
Average time to close issues: 26 days
Average time to close pull requests: 1 minute
Total issue authors: 7
Total pull request authors: 1
Average comments per issue: 2.25
Average comments per pull request: 0.0
Merged pull requests: 2
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 0
Pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

pradeepkc11 (2)
Miapata (1)
ronakker (1)
Epanhua622 (1)
FaisalAhmed-NSL (1)
VictorMEY (1)
soumyajee (1)

Pull Request Authors

MarkHershey (2)

Top Labels

Issue Labels

Pull Request Labels

Dependencies

.github/workflows/compile_results.yml actions

actions/checkout v2 composite
actions/setup-python v2 composite

requirements.txt pypi

librosa *
matplotlib *
numpy *
puts ==0.0.8
scikit-learn *
scipy *
torchinfo *

audiodeepfakedetection

Science Score: 36.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

Audio Deep Fake Detection

Setup Environment

Set up Python virtual environment

Make sure your PIP is up to date

Install required dependencies

Setup Datasets

Model Checkpoints

Training

Evaluation

Acknowledgements

License

Owner

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Dependencies