Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: zenodo.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (10.5%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Basic Info
  • Host: GitHub
  • Owner: rizwann2912
  • License: mit
  • Language: Python
  • Default Branch: main
  • Size: 6.15 MB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created 9 months ago · Last pushed 9 months ago
Metadata Files
Readme License Citation

README.md

Audio Deep Fake Detection

Created by Mohd Rizwan and Niresh Kumar

Check out our: Project Report | Interactive Website

Setup Environment

```bash

Set up Python virtual environment

python3 -m venv venv && source venv/bin/activate

Make sure your PIP is up to date

pip install -U pip wheel setuptools

Install required dependencies

pip install -r requirements.txt ```

Setup Datasets

You may download the datasets used in the project from the following URLs:

  • (Real) Human Voice Dataset: LJ Speech (v1.1)
    • This dataset consists of 13,100 short audio clips of a single speaker reading passages from 7 non-fiction books.
  • (Fake) Synthetic Voice Dataset: WaveFake (v1.20)
    • The dataset consists of 104,885 generated audio clips (16-bit PCM wav).

After downloading the datasets, you may extract them under data/real and data/fake respectively. In the end, the data directory should look like this:

data ├── real │   └── wavs └── fake ├── common_voices_prompts_from_conformer_fastspeech2_pwg_ljspeech ├── jsut_multi_band_melgan ├── jsut_parallel_wavegan ├── ljspeech_full_band_melgan ├── ljspeech_hifiGAN ├── ljspeech_melgan ├── ljspeech_melgan_large ├── ljspeech_multi_band_melgan ├── ljspeech_parallel_wavegan └── ljspeech_waveglow

Model Checkpoints

You may download the model checkpoints from here: Google Drive. Unzip the files and replace the saved directory with the extracted files.

Training

Use the train.py script to train the model.

``` usage: train.py [-h] [--realdir REALDIR] [--fakedir FAKEDIR] [--batchsize BATCHSIZE] [--epochs EPOCHS] [--seed SEED] [--feature_classname {wave,lfcc,mfcc}] [--modelclassname {MLP,WaveRNN,WaveLSTM,SimpleLSTM,ShallowCNN,TSSD}] [--in_distribution {True,False}] [--device DEVICE] [--deterministic] [--restore] [--eval_only] [--debug] [--debugall]

optional arguments: -h, --help show this help message and exit --realdir REALDIR, --real REALDIR Directory containing real data. (default: 'data/real') --fakedir FAKEDIR, --fake FAKEDIR Directory containing fake data. (default: 'data/fake') --batchsize BATCHSIZE Batch size. (default: 256) --epochs EPOCHS Number of maximum epochs to train. (default: 20) --seed SEED Random seed. (default: 42) --featureclassname {wave,lfcc,mfcc} Feature classname. (default: 'lfcc') --modelclassname {MLP,WaveRNN,WaveLSTM,SimpleLSTM,ShallowCNN,TSSD} Model classname. (default: 'ShallowCNN') --indistribution {True,False}, --indist {True,False} Whether to use in distribution experiment setup. (default: True) --device DEVICE Device to use. (default: 'cuda' if possible) --deterministic Whether to use deterministic training (reproducible results). --restore Whether to restore from checkpoint. --evalonly Whether to evaluate only. --debug Whether to use debug mode. --debugall Whether to use debug mode for all models. ```

Example:

To make sure all models can run successfully on your device, you can run the following command to test:

bash python train.py --debug_all

To train the model ShallowCNN with lfcc features in the in-distribution setting, you can run the following command:

bash python train.py --real data/real --fake data/fake --batch_size 128 --epochs 20 --seed 42 --feature_classname lfcc --model_classname ShallowCNN

Please use inline environment variable CUDA_VISIBLE_DEVICES to specify the GPU device(s) to use. For example:

bash CUDA_VISIBLE_DEVICES=0 python train.py

Evaluation

By default, we directly use test set for training validation, and the best model and the best predictions will be automatically saved in the saved directory during training/testing. Go to the directory saved to see the evaluation results.

To evaluate on the test set using trained model, you can run the following command:

bash python train.py --feature_classname lfcc --model_classname ShallowCNN --restore --eval_only

Run the following command to re-compute the evaluation results based on saved predictions and labels:

bash python metrics.py

License

Our project is licensed under the MIT License.

Owner

  • Login: rizwann2912
  • Kind: user

Citation (CITATION.cff)

# YAML 1.2
---
cff-version: 1.2.0
message: "If you use this repository, please cite it as below."
authors:
    - affiliation: "Singapore University of Technology and Design"
      family-names: "Huang"
      given-names: "Mark He"
    - affiliation: "Singapore University of Technology and Design"
      family-names: "Zhang"
      given-names: "Peiyuan"
    - affiliation: "Singapore University of Technology and Design"
      family-names: "Tiovalen"
      given-names: "James Raphael"
    - affiliation: "Singapore University of Technology and Design"
      family-names: "Balaji"
      given-names: "Madhumitha"
    - affiliation: "Singapore University of Technology and Design"
      family-names: "Shyam"
      given-names: "Sridhar"
title: "Audio Deep Fake Detection"
version: 0.0.1
date-released: 2025-04-25
url: "https://github.com/rizwann2912/Audio-DeepfakeDetetction"

GitHub Events

Total
  • Watch event: 1
  • Push event: 4
  • Create event: 1
Last Year
  • Watch event: 1
  • Push event: 4
  • Create event: 1

Dependencies

requirements.txt pypi
  • flask *
  • librosa *
  • matplotlib *
  • numpy *
  • puts ==0.0.8
  • scikit-learn *
  • scipy *
  • torchinfo *