leaf-pytorch

PyTorch implementation of the LEAF audio frontend

https://github.com/sarthakyadav/leaf-pytorch

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (11.6%) to scientific vocabulary

Last synced: 6 months ago · JSON representation ·

Repository

PyTorch implementation of the LEAF audio frontend

Basic Info

Host: GitHub
Owner: SarthakYadav
Language: Python
Default Branch: master
Size: 135 KB

Statistics

Stars: 71
Watchers: 2
Forks: 9
Open Issues: 2
Releases: 0

Created about 4 years ago · Last pushed almost 3 years ago

Metadata Files

Readme License Citation

leaf-pytorch

Sponsors
Attention
About
Key Points
Dependencies
Running experiments
Results
Loading Pretrained Models
References

Attention

leaf-pytorch implementation is now officially a part of SpeechBrain, with a sample recipe on SpeechCommands-v2 here. I would recommend folks trying to work with LEAF use SpeechBrain implementation instead, because of the overall ecosystem as well as better documentation. Thanks for your interest!

About

This is a PyTorch implementation of the LEAF audio frontend [1], made using the official tensorflow implementation as a direct reference.
This implementation supports training on TPUs using torch-xla.

Key Points

Will be evaluated on AudioSet, SpeechCommands and Voxceleb1 datasets, and pretrained weights will be made available.
Currently, torch-xla has some issues with certain complex64 operations: torch.view_as_real(comp), comp.real, comp.imag as highlighted in #Issue 3070. These are used primarily for generating gabor impulse responses. To bypass this shortcoming, an alternate implementation using manual complex number operations is provided.
Matched performance on SpeechCommands, experiments on other datasets ongoing

Dependencies

torch >= 1.9.0 torchaudio >= 0.9.0 torch-audiomentations==0.9.0 SoundFile==0.10.3.post1 msgpack msgpack-numpy wandb transformers lmdb [Optional] torch_xla == 1.9

Additional dependencies include ```

needed for augmentations

WavAugment ```

Running experiments

Setup

The only thing cfgs (such as the efficientnet-b0 default cfg) need is a path to the "metaroot" under data section. Meta dir needs to have file manifest for each split as well as a lblmap. A sample meta dir for SpeechCommands can be found here

Training

To train a model on speechcommands, run the following: python train.py --cfg_file cfgs/speechcommands/efficientnet-b0-leaf-default.cfg --expdir ./exps/scv2/efficientnet-b0_default_leaf_bs1x256_adam_warmupcosine_wd_1e-4_rs8881 --epochs 100 --num_workers 8 --log_steps 50 --random_seed 8881 --no_wandb

Testing

To evaluate the trained model, do python test.py --test_csv_name ./speechcommands_v2_meta/test.csv --exp_dir ./exps/scv2/efficientnet-b0_default_leaf_bs1x256_adam_warmupcosine_wd_1e-4_rs8881 --meta_dir ./speechcommands_v2_meta

Results

All experiments on VoxCeleb1 and SpeechCommands were repeated at least 5 times, and 95% ci are reported.

| Model | Dataset | Metric | features | Official | This repo | weights | | ----- | ----- | ----- | ----- | ----- | ----- | ----- | | EfficientNet-b0 | SpeechCommands v2 | Accuracy | LEAF | 93.4±0.3 | 94.5±0.3 | ckpt | ResNet-18 | SpeechCommands v2 | Accuracy | LEAF | N/A | 94.05±0.3 | ckpt | EfficientNet-b0 | VoxCeleb1 | Accuracy | LEAF | 33.1±0.7 | 40.9±1.8 | ckpt | ResNet-18 | VoxCeleb1 | Accuracy | LEAF | N/A | 44.7±2.9 | ckpt

Observations

ResNet-18 likely works better for VoxCeleb1 simply because it's a more difficult task than SpeechCommands and ResNet-18 has more parameters.

Evaluating different init schemes for `complex_conv` init

To evaluate how non-Mel initialization schemes for complex_conv work, experiments were repeated on xavier_normal, kaiming_normal and randn init schemes on the SpeechCommands dataset.

| Model | Features | Init | Test Accuracy| | ----- | ----- | ----- | ----- | | EfficientNet-b0 | LEAF | Default (Mel) | 94.5±0.3 | | EfficientNet-b0 | LEAF | randn | 84.7±1.6 | | EfficientNet-b0 | LEAF | kaiming_normal | 84.7±2.3 | | EfficientNet-b0 | LEAF | xavier_normal | 79.1±0.7 |

Loading Pretrained Models

download and extract desired ckpt from Results. ```python import os import torch import pickle from models.classifier import Classifier

resultsdir = "" hparamspath = os.path.join(resultsdir, "hparams.pickle") ckptpath = os.path.join(resultsdir, "ckpts", "") checkpoint = torch.load(ckptpath) with open(hparamspath, "rb") as fp: hparams = pickle.load(fp) model = Classifier(hparams.cfg) print(model.loadstatedict(checkpoint['modelstate_dict']))

to access just the pretrained LEAF frontend

frontend = model.features ```

References

[1] If you use this repository, kindly cite the LEAF paper:

@article{zeghidour2021leaf, title={LEAF: A Learnable Frontend for Audio Classification}, author={Zeghidour, Neil and Teboul, Olivier and de Chaumont Quitry, F{\'e}lix and Tagliasacchi, Marco}, journal={ICLR}, year={2021} }

Please also consider citing this implementation using the following bibtex or from the citation widget on the sidebar.

@software{Yadav_leaf-pytorch_2021, author = {Yadav, Sarthak}, month = {12}, title = {{leaf-pytorch}}, version = {0.0.1}, year = {2021} }

Owner

Name: Sarthak Yadav
Login: SarthakYadav
Kind: user
Location: Martigny, Switzerland
Company: IDIAP

Website: https://sarthakyadav.github.io/
Twitter: yadav_sar
Repositories: 5
Profile: https://github.com/SarthakYadav

Research Intern at IDIAP Research Institute | MSc(R), University of Glasgow

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
  - family-names: Yadav
    given-names: Sarthak
    orcid: https://orcid.org/0000-0002-1979-9460
title: "leaf-pytorch"
version: 0.0.1
date-released: 2021-12-07

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

leaf-pytorch

Science Score: 44.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

leaf-pytorch

Attention

Sponsors

About

Key Points

Dependencies

needed for augmentations

Running experiments

Setup

Training

Testing

Results

Observations

Evaluating different init schemes for `complex_conv` init

Loading Pretrained Models

to access just the pretrained LEAF frontend

References

Owner

Citation (CITATION.cff)

GitHub Events

Total

Last Year

leaf-pytorch

Science Score: 44.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

leaf-pytorch

Attention

Sponsors

About

Key Points

Dependencies

needed for augmentations

Running experiments

Setup

Training

Testing

Results

Observations

Evaluating different init schemes for complex_conv init

Loading Pretrained Models

to access just the pretrained LEAF frontend

References

Owner

Citation (CITATION.cff)

GitHub Events

Total

Last Year

Evaluating different init schemes for `complex_conv` init