https://github.com/google-deepmind/slowfast_nfnets

Science Score: 10.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
○
codemeta.json file
○
.zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org, ieee.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (10.5%) to scientific vocabulary

Last synced: 8 months ago · JSON representation

Repository

Basic Info

Host: GitHub
Owner: google-deepmind
License: apache-2.0
Language: Python
Default Branch: main
Size: 13.7 KB

Statistics

Stars: 30
Watchers: 3
Forks: 1
Open Issues: 1
Releases: 0

Archived

Created about 4 years ago · Last pushed almost 4 years ago

Metadata Files

Readme Contributing License

Towards Learning Universal Audio Representations

In Towards Learning Universal Audio Representations, we introduce a Holistic Audio Representation Evaluation Suite (HARES), containing 12 downstream tasks spanning the speech, music, and environmental sound domains, with the hope that this will spur research on developing better models for universal audio representations. Together with the benchmark, we also propose a new Slowfast NFNet architecture in the paper.

HARES tasks

Below is a summary of all 12 HARES tasks, with the links to obtaining these freely available datasets. Note that the lables of original test sets of Birdsong and TUT18 are not publicly availabe - therefore we use the splits created by the authors of Pre-Training Audio Representations with Self-Supervision, which is based on the original training subset. For more details about how to assemble these tasks, please refer to Appendix A of the arXiv version of our paper.

| Dataset | Task | #Samples | #Classes | Domain | |----------|:-------------|------:|------:|:------| | AudioSet | audio tagging | 1.9m | 527 | environment | | Birdsong | animal sound | 36k | 2 | environment | | TUT18 | acoustic scenes | 8.6k | 10 | environment | | ESC-50 | acoustic scenes | 2.0k | 50 | environment | | Speech Commands v1 | keyword | 90k | 12 | speech | | Speech Commands v2 | keyword | 96k | 35 | speech | | Fluent Speech Commands | intention | 27k | 31 | speech | | VoxForge | languge id | 145k | 6 | speech | | VoxCeleb | speaker id | 147k | 1251 | speech | | NSynth-instrument | instrument id | 293k | 11 | music | | NSynth-pitch | pitch estimation | 293k | 128 | music | | MagnaTagATune | music tagging | 26k | 50 | music |

Audio Slowfast NFNets, a JAX implementation

We provide a JAX/Haiku implementation of the Slowfast NfNet-F0. This convolutional neural network combines Slowfast networks' ability to model both transient and long-range signals in audio, and NFNets' strong performance optimized for hardware accelerators. It achieves the state-of-the-art score on the HARES benchmark.

You may use our unit tests to test your development environment and to know more about the usage of the models, which can be executed using pytest:

bash $ pip install -r requirements.txt $ python -m pytest [-n <NUMCPUS>] slowfast_nfnets

Usage

The unit tests provided together with the model shows a few use cases of how the model can be run.

Citing this work

BibTex for citing the paper:

bibtex @inproceedings{wang2022towards, title={Towards Learning Universal Audio Representations}, author={Wang, Luyu and Luc, Pauline and Wu, Yan and Recasens, Adria and Smaira, Lucas and Brock, Andrew and Jaegle, Andrew and Alayrac, Jean-Baptiste and Dieleman, Sander and Carreira, Joao and van den Oord, Aaron}, booktitle={IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)}, pages={4593--4597}, year={2022}, organization={IEEE} }

Disclaimer

This is not an official Google product.

Owner

Name: Google DeepMind
Login: google-deepmind
Kind: organization

Website: https://www.deepmind.com/
Repositories: 245
Profile: https://github.com/google-deepmind

GitHub Events

Total

Last Year

Dependencies

requirements.txt pypi

chex >=0.0.6
dm-haiku *
numpy *
pytest-xdist *

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science