https://github.com/google-research/perch

Science Score: 49.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 1 DOI reference(s) in README
✓
Academic publication links
Links to: arxiv.org, nature.com, frontiersin.org
○
Committers with academic emails
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (10.1%) to scientific vocabulary

Keywords from Contributors

distribution deep-neural-networks jax transformers research optim interpretability reinforcement-learning archival cart

Last synced: 8 months ago · JSON representation

Repository

Basic Info

Host: GitHub
Owner: google-research
License: apache-2.0
Language: Python
Default Branch: main
Size: 14.8 MB

Statistics

Stars: 244
Watchers: 10
Forks: 53
Open Issues: 23
Releases: 1

Created over 4 years ago · Last pushed 11 months ago

Metadata Files

Readme Contributing License Authors

Perch

A bioacoustics research project.

Directory of Things

We have published quite a few things which utilize this repository!

Perch (and SurfPerch!)

We produce a bird species classifier, trained on over 10k species.

The current released Perch model is available from Kaggle Models.
The current-best citation for the model is our paper: Global birdsong embeddings enable superior transfer learning for bioacoustic classification.
The SurfPerch model, trained on a combination of birds, coral reef sounds, and general audio, is also available at Kaggle models. The associated paper is (as of this writing) available as a preprint.

The major parts of the Perch model training code is broken up across the following files:

Model frontend - we use a PCEN melspectrogram.
EfficientNet model
Training loop
Training launch script
Export from JAX to Tensorflow and TFLite

Agile Modeling

Agile modeling combines search and active learning to produce classifiers for novel concepts quickly.

Here's Tutorial Colab Notebook we produced for Climate Change AI and presented at their workshop at NeurIPS 2023.

We maintain three 'working' notebooks for agile modeling in this repository:

embed_audio.ipynb for performing mass-embedding of audio.
agile_modeling.ipynb for search and active learning over embeddings.
analysis.ipynb for running inference and performing call density estimation (see below).
The code for agile modeling is largely contained in the inference directory, which contains its own extensive README.

The agile modeling work supports a number of different models, including our models (Perch and SurfPerch, and the multi-species whale classifier), BirdNet, and some general audio models like YamNet and VGGish. Adding support for additional models is fairly trivial.

Call Density

We provide some tooling for estimating the proportion of audio windows in a dataset containing a target call type or species - anything you have a classifier for.

Paper: All Thresholds Barred: Direct Estimation of Call Density in Bioacoustic Data

Code: See call_density.py and calldensitytest.py. Note that the code contains some interactions with our broader 'agile modeling' work, though we have endeavoured to isolate the underlying mathematics in more modular functions.

BIRB Benchmark

We produced a benchmark paper for understanding model generalization when transferring from focal to passive acoustic datasets. The preprint is available here.

For details on setting up the benchmark and evaluation protocol, please refer to this brief readme with instructions. The evaluation codebase is in perch/chirp/eval.

To build the BIRB evaluation data, after installing the chirp package, run the following command from the repository's root directory:

bash poetry run tfds build -i chirp.data.bird_taxonomy,chirp.data.soundscapes \ soundscapes/{ssw,hawaii,coffee_farms,sierras_kahl,high_sierras,peru}_full_length \ bird_taxonomy/{downstream_full_length,class_representatives_slice_peaked}

The process should take 36 to 48 hours to complete and use around 256 GiB of disk space.

Source-Free Domain Adaptation and NOTELA

We have a paper on 'source-free domain generalization,' which involves automatic model adaptation to data from a shifted domain. We have a blog post where you can read more about it. The paper was published in ICML 2023. The code for this project has been archived. You can download a snapshot of the repository containing the code, which can be found in the chirp/projects/sfda directory.

Installation

We support installation on a generic Linux workstation. A GPU is recommended, especially when working with large datasets. The recipe below is the same used by our continuous integration testing.

Some users have successfully used our repository with the Windows Linux Subsystem, or with Docker in a cloud-based virtual machine. Anecdotally, installation on OS X is difficult.

You will need the following dependencies.

```bash

Install Poetry for package management

curl -sSL https://install.python-poetry.org | python3 -

Install dependencies for librosa

sudo apt-get install libsndfile1 ffmpeg

Install all dependencies specified in the poetry configs.

Note that for Windows machines, you can remove the `--with nonwindows`

option to drop some optional dependencies which do not build for Windows.

poetry install --with jaxtrain --with nonwindows ```

Running poetry install installs all Perch dependencies into a new virtual environment, in which you can run the Perch code base. To run the tests, use:

bash poetry run python -m unittest discover -s chirp/tests -p "*test.py" poetry run python -m unittest discover -s chirp/inference/tests -p "*test.py"

Lightweight Inference

Note that if you only need the python notebooks for use with pre-trained models, you can install with lighter dependencies:

```

Install inference-only dependencies specified in the poetry configs

poetry install ```

And check that the inference tests succeed: bash poetry run python -m unittest discover -s chirp/inference/tests -p "*test.py"

Using a container

Alternatively, you can install and run this project using a container via Docker. To build a container using the tag perch, run:

bash git clone https://github.com/google-research/perch cd perch docker build . --tag perch

After building the container, to run the unit tests, use:

bash docker run --rm -t perch python -m unittest discover -s chirp/tests -p "*test.py"

This is not an officially supported Google product.

Owner

Name: Google Research
Login: google-research
Kind: organization
Location: Earth

Website: https://research.google
Repositories: 226
Profile: https://github.com/google-research

GitHub Events

Total

Issues event: 10
Watch event: 70
Delete event: 23
Issue comment event: 30
Push event: 113
Pull request event: 51
Fork event: 14
Create event: 24

Last Year

Issues event: 10
Watch event: 70
Delete event: 23
Issue comment event: 30
Push event: 113
Pull request event: 51
Fork event: 14
Create event: 24

Committers

Last synced: about 1 year ago

All Time

Total Commits: 696
Total Committers: 26
Avg Commits per committer: 26.769
Development Distribution Score (DDS): 0.572

Past Year

Commits: 114
Committers: 11
Avg Commits per committer: 10.364
Development Distribution Score (DDS): 0.289

Top Committers

Name	Email	Commits
Tom Denton	t**n@g**m	298
Vincent Dumoulin	v**n@g**m	93
Bart van Merriënboer	b**m@g**m	80
Malik Boudiaf	m**f@g**m	49
Chirp Team	c**o@g**m	36
Eleni Triantafillou	e**u@g**m	35
Jenny Hamer	h**r@g**m	28
mschulist	m**2@g**m	9
Matt Harvey	m**y@g**m	8
malik	m**f@h**r	7
Dan Morris	a**s@g**m	7
Marcus Chiam	m**m@g**m	6
Lauren Harrell	l**l@g**m	5
dependabot[bot]	4****]	5
Peter Hawkins	p**s@g**m	5
Jake VanderPlas	v**s@g**m	5
Laura Pak	l**k@g**m	4
Rebecca Chen	r**n@g**m	3
jeffgeoff4	j**s@g**m	3
Bart van Merriënboer	b**r@g**m	3
Vamsi Manchala	v**a@g**m	2
DongHyun Choi	d**i@g**m	1
Ethan Manilow	e**w@g**m	1
Hana Joo	h**o@g**m	1
Oscar Wahltinez	o**z@g**m	1
Yilei Yang	y**g@g**m	1

Committer Domains (Top 20 + Academic)

google.com: 20

Issues and Pull Requests

Last synced: 10 months ago

All Time

Total issues: 32
Total pull requests: 264
Average time to close issues: 3 months
Average time to close pull requests: about 1 month
Total issue authors: 14
Total pull request authors: 10
Average comments per issue: 1.69
Average comments per pull request: 0.27
Merged pull requests: 66
Bot issues: 0
Bot pull requests: 252

Past Year

Issues: 7
Pull requests: 48
Average time to close issues: 16 days
Average time to close pull requests: 6 days
Issue authors: 4
Pull request authors: 4
Average comments per issue: 1.71
Average comments per pull request: 0.19
Merged pull requests: 21
Bot issues: 0
Bot pull requests: 45

View more stats

Top Authors

Issue Authors

sdenton4 (13)
sammlapp (5)
joshctaylor (3)
IamJeffG (2)
Shiro-LK (2)
ilyassmoummad (1)
KasparSoltero (1)
jongalon (1)
mschulist (1)
rudrakshkarpe (1)
cparcerisas (1)
Tindtily (1)
ryanz22 (1)
copybara-service[bot] (1)
nnbuainain (1)

Pull Request Authors

copybara-service[bot] (221)
dependabot[bot] (72)
IamJeffG (4)
peichins (4)
mschulist (2)
cparcerisas (2)
agentmorris (2)
BenCretois (1)
vdumoulin (1)
bringingjoy (1)

Top Labels

Issue Labels

Pull Request Labels

dependencies (71) python (7)

Dependencies

poetry.lock pypi

184 dependencies

pyproject.toml pypi

SPARQLWrapper ^2.0.0
absl-py ^1.0.0
clu ^0.0.7
etils ^0.6.0
flax ^0.5.1
jax ^0.3.9
ml-collections ^0.1.1
optax ^0.1.2
pandas ==1.3.5
python >=3.7.1,<3.11
ratelimiter ^1.2.0.post0
tensorflow ^2.8.0
tensorflow-datasets ^4.6.0

.github/workflows/ci.yml actions

actions/checkout v3 composite
actions/setup-python v4 composite

chirp/birb_sep_paper/requirements.txt pypi

absl-py ==1.0.0
apache-beam ==2.38.0
ml-collections ==0.1.1
scipy ==1.8.0
tensorflow ==2.8.0

Dockerfile docker

python 3.11 build

https://github.com/google-research/perch

Science Score: 49.0%

Keywords from Contributors

Repository

Basic Info

Statistics

Metadata Files

README.md

Perch

Directory of Things

Perch (and SurfPerch!)

Agile Modeling

Call Density

BIRB Benchmark

Source-Free Domain Adaptation and NOTELA

Installation

Install Poetry for package management

Install dependencies for librosa

Install all dependencies specified in the poetry configs.

Note that for Windows machines, you can remove the --with nonwindows

option to drop some optional dependencies which do not build for Windows.

Lightweight Inference

Install inference-only dependencies specified in the poetry configs

Using a container

Owner

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Dependencies

Note that for Windows machines, you can remove the `--with nonwindows`