birdsongs

Python packing of the physical model motor gestures to characterize and create birdsongs

https://github.com/saguileran/birdsongs

Science Score: 59.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 16 DOI reference(s) in README
  • Academic publication links
    Links to: scholar.google
  • Committers with academic emails
    1 of 5 committers (20.0%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (11.5%) to scientific vocabulary

Keywords

bioacoustics birdsong numerical-optimization syrinx
Last synced: 6 months ago · JSON representation

Repository

Python packing of the physical model motor gestures to characterize and create birdsongs

Basic Info
Statistics
  • Stars: 12
  • Watchers: 2
  • Forks: 6
  • Open Issues: 2
  • Releases: 1
Topics
bioacoustics birdsong numerical-optimization syrinx
Created over 3 years ago · Last pushed about 1 year ago
Metadata Files
Readme License Citation

README.md

This repository is migrating to WaveSongs

Birdsongs

A python package for analyzing, visualizing and generating synthetic birdsongs from recorded audios.

Binder


Table of Contents

Objective

Design, development, and evaluation of a computational-physical model for generating synthetic birdsongs from recorded samples.

Overview

Study and Python implementation of the motor gestures for birdsongs model created by Prof. G Mindlin. This model explains the physics of birdsong by simulating the organs involved in sound production in birds, including the syrinx, trachea, glottis, oro-esophageal cavity (OEC), and beak, using ordinary differential equations (ODEs).

This work presents an automated model for generating synthetic birdsongs that are comparable to real birdsong in both spectrographic and temporal aspects. The model utilizes motor gestures for birdsongs model and an audio recording of real birdsong as input. Automation is achieved by formulating a minimization problem with three control parameters: air sac pressure of the birds bronchi, labial tension of the syrinx walls, and a time scale constant. This optimization problem is solved using numerical methods, signal processing tools, and numerical optimization techniques. The objective function is based on the Fundamental Frequency (also called pitch, denoted as FF or F0) and the Spectral Content Index (SCI) of both synthetic and real syllables.

The model is tested and evaluated on three different Colombian bird species: Zonotrichia capensis, Ocellated Tapaculo, and Mimus gilvus, using recorded samples downloaded from Xeno-Canto and eBird audio libraries. The results show relative errors in FF and SCI of less than 10%, with comparable spectral harmonics in terms of number and frequency, as detailed in the Results section.

Repository Contents

This repository contains the documentation, scripts, and results developed to achive the proposed objective. The files and information are divided in branches as follows:

  • main: Python package with model code implementation, tutorial examples, example data, and results.
  • dissertation: Latex document of the bachelor's dissertation: Design, development, and evaluation of a computational physical model to generate synthetic birdsongs from recorded samples.
  • gh-pages: Archieves for the BirdSongs website, a more in-depth description of the package.
  • results: Some results obtanied from the tutorial examples: image, audios and motor gesture parameters (.csv).

The physical model used, Motor Gestures for birdsongs [1], have been developed by profesog G. Mindlin at the Dynamical Systems Laboratory (in spanish LSD) of the university of Buenos Aires, Argentina.

Python Implementation

Physical Model

Schematic description of the physical model motor gestures for birdsongs with the organs involved in the sound production (syrinx, trachea, glotis, OEC, and beak) and their corresponding ODEs.

methodology
Figure 1. Motor gestures model diagram.

Object-Oriented Thinking

By leveraging the Object-Oriented Programming (OOP) paradigm, the need for lengthy code is minimized. Additionally, the execution and implementation of the model are efficient and straightforward, allowing for the creation and comparison of several syllables with a single line of code. To solve the optimization problem and to analyze and compare real and synthetic birdsong, five objects are created:

  • BirdSong: Read an audio using its file name and a path sobject, it computes the audio spectral and temporal features. It can also split the audio into syllables (in process).
  • Syllable: Create a birdsong syllable from a birdsong object using a time interval that can be selected in the plot or defined as a list. The spectral and temporal features of the syllable are automatically computed.
  • Optimizer: Create an optimizer that solves the minimization problem using the method entered (the default is brute force but can be changed to leastsq, bfgs, newton, etc. The use of a different method to brute force need to add additional parameters. Further information in lmfit) in a feasible region that can be modified.
  • Plotter: Visualize the birdsong and sylalble objects and their spectral and temporal features. It also include a functionality to select points from the spectrum.
  • Paths: Manage the package paths, audio files and results directories.

For each object an icon is defined as follows:

methodology
Figure 2. Objects implemented.

This approach simplifies the interpretation of the methodology diagram. Each icon represents an object that handles different tasks. The major advantage of this implementation is the ability to easily compare features between syllable or chunk (small part of a syllable) objects.

Methodology

Using the previous defined objects, the optimization problem is solved by following the next steps below:

methodology
Figure 3. Methodology diagram.

Each step includes the icon of the object involved. The final output is a parameters object (a data frame similar to the lmfit library parameters objects) containing the optimal control parameter coefficients for the motor gestures that best reproduce the real birdsong.

Installation

Requirments

birdsong is implemented in Python 3.8 but is also tested in Python 3.10 and 3.11. The required packages are listed in the file requirements.txt.

Download

To use birdsongs, clone the main branch of the repository and go to its root folder.

bat git clone -b main --single-branch https://github.com/saguileran/birdsongs.git cd birdsongs You can clone the whole repository using the code git clone https://github.com/saguileran/birdsongs.git but since it is very large only the main branch is enough. To change the branch use the command git checkout follow of the branch name of interest.

The next step is to install the required packages, any of the following commands lines will work:

bat pip install -r ./requirements.txt python -m pip install -r ./requirements.txt

If you are using a version of Python higher than 3.10, to listening the audios you must execute

bat pip install playsound@git+https://github.com/taconi/playsound

Now, install the birdsong package.

bat python .\setup.py install

or using pip, any of the following command lines should work:

bat pip install -e . pip install .

That's all. Now let's create a synthetic birdsong!

Take a look at the tutorials notebooks for basic uses: physical model implementation, motor-gestures.ipynb; define and generate a syllable from a recorded birdsong, syllable.ipynb; or to generate a whole birdsong, several syllables, birdsong.ipynb,

Use

Import Libraries

Import the birdonsg package as bs

python import birdsongs as bs from birdsongs.utils import *

Define Objects

Path and Plotter

First, define the plotter and paths objects, optionally you can specify the audio folder or enable plotter to save figures

```python root = "root\path\files" # default ..\examples\ audios = "path\to\audios" # default ..\examples\audios\ results = "path\to\results" # default ..\examples\results\

paths = bs.Paths(root, audios, results) plotter = bs.Ploter(save=True) # images are saved at ./examples/results/Images/ ```

Displays the audios file names found with the paths.AudiosFiles(True) function, if the folder has a spreadsheet.csv file this function displays all the information about the files inside the folder otherwise it diplays the audio files names found.

Birdsong

Define and plot the wave sound and spectrogram of the sample XC11293. You can use both mp3 and wav files but in Windows maybe you can get errors from librosa.load.

python birdsong = bs.BirdSong(paths, file_id="XC11293", NN=512, umbral_FF=1., Nt=500, tlim=(0,60), flim=(100,20e3) # other features ) plotter.Plot(birdsong, FF_on=False) # plot the wave sound and spectrogram without FF birdsong.Play() # listen the birdsong

[!NOTE] The parameter Nt is related to the envelope of the waveform, for long audios large Nt while for short audios small Nt.

Syllable

Define the syllable using a time interval of interest and the previous birdsong object. The syllable inherits the birdsong attributes (NN, flim, paths, etc). To select the two time interval points (start and end of the syllable) from the plot change the SelectTime_on argument of the plotter.Plot() funtion to True.

```python

selec time intervals from the birdsong plot, you can select a single pair

plotter.Plot(birdsong, FFon=False, SelectTimeon=True) # select ```

Then, define a birdsong syllable with the time interval selected

```python timeintervals = Positions(plotter.klicker) # save print(timeintervals) # display

define the syllable object

syllable = bs.Syllable(birdsong, tlim=timeintervals[0], Nt=30, ide="syllable") plotter.Plot(syllable, FFon=True) ```

[!IMPORTANT] The algorithm used to calculate the fundamental frequency does not perform well at the extremes of the syllable. To avoid issues, do not select the exact extremes; instead, choose a slightly shorter segment of the syllable.

Solve

Now let's define the optimizer object to generate the synthetic syllable by solving the optimization problem. First, create the optimizer object by specifying the optimization method, its parameters, and the syllable of interest.

python brute_kwargs = {'method':'brute', 'Ns':11} # optimization method, Ns is the number of grid points optimizer = bs.Optimizer(syllable, brute_kwargs) # optimizer object Then, execute the solver to find the optimal time scalar constant and the optimal motor gesture parameters (labial tension and air sac pressure vairables) ```python optimal_gm = optimizer.OptimalGamma(syllable) # find optimal gamma (time scale constant) optimizer.OptimalParams(syllable, Ns=11) # find optimal parameters coefficients

syllable, synthsyllable = optimizer.SongByTimes(timeintervals) # find optimal parameters over several time intervals

```

Finally, define the synthetic syllable object with the optimal values found above.

python synth_syllable = syllable.Solve(syllable.p)

Visualize

Finally, visualize and write the optimal synthetic audio.

```python plotter.Plot(synthsyllable); # sound wave and spectrogram of the synthetic syllable plotter.PlotVs(synthsyllable); # physical model variables over the time plotter.PlotAlphaBeta(synthsyllable); # motor gesture curve in the parametric space plotter.Syllables(syllable, synthsyllable); # synthetic and real syllables spectrograms and waveforms plotter.Result(syllable, synth_syllable); # scoring variables and other spectral features

birdsong.WriteAudio(); synth_syllable.WriteAudio(); # write both audios at ./examples/results/Audios ```

[!NOTE]

To generate a single synthetic syllable (or chunck) you must have defined a birdsong (or syllable) and the process is as follows:

  1. Define a path object.
  2. Define a birdsong object using the above path object, it requeries the audio file id. You can also enter the length of the window FFT and the umbral (threshold) for computing the FF, between others.
  3. Select or define the time intervals of interest.
  4. Define an optimization object with a dictionary of the method name and its parameters.
  5. Find the optimal gammas for all the time intervals, or a single, and average them.
  6. Find and export the optimal labia parameters for each syllable, the motor gesture curve.
  7. Generate synthetic birdsong from the previous control parameters found.
  8. Visualize and save all the syrinx, scoring, and result variables.
  9. Save both synthetic and real syllable audios.

The repository has some audio examples, in the ./examples/audios folder. You can download and store your own audios in the same folder or enter the audio folder path to the Paths object. <!-- The package also has a function to download audios from Xeno-Canto: birdsong.util.DownloadXenoCanto(). --> The audios can be in WAV of MP3 format. If you prefer WAV format, we suggest use Audacity to convert the audios without any issue.

Results

The model is tested and evaluated with different syllables of the birdsong of the Rufous Collared Sparrow. Results are located at examples/examples, images and audios. For more information visit the project website birdsongs or access the results page.

Simple syllable of a birdsong of the Rufous Collared Sparrow

methodology

Simple syllable of a birdsong of the Ocellated Tapaculo - Acropternis

methodology

The PDF document of the bachelor thesis, Design, development, and evaluation of a computational physical model to generate synthetic birdsongs from recorded samples, is stored in the dissertation brach of this repository.

Figure 6. Bachelor's thesis PDF document.

Applications

Some of the applications of this model are:

  • Data Augmentation: use the model to create numerous synthetic syllables, it can be done by creating a syntetic birdsong and then varying its motor gestures parameters to get similar birdsongs.
  • Birdsongs Descriptions: characterize and compare birdsongs using the motor gestures parameters.

References

Literature

[1] Amador, A., Perl, Y. S., Mindlin, G. B., & Margoliash, D. (2013). Elemental gesture dynamics are encoded by song premotor cortical neurons. Nature 2013 495:7439, 495(7439), 5964. https://doi.org/10.1038/nature11967.


Software

[2] Newville, M., Stensitzki, T., Allen, D. B., & Ingargiola, A. (2014). LMFIT: Non-Linear Least-Square Minimization and Curve-Fitting for Python. https://doi.org/10.5281/ZENODO.11813.


[3] Ulloa, J. S., Haupert, S., Latorre, J. F., Aubin, T., & Sueur, J. (2021). scikit-maad: An open-source and modular toolbox for quantitative soundscape analysis in Python. Methods in Ecology and Evolution, 12(12), 23342340. https://doi.org/10.1111/2041-210X.13711 Dataset.


[4] McFee, B., Raffel, C., Liang, D., Ellis, D. P., McVicar, M., Battenberg, E., & Nieto, O. & (2015). librosa: Audio and music signal analysis in python. In Proceedings of the 14th python in science conference , 12(12), (Vol. 8). Librosa.

Audios Dataset

[5] Xeno-canto Foundation and Naturalis Biodiversity Center & (2005). Xeno-Canto: Sharing bird sounds from around the world. Dissertation Audios Dataset.


[6] Fink, D., T. Auer, A. Johnston, M. Strimas-Mackey, S. Ligocki, O. Robinson, W. Hochachka, L. Jaromczyk, C. Crowley, K. Dunham, A. Stillman, I. Davies, A. Rodewald, V. Ruiz-Gutierrez, C. Wood. 2023. eBird Status and Trends, Data Version: 2022; Released: 2023. Cornell Lab of Ornithology, Ithaca, New York. https://doi.org/10.2173/ebirdst.2022.

Owner

  • Name: Sebastian Aguilera Novoa
  • Login: saguileran
  • Kind: user
  • Location: Bogotá - Colombia
  • Company: Universidad Nacional de Colombia

Physicist interested in numerical methods and machine learning applications to physics. I also like electronics and love wave simulations.

GitHub Events

Total
  • Issues event: 6
  • Watch event: 6
  • Issue comment event: 2
  • Push event: 1
  • Pull request event: 2
  • Fork event: 3
  • Create event: 1
Last Year
  • Issues event: 6
  • Watch event: 6
  • Issue comment event: 2
  • Push event: 1
  • Pull request event: 2
  • Fork event: 3
  • Create event: 1

Committers

Last synced: about 2 years ago

All Time
  • Total Commits: 259
  • Total Committers: 5
  • Avg Commits per committer: 51.8
  • Development Distribution Score (DDS): 0.637
Past Year
  • Commits: 16
  • Committers: 2
  • Avg Commits per committer: 8.0
  • Development Distribution Score (DDS): 0.375
Top Committers
Name Email Commits
Sebastian Aguilera Novoa 4****n 94
Sebastian Aguilera Novoa s****n@u****o 91
juan d****n@g****m 71
Ian Hunt-Isaak i****k@g****m 2
root r****t@D****n 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 7
  • Total pull requests: 6
  • Average time to close issues: over 1 year
  • Average time to close pull requests: about 3 hours
  • Total issue authors: 2
  • Total pull request authors: 3
  • Average comments per issue: 0.71
  • Average comments per pull request: 0.17
  • Merged pull requests: 5
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 2
  • Average time to close issues: N/A
  • Average time to close pull requests: less than a minute
  • Issue authors: 0
  • Pull request authors: 2
  • Average comments per issue: 0
  • Average comments per pull request: 0.0
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • saguileran (4)
  • masonyoungblood (3)
Pull Request Authors
  • saguileran (5)
  • ianhi (2)
  • LFSaw (1)
Top Labels
Issue Labels
Pull Request Labels

Dependencies

Gemfile.lock rubygems
  • activesupport 6.0.6
  • addressable 2.8.1
  • bundler 2.3.24
  • coffee-script 2.4.1
  • coffee-script-source 1.11.1
  • colorator 1.1.0
  • commonmarker 0.23.6
  • concurrent-ruby 1.1.10
  • dnsruby 1.61.9
  • em-websocket 0.5.3
  • ethon 0.15.0
  • eventmachine 1.2.7
  • execjs 2.8.1
  • faraday 2.6.0
  • faraday-net_http 3.0.1
  • ffi 1.15.5
  • forwardable-extended 2.6.0
  • gemoji 3.0.1
  • github-pages 227
  • github-pages-health-check 1.17.9
  • html-pipeline 2.14.3
  • http_parser.rb 0.8.0
  • i18n 0.9.5
  • jekyll 3.9.2
  • jekyll-avatar 0.7.0
  • jekyll-coffeescript 1.1.1
  • jekyll-commonmark 1.4.0
  • jekyll-commonmark-ghpages 0.2.0
  • jekyll-default-layout 0.1.4
  • jekyll-feed 0.15.1
  • jekyll-gist 1.5.0
  • jekyll-github-metadata 2.13.0
  • jekyll-include-cache 0.2.1
  • jekyll-mentions 1.6.0
  • jekyll-optional-front-matter 0.3.2
  • jekyll-paginate 1.1.0
  • jekyll-readme-index 0.3.0
  • jekyll-redirect-from 0.16.0
  • jekyll-relative-links 0.6.1
  • jekyll-remote-theme 0.4.3
  • jekyll-sass-converter 1.5.2
  • jekyll-seo-tag 2.8.0
  • jekyll-sitemap 1.4.0
  • jekyll-swiss 1.0.0
  • jekyll-theme-architect 0.2.0
  • jekyll-theme-cayman 0.2.0
  • jekyll-theme-dinky 0.2.0
  • jekyll-theme-hacker 0.2.0
  • jekyll-theme-leap-day 0.2.0
  • jekyll-theme-merlot 0.2.0
  • jekyll-theme-midnight 0.2.0
  • jekyll-theme-minimal 0.2.0
  • jekyll-theme-modernist 0.2.0
  • jekyll-theme-primer 0.6.0
  • jekyll-theme-slate 0.2.0
  • jekyll-theme-tactile 0.2.0
  • jekyll-theme-time-machine 0.2.0
  • jekyll-titles-from-headings 0.5.3
  • jekyll-watch 2.2.1
  • jemoji 0.12.0
  • kramdown 2.3.2
  • kramdown-parser-gfm 1.1.0
  • liquid 4.0.3
  • listen 3.7.1
  • mercenary 0.3.6
  • minima 2.5.1
  • minitest 5.16.3
  • nokogiri 1.13.9
  • octokit 4.25.1
  • pathutil 0.16.2
  • public_suffix 4.0.7
  • racc 1.6.0
  • rb-fsevent 0.11.2
  • rb-inotify 0.10.1
  • rexml 3.2.5
  • rouge 3.26.0
  • ruby2_keywords 0.0.5
  • rubyzip 2.3.2
  • safe_yaml 1.0.5
  • sass 3.7.4
  • sass-listen 4.0.0
  • sawyer 0.9.2
  • simpleidn 0.2.1
  • terminal-table 1.8.0
  • thread_safe 0.3.6
  • typhoeus 1.4.0
  • tzinfo 1.2.10
  • unf 0.1.4
  • unf_ext 0.0.8.2
  • unicode-display_width 1.8.0
  • webrick 1.7.0
  • zeitwerk 2.6.4