augmentednet

A Roman Numeral Analysis Network with Synthetic Training Examples and Additional Tonal Tasks

https://github.com/napulen/augmentednet

Science Score: 57.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 5 DOI reference(s) in README
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (11.9%) to scientific vocabulary

Last synced: 9 months ago · JSON representation ·

Repository

A Roman Numeral Analysis Network with Synthetic Training Examples and Additional Tonal Tasks

Basic Info

Host: GitHub
Owner: napulen
License: mit
Language: Jupyter Notebook
Default Branch: main
Homepage:
Size: 47.8 MB

Statistics

Stars: 38
Watchers: 2
Forks: 8
Open Issues: 17
Releases: 25

Created over 5 years ago · Last pushed over 2 years ago

Metadata Files

Readme Changelog License Citation

AugmentedNet

AugmentedNet is an automatic Roman numeral analysis neural network.

The network was developed by Nstor Npoles Lpez as part of his PhD research. It was first mentioned in a co-authored ISMIR paper in 2021, and later on in the body of the dissertation.

It has been used to power the analysis features in at least the following projects: - Sibelius - Vimu.app - MusicLang

The version documented in the PhD dissertation matches the v1.9.1 release of this repository.

The older version of the model described in the ISMIR paper matches the v1.0.0 release of this repository.

In general, the results of v1.9.1 are better and it is encouraged to use (and compare against) that version.

PhD Dissertation

Npoles Lpez, Nstor. 2022. Automatic Roman Numeral Analysis in Symbolic Music Representations. PhD Thesis, McGill University. https://escholarship.mcgill.ca/concern/theses/qr46r6307.

bibtex @phdthesis{napoleslopez22automatic, type = {{PhD} {Thesis}}, title = {Automatic {Roman} {Numeral} {Analysis} in {Symbolic} {Music} {Representations}}, url = {https://escholarship.mcgill.ca/concern/theses/qr46r6307}, school = {McGill University}, author = {Npoles Lpez, Nstor}, month = dec, year = {2022} }

ISMIR Paper

N. Npoles Lpez, M. Gotham, and I. Fujinaga, AugmentedNet: A Roman Numeral Analysis Network with Synthetic Training Examples and Additional Tonal Tasks. in Proceedings of the 22nd International Society for Music Information Retrieval Conference, 2021, pp. 404411. https://doi.org/10.5281/zenodo.5624533

bibtex @inproceedings{napoleslopez21augmentednet, author = {Npoles Lpez, Nstor and Gotham, Mark and Fujinaga, Ichiro}, title = {{AugmentedNet: A Roman Numeral Analysis Network with Synthetic Training Examples and Additional Tonal Tasks}}, booktitle = {{Proceedings of the 22nd International Society for Music Information Retrieval Conference}}, year = 2021, pages = {404-411}, publisher = {ISMIR}, address = {Online}, month = nov, venue = {Online}, doi = {10.5281/zenodo.5624533}, url = {https://doi.org/10.5281/zenodo.5624533} }

Try out the pre-trained network

Clone, create a virtual environment, and get the python dependencies.

```bash git clone https://github.com/napulen/AugmentedNet.git cd AugmentedNet python3 -m venv .env source .env/bin/activate

(.env) pip install -r requirements.txt ```

I have experienced that pip is sometimes incapable of installing specific package versions depending on your environment. This requirements.txt was tested on a vanilla Ubuntu 20.04, both in native linux and Windows 10 WSL2. A docker tensorflow/tensorflow:2.5-gpu image should also work.

Run the pre-trained model for inference on a MusicXML file

bash python -m AugmentedNet.inference AugmentedNetv.hdf5 <input_file>.musicxml

Two files will be generated:

<input_file>_annotated.xml
<input_file>_annotated.csv

An annotated MusicXML file and the csv file with the predictions of every time step.

Training the network from scratch

Clone recursively (needed to collect the third-party datasets), create a virtual environment, and get the python dependencies

```bash git clone --recursive https://github.com/napulen/AugmentedNet.git cd AugmentedNet python3 -m venv .env source .env/bin/activate

(.env) pip install -r requirements.txt ```

Using accompanying data

To save you some time, we include the preprocessed tsv files of the real data, as well as the synthetic block-chord templates for texturization. These are available in the release of the latest version.

bash wget https://github.com/napulen/AugmentedNet/releases/latest/download/dataset.zip unzip dataset.zip

Now you are ready to train the network.

Generating synthetic examples

There are two ways to generate texturizations: at the tsv-level (legacy) and at the numpy-level (newer).

Texturizations at the tsv-level (legacy)

You can generate one texturization per file in the dataset with this script

python (.env) python -m AugmentedNet.dataset_tsv_generator --synthesize --texturize

Originally, this is how I trained v1.0.0. The network was trained with exactly twice the amount of real training data.

Texturizations at the npz-level (newer)

After v1.5.0, the tsv-dataset only includes the templates (i.e., block chords). The texturization is done when encoding the numpy arrays for the neural network. Right before training.

There are two options for texturization in this way 1. Generate one texturization per file 2. Generate one texturization per transposition

In the second approach, every time you transpose a synthetic example to a different key, you re-texturize it. The amount of "new" training examples seen by the network is much larger this way.

You can control those settings when training the network.

```bash

No synthetic examples

(.env) python -m AugmentedNet.train

Synthetic examples, one texturization per file

(.env) python -m AugmentedNet.train --syntheticDataStrategy concatenate

Synthetic examples, one texturization per transposition

(.env) python -m AugmentedNet.train --syntheticDataStrategy concatenate --texturizeEachTransposition ```

See next section for the <compulsory_args>.

At the moment, the code for generating the texturizations is not extremely simple, if you only wanted to do that. However, raise an issue, reach out, and I'll make my best effort to help you on your use case.

Training the network

If you want to train the network, the minimum call looks like this.

bash (.env) python -m AugmentedNet.train debug testexperiment

The code is integrated with mlflow. In the training script, debug and testexperiment refer to the experiment and run names passed down to mlflow. You can access more CLI parameters by running python -m AugmentedNet.train --help.

After training the network, you will get a path to the trained hdf5 model, which looks something like this:

The trained model is available in: .model_checkpoint/debug/testexperiment-220101T000000/81-6.000-0.78.hdf5

You can use that trained model for inference, using the same workflow shown above.

About the AugmentedNet

The neural network architecture

The architecture is a CRNN (Convolutional Recurrent Neural Network) with an alternative representation of pitch spelling at the input.

More information about the neural network architecture can be found in the paper.

AugmentedNet Architecture

Organization of the repo

This repository is organized in the following way

AugmentedNet has all the source code of the network
img the image diagrams of the network and code organization
misc useful, but non-essential, stand-alone scripts that I wrote while developing this project
notebooks Jupyter notebook playgrounds used throughout the project (e.g., data exploration)
test unit tests for all relevant modules of the network

The AugmentedNet source code

The general organization of the code is summarized by the following diagram.

AugmentedNet

Each of the blue rectangles roughly corresponds to a Python module.

The inputs of the network are pairs of (MusicXML, RomanText) files.

The inputs pairs are converted into pandas DataFrame objects, stored as .tsv files.

Later on, these are encoded in a representation that can be dispatched to the neural network.

The module documentation is located here.

Experiments

Visualizing the results with mlflow

All the experiments presented in the paper were monitored using mlflow.

If you want to visualize the experiments with the mlflow ui:

pip install mlflow
Download our mlruns with the AugmentedNet experiments
Unzip anywhere
Run mlflow ui from the terminal; make sure that ./mlruns/ is reachable from the current directory
Visit localhost:5000
That's it! The experiments should be available in the browser

For extra convenience, I also uploaded the logs to TensorBoard.dev.

Here are the tables of the paper and a link to see the runs of each model in Tensorboard.dev.

Paper results and tensorboard visualizations

AugmentedNet configurations

These are the results for the four different configurations of the AugmentedNet.

| Model | Key | Deg. | Qual. | Inv. | Root | RN | |----------------------------|---------------|---------------|---------------|---------------|---------------|---------------| | AugmentedNet6 | 82.7 | 64.4 | 76.6 | 77.4 | 82.5 | 43.3 | | AugmentedNet6+ | 83.0 | 65.1 | 77.5 | 78.6 | 83.0 | 44.6 | | AugmentedNet11 | 81.3 | 64.2 | 77.2 | 76.1 | 82.9 | 43.1 | | AugmentedNet11+ | 83.7 | 66.0 | 77.6 | 77.2 | 83.2 | 45.0 |

Visualize experiments in TensorBoard.dev!

6 and 11 indicate the number of tasks in the multitask learning layout.

+ indicates the use of synthetic training data.

AugmentedNet vs. other models

These are the results for the best AugmentedNet configuration (11+) against other models.

| Test set | Training set | Model | Key | Degree | Quality | Inversion | Root | ComRN | RNconv | RNalt | |-------------------------|--------------|--------------|--------------------------------|-----------------------|-----------------------|-----------------------|-----------------------|-----------------------|--------------------------------|-----------------------| | Full test set | Full dataset | AugN | 82.9 | 67.0 | 79.7 | 78.8 | 83.0 | 65.6 | 46.4 | 51.5 | | WiR | Full dataset | AugN | 81.8 | 69.2 | 85.9 | 90.3 | 90.3 | 70.2 | 56.4 | 62.4 | | HaydnSun | Full dataset | AugN | 81.2 | 62.9 | 80.2 | 82.7 | 86.5 | 60.4 | 48.6 | 52.1 | | ABC | Full dataset | AugN | 83.6 | 65.6 | 78.0 | 76.9 | 78.9 | 62.6 | 44.5 | 48.4 | | TAVERN | Full dataset | AugN | 88.7 | 60.0 | 77.4 | 78.8 | 81.5 | 66.3 | 42.6 | 52.9 | | WTC | Full dataset | AugN | 77.2 | 69.7 | 75.0 | 74.4 | 82.7 | 61.7 | 46.2 | 47.9 | | WTCcrossval | BPS+WTC | AugN | 85.1(4.0) | 62.9(5.5) | 69.1(1.9) | 70.1(3.7) | 79.2(1.8) | 59.9(3.4) | 42.9(4.2) | 46.9(4.7) | | WTCcrossval | BPS+WTC | CS21 | 56.3(2.5) | - | - | - | - | - | 26.0(1.7) | - | | BPS | Full dataset | AugN | 85.0 | 73.4 | 79.0 | 73.4 | 84.4 | 68.3 | 45.4 | 49.3 | | BPS | All data | Mi20 | 82.9 | 68.3 | 76.6 | 72.0 | - | - | 42.8 | - | | BPS | BPS+WTC | AugN | 82.9 | 70.9 | 80.7 | 72.0 | 85.3 | 67.6 | 44.1 | 47.5 | | BPS | BPS+WTC | CS21 | 79.0 | - | - | - | - | - | 41.7 | - | | BPS | BPS | AugN | 83.0 | 71.2 | 80.3 | 71.1 | 84.1 | 68.5 | 44.0 | 47.4 | | BPS | BPS | Mi20 | 80.6 | 66.5 | 76.3 | 68.1 | - | - | 39.1 | - | | BPS | BPS | CS19 | 78.4 | 65.1 | 74.6 | 62.1 | - | - | - | - | | BPS | BPS | CS18 | 66.7 | 51.8 | 60.6 | 59.1 | - | - | 25.7 | - |

Visualize experiments in TensorBoard.dev!

Owner

Name: Néstor Nápoles López
Login: napulen
Kind: user
Location: Montreal, Québec
Company: Avid Technology (Sibelius)

Website: https://napulen.github.io/
Twitter: napulen
Repositories: 8
Profile: https://github.com/napulen

PhD in Music Technology by McGill University. Senior Software Developer at Avid Technology.

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Nápoles López"
  given-names: "Néstor"
  orcid: "https://orcid.org/0000-0001-7347-2613"
- family-names: "Gotham"
  given-names: "Mark"
  orcid: "https://orcid.org/0000-0003-0722-3074"
- family-names: "Fujinaga"
  given-names: "Ichiro"
  orcid: "https://orcid.org/0000-0003-2524-8582"
title: "AugmentedNet (source code)"
version: 1.0.0
doi: ""
date-released: 2021-08-05
url: "https://github.com/napulen/AugmentedNet"

GitHub Events

Total

Watch event: 5
Fork event: 1

Last Year

Watch event: 5
Fork event: 1

Issues and Pull Requests

Last synced: about 1 year ago

All Time

Total issues: 54
Total pull requests: 46
Average time to close issues: 20 days
Average time to close pull requests: about 1 hour
Total issue authors: 5
Total pull request authors: 4
Average comments per issue: 0.81
Average comments per pull request: 0.33
Merged pull requests: 43
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 1
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 1
Pull request authors: 0
Average comments per issue: 1.0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

napulen (49)
adityac95 (2)
MarkGotham (1)
luto65 (1)
clariguy (1)

Pull Request Authors

napulen (43)
clariguy (1)
giamic (1)
Alicelavander (1)

Top Labels

Issue Labels

high priority (8) medium priority (2) low priority (1) enhancement (1) documentation (1)

Pull Request Labels

Dependencies

.github/workflows/main.yml actions

actions/checkout v2 composite
actions/setup-python v2 composite
codecov/codecov-action v1 composite

.github/workflows/pages.yml actions

JamesIves/github-pages-deploy-action 4.1.4 composite
actions/checkout v2.3.1 composite
actions/setup-python v2 composite

requirements.txt pypi

Flask ==2.0.3
GitPython ==3.1.27
Jinja2 ==3.0.3
Keras-Preprocessing ==1.1.2
Mako ==1.1.6
Markdown ==3.3.6
MarkupSafe ==2.1.0
Pillow ==9.0.1
PyYAML ==6.0
Pygments ==2.11.2
SQLAlchemy ==1.4.32
Werkzeug ==2.0.3
absl-py ==0.15.0
alembic ==1.7.6
appdirs ==1.4.4
asttokens ==2.0.5
astunparse ==1.6.3
backcall ==0.2.0
black ==20.8b1
cachetools ==5.0.0
certifi ==2021.10.8
chardet ==4.0.0
charset-normalizer ==2.0.12
click ==8.0.4
cloudpickle ==2.0.0
coverage ==6.3.1
cycler ==0.11.0
databricks-cli ==0.16.4
debugpy ==1.5.1
decorator ==5.1.1
docker ==5.0.3
entrypoints ==0.4
executing ==0.8.3
flatbuffers ==1.12
fonttools ==4.30.0
gast ==0.4.0
gitdb ==4.0.9
google-auth ==2.6.0
google-auth-oauthlib ==0.4.6
google-pasta ==0.2.0
greenlet ==1.1.2
grpcio ==1.34.1
gunicorn ==20.1.0
h5py ==3.1.0
idna ==3.3
importlib-metadata ==4.11.2
importlib-resources ==5.4.0
ipykernel ==6.9.2
ipython ==8.1.1
itsdangerous ==2.1.0
jedi ==0.18.1
joblib ==1.1.0
jsonpickle ==2.1.0
jupyter-client ==7.1.2
jupyter-core ==4.9.2
keras-nightly ==2.5.0.dev2021032900
kiwisolver ==1.4.0
matplotlib ==3.5.1
matplotlib-inline ==0.1.3
mido ==1.2.10
mlflow ==1.23.1
more-itertools ==8.12.0
music21 ==6.7.1
mypy-extensions ==0.4.3
nest-asyncio ==1.5.4
numpy ==1.19.5
oauthlib ==3.2.0
opt-einsum ==3.3.0
packaging ==21.3
pandas ==1.4.1
parso ==0.8.3
pathspec ==0.9.0
pdoc ==10.0.1
pexpect ==4.8.0
pickleshare ==0.7.5
prometheus-client ==0.13.1
prometheus-flask-exporter ==0.18.7
prompt-toolkit ==3.0.28
protobuf ==3.19.4
psutil ==5.9.0
ptyprocess ==0.7.0
pure-eval ==0.2.2
pyasn1 ==0.4.8
pyasn1-modules ==0.2.8
pyparsing ==3.0.7
python-dateutil ==2.8.2
pytz ==2021.3
pyzmq ==22.3.0
querystring-parser ==1.2.4
regex ==2022.3.2
requests ==2.27.1
requests-oauthlib ==1.3.1
rsa ==4.8
scipy ==1.8.0
seaborn ==0.11.2
six ==1.15.0
smmap ==5.0.0
sqlparse ==0.4.2
stack-data ==0.2.0
tabulate ==0.8.9
tensorboard ==2.8.0
tensorboard-data-server ==0.6.1
tensorboard-plugin-wit ==1.8.1
tensorflow ==2.5.0
tensorflow-estimator ==2.5.0
termcolor ==1.1.0
toml ==0.10.2
tornado ==6.1
traitlets ==5.1.1
typed-ast ==1.5.2
typing-extensions ==3.7.4.3
urllib3 ==1.26.8
wcwidth ==0.2.5
webcolors ==1.11.1
websocket-client ==1.3.1
wrapt ==1.12.1
zipp ==3.7.0

augmentednet

Science Score: 57.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

AugmentedNet

PhD Dissertation

ISMIR Paper

Try out the pre-trained network

Training the network from scratch

Using accompanying data

Generating synthetic examples

Texturizations at the tsv-level (legacy)

Texturizations at the npz-level (newer)

No synthetic examples

Synthetic examples, one texturization per file

Synthetic examples, one texturization per transposition

Training the network

About the AugmentedNet

The neural network architecture

Organization of the repo

The AugmentedNet source code

Experiments

Visualizing the results with mlflow

Paper results and tensorboard visualizations

AugmentedNet configurations

AugmentedNet vs. other models

Owner

Citation (CITATION.cff)

GitHub Events

Total

Last Year

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Dependencies