mmt

Official Implementation of "Multitrack Music Transformer" (ICASSP 2023)

https://github.com/salu133445/mmt

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (10.0%) to scientific vocabulary

Keywords

machine-learning music music-generation music-information-retrieval python

Last synced: 7 months ago · JSON representation ·

Repository

Official Implementation of "Multitrack Music Transformer" (ICASSP 2023)

Basic Info

Host: GitHub
Owner: salu133445
License: mit
Language: Python
Default Branch: main
Homepage: https://salu133445.github.io/mmt/
Size: 410 MB

Statistics

Stars: 150
Watchers: 4
Forks: 27
Open Issues: 2
Releases: 0

Topics

machine-learning music music-generation music-information-retrieval python

Created almost 4 years ago · Last pushed about 2 years ago

Metadata Files

Readme Funding License Citation

Multitrack Music Transformer

This repository contains the official implementation of "Multitrack Music Transformer" (ICASSP 2023).

Multitrack Music Transformer
Hao-Wen Dong, Ke Chen, Shlomo Dubnov, Julian McAuley and Taylor Berg-Kirkpatrick
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2023
[homepage] [paper] [code] [reviews]

Content

Prerequisites
Preprocessing
- Preprocessed Datasets
- Preprocessing Scripts
Training
- Pretrained Models
- Training Scripts
Evaluation
Generation (Inference)
Citation

Prerequisites

We recommend using Conda. You can create the environment with the following command.

sh conda env create -f environment.yml

Preprocessing

Preprocessed Datasets

The preprocessed datasets can be found here.

Extract the files to data/{DATASET_KEY}/processed/json and data/{DATASET_KEY}/processed/notes, where DATASET_KEY is sod, lmd, lmd_full or snd.

Preprocessing Scripts

You can skip this section if you download the preprocessed datasets.

Step 1 -- Download the datasets

Please download the Symbolic orchestral database (SOD). You may download it via command line as follows.

sh wget https://qsdfo.github.io/LOP/database/SOD.zip

We also support the following two datasets:

Lakh MIDI Dataset (LMD):

sh wget http://hog.ee.columbia.edu/craffel/lmd/lmd_full.tar.gz

SymphonyNet Dataset:

sh gdown https://drive.google.com/u/0/uc?id=1j9Pvtzaq8k_QIPs8e2ikvCR-BusPluTb&export=download

Step 2 -- Prepare the name list

Get a list of filenames for each dataset.

sh find data/sod/SOD -type f -name *.mid -o -name *.xml | cut -c 14- > data/sod/original-names.txt

Note: Change the number in the cut command for different datasets.

Step 3 -- Convert the data

Convert the MIDI and MusicXML files into MusPy files for processing.

sh python convert_sod.py

Note: You may enable multiprocessing with the -j option, for example, python convert_sod.py -j 10 for 10 parallel jobs.

Step 4 -- Extract the note list

Extract a list of notes from the MusPy JSON files.

sh python extract.py -d sod

Step 5 -- Split training/validation/test sets

Split the processed data into training, validation and test sets.

sh python split.py -d sod

Training

Pretrained Models

The pretrained models can be found here.

Training Scripts

Train a Multitrack Music Transformer model.

Absolute positional embedding (APE):

python mmt/train.py -d sod -o exp/sod/ape -g 0

Relative positional embedding (RPE):

python mmt/train.py -d sod -o exp/sod/rpe --no-abs_pos_emb --rel_pos_emb -g 0

No positional embedding (NPE):

python mmt/train.py -d sod -o exp/sod/npe --no-abs_pos_emb --no-rel_pos_emb -g 0

Generation (Inference)

Generate new samples using a trained model.

sh python mmt/generate.py -d sod -o exp/sod/ape -g 0

Evaluation

Evaluate the trained model using objective evaluation metrics.

sh python mmt/evaluate.py -d sod -o exp/sod/ape -ns 100 -g 0

Acknowledgment

The code is based largely on the x-transformers library developed by lucidrains.

Citation

Please cite the following paper if you use the code provided in this repository.

Hao-Wen Dong, Ke Chen, Shlomo Dubnov, Julian McAuley, and Taylor Berg-Kirkpatrick, "Multitrack Music Transformer," IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2023.

bibtex @inproceedings{dong2023mmt, author = {Hao-Wen Dong and Ke Chen and Shlomo Dubnov and Julian McAuley and Taylor Berg-Kirkpatrick}, title = {Multitrack Music Transformer}, booktitle = {IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)}, year = 2023, }

Owner

Name: Hao-Wen (Herman) Dong 董皓文
Login: salu133445
Kind: user
Location: USA/Taiwan
Company: UC San Diego

Website: hermandong.com
Twitter: hermanhwdong
Repositories: 26
Profile: https://github.com/salu133445

Assistant Professor at University of Michigan | PhD from UC San Diego | Human-Centered Generative AI for Content Generation

Citation (CITATION.cff)

cff-version: 1.2.0
message: If you use this software, please cite it as below.
authors:
  - family-names: Dong
    given-names: Hao-Wen
title: MMT
preferred-citation:
  type: article
  authors:
    - family-names: Dong
      given-names: Hao-Wen
    - family-names: Chen
      given-names: Ke
    - family-names: Dubnov
      given-names: Shlomo
    - family-names: McAuley
      given-names: Julian
    - family-names: Berg-Kirkpatrick
      given-names: Taylor
  title: Multitrack Music Transformer
  journal: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  year: 2023
date-released: 2022-04-21
license: MIT
url: https://salu133445.github.io/mmt/
repository-code: https://github.com/salu133445/mmt

GitHub Events

Total

Watch event: 14
Fork event: 4

Last Year

Watch event: 14
Fork event: 4

Issues and Pull Requests

Last synced: over 1 year ago

All Time

Total issues: 2
Total pull requests: 1
Average time to close issues: about 1 hour
Average time to close pull requests: N/A
Total issue authors: 2
Total pull request authors: 1
Average comments per issue: 0.0
Average comments per pull request: 0.0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 1
Pull requests: 1
Average time to close issues: about 1 hour
Average time to close pull requests: N/A
Issue authors: 1
Pull request authors: 1
Average comments per issue: 0.0
Average comments per pull request: 0.0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

HKUST-Audio (1)
asigalov61 (1)

Pull Request Authors

tatsuropfgt (1)

Top Labels

Issue Labels

Pull Request Labels

Dependencies

environment.yml conda

black
cudatoolkit 10.2.*
ffmpeg 4.3.1.*
flake8
fluidsynth 2.2.5.*
jupyterlab 3.3.4.*
matplotlib 3.5.1.*
numpy 1.22.3.*
pip 22.0.4.*
pylint
python 3.9.*
pytorch 1.11.0.*
scikit-learn 1.0.2.*
scipy 1.8.0.*
tqdm 4.64.0.*

mmt