mmt
Official Implementation of "Multitrack Music Transformer" (ICASSP 2023)
Science Score: 54.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (10.0%) to scientific vocabulary
Keywords
Repository
Official Implementation of "Multitrack Music Transformer" (ICASSP 2023)
Basic Info
- Host: GitHub
- Owner: salu133445
- License: mit
- Language: Python
- Default Branch: main
- Homepage: https://salu133445.github.io/mmt/
- Size: 410 MB
Statistics
- Stars: 150
- Watchers: 4
- Forks: 27
- Open Issues: 2
- Releases: 0
Topics
Metadata Files
README.md
Multitrack Music Transformer
This repository contains the official implementation of "Multitrack Music Transformer" (ICASSP 2023).
Multitrack Music Transformer
Hao-Wen Dong, Ke Chen, Shlomo Dubnov, Julian McAuley and Taylor Berg-Kirkpatrick
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2023
[homepage]
[paper]
[code]
[reviews]
Content
Prerequisites
We recommend using Conda. You can create the environment with the following command.
sh
conda env create -f environment.yml
Preprocessing
Preprocessed Datasets
The preprocessed datasets can be found here.
Extract the files to data/{DATASET_KEY}/processed/json and data/{DATASET_KEY}/processed/notes, where DATASET_KEY is sod, lmd, lmd_full or snd.
Preprocessing Scripts
You can skip this section if you download the preprocessed datasets.
Step 1 -- Download the datasets
Please download the Symbolic orchestral database (SOD). You may download it via command line as follows.
sh
wget https://qsdfo.github.io/LOP/database/SOD.zip
We also support the following two datasets:
sh
wget http://hog.ee.columbia.edu/craffel/lmd/lmd_full.tar.gz
sh
gdown https://drive.google.com/u/0/uc?id=1j9Pvtzaq8k_QIPs8e2ikvCR-BusPluTb&export=download
Step 2 -- Prepare the name list
Get a list of filenames for each dataset.
sh
find data/sod/SOD -type f -name *.mid -o -name *.xml | cut -c 14- > data/sod/original-names.txt
Note: Change the number in the cut command for different datasets.
Step 3 -- Convert the data
Convert the MIDI and MusicXML files into MusPy files for processing.
sh
python convert_sod.py
Note: You may enable multiprocessing with the
-joption, for example,python convert_sod.py -j 10for 10 parallel jobs.
Step 4 -- Extract the note list
Extract a list of notes from the MusPy JSON files.
sh
python extract.py -d sod
Step 5 -- Split training/validation/test sets
Split the processed data into training, validation and test sets.
sh
python split.py -d sod
Training
Pretrained Models
The pretrained models can be found here.
Training Scripts
Train a Multitrack Music Transformer model.
- Absolute positional embedding (APE):
python mmt/train.py -d sod -o exp/sod/ape -g 0
- Relative positional embedding (RPE):
python mmt/train.py -d sod -o exp/sod/rpe --no-abs_pos_emb --rel_pos_emb -g 0
- No positional embedding (NPE):
python mmt/train.py -d sod -o exp/sod/npe --no-abs_pos_emb --no-rel_pos_emb -g 0
Generation (Inference)
Generate new samples using a trained model.
sh
python mmt/generate.py -d sod -o exp/sod/ape -g 0
Evaluation
Evaluate the trained model using objective evaluation metrics.
sh
python mmt/evaluate.py -d sod -o exp/sod/ape -ns 100 -g 0
Acknowledgment
The code is based largely on the x-transformers library developed by lucidrains.
Citation
Please cite the following paper if you use the code provided in this repository.
Hao-Wen Dong, Ke Chen, Shlomo Dubnov, Julian McAuley, and Taylor Berg-Kirkpatrick, "Multitrack Music Transformer," IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2023.
bibtex
@inproceedings{dong2023mmt,
author = {Hao-Wen Dong and Ke Chen and Shlomo Dubnov and Julian McAuley and Taylor Berg-Kirkpatrick},
title = {Multitrack Music Transformer},
booktitle = {IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
year = 2023,
}
Owner
- Name: Hao-Wen (Herman) Dong 董皓文
- Login: salu133445
- Kind: user
- Location: USA/Taiwan
- Company: UC San Diego
- Website: hermandong.com
- Twitter: hermanhwdong
- Repositories: 26
- Profile: https://github.com/salu133445
Assistant Professor at University of Michigan | PhD from UC San Diego | Human-Centered Generative AI for Content Generation
Citation (CITATION.cff)
cff-version: 1.2.0
message: If you use this software, please cite it as below.
authors:
- family-names: Dong
given-names: Hao-Wen
title: MMT
preferred-citation:
type: article
authors:
- family-names: Dong
given-names: Hao-Wen
- family-names: Chen
given-names: Ke
- family-names: Dubnov
given-names: Shlomo
- family-names: McAuley
given-names: Julian
- family-names: Berg-Kirkpatrick
given-names: Taylor
title: Multitrack Music Transformer
journal: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
year: 2023
date-released: 2022-04-21
license: MIT
url: https://salu133445.github.io/mmt/
repository-code: https://github.com/salu133445/mmt
GitHub Events
Total
- Watch event: 14
- Fork event: 4
Last Year
- Watch event: 14
- Fork event: 4
Issues and Pull Requests
Last synced: over 1 year ago
All Time
- Total issues: 2
- Total pull requests: 1
- Average time to close issues: about 1 hour
- Average time to close pull requests: N/A
- Total issue authors: 2
- Total pull request authors: 1
- Average comments per issue: 0.0
- Average comments per pull request: 0.0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 1
- Pull requests: 1
- Average time to close issues: about 1 hour
- Average time to close pull requests: N/A
- Issue authors: 1
- Pull request authors: 1
- Average comments per issue: 0.0
- Average comments per pull request: 0.0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- HKUST-Audio (1)
- asigalov61 (1)
Pull Request Authors
- tatsuropfgt (1)
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- black
- cudatoolkit 10.2.*
- ffmpeg 4.3.1.*
- flake8
- fluidsynth 2.2.5.*
- jupyterlab 3.3.4.*
- matplotlib 3.5.1.*
- numpy 1.22.3.*
- pip 22.0.4.*
- pylint
- python 3.9.*
- pytorch 1.11.0.*
- scikit-learn 1.0.2.*
- scipy 1.8.0.*
- tqdm 4.64.0.*