mmt

Official Implementation of "Multitrack Music Transformer" (ICASSP 2023)

https://github.com/salu133445/mmt

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (10.0%) to scientific vocabulary

Keywords

machine-learning music music-generation music-information-retrieval python
Last synced: 6 months ago · JSON representation ·

Repository

Official Implementation of "Multitrack Music Transformer" (ICASSP 2023)

Basic Info
Statistics
  • Stars: 150
  • Watchers: 4
  • Forks: 27
  • Open Issues: 2
  • Releases: 0
Topics
machine-learning music music-generation music-information-retrieval python
Created almost 4 years ago · Last pushed almost 2 years ago
Metadata Files
Readme Funding License Citation

README.md

Multitrack Music Transformer

This repository contains the official implementation of "Multitrack Music Transformer" (ICASSP 2023).

Multitrack Music Transformer
Hao-Wen Dong, Ke Chen, Shlomo Dubnov, Julian McAuley and Taylor Berg-Kirkpatrick
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2023
[homepage] [paper] [code] [reviews]

Content

Prerequisites

We recommend using Conda. You can create the environment with the following command.

sh conda env create -f environment.yml

Preprocessing

Preprocessed Datasets

The preprocessed datasets can be found here.

Extract the files to data/{DATASET_KEY}/processed/json and data/{DATASET_KEY}/processed/notes, where DATASET_KEY is sod, lmd, lmd_full or snd.

Preprocessing Scripts

You can skip this section if you download the preprocessed datasets.

Step 1 -- Download the datasets

Please download the Symbolic orchestral database (SOD). You may download it via command line as follows.

sh wget https://qsdfo.github.io/LOP/database/SOD.zip

We also support the following two datasets:

sh wget http://hog.ee.columbia.edu/craffel/lmd/lmd_full.tar.gz

sh gdown https://drive.google.com/u/0/uc?id=1j9Pvtzaq8k_QIPs8e2ikvCR-BusPluTb&export=download

Step 2 -- Prepare the name list

Get a list of filenames for each dataset.

sh find data/sod/SOD -type f -name *.mid -o -name *.xml | cut -c 14- > data/sod/original-names.txt

Note: Change the number in the cut command for different datasets.

Step 3 -- Convert the data

Convert the MIDI and MusicXML files into MusPy files for processing.

sh python convert_sod.py

Note: You may enable multiprocessing with the -j option, for example, python convert_sod.py -j 10 for 10 parallel jobs.

Step 4 -- Extract the note list

Extract a list of notes from the MusPy JSON files.

sh python extract.py -d sod

Step 5 -- Split training/validation/test sets

Split the processed data into training, validation and test sets.

sh python split.py -d sod

Training

Pretrained Models

The pretrained models can be found here.

Training Scripts

Train a Multitrack Music Transformer model.

  • Absolute positional embedding (APE):

python mmt/train.py -d sod -o exp/sod/ape -g 0

  • Relative positional embedding (RPE):

python mmt/train.py -d sod -o exp/sod/rpe --no-abs_pos_emb --rel_pos_emb -g 0

  • No positional embedding (NPE):

python mmt/train.py -d sod -o exp/sod/npe --no-abs_pos_emb --no-rel_pos_emb -g 0

Generation (Inference)

Generate new samples using a trained model.

sh python mmt/generate.py -d sod -o exp/sod/ape -g 0

Evaluation

Evaluate the trained model using objective evaluation metrics.

sh python mmt/evaluate.py -d sod -o exp/sod/ape -ns 100 -g 0

Acknowledgment

The code is based largely on the x-transformers library developed by lucidrains.

Citation

Please cite the following paper if you use the code provided in this repository.

Hao-Wen Dong, Ke Chen, Shlomo Dubnov, Julian McAuley, and Taylor Berg-Kirkpatrick, "Multitrack Music Transformer," IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2023.

bibtex @inproceedings{dong2023mmt, author = {Hao-Wen Dong and Ke Chen and Shlomo Dubnov and Julian McAuley and Taylor Berg-Kirkpatrick}, title = {Multitrack Music Transformer}, booktitle = {IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)}, year = 2023, }

Owner

  • Name: Hao-Wen (Herman) Dong 董皓文
  • Login: salu133445
  • Kind: user
  • Location: USA/Taiwan
  • Company: UC San Diego

Assistant Professor at University of Michigan | PhD from UC San Diego | Human-Centered Generative AI for Content Generation

Citation (CITATION.cff)

cff-version: 1.2.0
message: If you use this software, please cite it as below.
authors:
  - family-names: Dong
    given-names: Hao-Wen
title: MMT
preferred-citation:
  type: article
  authors:
    - family-names: Dong
      given-names: Hao-Wen
    - family-names: Chen
      given-names: Ke
    - family-names: Dubnov
      given-names: Shlomo
    - family-names: McAuley
      given-names: Julian
    - family-names: Berg-Kirkpatrick
      given-names: Taylor
  title: Multitrack Music Transformer
  journal: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  year: 2023
date-released: 2022-04-21
license: MIT
url: https://salu133445.github.io/mmt/
repository-code: https://github.com/salu133445/mmt

GitHub Events

Total
  • Watch event: 14
  • Fork event: 4
Last Year
  • Watch event: 14
  • Fork event: 4

Issues and Pull Requests

Last synced: over 1 year ago

All Time
  • Total issues: 2
  • Total pull requests: 1
  • Average time to close issues: about 1 hour
  • Average time to close pull requests: N/A
  • Total issue authors: 2
  • Total pull request authors: 1
  • Average comments per issue: 0.0
  • Average comments per pull request: 0.0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 1
  • Pull requests: 1
  • Average time to close issues: about 1 hour
  • Average time to close pull requests: N/A
  • Issue authors: 1
  • Pull request authors: 1
  • Average comments per issue: 0.0
  • Average comments per pull request: 0.0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • HKUST-Audio (1)
  • asigalov61 (1)
Pull Request Authors
  • tatsuropfgt (1)
Top Labels
Issue Labels
Pull Request Labels

Dependencies

environment.yml conda
  • black
  • cudatoolkit 10.2.*
  • ffmpeg 4.3.1.*
  • flake8
  • fluidsynth 2.2.5.*
  • jupyterlab 3.3.4.*
  • matplotlib 3.5.1.*
  • numpy 1.22.3.*
  • pip 22.0.4.*
  • pylint
  • python 3.9.*
  • pytorch 1.11.0.*
  • scikit-learn 1.0.2.*
  • scipy 1.8.0.*
  • tqdm 4.64.0.*