https://github.com/ccoreilly/streaming-source-separation

Streaming source separation for music and speech files, using the Open-Unmix LSTM architecture.

https://github.com/ccoreilly/streaming-source-separation

Science Score: 10.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: researchgate.net
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (9.2%) to scientific vocabulary
Last synced: 9 months ago · JSON representation

Repository

Streaming source separation for music and speech files, using the Open-Unmix LSTM architecture.

Basic Info
  • Host: GitHub
  • Owner: ccoreilly
  • Default Branch: master
  • Homepage:
  • Size: 81.1 KB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Archived Fork of tommy-fox/streaming-source-separation
Created over 3 years ago · Last pushed almost 4 years ago

https://github.com/ccoreilly/streaming-source-separation/blob/master/

# streaming-source-separation

Overview:
This project utilizes the Open-Unmix source separation modeling architecture
to produce separated audio to the loudspeakers as it is being processed.



The original Open-Unmix repository can be found [here](https://github.com/sigsep/open-unmix-pytorch/blob/master/README.md). Open-Unmix utilizes 3 bidirectional LSTM layers to generate a spectral mask of its targeted source.
The final separation is produced by Wiener filtering the original mixed signal with the estimated spectral mask. The online, streaming version was accomplished by training unidirectional LSTM models
and implementing a producer-consumer multithreading system in Python.
Included in the 'models' folder are trained models for sung vocals and spoken speech targets.
These were uploaded using git lfs and may require lfs in order to obtain them locally. The model for sung vocals was trained using the MUSDB dataset
and the spoken speech model was trained using a subset of 7000 examples
from Mozilla's Common Voice dataset and 7000 samples from the UrbanSound8k dataset of urban noise. Examples:
Given the provided models, the program can separate sung vocals from a musical mix
or speech from environmental noise.
When evaluating music files, either sung vocals or the backing instruments may be extracted. ``` python3 unmix_stream.py path_to_music_file.wav acapella ``` ``` python3 unmix_stream.py path_to_music_file.wav instrumental ``` ``` python3 unmix_stream.py path_to_noisy_speech_file.wav speech ``` References:
Stter, F.R., Uhlich, S., Liutkus, A., Mitsufuji, Y. (2019). Open-Unmix - A Reference Implementation for Music Source Separation. Journal of Open Source Software, Open Journals, 4(41), 1667.
[Open-Unmix Repository](https://github.com/sigsep/open-unmix-pytorch/blob/master/README.md) Mozilla (2017). Mozilla Common Voice.
[Common Voice Dataset](https://voice.mozilla.org/en) Salamon, J., Jacoby, C., & Bello, J. P. (2014, November). A dataset and taxonomy for urban sound research. In Proceedings of the 22nd ACM international conference on Multimedia (pp. 1041-1044). ACM.
[UbanSound Dataset Paper](https://www.researchgate.net/profile/Justin_Salamon/publication/267269056_A_Dataset_and_Taxonomy_for_Urban_Sound_Research/links/544936af0cf2f63880810a84/A-Dataset-and-Taxonomy-for-Urban-Sound-Research.pdf)

Owner

  • Name: Ciaran O'Reilly
  • Login: ccoreilly
  • Kind: user
  • Location: Berlin
  • Company: @parloa

GitHub Events

Total
Last Year