https://github.com/ccoreilly/streaming-source-separation

Streaming source separation for music and speech files, using the Open-Unmix LSTM architecture.

Science Score: 10.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
○
codemeta.json file
○
.zenodo.json file
○
DOI references
✓
Academic publication links
Links to: researchgate.net
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (9.2%) to scientific vocabulary

Last synced: 9 months ago · JSON representation

Repository

Streaming source separation for music and speech files, using the Open-Unmix LSTM architecture.

Basic Info

Host: GitHub
Owner: ccoreilly
Default Branch: master
Homepage:
Size: 81.1 KB

Statistics

Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Releases: 0

Archived Fork of tommy-fox/streaming-source-separation

Created over 3 years ago · Last pushed almost 4 years ago

https://github.com/ccoreilly/streaming-source-separation/blob/master/

# streaming-source-separation

Overview:

This project utilizes the Open-Unmix source separation modeling architecture

to produce separated audio to the loudspeakers as it is being processed.

The original Open-Unmix repository can be found [here](https://github.com/sigsep/open-unmix-pytorch/blob/master/README.md).

Open-Unmix utilizes 3 bidirectional LSTM layers to generate a spectral mask of its targeted source.

The final separation is produced by Wiener filtering the original mixed signal with the estimated spectral mask.

The online, streaming version was accomplished by training unidirectional LSTM models

and implementing a producer-consumer multithreading system in Python.

Included in the 'models' folder are trained models for sung vocals and spoken speech targets.

These were uploaded using git lfs and may require lfs in order to obtain them locally.

The model for sung vocals was trained using the MUSDB dataset

and the spoken speech model was trained using a subset of 7000 examples

from Mozilla's Common Voice dataset and 7000 samples from the UrbanSound8k dataset of urban noise.

Examples:

Given the provided models, the program can separate sung vocals from a musical mix

or speech from environmental noise.

When evaluating music files, either sung vocals or the backing instruments may be extracted.
```
python3 unmix_stream.py path_to_music_file.wav acapella
```
```
python3 unmix_stream.py path_to_music_file.wav instrumental
```
```
python3 unmix_stream.py path_to_noisy_speech_file.wav speech
```
References:

Stter, F.R., Uhlich, S., Liutkus, A., Mitsufuji, Y. (2019). Open-Unmix - A Reference Implementation for Music Source Separation. Journal of Open Source Software, Open Journals, 4(41), 1667.

[Open-Unmix Repository](https://github.com/sigsep/open-unmix-pytorch/blob/master/README.md)

Mozilla (2017). Mozilla Common Voice.

[Common Voice Dataset](https://voice.mozilla.org/en)

Salamon, J., Jacoby, C., & Bello, J. P. (2014, November). A dataset and taxonomy for urban sound research. In Proceedings of the 22nd ACM international conference on Multimedia (pp. 1041-1044). ACM.

[UbanSound Dataset Paper](https://www.researchgate.net/profile/Justin_Salamon/publication/267269056_A_Dataset_and_Taxonomy_for_Urban_Sound_Research/links/544936af0cf2f63880810a84/A-Dataset-and-Taxonomy-for-Urban-Sound-Research.pdf)

Owner

Name: Ciaran O'Reilly
Login: ccoreilly
Kind: user
Location: Berlin
Company: @parloa

Website: https://oreilly.cat
Repositories: 51
Profile: https://github.com/ccoreilly

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

https://github.com/ccoreilly/streaming-source-separation

Science Score: 10.0%

Repository

Basic Info

Statistics

https://github.com/ccoreilly/streaming-source-separation/blob/master/

Owner

GitHub Events

Total

Last Year