senseflosser
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (11.3%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: hyve9
- License: mit
- Language: Python
- Default Branch: main
- Size: 77.8 MB
Statistics
- Stars: 1
- Watchers: 2
- Forks: 0
- Open Issues: 0
- Releases: 2
Metadata Files
README.md
Logo generated by ChatGPT (because of course it is).
senseflosser
Takes an autoencoder keras model and adversarially attempts to deteriorate layers and neurons.
"senseflosser" is a homophonic translation of Sinneslöschen, itself a non-idiomatic translation of "sense deletion", from the Polybius mythos.
Usage
``` usage: buildautoencoder.py [-h] --data-dir DATADIR [--duration DURATION] [--var-input] [--percentage PERCENTAGE] [--log LOG]
optional arguments: -h, --help show this help message and exit --data-dir DATA_DIR Directory containing the audio files --duration DURATION Duration of audio (in seconds) to use for training --var-input Use variable input size for model --percentage PERCENTAGE Percentage of dataset to use --log LOG Logging level (choose from: critical, error, warn, info, debug) ```
``` usage: run_senseflosser.py [-h] [--model MODEL] [--magnitude MAGNITUDE] [--titrate] [--duration DURATION] [--action ACTION] [--input INPUT] [--output-dir OUTPUT_DIR] [--save-model] [--log LOG]
optional arguments: -h, --help show this help message and exit --model MODEL Model file to load --magnitude MAGNITUDE Magnitude of noise to introduce --titrate Titrate noise magnitude --duration DURATION Duration of audio in seconds --action ACTION Action to perform (currently fog or lapse) --input INPUT Input file to process --output-dir OUTPUT_DIR Output directory (default: ./output) --save-model Save flossed model --log LOG Logging level (choose from: critical, error, warn, info, debug) ```
Note: All tools/programs must be run from the root of this directory.
Setup: Create conda environment
conda create -f envs/environment.yml -n senseflosser
conda activate senseflosser
Autoencoder
This repo contains two tools (which should probably be separated into different projects). The first simply builds an audio autoencoder. This should reproduce audio input. To build your own autoencoder, you can run, for example:
python build_autoencoder.py --data-dir data/fma_small --percentage 0.4 --duration 10 --log debug --var-input
This will build an autoencoder trained on 10 second samples grabbed from 40% of your data, and will build the model in such a way that it supports inference (prediction) on variable length input.
The model will be saved under models/<duration>s_audio_autoencoder.h5
Data preprocessing
The autoencoder is designed to work with 22 kHz, 16-bit, single-channel audio .wav files. Any other file format or encoding will cause the model to fail, although I'm sure it can be designed to be more robust. Submit a PR if you have an idea!
If you aren't lucky enough to have data that is in that format, there are two shell scripts that can do this for you. One will iterate through your data directory and convert all files from .wav, reduce to mono, and resample to 22050. To use:
./tools/fma_wav_full_converter.sh data/fma_small/ mp3
If you just want to mix to single channel and resample (you already have wav files):
./tools/fma_wav_resampler.sh data/fma_small/
Senseflosser
Once you've built your autoencoder (or maybe you brought your own), you can experiment with degrading some layers by running senseflosser:
python run_senseflosser.py --input data/fma_small/006/006329.wav --model models/15s_audio_autoencoder.h5 --action lapse --magnitude 0.05 --duration 30
This takes an input audio file, a pre-trained autoencoder, an action (here either "fog" or "lapse"), a magnitude (here 0.05), and a duration for the audio. Note that you can specify the duration to be longer than the inputs that the model was trained on, only if the autoencoder was trained with the --var-input option.
Based on experimentation, good values for magnitude are between 0.02 and 0.5. Values higher than 0.5 severely degrade the audio (which may be what you want!)
If you want to save your degraded model, add the --save-model option.
Web Frontend
You can run this tool as a web application for demonstration purposes. See flask for more information.
Using as a module
You can include Senseflosser as a module for your work.
git submodule add git@github.com:hyve9/senseflosser.git
git submodule update --init
Then in your code:
``` senseflosserdir = (Path.cwd() / 'senseflosser/') sys.path.append(str(senseflosserdir)) from senseflosser.utils import run_senseflosser
outputfiles = runsenseflosser(modelfile, magnitude, action, inputfile, outputdir, duration, titrate, savemodel)
```
Acknowledgments
This project was built as a group project for the Spring '24 Deep Learning for Media class at NYU, and we had a lot of help from both faculty and colleagues in that class. Thanks, guys!
- FMA Dataset: The autoencoders included in this repo were trained on (preprocessed) data from the FMA Dataset, and the two shell scripts mentioned above expect a similar directory structure. The FMA dataset is licensed under the MIT License.
- Lightning AI: The Lightning Studio compute resource for this project was provided by Lightning AI.
Owner
- Login: hyve9
- Kind: user
- Repositories: 1
- Profile: https://github.com/hyve9
Citation (CITATION.cff)
cff-version: 1.2.0
title: Senseflosser
message: >-
If you use this software, please cite it using the
metadata from this file.
type: software
authors:
- given-names: A.
family-names: Brewer
email: afb8252@nyu.edu
affiliation: New York University
- given-names: M.
family-names: Frick
email: mf2675@nyu.edu
affiliation: New York University
- given-names: M.
family-names: Falgowski
email: mlf435@nyu.edu
affiliation: New York University
repository-code: 'https://github.com/hyve9/senseflosser'
abstract: >-
Takes an autoencoder keras model and adversarially attempts
to deteriorate layers and neurons.
keywords:
- machine learning
- music
license: MIT
version: 0.0.2
date-released: '2024-04-30'
GitHub Events
Total
Last Year
Dependencies
- absl-py ==2.1.0
- anyio ==4.3.0
- argon2-cffi ==23.1.0
- argon2-cffi-bindings ==21.2.0
- arrow ==1.3.0
- asttokens ==2.4.1
- astunparse ==1.6.3
- async-lru ==2.0.4
- attrs ==23.2.0
- audioread ==3.0.1
- babel ==2.14.0
- beautifulsoup4 ==4.12.3
- black ==24.3.0
- bleach ==6.1.0
- cachetools ==5.3.3
- certifi ==2024.2.2
- cffi ==1.16.0
- chardet ==5.2.0
- charset-normalizer ==3.3.2
- click ==8.1.7
- comm ==0.2.2
- contourpy ==1.2.0
- cycler ==0.12.1
- debugpy ==1.8.1
- decorator ==5.1.1
- defusedxml ==0.7.1
- deprecated ==1.2.14
- distlib ==0.3.8
- dm-tree ==0.1.8
- exceptiongroup ==1.2.0
- executing ==2.0.1
- fastjsonschema ==2.19.1
- filelock ==3.13.3
- flatbuffers ==24.3.7
- fonttools ==4.50.0
- fqdn ==1.5.1
- future ==1.0.0
- gast ==0.4.0
- google-auth ==2.29.0
- google-auth-oauthlib ==0.4.6
- google-pasta ==0.2.0
- grpcio ==1.62.1
- h11 ==0.14.0
- h5py ==3.10.0
- httpcore ==1.0.4
- httpx ==0.27.0
- idna ==3.7
- importlib-metadata ==7.0.2
- importlib-resources ==6.4.0
- iniconfig ==2.0.0
- ipykernel ==6.29.3
- ipython ==8.18.1
- isoduration ==20.11.0
- jams ==0.3.4
- jedi ==0.19.1
- jinja2 ==3.1.3
- joblib ==1.3.2
- json5 ==0.9.24
- jsonpointer ==2.4
- jsonschema ==4.21.1
- jsonschema-specifications ==2023.12.1
- jupyter-client ==8.6.1
- jupyter-core ==5.7.2
- jupyter-events ==0.10.0
- jupyter-lsp ==2.2.4
- jupyter-server ==2.13.0
- jupyter-server-terminals ==0.5.3
- jupyterlab ==4.1.5
- jupyterlab-pygments ==0.3.0
- jupyterlab-server ==2.25.4
- keras ==2.10.0
- keras-preprocessing ==1.1.2
- keras-tcn ==3.5.0
- kiwisolver ==1.4.5
- lazy-loader ==0.4
- libclang ==16.0.6
- librosa ==0.10.1
- llvmlite ==0.42.0
- markdown ==3.5.2
- markdown-it-py ==3.0.0
- markupsafe ==2.1.5
- matplotlib ==3.8.3
- matplotlib-inline ==0.1.6
- mdurl ==0.1.2
- mido ==1.3.2
- mir-eval ==0.7
- mirdata ==0.3.8
- mistune ==3.0.2
- ml-dtypes ==0.3.2
- msgpack ==1.0.8
- mypy-extensions ==1.0.0
- namex ==0.0.7
- nbclient ==0.10.0
- nbconvert ==7.16.3
- nbformat ==5.10.3
- nest-asyncio ==1.6.0
- notebook-shim ==0.2.4
- numba ==0.59.0
- numpy ==1.26.4
- nvidia-cublas-cu12 ==12.3.4.1
- nvidia-cuda-cupti-cu12 ==12.3.101
- nvidia-cuda-nvcc-cu12 ==12.3.107
- nvidia-cuda-nvrtc-cu12 ==12.3.107
- nvidia-cuda-runtime-cu12 ==12.3.101
- nvidia-cudnn-cu12 ==8.9.7.29
- nvidia-cufft-cu12 ==11.0.12.1
- nvidia-curand-cu12 ==10.3.4.107
- nvidia-cusolver-cu12 ==11.5.4.101
- nvidia-cusparse-cu12 ==12.2.0.103
- nvidia-nccl-cu12 ==2.19.3
- nvidia-nvjitlink-cu12 ==12.3.101
- oauthlib ==3.2.2
- opencv-python ==4.9.0.80
- opt-einsum ==3.3.0
- optree ==0.11.0
- overrides ==7.7.0
- packaging ==23.2
- pandas ==2.2.1
- pandocfilters ==1.5.1
- parso ==0.8.3
- pathspec ==0.12.1
- pexpect ==4.9.0
- pillow ==10.2.0
- pip ==24.0
- platformdirs ==4.2.0
- pluggy ==1.4.0
- pooch ==1.8.1
- pretty-midi ==0.2.10
- prometheus-client ==0.20.0
- prompt-toolkit ==3.0.43
- protobuf ==3.19.6
- psutil ==5.9.8
- ptyprocess ==0.7.0
- pure-eval ==0.2.2
- pyasn1 ==0.6.0
- pyasn1-modules ==0.4.0
- pycparser ==2.22
- pygments ==2.17.2
- pyparsing ==3.1.2
- pytest ==8.1.1
- python-dateutil ==2.9.0.post0
- python-json-logger ==2.0.7
- pytz ==2024.1
- pyyaml ==6.0.1
- pyzmq ==25.1.2
- referencing ==0.34.0
- requests ==2.31.0
- requests-oauthlib ==2.0.0
- rfc3339-validator ==0.1.4
- rfc3986-validator ==0.1.1
- rich ==13.7.1
- rpds-py ==0.18.0
- rsa ==4.9
- scikit-learn ==1.4.1.post1
- scipy ==1.12.0
- send2trash ==1.8.2
- six ==1.16.0
- smart-open ==7.0.4
- sniffio ==1.3.1
- sortedcontainers ==2.4.0
- soundfile ==0.12.1
- soupsieve ==2.5
- soxr ==0.3.7
- stack-data ==0.6.3
- tensorboard ==2.10.1
- tensorboard-data-server ==0.6.1
- tensorboard-plugin-wit ==1.8.1
- tensorflow ==2.10.1
- tensorflow-addons ==0.23.0
- tensorflow-estimator ==2.10.0
- tensorflow-hub ==0.16.1
- tensorflow-io-gcs-filesystem ==0.36.0
- tensorrt ==8.6.0
- tensorrt-cu12 ==10.0.1
- tensorrt-cu12-bindings ==10.0.1
- tensorrt-cu12-libs ==10.0.1
- termcolor ==2.4.0
- terminado ==0.18.1
- tf-keras ==2.15.0
- threadpoolctl ==3.4.0
- tinycss2 ==1.2.1
- tomli ==2.0.1
- tornado ==6.4
- tqdm ==4.66.2
- traitlets ==5.14.2
- typeguard ==2.13.3
- types-python-dateutil ==2.9.0.20240316
- typing-extensions ==4.10.0
- tzdata ==2024.1
- uri-template ==1.3.0
- urllib3 ==2.2.1
- virtualenv ==20.25.1
- wcwidth ==0.2.13
- webcolors ==1.13
- webencodings ==0.5.1
- websocket-client ==1.7.0
- werkzeug ==3.0.2
- wrapt ==1.16.0
- zipp ==3.18.0