Spleeter
Spleeter: a fast and efficient music source separation tool with pre-trained models - Published in JOSS (2020)
Panako
Panako: a scalable audio search system - Published in JOSS (2022)
gladia-torchaudio
Data manipulation and transformation for audio signal processing, powered by PyTorch
pyaca
Python scripts accompanying the book "An Introduction to Audio Content Analysis" (www.AudioContentAnalysis.org)
triton
:whale: Scripps Whale Acoustics Lab :earth_americas: Scripps Acoustic Ecology Lab - Triton with remoras in development
mbari-pbp
Process ocean audio data archives to daily analysis products of hybrid millidecade spectra using PyPAM.
dawdreamer
Digital Audio Workstation with Python; VST instruments/effects, parameter automation, FAUST, JAX, Warp Markers, and JUCE processors
https://github.com/fgnt/nara_wpe
Different implementations of "Weighted Prediction Error" for speech dereverberation
callsync
R package to align recordings, detect, assign, trace and analyse vocalisations
https://github.com/bytedance/salmonn
SALMONN family: A suite of advanced multi-modal LLMs
aifororcas-livesystem
Real-time AI-assisted killer whale notification system (model and moderator portal) :star:
https://github.com/brucewlee/lama-music-genre-dataset
.wav files, training dataset (MFCC), and graph plots (FFTs, MFCCs, Waveforms) from Latin America, Asia, MiddleEast, and Africa
https://github.com/akiomik/precountify
A tool for adding pre-count (count-off) click to audio file
https://github.com/dbraun/td-faust
FAUST (Functional Audio Stream) for TouchDesigner
libaca
C++ code accompanying the book "An Introduction to Audio Content Analysis" (www.AudioContentAnalysis.org)
complex-cnn-deeplab-v3-with-stft-for-audio-denoising
Paper Name: Complex Convolution Neural Network model (Complex DeepLab v3) on STFT time-varying frequency components for audio denoising Creating a Complex Deep Lab v3 model for audio denoising using STFT complex mask Dataset from: https://datashare.is.ed.ac.uk/handle/10283/2791
https://github.com/csteinmetz1/steerable-nafx
Steerable discovery of neural audio effects
https://github.com/bkraad47/fat_llama
fat_llama is a Python package for upscaling audio files to FLAC or WAV formats using advanced audio processing techniques. It utilizes CUDA-accelerated calculations to enhance audio quality by upsampling and adding missing frequencies through FFT, resulting in richer and more detailed audio.
https://github.com/anira-project/anira-rt-principle-check
Evaluation of real-time violations of different inference engines and validation of the real-time safety of the anira library
find-delay
A Python package to calculate the delay between two arrays or two audio files
https://github.com/aryanvbw/aivoiceclone
Transform Your Voice: Replicate Your Unique Sound in a Pristine Pre-Trained Model and Cultivate Your Custom Voiceprint
diy-smartcube
A proof-of-concept proposal for turning standard Rubik's Cubes into smartcubes by embedding speakers into the cube's centercaps.
zff_vad
Unsupervised Voice Activity Detection by Modeling Source and System Information using Zero Frequency Filtering
https://github.com/alexanderlerch/aca-slides
Slides and Code for "An Introduction to Audio Content Analysis," also taught at Georgia Tech as MUSI-6201. This introductory course on Music Information Retrieval is based on the text book "An Introduction to Audio Content Analysis", Wiley 2012/2022
https://github.com/alexisvassquez/ai_spotibot_player
AudioMIX is an open-source, AI-driven music production software designed to empower independent artists and DJs with mood-based audio analysis, LED integration, and creative autonomy. Spotibot was its original name.
https://github.com/akiomik/shiomi
An oscilloscope-like audio waveform GIF animation generator
anira
an architecture for neural network inference in real-time audio applications
speechbrain
A PyTorch-based Speech Toolkit
aca-code
Matlab scripts accompanying the book "An Introduction to Audio Content Analysis" (www.AudioContentAnalysis.org)
https://github.com/google-ai-edge/mediapipe
Cross-platform, customizable ML solutions for live and streaming media.