https://github.com/bkraad47/fat_llama

fat_llama is a Python package for upscaling audio files to FLAC or WAV formats using advanced audio processing techniques. It utilizes CUDA-accelerated calculations to enhance audio quality by upsampling and adding missing frequencies through FFT, resulting in richer and more detailed audio.

Keywords

audio audio-engineering audio-processing audiophile cuda cufft cupy fft flac hi-res hpc mp3 music nvidia ogg parallel-computing physics upscaling wav

Last synced: 5 months ago · JSON representation

Repository

fat_llama is a Python package for upscaling audio files to FLAC or WAV formats using advanced audio processing techniques. It utilizes CUDA-accelerated calculations to enhance audio quality by upsampling and adding missing frequencies through FFT, resulting in richer and more detailed audio.

Basic Info

Host: GitHub
Owner: bkraad47
License: bsd-3-clause
Language: Python
Default Branch: main
Homepage: https://pypi.org/project/fat-llama/
Size: 52.9 MB

Statistics

Stars: 22
Watchers: 3
Forks: 2
Open Issues: 1
Releases: 0

Topics

audio audio-engineering audio-processing audiophile cuda cufft cupy fft flac hi-res hpc mp3 music nvidia ogg parallel-computing physics upscaling wav

Created over 1 year ago · Last pushed over 1 year ago

Metadata Files

Readme License

README.md

Fat Llama Logo

Fat Llama

fat_llama is a Python package for upscaling audio files to FLAC or WAV formats using advanced audio processing techniques. It utilizes CUDA-accelerated calculations to enhance audio quality by upsampling and adding missing frequencies through FFT (Fast Fourier Transform), resulting in richer and more detailed audio.

Features

Upscale MP3 files to high-quality FLAC format.
Iterative soft thresholding (IST) for enhanced audio processing.
Auto-scaling amplitude adjustment and normalization.
Supports GPU-accelerated processing with CuPy.

Requirements

CUDA capable GPU

(Note: For cpu verison please look at https://pypi.org/project/fat-llama-fftw/)

Installation

Install via pip: pip install fat-llama Note: This version works with CUDA 12.

Further need CUDA & CuPy properly installed: https://docs.cupy.dev/en/stable/install.html

Also, requires ffmpeg: https://support.audacityteam.org/basics/installing-ffmpeg

Note to install on older versions of CUDA and CuPy. You will need to download specific versions and install locally.

cupy version - https://github.com/bkraad47/fat_llama/tree/v-0.1.3---cupy
cupy-cuda11x version - https://github.com/bkraad47/fat_llama/tree/v-0.1.3---cupy-cuda11x

To install locally: git clone <target_url> cd fat_llama pip install .

Usage

Example Usage

You can run the example provided in example.py:

``` from fatllama.audiofattener.feed import upscale

Example call to the method

upscale( inputfilepath='inputtest.mp3', outputfilepath='outputtest.flac', sourceformat='mp3', targetformat='flac', maxiterations=300, thresholdvalue=0.6, targetbitratekbps=1400, togglenormalize=True, toggleautoscale=True, toggleadaptivefilter=True ) ```

Function Parameters

input_file_path (str): Path to the input audio file. Mandatory.
output_file_path (str): Path to the output processed audio file. Mandatory.
source_format (str): Format of the input audio file (e.g., 'mp3', 'wav', 'ogg', 'flac').
target_format (str): Format of the output audio file (e.g., 'flac', 'wav'). Default is 'flac'.
max_iterations (int): Maximum number of iterations for IST. Default is 800.
threshold_value (float): Threshold value for IST. Default is 0.6.
target_bitrate_kbps (int): Target bitrate in kbps. Default is 1411.
toggle_normalize (bool): Whether to normalize the audio. Default True.
toggle_autoscale (bool): Whether to autoscale the audio based on the original audio. Default True.

Running the Example

To run the example, execute the following command: python example.py This will upscale the MP3 file specified in the example and produce a FLAC file with full processing.

Spectrogram Results

How it works

How it Works

Algorithm Explanation

The upscaling process involves several steps:

Reading Audio File: The audio file is read, and the audio samples are extracted along with the sample rate and bitrate.
Calculating Upscale Factor: The upscale factor is calculated to achieve the target bitrate.
Upscaling Channels: The audio channels are upscaled using an interpolation algorithm. Each sample is repeated multiple times to increase the resolution.
Iterative Soft Thresholding (IST): IST is applied to enhance the audio by adding missing frequencies. This process uses FFT to transform the signal into the frequency domain, apply a threshold to keep significant frequencies, and then inverse transform back to the time domain.
Scaling Amplitude: The amplitude of the upscaled audio is scaled to match the original.
Normalizing Audio: The audio is normalized to the range -1 to 1.
Writing FLAC File: The processed audio is written to a FLAC file.

Why FFT and IST?

FFT (Fast Fourier Transform) is used to transform the audio signal into the frequency domain. This allows for the identification and manipulation of specific frequency components. By applying a threshold in the frequency domain, we can keep significant frequencies and discard noise and add it to our upscaling data to add detail to upscaling frequencies.

The report titled "Fast Sparse Fourier Transformations for NMR Spectroscopy" by Badruddin Kamal, supervised by Thomas Huber and Alastair Rendall, 2015, provides a comprehensive understanding of sparse representations and their applications in signal processing. IST leverages the concepts from this report to add missing frequencies and enhance the audio quality by making it more detailed and rich. This is particularly useful in upscaling audio where some frequencies might be missing or congested.

Test Audio Source

ericzo - beyond link(https://soundcloud.com/ericzomusic/free-electro-trap-anthem-beyond)

Changelog

All notable changes to this project will be documented in this file.

[1.1.0] - 2024-08-01

Chanaged

Moved adaptive filtering to after normalization and auto-scaling steps.
Reduced step size for LMS adaptive filter for improved stability.
Ensured all processing uses CuPy for GPU acceleration.
Added detailed comments and logging for better traceability.

[1.0.2] - 2024-07-26

Changed

Remove logging from requirements to fix pip bug.

[1.0.1] - 2024-07-26

Changed

Updated analytics.py analysis and spectorgram results.
Updated README.md details.

[1.0.0] - 2024-07-25

Added

Added support for reading 'ogg', 'flac', and 'wav' file formats and calculating their bitrates correctly.

Changed

Renamed upscale_mp3_to_flac method to upscale to support multiple source formats.
Simplified the workflow to focus on 'mp3' to 'flac' conversion with essential steps only.

Removed

Dropped support for 'ape' and 'alac' target formats.

[0.1.8] - 2024-07-24

Added

Introduced toggle flags for normalization, equalization, amplitude scaling, and gain reduction.
Enhanced auto-scaling of amplitude based on the original MP3 file when toggle_scale_amplitude is False.
Logging for each step of the processing to provide better traceability and debugging.

Changed

Default values for parameters are now set at the function call.
Refined the upscaling algorithm to ensure better handling of amplitude and gain.
Renamed the flags for consistency (toggle_wiener_filter, toggle_normalize, toggle_equalize, toggle_scale_amplitude, toggle_gain_reduction).

Fixed

Fixed issues related to numpy and cupy array conversions.
Improved error handling for invalid target bitrate values.
Addressed the issue where the amplitude of the produced signal was significantly weaker than the original.

[0.1.7] - 2024-07-22

Added

Added methods for MP3 to FLAC conversion with optional processing using CuPy for GPU acceleration.
Initial version of upscale_mp3_to_flac method with parameters for iterative soft thresholding (IST), gain reduction, and equalization.

[0.1.0] to [0.1.6] - 2024-07-20

Added

Basic functionality for reading MP3 files and writing FLAC files.
Initial implementation of the new interpolation algorithm and IST for audio processing.

Owner

Name: Badruddin Kamal
Login: bkraad47
Kind: user
Location: Sydney, Australia
Company: @pathao-eng, @bkash, @dugdugi, @FusionProfessionalsAU

Website: http://raadtech.blogspot.com.au/
Repositories: 1
Profile: https://github.com/bkraad47

>_<

GitHub Events

Total

Watch event: 12
Fork event: 3

Last Year

Watch event: 12
Fork event: 3

Issues and Pull Requests

Last synced: 8 months ago

All Time

Total issues: 1
Total pull requests: 18
Average time to close issues: N/A
Average time to close pull requests: 4 minutes
Total issue authors: 1
Total pull request authors: 1
Average comments per issue: 11.0
Average comments per pull request: 0.0
Merged pull requests: 18
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 1
Pull requests: 18
Average time to close issues: N/A
Average time to close pull requests: 4 minutes
Issue authors: 1
Pull request authors: 1
Average comments per issue: 11.0
Average comments per pull request: 0.0
Merged pull requests: 18
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

MarcoRavich (1)

Pull Request Authors

bkraad47 (36)

Top Labels

Issue Labels

Pull Request Labels

Packages

Total packages: 1
Total downloads:
- pypi 175 last-month

Total dependent packages: 0
Total dependent repositories: 0
Total versions: 16
Total maintainers: 1

pypi.org: fat-llama

fat_llama is a Python package for upscaling audio files to FLAC or WAV formats using advanced audio processing techniques. It utilizes CUDA-accelerated calculations to enhance audio quality by upsampling and adding missing frequencies through FFT (Fast Fourier Transform), resulting in richer and more detailed audio.

Homepage: https://github.com/bkraad47/fat_llama
Documentation: https://fat-llama.readthedocs.io/
License: BSD-3-Clause
Latest release: 1.1.0
published over 1 year ago

Versions: 16
Dependent Packages: 0
Dependent Repositories: 0
Downloads: 175 Last month

Rankings

Dependent packages count: 10.6%

Average: 35.1%

Dependent repos count: 59.7%

Maintainers (1)

bulkguy47

Last synced: 6 months ago

https://github.com/bkraad47/fat_llama

Science Score: 13.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

Fat Llama

Features

Requirements

Installation

Usage

Example Usage

Example call to the method

Function Parameters

Running the Example

Spectrogram Results

How it works

Algorithm Explanation

Why FFT and IST?

Test Audio Source

Changelog

[1.1.0] - 2024-08-01

Chanaged

[1.0.2] - 2024-07-26

Changed

[1.0.1] - 2024-07-26

Changed

[1.0.0] - 2024-07-25

Added

Changed

Removed

[0.1.8] - 2024-07-24

Added

Changed

Fixed

[0.1.7] - 2024-07-22

Added

[0.1.0] to [0.1.6] - 2024-07-20

Added

Owner

GitHub Events

Total

Last Year

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

pypi.org: fat-llama

Rankings

Maintainers (1)

Dependencies