https://github.com/richiehakim/vqt
Variable Q-Transform. Torch backend, runs on GPU, simple implementation.
Science Score: 26.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (16.2%) to scientific vocabulary
Repository
Variable Q-Transform. Torch backend, runs on GPU, simple implementation.
Basic Info
Statistics
- Stars: 3
- Watchers: 1
- Forks: 1
- Open Issues: 1
- Releases: 4
Metadata Files
README.md
VQT: Variable Q-Transform
Contributions are welcome! Feel free to open an issue or a pull request.
Variable Q-Transform
This is a novel python implementation of the variable Q-transform that was
developed due to the need for a more accurate and flexible VQT for use in
research. It is battle-tested and has been used in a number of research
projects.
- Accuracy: The approach is different in that it is a direct
implementation of a spectrogram via a Hilbert transformation at each desired
frequency. This results in an exact computation of the spectrogram and is
appropriate for research applications where accuracy is critical. The
implementation seen in librosa and nnAudio uses recursive downsampling,
which can introduce artifacts in the spectrogram under certain conditions.
- Flexibility: The parameters and codebase are less complex than in other
libraries, and the filter bank is fully customizable and exposed to the user.
Built in plotting of the filter bank makes tuning the parameters easy and
intuitive. The main class is a PyTorch Module and the gradient function is
maintained, so backpropagation is possible.
- Speed: The backend is written using PyTorch, and allows for GPU
acceleration. It is faster than the librosa implementation under most cases.
Though it is typically a bit slower (1X-8X) than the nnAudio implementation,
however under some conditions (low hop_length), it is as fast or faster. See
below section 'What to improve on?' for more details on how to speed it up
further.
Installation
Using pip:
pip install vqt
From source:
git clone https://github.com/RichieHakim/vqt.git
cd vqt
pip install -e .
Requirements: torch, numpy, scipy, matplotlib, tqdm
These will be installed automatically if you install from PyPI.
Usage

``` import vqt
signal = torch.astensor(X) ## torch Tensor of shape (nchannels, n_samples)
myvqt = vqt.VQT( Fssample=1000, ## In Hz QlowF=3, ## In periods per octave QhighF=20, ## In periods per octave Fmin=10, ## In Hz Fmax=400, ## In Hz nfreqbins=55, ## Number of frequency bins windowtype='hann', downsamplefactor=8, ## Reduce the output sample rate fftconv=True, ## Use FFT convolution for speed plotpref=False, ## Can show the filter bank )
spectrograms = myvqt(signal)
xaxis = myvqt.getxAxis(nsamples=signal.shape[1])
frequencies = myvqt.get_freqs()
```

What is the Variable Q-Transform?
The Variable Q-Transform
(VQT)
is a time-frequency analysis tool that generates spectrograms, similar to the
Short-time Fourier Transform (STFT). It can also be defined as a special case of
a wavelet transform (complex Morlet), as well as the generalization of the
Constant Q-Transform
(CQT). In fact, the VQT
subsumes the CQT and the STFT since both can be recreated using specific
parameters of the VQT.
In brief, the VQT generates a spectrogram where the frequencies are spaced
logarithmically, and the bandwidth of the filters are tuned using two
parameters: Q_low and Q_high, where Q describes the number of periods of
the oscillatory wavelet at a particular frequency (aka the 'bandwidth'); 'low'
refers to the lowest frequency bin, and 'high' refers to the highest frequency
bin.
Why use the VQT?
It provides enough knobs to tune the time-frequency resolution trade-off to suit your needs. It is especially useful when time resolution is needed at lower frequencies.
How exactly does this implementation differ from others?

This function works differently than the VQT from librosa or nnAudio in that
it does not use the recursive downsampling algorithm from this
paper.
Instead, it computes the power at each frequency using either direct- or
FFT-convolution with a filter bank of complex oscillations, followed by a
Hilbert transform. This results in a more accurate computation of the same
spectrogram without any artifacts. The direct computation approach also results
in code that is more flexible, easier to understand, and it has fewer
constraints on the input parameters compared to librosa and nnAudio.
What to improve on?
Contributions are welcome! Feel free to open an issue or a pull request.
Features:
- Inverse VQT: https://github.com/librosa/librosa/issues/1161#issuecomment-981771860
Speed / Memory usage:
- Lossless approaches:
- For the
conv1dapproach: I think it would be much faster if we cropped the filters to remove the blank space from the higher frequency filters. This would be pretty easy to implement and could give a >10x speedup. - Lossy approaches:
- For the
fft_convapproach: I believe a large (5-50x) speedup is possible. The lower frequency filters use only a small portion of the spectrum, therefore most of the compute is spent multiplying zeros.- Idea 1: Separate out filters in the filter bank whose spectra are all
zeros above
n_samples_downsampled, crop the spectra above that level, then useifftwithn=n_samples_downsampledto downsample the filter. This would allow for a much faster convolution. For filters that can't be cropped, downsampling would have to be done after the iFFT. - Idea 2: using an efficient sparse or non-uniform FFT. An approach where
only the non-zero frequencies are computed in the
fft, product, andifft. There is an implmentation of the NUFFT in PyTorch here. - Idea 3: Similar to above, a log-frequency iFFT could be used to allow for only the non-zero segment of the filter's spectrum to be used in the convolution.
- Idea 4: Try using the overlap-add method.
- Idea 1: Separate out filters in the filter bank whose spectra are all
zeros above
- Recursive downsampling: Under many circumstances (like when
Q_highis not much greater thanQ_low), recursive downsampling is fine. Implementing it would be nice just for completeness (from this paper) - For conv1d approach: Use a strided convolution.
- For fftconv approach: Downsample using
n=n_samples_downsampledinifftfunction. - Non-trivial ideas that theoretically could speed things up:
- An FFT implementation that allows for a reduced set of frequencies to be computed.
Flexibility:
librosaparameter mode: It would be nice to have a mode that allows for the same parameters aslibrosato be used.
Demo:

``` import vqt import numpy as np import torch import matplotlib.pyplot as plt import scipy
dataecg = torch.astensor(scipy.datasets.electrocardiogram()[:10000]) sample_rate = 360
myvqt = vqt.VQT( Fssample=samplerate, QlowF=2, QhighF=8, Fmin=1, Fmax=120, nfreqbins=150, winsize=1501, windowtype='gaussian', downsamplefactor=8, padding='same', fftconv=True, takeabs=True, plot_pref=False, )
specs = myvqt(dataecg) xaxis = myvqt.getxAxis(nsamples=dataecg.shape[0]) freqs = myvqt.getfreqs()
fig, axs = plt.subplots(nrows=2, ncols=1, sharex=True, ) axs[0].plot(np.arange(dataecg.shape[0]) / samplerate, dataecg) axs[0].title.settext('Electrocardiogram') axs[1].pcolor( xaxis / samplerate, np.arange(specs[0].shape[0]), specs[0] * (freqs)[:, None], vmin=0, vmax=30, cmap='hot', ) axs[1].setyticks(np.arange(specs.numpy()[0].shape[0])[::10], np.round(freqs.numpy()[::10], 1)); axs[1].setxlim([13, 22]) axs[0].setylabel('mV') axs[1].setylabel('frequency (Hz)') axs[1].setxlabel('time (s)') plt.show() ```
Owner
- Name: Richard Hakim
- Login: RichieHakim
- Kind: user
- Location: Boston
- Company: Sabatini Lab, Harvard Medical School
- Website: richhakim.com
- Twitter: RichieHakim
- Repositories: 21
- Profile: https://github.com/RichieHakim
Currently a PhD candidate in Bernardo Sabatini's lab at Harvard interested in brain-machine-interaces and computation in the brain.
GitHub Events
Total
- Watch event: 3
- Issue comment event: 4
- Push event: 4
- Pull request event: 1
- Fork event: 1
Last Year
- Watch event: 3
- Issue comment event: 4
- Push event: 4
- Pull request event: 1
- Fork event: 1
Packages
- Total packages: 1
-
Total downloads:
- pypi 23 last-month
- Total dependent packages: 0
- Total dependent repositories: 0
- Total versions: 4
- Total maintainers: 1
pypi.org: vqt
Variable Q-Transform with PyTorch backend
- Homepage: https://github.com/RichieHakim/vqt
- Documentation: https://vqt.readthedocs.io/
- License: LICENSE
-
Latest release: 0.1.3
published about 2 years ago
Rankings
Maintainers (1)
Dependencies
- actions/checkout v3 composite
- actions/setup-python v3 composite
- codecov/codecov-action v3 composite
- conda-incubator/setup-miniconda v2 composite
- actions/checkout v2 composite
- actions/setup-python v2 composite
- actions/checkout v4 composite
- actions/download-artifact v3 composite
- actions/setup-python v4 composite
- actions/upload-artifact v3 composite
- pypa/gh-action-pypi-publish release/v1 composite
- sigstore/gh-action-sigstore-python v1.2.3 composite
- hypothesis *
- matplotlib *
- numpy *
- pytest *
- scipy *
- torch *
- tqdm *