librosax.feature.cqt¶

cqt(y: Array, *, sr: float = 22050, hop_length: int = 512, fmin: float | None = None, n_bins: int = 84, bins_per_octave: int = 12, tuning: float | None = 0.0, filter_scale: float = 1.0, norm: float | None = 1.0, sparsity: float = 0.01, window: str = 'hann', scale: bool = True, pad_mode: str = 'constant', res_type: str | None = None, dtype: dtype = <class 'jax.numpy.complex64'>, n_fft: int | None = None, use_1992_version: bool = True, output_format: str = 'complex', normalization_type: str = 'librosa') → Array[source]¶

Compute the constant-Q transform following nnAudio’s CQT1992v2 implementation.

This implementation follows nnAudio’s CQT1992v2 algorithm which computes the CQT efficiently by convolving the time-domain signal with CQT kernels.

Note

For JAX JIT compilation, all arguments except y should be marked as static: sr, hop_length, fmin, n_bins, bins_per_octave, tuning, filter_scale, norm, sparsity, window, scale, pad_mode, res_type, dtype, n_fft, use_1992_version, output_format, normalization_type

Parameters:

y –
Audio time series. The last axis must be time.
- (T,) - single waveform
- (B, T) - batch of waveforms
sr – Sampling rate
hop_length – Number of samples between successive CQT columns
fmin – Minimum frequency (default: C1 = 32.70 Hz)
n_bins – Number of frequency bins
bins_per_octave – Number of bins per octave
tuning – Tuning offset in fractions of a bin
filter_scale – Filter scale factor (Q = filter_scale / (2^(1/bins_per_octave) - 1))
norm – Normalization type for basis functions (1, 2, or None)
sparsity – Sparsification factor (not implemented)
window – Window function
scale – If True, scale by sqrt(filter_lengths) following librosa normalization
pad_mode – Padding mode
res_type – Resampling type (not used in 1992 version)
dtype – Complex data type
n_fft – FFT size (if None, calculated automatically)
use_1992_version – If True, use CQT1992v2 algorithm (recommended)
output_format – Output format (‘complex’, ‘magnitude’, ‘phase’)
normalization_type – Normalization type (‘librosa’, ‘convolutional’, ‘wrap’)

Returns:

CQT matrix with shape (..., n_bins, N). Format depends on output_format.

(T,) → (n_bins, N)
(B, T) → (B, n_bins, N)