librosax.feature.spectral_contrast¶

spectral_contrast(*, y: Array | None = None, sr: float = 22050, S: Array | None = None, n_fft: int = 2048, hop_length: int = 512, win_length: int | None = None, window: str = 'hann', center: bool = True, pad_mode: str = 'constant', freq: Array | None = None, fmin: float = 200.0, n_bands: int = 6, quantile: float = 0.02, linear: bool = False) → Array[source]¶

Compute spectral contrast.

Each frame of a spectrogram S is divided into sub-bands. For each sub-band, the energy contrast is estimated by comparing the mean energy in the top quantile (peak energy) to that of the bottom quantile (valley energy). High contrast values generally correspond to clear, narrow-band signals, while low contrast values correspond to broad-band noise.

Parameters:

y –
Audio time series. The last axis must be time.
- (T,) - single waveform
- (B, T) - batch of waveforms
sr – Audio sampling rate
S – (optional) Pre-computed spectrogram magnitude with shape (..., F, N)
n_fft – FFT window size
hop_length – Hop length for STFT
win_length – Window length
window – Window function
center – If True, pad the signal
pad_mode – Padding mode
freq – Center frequencies for spectrogram bins. If None, FFT bin center frequencies are used.
fmin – Frequency cutoff for the first bin [0, fmin] Subsequent bins will cover [fmin, 2*fmin], [2*fmin, 4*fmin], etc.
n_bands – Number of frequency bands
quantile – Quantile for determining peaks and valleys
linear – If True, return the linear difference of magnitudes: peaks - valleys. If False, return the logarithmic difference: log(peaks) - log(valleys).

Returns:

Spectral contrast with shape (..., n_bands + 1, N).

(T,) → (n_bands + 1, N)
(B, T) → (B, n_bands + 1, N)