librosax.feature.melspectrogram¶
- melspectrogram(*, y: ~jax.Array | None = None, sr: float = 22050, S: ~jax.Array | None = None, n_fft: int = 2048, hop_length: int = 512, win_length: int | None = None, window: str = 'hann', center: bool = True, pad_mode: str = 'constant', power: float = 2.0, n_mels: int = 128, fmin: float = 0.0, fmax: float | None = None, htk: bool = False, norm: str | float | None = 'slaney', dtype: ~numpy.dtype = <class 'jax.numpy.float32'>) Array[source]¶
Compute a mel-scaled spectrogram.
If a time-series input y is provided, its magnitude spectrogram S is first computed, and then mapped onto the mel scale by mel_f.dot(S**power).
By default, power=2 operates on a power spectrum.
Note
For JAX JIT compilation, all arguments except
yandSshould be marked as static:sr,n_fft,hop_length,win_length,window,center,pad_mode,power,n_mels,fmin,fmax,htk,norm,dtype- Parameters:
y –
Audio time series. The last axis must be time.
(T,)- single waveform(B, T)- batch of waveforms
sr – Audio sampling rate
S – (optional) Pre-computed spectrogram magnitude with shape
(..., F, N)n_fft – FFT window size
hop_length – Hop length for STFT
win_length – Window length
window – Window function
center – If True, pad the signal
pad_mode – Padding mode
power – Exponent for the magnitude melspectrogram. e.g., 1 for energy, 2 for power, etc. If 0, return the STFT magnitude directly.
n_mels – Number of mel bands to generate
fmin – Lowest frequency (in Hz)
fmax – Highest frequency (in Hz). If None, use fmax = sr / 2.0
htk – Use HTK formula instead of Slaney
norm – {None, “slaney”, float > 0} If “slaney”, divide the triangular mel weights by the width of the mel band (area normalization). If numeric, use norm as a mel exponent normalization. See librosa.filters.mel for details.
dtype – Data type of the output array
- Returns:
Mel spectrogram with shape
(..., n_mels, N).(T,)→(n_mels, N)(B, T)→(B, n_mels, N)