librosax.feature.melspectrogram

melspectrogram(*, y: ~jax.Array | None = None, sr: float = 22050, S: ~jax.Array | None = None, n_fft: int = 2048, hop_length: int = 512, win_length: int | None = None, window: str = 'hann', center: bool = True, pad_mode: str = 'constant', power: float = 2.0, n_mels: int = 128, fmin: float = 0.0, fmax: float | None = None, htk: bool = False, norm: str | float | None = 'slaney', dtype: ~numpy.dtype = <class 'jax.numpy.float32'>) Array[source]

Compute a mel-scaled spectrogram.

If a time-series input y is provided, its magnitude spectrogram S is first computed, and then mapped onto the mel scale by mel_f.dot(S**power).

By default, power=2 operates on a power spectrum.

Note

For JAX JIT compilation, all arguments except y and S should be marked as static: sr, n_fft, hop_length, win_length, window, center, pad_mode, power, n_mels, fmin, fmax, htk, norm, dtype

Parameters:
  • y – Audio time series. Multichannel is supported.

  • sr – Audio sampling rate

  • S – (optional) Pre-computed spectrogram magnitude

  • n_fft – FFT window size

  • hop_length – Hop length for STFT

  • win_length – Window length

  • window – Window function

  • center – If True, pad the signal

  • pad_mode – Padding mode

  • power – Exponent for the magnitude melspectrogram. e.g., 1 for energy, 2 for power, etc. If 0, return the STFT magnitude directly.

  • n_mels – Number of mel bands to generate

  • fmin – Lowest frequency (in Hz)

  • fmax – Highest frequency (in Hz). If None, use fmax = sr / 2.0

  • htk – Use HTK formula instead of Slaney

  • norm – {None, “slaney”, float > 0} If “slaney”, divide the triangular mel weights by the width of the mel band (area normalization). If numeric, use norm as a mel exponent normalization. See librosa.filters.mel for details.

  • dtype – Data type of the output array

Returns:

Mel spectrogram [shape=(…, n_mels, t)]

Return type:

jnp.ndarray