librosax.feature.mfcc¶
- mfcc(*, y: Array | None = None, sr: float = 22050, S: Array | None = None, n_mfcc: int = 20, dct_type: int = 2, norm: str | None = 'ortho', lifter: int = 0, n_fft: int = 2048, hop_length: int = 512, win_length: int | None = None, window: str = 'hann', center: bool = True, pad_mode: str = 'constant', power: float = 2.0, n_mels: int = 128, fmin: float = 0.0, fmax: float | None = None, htk: bool = False, melspectrogram_params: dict | None = None) Array[source]¶
Compute Mel-frequency cepstral coefficients (MFCCs).
MFCCs are computed from the log-power mel spectrogram.
Note
For JAX JIT compilation, all arguments except
yandSshould be marked as static. This includes all the melspectrogram parameters and MFCC-specific parameters:sr,n_mfcc,dct_type,norm,lifter, plus all other kwargs.- Parameters:
y – Audio time series. Multichannel is supported.
sr – Audio sampling rate
S – (optional) log-power mel spectrogram
n_mfcc – Number of MFCCs to return (default: 20)
dct_type – Discrete cosine transform (DCT) type (default: 2)
norm – If “ortho”, use orthonormal DCT basis. Default: “ortho”
lifter – If lifter>0, apply liftering (cepstral filtering) to the MFCCs. If lifter=0, no liftering is applied.
n_fft – FFT window size (used if y is provided)
hop_length – Hop length for STFT (used if y is provided)
win_length – Window length (used if y is provided)
window – Window function (used if y is provided)
center – If True, pad the signal (used if y is provided)
pad_mode – Padding mode (used if y is provided)
power – Exponent for the magnitude melspectrogram (used if y is provided)
n_mels – Number of mel bands (used if y is provided)
fmin – Lowest frequency in Hz (used if y is provided)
fmax – Highest frequency in Hz (used if y is provided)
htk – Use HTK formula for mel scale (used if y is provided)
melspectrogram_params – Additional keyword arguments for melspectrogram (used if y is provided)
- Returns:
MFCC sequence [shape=(…, n_mfcc, t)]
- Return type:
jnp.ndarray