librosax.layers.LogMelFilterBank

class LogMelFilterBank(*args: Any, **kwargs: Any)[source]

A module that converts spectrograms to (log) mel spectrograms.

This module applies mel filterbank on spectrogram and optionally converts the result to log scale.

Variables:
  • sr – Sample rate of the audio signal. Default is 22_050.

  • n_fft – FFT size. Default is 2048.

  • n_mels – Number of mel filterbanks. Default is 64.

  • fmin – Minimum frequency for mel filterbank. Default is 0.0.

  • fmax – Maximum frequency for mel filterbank. Default is sr // 2.

  • is_log – If True, convert to log scale. Default is True.

  • ref – Reference value for log scaling. Default is 1.0.

  • amin – Minimum value for log scaling. Default is 1e-10.

  • top_db – Maximum dynamic range in dB. Default is 80.0.

  • freeze_parameters – If True, parameters are not updated during training. Default is True.

__init__(sr: int = 22050, n_fft: int = 2048, n_mels: int = 64, fmin: float = 0.0, fmax: float = None, is_log: bool = True, ref: float = 1.0, amin: float = 1e-10, top_db: float | None = 80.0, freeze_parameters: bool = True)[source]

Methods

__init__([sr, n_fft, n_mels, fmin, fmax, ...])

eval(**attributes)

Sets the Module to evaluation mode.

iter_children()

Iterates over all children Module's of the current Module.

iter_modules()

Recursively iterates over all nested Module's of the current Module, including the current Module.

perturb(name, value[, variable_type])

Add an zero-value variable ("perturbation") to the intermediate value.

set_attributes(*filters[, raise_if_not_found])

Sets the attributes of nested Modules including the current Module.

sow(variable_type, name, value[, reduce_fn, ...])

sow() can be used to collect intermediate values without the overhead of explicitly passing a container through each Module call.

train(**attributes)

Sets the Module to training mode.