Librosax documentation¶

Librosax is a JAX implementation of audio processing functions from librosa , TorchLibrosa , and nnAudio.

The source code is available on GitHub. Librosax follows Effort-based Versioning.

Getting started¶

New to librosax? Check out the installation guide and design philosophy:

Installation - Installation instructions for librosax and JAX
Scope and Design Philosophy - Design philosophy and recommended workflows
Changelog - See what’s new in recent releases

API documentation¶

Librosax provides audio processing functions and neural network layers:

Librosax - Core functions (STFT, iSTFT, magnitude scaling)
Feature extraction - Feature extraction (spectral features, mel-frequency, chromagram, CQT)
Neural network layers - Neural network layers (Spectrogram, MFCC, data augmentation)

Citation¶

If you use librosax in your research, please cite:

@software{Braun_librosax_2025,
   author = {Braun, David},
   month = mar,
   title = {{librosax}},
   url = {https://github.com/DBraun/librosax},
   version = {0.0.5},
   year = {2025}
}

Additionally, please consider citing the original libraries that librosax is based on:

librosa - For the design principles and most algorithms:

@inproceedings{mcfee2015librosa,
  title={librosa: Audio and music signal analysis in python},
  author={McFee, Brian and Raffel, Colin and Liang, Dawen and Ellis, Daniel PW and McVicar, Matt and Battenberg, Eric and Nieto, Oriol},
  booktitle={Proceedings of the 14th python in science conference},
  volume={8},
  year={2015}
}

nnAudio - For Constant-Q Transform (CQT) implementations:

@ARTICLE{9174990,
  author={K. W. {Cheuk} and H. {Anderson} and K. {Agres} and D. {Herremans}},
  journal={IEEE Access},
  title={nnAudio: An on-the-Fly GPU Audio to Spectrogram Conversion Toolbox Using 1D Convolutional Neural Networks},
  year={2020},
  volume={8},
  pages={161981-162003},
  doi={10.1109/ACCESS.2020.3019084}
}

TorchLibrosa - For augmentations and neural network layers:

@article{kong2020panns,
  title={{PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition}},
  author={Kong, Qiuqiang and Cao, Yin and Iqbal, Turab and Wang, Yuxuan and Wang, Wenwu and Plumbley, Mark D.},
  journal={IEEE/ACM Transactions on Audio, Speech, and Language Processing},
  volume={28},
  pages={2880--2894},
  year={2020},
  publisher={{IEEE}}
}

Librosax documentation¶

Getting started¶

API documentation¶

Citation¶

Indices and tables¶