AudioTree documentation¶
AudioTree is a JAX library for audio data loading and augmentations. The source code is here. AudioTree follows Effort-based Versioning.
There are three requirements:
pip install "git+https://github.com/DBraun/argbind.git@improve.subclasses"
pip install "git+https://github.com/DBraun/dm_aux.git@DBraun-patch-2"
pip install "git+https://github.com/boris-kuz/jaxloudnorm.git"
Then AudioTree can be installed with pip:
pip install audiotree
The namesake class AudioTree
is a flax.struct.dataclass with properties for audio data,
sample rate, on-demand data such as loudness, and optional data such as filepaths, MIDI pitch, velocity, and duration.
An AudioTree can also store arrays for codebooks or latent embeddings.
Although AudioTree
is a specific class, we loosely refer to any combination of dictionaries and lists of
AudioTrees
as also an AudioTree
(check out the Pytree JAX docs).
For example, in the code below, we consider batch
to be an AudioTree
.
from jax import numpy as jnp
from audiotree import AudioTree
sample_rate = 44100
data = jnp.zeros((16, 2, 441000)) # dummy placeholder shaped (B, C, T)
audio_tree = AudioTree(data, sample_rate)
batch = {"src": [audio_tree, audio_tree], "target": audio_tree}
The batch above can be used with jax.tree.map to create a new batch. That’s essentially what the Transform classes in
transforms
do.
They perform GPU-based jax.jit-compatible augmentations on arbitrarily shaped AudioTrees.
When used with ArgBind, they are also highly configurable from the command-line, YAML and Python.
Whether you’re creating a data loader for training, validation, testing, or prompt generation, AudioTree can help.
Content¶
Introduction
AudioTree API
Acknowledgments¶
AudioTree is inspired by AudioTools. Thank you!
Citation¶
@software{Braun_AudioTree_2025,
author = {Braun, David},
month = aug,
title = {{AudioTree}},
url = {https://github.com/DBraun/audiotree},
version = {0.2.0},
year = {2025}
}