AudioTree documentation¶
AudioTree is a JAX library for audio data loading and augmentations. The source code is here.
There are three requirements:
pip install "git+https://github.com/DBraun/argbind.git@improve.subclasses"
pip install "git+https://github.com/DBraun/dm_aux.git@DBraun-patch-2"
pip install "git+https://github.com/DBraun/jaxloudnorm.git@feature/speed-optimize"
Then AudioTree can be installed with pip:
pip install audiotree
The namesake class AudioTree
is a flax.struct.dataclass with properties for audio data,
sample rate, on-demand data such as loudness, and optional metadata such as MIDI pitch, velocity, and duration.
Although AudioTree
is a specific class, we loosely refer to any combination of dictionaries and lists of
AudioTrees
as also an AudioTree
(check out the Pytree JAX docs).
For example, in the code below, we consider batch
to be an AudioTree
.
from jax import numpy as jnp
from audiotree import AudioTree
sample_rate = 44100
data = jnp.zeros((16, 2, 441000)) # dummy placeholder shaped (B, C, T)
audio_tree = AudioTree(data, sample_rate)
batch = {"src": [audio_tree, audio_tree], "target": audio_tree}
The batch above can be used with jax.tree.map to create a new batch. That’s essentially what the Transform classes in
transforms
do.
They perform GPU-based jax.jit-compatible augmentations on arbitrarily shaped AudioTrees.
When used with ArgBind, they are also highly configurable from the command-line and YAML.
Whether you’re creating a data loader for training, validation, testing, or prompt generation, AudioTree can help.
Content¶
Acknowledgments¶
AudioTree is inspired by AudioTools. Thank you!
Citation¶
@software{Braun_AudioTree_2024,
author = {Braun, David},
month = aug,
title = {{AudioTree}},
url = {https://github.com/DBraun/audiotree},
version = {0.1.2},
year = {2024}
}