dasp-pytorch - Efficient Integration of Differentiable Audio Processing in Neural Networks with PyTorch

Introduction to dasp-pytorch

dasp-pytorch is an innovative project that offers researchers and developers a powerful library for implementing differentiable audio signal processors, utilizing the strengths of PyTorch. With dasp-pytorch, users can explore a wide array of audio effects and processing techniques that are both versatile and effective. This library caters to both academic and commercial needs, offering an open-source solution under the Apache 2.0 license.

Key Features

Comprehensive Audio Effects: The library includes essential audio effects such as reverberation, distortion, dynamic range processing, equalization, and stereo processing. These can be applied and adapted for various audio processing tasks, allowing users to tailor the sound to their specific needs.
Innovative Processing Capabilities: dasp-pytorch empowers users with the ability to conduct virtual analog modeling, blind parameter estimation, automated DSP (Digital Signal Processing), and even audio style transfer. These features open up a world of possibilities for crafting unique audio experiences.
Efficient Performance: Designed with performance in mind, dasp-pytorch supports batching operations on both CPU and GPU, ensuring fast training and minimizing bottlenecks. This makes it a highly efficient choice for large-scale audio processing tasks.
Ease of Use: The library provides purely functional interfaces for all processors, ensuring ease of use and portability across different projects. This design choice means that integrating audio effects into a computational graph is as straightforward as passing the input tensor to the function.

Getting Started

To install dasp-pytorch, you can simply use pip:

pip install dasp-pytorch

Alternatively, for a local installation, clone the repository and install:

git clone https://github.com/csteinmetz1/dasp-pytorch
cd dasp-pytorch
pip install -e .

Example Usage

To demonstrate dasp-pytorch in action, consider this quickstart guide for reverse engineering the drive value of a distortion effect using gradient descent:

import torch
import torchaudio
import dasp_pytorch

# Load audio
x, sr = torchaudio.load("audio/short_riff.wav")

# create batch dim
x = x.unsqueeze(0)

# apply some distortion with 16 dB drive
drive = torch.tensor([16.0])
y = dasp_pytorch.functional.distortion(x, sr, drive)

# create a parameter to optimize
drive_hat = torch.nn.Parameter(torch.tensor(0.0))
optimizer = torch.optim.Adam([drive_hat], lr=0.01)

# optimize the parameter
n_iters = 2500
for n in range(n_iters):
    y_hat = dasp_pytorch.functional.distortion(x, sr, drive_hat)
    loss = torch.nn.functional.mse_loss(y_hat, y)
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()
    print(f"step: {n+1}/{n_iters}, loss: {loss.item():.3e}, drive: {drive_hat.item():.3f}\r")

More Examples

Virtual Analog Modeling: Explore creating analog-inspired audio effects.
Automatic Equalization: Utilize automated procedures to fine-tune audio output.
Style Transfer: Apply the characteristics of one audio style to another.

Audio Processors Offered

dasp-pytorch offers a variety of processors through functional interfaces:

Gain: gain()
Distortion: distortion()
Parametric Equalizer: parametric_eq()
Dynamic Range Compressor: compressor()
Dynamic Range Expander: expander()
Reverberation: noise_shaped_reverberation()
Stereo Widener: stereo_widener()
Stereo Panner: stereo_panner()
Stereo Bus: stereo_bus()

Conclusion

dasp-pytorch presents a formidable toolset for anyone involved in audio processing and manipulation. With its range of features and ease of integration, it serves as a versatile resource for both research and practical applications in sound engineering and production.