vampnet - Generative Music Models Development Using Descript Audio Codec

Introducing VampNet

VampNet presents a revolutionary approach in the domain of generative music modeling, specifically optimized to work with the Descript Audio Codec. This exciting project delivers a comprehensive toolkit for creating music using advanced AI models, supported by a versatile repository filled with resources and guidelines.

Explore VampNet with Unloop

For those interested in experiencing VampNet's capabilities firsthand, there's 'unloop', a unique co-creative looper that highlights what VampNet can do. Users can explore unloop via the GitHub repository found here, offering a practical example of VampNet’s applications in a creative environment.

Setting Up the Environment

To get started with VampNet, users need to ensure they have Python 3.9 installed. This requirement stems from a known issue with a component called madmom. The recommended approach is to create a dedicated Python environment using tools like Conda:

conda create -n vampnet python=3.9
conda activate vampnet

Once the environment is ready, VampNet can be installed by cloning the repository and installing it directly:

git clone https://github.com/hugofloresgarcia/vampnet.git
pip install -e ./vampnet

Leveraging Argbind

VampNet uses Argbind to effectively manage command-line interfaces and configuration files, which are conveniently stored in the conf/ folder for easy access and customization.

Accessing Pretrained Models

VampNet provides pretrained models, which are licensed under the CC BY-NC-SA 4.0 license. This license applies both to the original models and any models that users develop further using these pretrained weights. Users can download these models through this link and then extract them into the models/ folder.

Utilizing the Gradio Interface

For an interactive experience, VampNet offers a Gradio-powered interface. This user-friendly tool facilitates experimenting with the models:

python app.py --args.load conf/interface.yml --Interface.device cuda

Training and Fine-tuning Models

VampNet is equipped with the necessary scripts and configurations for training and fine-tuning models. Users can run a straightforward script to train a model:

python scripts/exp/train.py --args.load conf/vampnet.yml --save_path /path/to/checkpoints

For those with access to multiple GPUs, VampNet supports multi-GPU training using torchrun:

torchrun --nproc_per_node gpu scripts/exp/train.py --args.load conf/vampnet.yml --save_path path/to/ckpt

Configuration options, such as dataset paths and training parameters, can be customized within the conf/vampnet.yml file.

For fine-tuning existing models, VampNet enables users to generate specific configurations through the following script:

python scripts/exp/fine_tune.py "/path/to/audio1.mp3 /path/to/audio2/ /path/to/audio3.wav" <fine_tune_name>

This process creates separate configuration files to refine both coarse and fine models, ensuring that many levels of detail can be adjusted before launching the final interface of the refined model.

Debugging for Optimal Performance

To ensure smooth training, debugging can be conducted on a single GPU with zero worker processes, making it easier to troubleshoot and optimize:

CUDA_VISIBLE_DEVICES=0 python -m pdb scripts/exp/train.py --args.load conf/vampnet.yml --save_path /path/to/checkpoints --num_workers 0

In summary, VampNet stands as a robust platform for generating, training, and fine-tuning music models, providing users with the tools and resources needed to explore the frontiers of AI-driven music creativity. Whether you're a seasoned developer or a curious music enthusiast, VampNet opens up a world of creative possibilities.