minRF - Refined Minimal Rectified Flow Transformers for Novice Users

Minimal Implementation of Scalable Rectified Flow Transformers

The minRF project is a cutting-edge initiative aimed at making the implementation of rectified flow models highly efficient and accessible. This project employs innovative training techniques alongside sophisticated architectural designs to streamline the model-building process, ensuring even beginners can get started with ease.

Overview

The minRF project is grounded in the principles of rectified flow models, starting with a foundation from the SD3 training methodology and integrating it with the LLaMA-DiT architecture. This blend offers a scalable and minimalistic implementation that emphasizes simplicity and modularity. While the project retains essential complexity for robust model training, it ensures the intricacies are concealed, focusing on practicality and user-friendly execution.

Getting Started: Simple Rectified Flow

For newcomers eager to delve into machine learning with rectified flows, the minRF project provides a straightforward entry point. By installing essential libraries such as Torch, PIL, and torchvision, users can initiate model training on the MNIST dataset—the perfect starting place for AI experimentation.

To run the model:

pip install torch torchvision pillow
python rf.py

For those interested in exploring beyond MNIST, the project also supports training on CIFAR-10 with a simple command modification:

python rf.py --cifar

Advanced Implementation: Massive Rectified Flow with muP Support

For experienced practitioners looking to push the boundaries, minRF supports Imagenet training—a step forward likened to MNIST for advanced users. This advanced training involves various sophisticated techniques and partnerships with innovative codebases, alongside leveraging a specialized Imagenet.int8 dataset for optimal performance.

Start by downloading necessary resources and setting up your environment:

cd advanced
pip install hf_transfer
bash download.sh

To train the Imagenet model from scratch:

bash run.sh

A key feature of this advanced implementation is the muP grid search, which efficiently aligns the training parameters, unlocking advanced capabilities like zero-shot learning rate transfer for rectified flow models.

Integration of Techniques

The project is a confluence of multiple comprehensive techniques refined over time, incorporating elements from different innovations like min-max-IN-dit, min-max-gpt, and ez-muP. This amalgamation results in a robust, versatile approach suitable for a wide range of model training needs.

Conclusion and Contribution

The minRF project promises a minimal yet potent framework for scalable rectified flow transformers, harnessing the strength of simplicity paired with cutting-edge research. It invites users to leverage its resources for experimental and applied machine learning ventures, contributing to the broader AI research community.

If you utilize the project's resources, it is recommended to acknowledge the repository:

@misc{ryu2024minrf,
  author       = {Simo Ryu},
  title        = {minRF: Minimal Implementation of Scalable Rectified Flow Transformers},
  year         = 2024,
  publisher    = {Github},
  url          = {https://github.com/cloneofsimo/minRF},
}