LiBai Project Introduction
Overview
LiBai is an open-source large-scale model training toolbox that is built on OneFlow. It is designed to facilitate efficient and scalable model training and is packed with features that cater to both computer vision (CV) and natural language processing (NLP) tasks. The current version of LiBai supports OneFlow 0.7.0, ensuring compatibility with this robust deep learning framework.
Key Highlights
Parallel Training Components
LiBai stands out with its comprehensive support for parallel training components. It offers various parallelism strategies including Data Parallelism, Tensor Parallelism, and Pipeline Parallelism. The extensibility of LiBai also allows it to incorporate new forms of parallelism, making it a versatile tool for researchers and developers.
Diverse Training Techniques
The toolbox comes equipped with numerous training techniques that users can implement out of the box. These include Distributed Training, which leverages multiple computing nodes; Mixed Precision Training for improved performance with reduced precision computations; Activation Checkpointing and Recomputation to manage memory consumption efficiently; Gradient Accumulation for handling large batch sizes; and the Zero Redundancy Optimizer (ZeRO) to optimize memory usage in large model training.
Support for CV and NLP Tasks
LiBai makes data preparation convenient by providing predefined processes for popular datasets in both the computer vision and natural language processing domains. This includes datasets like CIFAR, ImageNet for vision tasks, and BERT Dataset for NLP tasks, simplifying the workflow of data scientists and engineers.
User-Friendly Design
The modular design of LiBai emphasizes ease of use. Features such as the LazyConfig system offer a flexible configuration syntax without rigid predefined structures. This adaptability is further supported by its user-friendly trainer and engine, facilitating both the deployment of existing research and the building of new projects based on LiBai's infrastructure.
High Efficiency
Efficiency in LiBai is a core focus, ensuring that model training and related processes are optimized for speed and resource utilization.
Installation and Usage
Users can quickly get started with LiBai by following the installation instructions provided. A dedicated guide on getting started is also available for basic usage scenarios.
Documentation
Comprehensive documentation is available, providing full API references and tutorials for users at all levels. This can be accessed through LiBai's documentation portal.
Recent Updates
The latest release, Beta 0.3.0, came out on March 11, 2024. This update includes several notable features such as support for mock transformers and model evaluation tools like lm-evaluation-harness. User experience has also been improved. New models supported include BLOOM, ChatGLM, Couplets, DALLE2, Llama2, MAE, and Stable_Diffusion, across various parallel training methods.
Community and Contributions
The LiBai project welcomes contributions from the community. Interested individuals can refer to the project's CONTRIBUTING guide for more information on how to get involved.
Licensing and Citation
LiBai is released under the Apache 2.0 license. Researchers utilizing LiBai in their work are encouraged to cite the project using the provided BibTeX entry.
Join the community and explore the full potential of large-scale model training with LiBai!