PyTorch Learning Rate Finder
Overview
The PyTorch Learning Rate Finder is a tool designed to optimize the process of determining an effective learning rate for training neural networks using PyTorch. This project is based on the learning rate range test, as outlined by Leslie N. Smith in his influential paper, Cyclical Learning Rates for Training Neural Networks. The tool also incorporates a modified version adopted by the Fastai library.
A fundamental part of training neural networks involves choosing the right learning rate, which can drastically affect the performance and speed of training. This tool assists by running a pre-training test where the learning rate is gradually increased, helping practitioners identify regions where the model begins to learn effectively.
What is a Learning Rate Range Test?
The learning rate range test helps users identify the ideal learning rate by systematically testing different rates. Initially, the test starts with a low learning rate, allowing the model to begin stabilizing. The rate is then increased linearly or exponentially until it becomes too large, causing the model to diverge. The point just before divergence often indicates a good learning rate value for training.
In a typical plot from such a test, a decreasing loss curve is observed, and selecting a learning rate halfway down this curve is often optimal. For cyclical learning rates, the plot can also help determine the boundary conditions where the learning starts and stops being effective.
Key Features
- Installable Package: The library can be easily installed via pip for Python 3.5 and above.
- Integration with Mixed Precision Training: For those interested in optimizing performance using mixed precision, the library supports additional configurations with NVIDIA's
apex
and PyTorch'storch.amp
. - Training Mode Variations: Users can choose between exponential or linear increment modes, as implemented in the methods inspired by Fastai and Leslie Smith, respectively.
Implementation and Usage
The PyTorch Learning Rate Finder simplifies integration into existing training setups. The library is used to create an instance of LRFinder
, which manages the learning rate tests. For example, an exponential learning rate test can be conducted as follows:
from torch_lr_finder import LRFinder
model = ...
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=1e-7, weight_decay=1e-2)
lr_finder = LRFinder(model, optimizer, criterion, device="cuda")
lr_finder.range_test(trainloader, end_lr=100, num_iter=100)
lr_finder.plot()
lr_finder.reset()
This snippet initializes your model and optimizer, sets up the LRFinder
, runs the range test, and plots the results to find the optimal learning rate.
Additional Capabilities
- Gradient Accumulation: If managing large batch sizes is challenging due to hardware constraints, users can utilize gradient accumulation.
- Mixed Precision Training: Taking advantage of NVIDIA hardware, the library supports mixed precision to accelerate training without significant accuracy losses. Users can choose between using
apex
or PyTorch’s built-intorch.amp
.
Contribution
Community contributions to the PyTorch Learning Rate Finder are welcome. Interested developers should check out the project's contribution guidelines to get started.
This project provides a robust solution for one of the most common problems in training neural networks — effective learning rate discovery, which is crucial for enhancing machine learning model performance and training efficiency.