Introduction to PyTorch Benchmarks
The PyTorch Benchmarks project is an open-source collection designed to evaluate the performance of the PyTorch framework. This set of benchmarks helps developers and researchers understand how well PyTorch performs across various machine learning models.
What Does It Include?
The project repository contains a directory torchbenchmark/models
, which hosts copies of well-known or representative workloads. These workloads have been adapted to:
- Expose a Standardized API: This standardized API is useful for benchmark drivers.
- Optional Backend Support: It allows for backend integration, such as torchinductor or torchscript.
- Miniature Data Sets: It includes small data sets for training and testing, along with scripts to install necessary dependencies.
Installation
To get started with the benchmarks, a crucial step is the installation process. The benchmark suite is mainly self-contained, but you'll need to install PyTorch separately. This approach allows you to test different PyTorch versions.
Using Pre-built Packages
The benchmark supports Python versions 3.8 and above, with 3.11 being the recommended version. While Conda is optional, it is suggested for managing dependencies.
To set up Python 3.11 in a Conda environment:
# Using your current conda environment:
conda install -y python=3.11
# Or, creating a new conda environment:
conda create -n torchbenchmark python=3.11
conda activate torchbenchmark
For those utilizing NVIDIA GPUs, CUDA versions 11.8 and 12.1 are supported, with CUDA 12.1 as the default:
conda install -y -c pytorch magma-cuda121
Next, install PyTorch and its associated libraries, torchvision and torchaudio, using Conda or Pip. Ensure not to mix these package managers for PyTorch libraries:
# Using Conda
conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch-nightly -c nvidia
# Using Pip
pip3 install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu121
You will need to clone the benchmark repository as it is intended to be installed from the source tree:
git clone https://github.com/pytorch/benchmark
cd benchmark
python3 install.py
Using Torchbench as a Library
For those interested in using torchbench as a library:
python3 install.py
pip install git+https://github.com/pytorch/benchmark.git
Or for editable installations:
python3 install.py
pip install . -e
Running Model Benchmarks
The PyTorch Benchmarks project provides several ways to execute model benchmarks:
test.py
Script: A basic wrapper for checking that models execute correctly.test_bench.py
: A pytest-benchmark script that collects and supports filtering of benchmark statistics.userbenchmark
: Facilitates customized benchmark development.
Commands in test.py
allow for executing specific models, such as running the BERT model in training mode on a CPU:
python3 test.py -k "test_BERT_pytorch_train_cpu"
Creating Customized Benchmarks
The userbenchmark
feature supports creating your benchmarks with TorchBench models, which can be managed using the run_benchmark.py
driver.
Debugging and Profiling
For simple debugging or profiling tasks, utilize run.py
:
python3 run.py <model> [-d {cpu,cuda}] [-t {eval,train}] [--profile]
This simple command-line interface aids in trial runs of training or evaluation modes.
Continual Improvement and Expansion
The project undergoes nightly CI runs on the latest PyTorch builds and integrates performance data into Meta's internal systems. Contributions are welcomed, particularly in adding new models and improving benchmarking support across different machine types.
Adding New Models
For those interested in expanding the benchmark suite, refer to the documentation on adding models within the repository.
By understanding PyTorch Benchmarks, developers can gain insights into the performance of different PyTorch versions and setups, ultimately optimizing their machine learning workflows.