Introduction to Augmentor
Augmentor is a Python library designed for image augmentation, primarily aimed at enhancing datasets for machine learning purposes. It serves as a versatile and independent tool that provides users with the capability to control and implement effective image augmentation techniques relevant to real-world applications. Leveraging Python, Augmentor employs a stochastic (randomly determined) approach through the use of pipelines, a series of modular operations that can be arranged to augment datasets meaningfully.
Installation
Augmentor can be easily installed using the pip
command in Python. To install the library, one could simply execute the following in the command line:
pip install Augmentor
For those who prefer building from source or need to upgrade from a previous version, instructions are available in the official documentation. Augmentor also supports multiple Python versions, ranging from 2.7 to 3.9.
Documentation
Comprehensive documentation for Augmentor can be accessed via Read the Docs: https://augmentor.readthedocs.io. This resource offers detailed usage guidelines and feature explanations to help users maximize Augmentor's capabilities.
Quick Start Guide and Usage
The primary goal of Augmentor is to automate the process of image augmentation, enhancing datasets, especially for neural networks and deep learning applications. Augmentor operates using a concept known as an augmentation pipeline, where users define a sequence of operations to be applied to images, such as rotations or zoom transformations. These operations are added incrementally, and once the pipeline configuration is complete, executing the pipeline results in a transformed dataset.
Creating a Pipeline
To start, users create a Pipeline
object and direct it to the folder containing their images:
import Augmentor
p = Augmentor.Pipeline("/path/to/images")
Subsequent operations, like rotating or zooming the images, can be added to this pipeline:
p.rotate(probability=0.7, max_left_rotation=10, max_right_rotation=10)
p.zoom(probability=0.5, min_factor=1.1, max_factor=1.5)
The operations require a probability parameter to determine if they should be applied to each image processed through the pipeline. After setting up the pipeline, users can generate augmented images:
p.sample(10000)
It generates a specified number of augmented images, stored in the designated directory. Alternatively, using p.process()
processes each image exactly once, which is convenient for resizing datasets.
Advanced Features and Multi-threading
Augmentor supports multi-threading to speed up the image generation process. Users have the option to disable it if necessary:
p.sample(100, multi_threaded=False)
Augmentor can handle ground truth images and augment them alongside original data in a synchronized manner using the ground_truth()
method. It also supports complex scenarios where multiple images and masks are augmented concurrently with the DataPipeline
class.
Integration with Machine Learning Frameworks
Augmentor can be seamlessly integrated with popular machine learning libraries like Keras and PyTorch. For Keras, it offers a generator that supplies augmented images for training models in real time:
g = p.keras_generator(batch_size=128)
images, labels = next(g)
Integration with PyTorch is also straightforward by using transformations:
import torchvision
transforms = torchvision.transforms.Compose([
p.torch_transform(),
torchvision.transforms.ToTensor(),
])
Unique Augmentation Techniques
Augmentor implements advanced augmentation methods, like elastic distortions, perspective transforms, and random erasing, which help simulate real-world variations without altering image sizes or introducing unwanted padding.
Chaining Operations
The pipeline can combine various operations, producing diverse and rich datasets from a single image. This feature offers scalability and flexibility, crucial for developing robust machine learning models.
Tutorials and Examples
To aid understanding, Augmentor includes tutorial notebooks demonstrating its functionality and integration with deep learning frameworks, focusing on tasks such as per-class augmentation strategies and Keras generator functionality.
Licensing and Contributions
Augmentor is open-source and provided under the MIT License. Users and developers are encouraged to contribute and can find additional information about testing and CI processes on the project's official repository.
Overall, Augmentor stands out as a powerful library for enhancing the capacity and quality of machine learning datasets by providing advanced augmentation options that cater to both beginners and experienced developers.