What is PyLabel?
PyLabel is a versatile Python package designed to facilitate the preparation of image datasets for computer vision models such as PyTorch and YOLOv5. It offers a range of functionalities that simplify the translation of bounding box annotations between various formats, like COCO to YOLO, and provides an intuitive AI-assisted labeling tool that operates within a Jupyter notebook environment.
Key Features
Translate
PyLabel allows users to convert annotation formats with astonishing ease and efficiency. With a simple line of code, users can convert their dataset annotations from one format to another. For example:
importer.ImportCoco(path_to_annotations).export.ExportToYoloV5()
This feature is particularly useful for data scientists and developers who need to work with different machine learning frameworks, each requiring unique annotation standards.
Analyze
The library stores annotations in a Pandas DataFrame, enabling users to perform in-depth analysis on their image datasets. This feature can be crucial for understanding the underlying data structure, class distribution, and other analytics that inform model optimization and data quality assessment.
Split
Organizing datasets into train, test, and validation sets is made seamless with PyLabel's stratification capabilities, ensuring a consistent class distribution across different subsets. This consistent distribution is essential for achieving reliable and high-performing machine learning models.
Label
PyLabel includes a powerful image labeling tool that functions directly within a Jupyter notebook. This tool allows for both manual and automated labeling of images, leveraging pre-trained models for the latter to expedite the annotation process.
Visualize
Accurate visualization of datasets is vital for better comprehension. PyLabel makes it easy to overlay bounding boxes on images from your dataset, providing a clear and accurate representation of annotations for validation and refinement purposes.
Tutorial Notebooks
To see PyLabel in action, users can explore a collection of sample Jupyter notebooks which demonstrate various conversion processes and tools, such as:
- Convert COCO to YOLO
- Convert COCO to VOC
- Convert VOC to COCO
- Convert YOLO to COCO
- Convert YOLO to VOC
- Import a YOLO YAML File
- Splitting Image Datasets into Train, Test, Val
- Labeling Tool Demo with AI Assisted Labeling
More detailed documentation is available at PyLabel Documentation.
About PyLabel
PyLabel was created as a Capstone project for the Master of Information and Data Science (MIDS) program at the UC Berkeley School of Information. The project was developed by Jeremy Fraenkel, Alex Heaton, and Derek Topper. Users are encouraged to provide feedback, suggest improvements, or ask questions by creating an issue on the project's GitHub page. The team is committed to ensuring PyLabel is as useful and efficient as possible for all users.