Open3D-ML - Integration of Machine Learning with 3D Data Processing

Introduction to Open3D-ML

Open3D-ML is a project that expands upon the Open3D library, specifically contributing to 3D machine learning tasks. The project offers a suite of tools built on top of the core Open3D library, aimed at processing 3D data for tasks such as semantic point cloud segmentation. It benefits users by providing pretrained models and pipelines essential for both training and application in practical scenarios.

Installation

For Users

Open3D-ML is part of the Open3D Python distribution starting from version 0.11, making it compatible with multiple versions of popular machine learning frameworks like PyTorch (2.0.*) and TensorFlow (2.13.*), along with CUDA (10.1 and 11.*) for Linux platforms.

To begin using Open3D-ML, users should ensure they have the latest version of pip and then install Open3D with the following commands:

pip install --upgrade pip
pip install open3d

For installing a compatible ML framework, users can use the provided requirement files:

pip install -r requirements-tensorflow.txt  # For TensorFlow
pip install -r requirements-torch.txt       # For PyTorch
pip install -r requirements-torch-cuda.txt  # For PyTorch with CUDA on Linux

Users can test the installation with these commands:

python -c "import open3d.ml.torch as ml3d"  # For PyTorch
python -c "import open3d.ml.tf as ml3d"     # For TensorFlow

For custom versions of frameworks or CUDA, users may consider building Open3D from source.

Getting Started

Reading a Dataset

Open3D-ML supports reading various common datasets. As an example, users can read and visualize the SemanticKITTI dataset. Here's a quick overview of how to get started:

import open3d.ml.torch as ml3d  # or import open3d.ml.tf as ml3d

dataset = ml3d.datasets.SemanticKITTI(dataset_path='/path/to/SemanticKITTI/')
all_split = dataset.get_split('all')

print(all_split.get_attr(0))  # Attribute of the first datum
print(all_split.get_data(0)['point'].shape)  # Shape of the first point cloud

vis = ml3d.vis.Visualizer()
vis.visualize_dataset(dataset, 'all', indices=range(100))  # Visualize first 100 frames

Model Configuration Loading

Configurations for models and datasets are stored in yaml files, facilitating easy modification and reuse. Here's an example of loading a configuration file:

import open3d.ml as _ml3d
import open3d.ml.torch as ml3d

cfg_file = "ml3d/configs/randlanet_semantickitti.yml"
cfg = _ml3d.utils.Config.load_from_file(cfg_file)

Pipeline = _ml3d.utils.get_module("pipeline", cfg.pipeline.name, framework)
Model = _ml3d.utils.get_module("model", cfg.model.name, framework)
Dataset = _ml3d.utils.get_module("dataset", cfg.dataset.name)

cfg.dataset['dataset_path'] = "/path/to/your/dataset"
dataset = Dataset(cfg.dataset.pop('dataset_path', None), **cfg.dataset)
model = Model(**cfg.model)
pipeline = Pipeline(model, dataset, **cfg.pipeline)

Semantic Segmentation and 3D Object Detection

Pretrained Model Inference and Training

Open3D-ML supports semantic segmentation and 3D object detection tasks. Users can run inference using pretrained models or train models from scratch. Substantial pretrained models and scripts are available to facilitate these tasks. Here's how to perform segmentation on a dataset using a pretrained model:

import os
import open3d.ml as _ml3d
import open3d.ml.torch as ml3d

cfg_file = "ml3d/configs/randlanet_semantickitti.yml"
cfg = _ml3d.utils.Config.load_from_file(cfg_file)

model = ml3d.models.RandLANet(**cfg.model)
cfg.dataset['dataset_path'] = "/path/to/your/dataset"
dataset = ml3d.datasets.SemanticKITTI(cfg.dataset.pop('dataset_path', None), **cfg.dataset)

pipeline = ml3d.pipelines.SemanticSegmentation(
    model, dataset=dataset, device="gpu", **cfg.pipeline
)

# Note: Download model weights, load checkpoint, run inference, and evaluate performance functions omitted for brevity.

Users can also run similar operations for 3D object detection tasks.

Repository Structure

The Open3D-ML repository is organized into various folders for documentation, example scripts, and core library code (ml3d) which is integrated with Open3D's ML namespace.

Tasks and Algorithms

Open3D-ML engages with tasks such as semantic segmentation and object detection, providing performance tables and model weights for various datasets and models to assess compatibility and evaluate model performance.

With this introduction, users are equipped to explore and utilize Open3D-ML for 3D machine learning tasks, leveraging its extensive utilities and frameworks for their specialized needs.