Open3D-PointNet2-Semantic3D - Employ Open3D and PointNet++ for Effective 3D Semantic Segmentation

Open3D-PointNet2-Semantic3D: An Overview

Introduction

The Open3D-PointNet2-Semantic3D project provides a demonstration of semantic segmentation on 3D point cloud data using Open3D and PointNet++. This venture primarily focuses on using Open3D within deep learning workflows, with a baseline implementation for semantic segmentation on the Semantic3D dataset. The project also highlights its participation in the semantic-8 test benchmark, showcasing its efficacy and performance.

Open3D is a versatile, open-source library conducive to developing applications that manipulate 3D data. Offering features in both C++ and Python, Open3D’s frontend is equipped with robust data structures and algorithms, while its backend boasts high optimization and supports parallelization—making it a substantial tool for handling 3D datasets efficiently.

In this particular project, Open3D is pivotal for various tasks:

Loading, writing, and visualizing point cloud data.
Pre-processing steps like voxel-based downsampling.
Quick nearest neighbor searches for label interpolation.
And other essential operations.

This project roots back to the efforts of Mathieu Orhan and Guillaume Dekeyser and even further to the original PointNet2 by Charles R. Qi. Thanks to their foundational work, this repository extends these implementations into broader applications.

Project Workflow

Data Preparation

Download the Dataset: Begin the process by obtaining the Semantic3D dataset. Once downloaded, extract the contents into the specified directory following simple shell commands.
Convert Data Format: Implement a Python script (preprocess.py) to convert .txt files into more manageable .pcd files. Open3D processes .pcd files efficiently, allowing smoother operations afterward.
Downsample the Data: Using another script (downsample.py), experiment with downsampling the dataset. This phase outputs a refined set of data with unlabeled points (label 0) excluded, which assists in reducing computational overhead.

Model Training and Operations

Compile TensorFlow Operations: The project requires compiling certain TensorFlow operations that depend on CUDA and CMake. Start by ensuring TensorFlow is correctly installed and the necessary virtual environments are activated. Build the TensorFlow operations and verify the functionalities through provided scripts.
Training the Model: Execute the training processes through a predefined script (train.py). By default, the script will segment the training and validation datasets for their respective purposes. Optional flags aid in altering the dataset structure used during training sessions.
Prediction: Post-training, select a checkpoint and invoke the prediction script (predict.py) to perform data segmentation with the available dataset. Since PointNet2 handles point clouds in batches, multiple sampling trials enhance point coverage accuracy for each scene.

Post-Processing and Submit

Interpolate Results: After making sparse predictions, employ Open3D’s K-nearest neighbor (K-NN) hybrid search method to interpolate these results to achieve dense predictions, further refining the data output.
Prepare for Submission: For entrants subscribed to the Semantic3D benchmark, a tool assists in renaming and organizing output files to meet submission criteria efficiently.

Project Structure and Directories

dataset/semantic_raw: Stores raw Semantic3D data in both .txt and .labels formats, along with the generated .pcd files.
dataset/semantic_downsampled: Contains downsampled .pcd and .labels data.
result/sparse: Outputs from the prediction stage. Consists of sparse .pcd and .labels files.
result/dense: Contains dense prediction .labels files, further visualized by label types in a separate folder (dense_label_colorized).

The Open3D-PointNet2-Semantic3D project illustrates a comprehensive workflow on processing and segmenting 3D point cloud data with the integration of Open3D's functionalities and PointNet++'s neural network designs, serving as a pivotal tool for research and development in 3D data processing.