Ultrasound Nerve Segmentation with TensorFlow
Overview
The ultrasound nerve segmentation project utilizes deep learning to address the challenge of segmenting nerves in ultrasound images. The project is based on the Kaggle competition for ultrasound nerve segmentation, and it employs the Keras library to develop a deep neural network designed to improve the accuracy of segmenting these complex medical images.
Data Handling
The data used in this project consists of ultrasound images which are processed using a Python script named data.py
. This script converts the images into a .npy
format, a binary file format optimized for fast retrieval and efficient processing in Python. The original images are resized to dimensions of 64 by 80 pixels, a step necessary for standardizing input sizes for the neural network but are not subjected to any additional pre-processing, despite their noisy nature.
Model Architecture
The neural network model follows the architecture of a convolutional auto-encoder, but with a unique twist borrowed from the U-Net design. The U-Net model is renowned for biomedical image segmentation, characterized by its "skip connections." These allow for essential features extracted during the encoding phase to be passed directly to the decoding layers, facilitating better reconstruction of the segmented output, which in this case, is the mask overlay representing the nerve segments in the images. Each output mask is scaled to fall within a 0 to 1 interval, thanks to the Sigmoid activation function.
Training
The model is trained for 20 epochs, with each epoch requiring about 30 seconds on a high-performance Titan X GPU, consuming approximately 800MB of memory. After training, a Dice coefficient of about 0.68 is achieved, reflecting the model's ability to accurately predict nerve regions. However, this results in a leaderboard score of 0.57, indicating potential overfitting. The loss function used is a negative Dice coefficient, a custom function in the project tailored for better model evaluation by accounting for pixel overlap between predicted and actual segmentations. The Adam optimizer is employed during training, applying a learning rate of 1e-5, while models weights are preserved in HDF5 format.
Practical Usage
To utilize this project effectively, certain dependencies need to be in place, namely OpenCV (cv2), Theano or TensorFlow, and Keras versions >= 1.0. The project supports Python versions from 2.7 to 3.5.
Data Preparation: Extract the raw images into a raw
directory, with the structure segregated into train
and test
subdirectories for accessibility by the data.py
script, which then converts them into .npy
files.
Model Definition: The model definition can be accessed and modified in the train.py
file through the get_unet()
function, where users can tweak the architecture, optimizer, and loss functions to suit their needs.
Training and Testing: Initiate the training process by running python train.py
. This step requires modifications in train_predict()
if needed, such as changing epochs or batch sizes. The resultant test masks are saved in imgs_mask_test.npy
.
Submission Creation: Generate a final submission file for evaluation using python submission.py
, which produces a submission.csv
based on the test masks.
About Keras
Keras serves as a foundational library for the project, offering a minimalist and highly modular framework for building neural networks. It supports both convolutional and recurrent networks, and is capable of functioning with either TensorFlow or Theano as its backend. Keras' simplicity aids rapid development and fast experimentation within the neural network domain.