InsightFace-REST - Efficient Face Detection and Recognition Empowered by NVIDIA GPUs

InsightFace-REST: An Overview

InsightFace-REST is a sophisticated yet accessible tool that facilitates face detection and recognition through a RESTful API. This innovative project leverages cutting-edge technologies, namely FastAPI for service distribution and NVIDIA TensorRT for enhanced inference efficiency. Mainly inspired by the API code of InsightFace, InsightFace-REST is designed to simplify deployment and improve scalability for developers working in advanced face recognition fields.

Key Features

Seamless Deployment on GPU Systems: InsightFace-REST is crafted for NVIDIA GPU-enabled environments, utilizing Docker and NVIDIA-docker2 for streamlined deployment.
Automatic Model Acquisition: On starting up, the system can automatically fetch required models from Google Drive, ensuring an up-to-date working environment.
Efficient Performance: The integration of TensorRT provides up to a 3x speed increase over MXNet inference through optimizations, offering FP16 inference and batch processing for face detection and recognition.
Diverse Model Support: The tool supports both older models, such as the Retinaface detectors and MXNet-based ArcFace, and newer models, such as SCRFD and PyTorch based models like 'glintr100' and 'w600k_r50'.
Batch Inference and CPU Support: It allows batch processing for the SCRFD model family and supports inference on CPU using ONNX-Runtime, broadening its usability.

Supported Models

InsightFace-REST supports a variety of models for both detection and recognition tasks:

Detection Models

These models include:

Retinaface versions for detection with automatic downloading.
SCRFD models, which offer batch inference and optimized processing times.

Recognition Models

There is a wide range of supported models, such as:

Various ArcFace models that balance speed and recognition accuracy.
Newer models from the InsightFace repository, which are designed for efficient recognition.

Other Models

In addition to detection and recognition, InsightFace-REST supports models for additional tasks like gender and age prediction and mask detection.

Prerequisites

To run InsightFace-REST, users need:

Docker and Nvidia-container-toolkit
Compatible NVIDIA GPU drivers (version 470.x.x and above)

Running and Deployment

Deploying InsightFace-REST is straightforward with Docker:

Clone the repository.
Execute deploy_trt.sh to set up the environment. Settings can be modified as needed.
Access the API documentation and interface at http://localhost:18081.

For systems with multiple GPUs, configurations for parallel processing can be adjusted for optimal performance. Moreover, users have the option to run the service without GPUs using the deploy_cpu.sh script, which uses ONNXRuntime for inference.

Using the API

InsightFace-REST offers a practical interface for API interactions, demonstrated through scripts like demo_client.py in the repository, which showcases typical usage scenarios.

Future Work and Known Issues

Ongoing enhancements include:

Developing examples for face indexing and searching with Milvus.
Integrating Triton Inference Server for improved backend execution.

However, note some known issues, such as discrepancies in gender-age predictions when using the glintr100 model.

Updates and Improvements

InsightFace-REST continuously evolves, with significant updates enhancing functionality and performance, such as:

Improved support for the SCRFD detection models.
New additions like msgpack serializer to optimize data transfer.

InsightFace-REST represents a powerful tool for developers and researchers seeking to harness the potential of modern face recognition technology in a convenient and efficient package.