InsightFace-REST: An Overview
InsightFace-REST is a sophisticated yet accessible tool that facilitates face detection and recognition through a RESTful API. This innovative project leverages cutting-edge technologies, namely FastAPI for service distribution and NVIDIA TensorRT for enhanced inference efficiency. Mainly inspired by the API code of InsightFace, InsightFace-REST is designed to simplify deployment and improve scalability for developers working in advanced face recognition fields.
Key Features
- Seamless Deployment on GPU Systems: InsightFace-REST is crafted for NVIDIA GPU-enabled environments, utilizing Docker and NVIDIA-docker2 for streamlined deployment.
- Automatic Model Acquisition: On starting up, the system can automatically fetch required models from Google Drive, ensuring an up-to-date working environment.
- Efficient Performance: The integration of TensorRT provides up to a 3x speed increase over MXNet inference through optimizations, offering FP16 inference and batch processing for face detection and recognition.
- Diverse Model Support: The tool supports both older models, such as the Retinaface detectors and MXNet-based ArcFace, and newer models, such as SCRFD and PyTorch based models like 'glintr100' and 'w600k_r50'.
- Batch Inference and CPU Support: It allows batch processing for the SCRFD model family and supports inference on CPU using ONNX-Runtime, broadening its usability.
Supported Models
InsightFace-REST supports a variety of models for both detection and recognition tasks:
Detection Models
These models include:
- Retinaface versions for detection with automatic downloading.
- SCRFD models, which offer batch inference and optimized processing times.
Recognition Models
There is a wide range of supported models, such as:
- Various ArcFace models that balance speed and recognition accuracy.
- Newer models from the InsightFace repository, which are designed for efficient recognition.
Other Models
In addition to detection and recognition, InsightFace-REST supports models for additional tasks like gender and age prediction and mask detection.
Prerequisites
To run InsightFace-REST, users need:
- Docker and Nvidia-container-toolkit
- Compatible NVIDIA GPU drivers (version 470.x.x and above)
Running and Deployment
Deploying InsightFace-REST is straightforward with Docker:
- Clone the repository.
- Execute
deploy_trt.sh
to set up the environment. Settings can be modified as needed. - Access the API documentation and interface at
http://localhost:18081
.
For systems with multiple GPUs, configurations for parallel processing can be adjusted for optimal performance. Moreover, users have the option to run the service without GPUs using the deploy_cpu.sh
script, which uses ONNXRuntime for inference.
Using the API
InsightFace-REST offers a practical interface for API interactions, demonstrated through scripts like demo_client.py
in the repository, which showcases typical usage scenarios.
Future Work and Known Issues
Ongoing enhancements include:
- Developing examples for face indexing and searching with Milvus.
- Integrating Triton Inference Server for improved backend execution.
However, note some known issues, such as discrepancies in gender-age predictions when using the glintr100
model.
Updates and Improvements
InsightFace-REST continuously evolves, with significant updates enhancing functionality and performance, such as:
- Improved support for the SCRFD detection models.
- New additions like msgpack serializer to optimize data transfer.
InsightFace-REST represents a powerful tool for developers and researchers seeking to harness the potential of modern face recognition technology in a convenient and efficient package.