Introduction to rknn-cpp-Multithreading
The rknn-cpp-Multithreading project is a C++ implementation, initially adapted from the rknpu2 repository. For those who prefer deploying rapidly with Python, the rknn-multi-threaded provides a quick deployment solution. This project utilizes a thread pool to asynchronously operate RKNN models, significantly enhancing the utilization of RK3588/RK3588s NPU (Neural Processing Unit) and thereby improving the inference frame rate. Additionally, the Yolov5s model incorporates the ReLU activation function to optimize and enhance inference speed.
Updates
The project has undergone several improvements:
- A fix was implemented for the CMake error related to finding pthread.
- A new
nosigmoid
branch was introduced, leveraging models from the rknn_model_zoo to achieve maximum performance boosts. - The RK3588 NPU SDK was updated to the official mainline version 1.5.0. Meanwhile, the yolov5s-silu continues to use the old model from version 1.4.0. The yolov5s-relu has been updated to version 1.5.0, phasing out the
nosigmoid
branch. - A new v1.5.0 branch was added (backward compatible with v1.4.0), and the main branch has been updated to v1.5.2. The project structure was modified by encapsulating the RKNN model thread pool into a class (
include/rknnPool.hpp
).
Usage Instructions
Demonstration
To run the demo:
- Ensure OpenCV is installed on your system.
- Download the test video from the Releases section and place it in the project's root directory. Then execute
build-linux_RK3588.sh
. - You can switch to the root user and run
performance.sh
to set a fixed frequency for improved performance and stability. - Once the compilation is completed, navigate to the
install
directory and execute the command./rknn_yolov5_demo <model_path> <video_path/camera_index>
to run the demo.
Deploying Applications
Refer to the rkYolov5s
class in include/rkYolov5s.hpp
for constructing RKNN model classes.
Multithreaded Model Frame Rate Testing
Frame rate testing was conducted using the performance.sh
script to ensure CPU/NPU frequencies are consistent and errors are minimized. The model for testing was:
Test videos can be found on Bilibili.
Model \ Threads | 1 | 2 | 3 | 4 | 5 | 6 | 9 | 12 |
---|---|---|---|---|---|---|---|---|
Yolov5s - relu | 41.6044 | 71.6037 | 98.6057 | 98.0068 | 104.6001 | 114.7454 | 129.5693 | 140.8788 |
Additional Notes
Currently, exception handling is not fully developed. The project only supports operation on RK3588/RK3588s.
Acknowledgements
This project is grateful for the contributions from the following repositories and organizations: