Robust Video Matting (RVM)
Robust Video Matting (RVM) is a cutting-edge project specifically crafted for high-quality video matting, particularly aimed at human subjects. Unlike traditional models that treat each video frame as a separate image, RVM utilizes a recurrent neural network (RNN) to handle video sequences, effectively remembering past frames to improve matting accuracy and consistency over time.
Key Features
- Real-Time Matting: RVM stands out with its ability to process videos in real-time. It can handle 4K resolution at 76 frames per second (FPS) and HD resolution at 104 FPS using an Nvidia GTX 1080 Ti GPU.
- No Additional Inputs Required: The model doesn't need extra information, such as trimaps, to perform matting, simplifying the workflow.
- High-Resolution Capability: It efficiently manages high-resolution videos, making it versatile for various professional applications.
Recent Updates
- In November 2021, a bug was fixed in the training script, ensuring smoother operation.
- The project was re-released under the GPL-3.0 license in September 2021, marking it as open-source.
- In August 2021, both the source code and pretrained models were made publicly available.
- The project's research paper was accepted by the WACV 2022 conference in July 2021.
Demonstrations and Resources
RVM provides several ways to see its impressive capabilities:
-
Showreel Video: A showcase of the model's performance is available on YouTube and Bilibili.
-
Webcam Demo: Users can try the model directly in their browsers, interacting with live video matting through this demo.
-
Google Colab Demo: This allows individuals to test the model on their own videos using free GPU resources via this platform.
Download Options
For those interested in integrating RVM into their projects, a variety of models are offered across different frameworks:
- MobileNetv3 Models: Recommended for most use cases due to their balance between performance and resource usage.
- ResNet50 Models: Slightly larger with minor performance gains.
Here’s a breakdown of downloading options:
- PyTorch: Official weights are available, making it easy to integrate into PyTorch-based projects.
- TorchHub: No download needed, offering the simplest usage experience.
- TorchScript, ONNX, TensorFlow, TensorFlow.js, and CoreML: Models are provided for various platforms, optimizing performance across a range of devices from mobile to web applications.
RVM's comprehensive documentation guides users through the inference process for each framework, offering a broad array of options to suit diverse technical needs. Whether in a professional studio setup or an exploratory personal project, RVM empowers users with robust video matting technology.