RepViT
RepViT-SAM addresses computational challenges in mobile vision tasks by replacing conventional image encoders with advanced RepViT models, enhancing segmentation speed and efficiency on devices such as iPhones. With this approach, RepViT-SAM achieves impressive zero-shot transfer performance and up to ten times faster inference. Leveraging state-of-the-art ViT and CNN integrations, the RepViT family sets a new standard in lightweight model performance, boasting over 80% top-1 accuracy on ImageNet while maintaining low latency.