EdgeSAM - Efficient SAM Deployment on Edge Devices through Prompt Distillation

Introduction to EdgeSAM

EdgeSAM is an exciting advancement in the field of machine learning, specifically designed for edge devices. It builds upon the Segment Anything Model (SAM) to offer a solution that is significantly faster and more efficient without sacrificing performance. EdgeSAM stands out by achieving a remarkable speed increase, running 40 times faster than the original SAM and 14 times faster than MobileSAM on edge devices. This enhancement is achieved while improving the mean Intersection over Union (mIoU) scores on datasets like COCO and LVIS by notable margins.

Innovating Performance with EdgeSAM

EdgeSAM is unique in its ability to operate at over 30 frames per second (FPS) on devices like the iPhone 14, making it the first SAM variant suitable for such performance on mobile hardware. The technology behind EdgeSAM involves a clever transformation of the SAM image encoder from a ViT-based architecture to a CNN-based one more suited to edge device constraints. This adaptation, along with the integration of prompt encoder and mask decoder, ensures better interaction between user input and output.

Achievements in Speed and Accuracy

The goal of EdgeSAM is to distill the extensive capabilities of the original SAM into a model that is much faster and viable for edge deployment. In terms of speed and resource efficiency, it uses far fewer computations (measured in FLOPs) while also maintaining a compact model size (parameters). For example, it uses only about 22.1 GFLops compared to SAM’s 2734.8 GFLops and takes up less memory with just 9.6 M parameters.

Getting Started with EdgeSAM

To utilize EdgeSAM, users can access the project through several platforms. There's a user-friendly iOS app called CutCha that demonstrates the power of EdgeSAM in real-time image processing. For developers, the EdgeSAM repository provides the tools and instructions necessary for installation and deployment. The software is designed to integrate seamlessly into Python environments, allowing for easy model prediction from input images.

Testing and Deployment

EdgeSAM can be tested and demonstrated through a web demo that can be set up on local instances or accessed via Hugging Face Space. This demo provides an interactive environment to experience EdgeSAM’s capabilities firsthand. There are also options to enhance performance further with ONNX, providing potential for faster processing times.

Export and Checkpoints

For those interested in deploying EdgeSAM in various environments, support is available for exporting the models to formats like CoreML and ONNX, suitable for integration into iOS applications or other systems. Detailed instructions and the necessary checkpoints for different model sizes and their respective performance benchmarks are available.

Appreciating Contributions

The development of EdgeSAM has received support from industry partnerships and recognizes contributions from several foundational projects in its inception, including SAM, MobileSAM, and FastSAM. This collaborative spirit underpins the ongoing advancements in edge computing models.

EdgeSAM represents a leap forward in making advanced machine learning models more accessible and practical for a broader range of devices, reflecting a strong commitment to enhancing both speed and functionality.