Meet YOLO26: next-gen vision AI.

MobileSAM lightweight image segmentation model logo

Link to this sectionMobile Segment Anything (MobileSAM)#

MobileSAM is a compact, efficient image segmentation model purpose-built for mobile and edge devices. Designed to bring the power of Meta's Segment Anything Model (SAM) to environments with limited compute, MobileSAM delivers near-instant segmentation while maintaining compatibility with the original SAM pipeline. Whether you're developing real-time applications or lightweight deployments, MobileSAM provides impressive segmentation results with a fraction of the size and speed requirements of its predecessors.



Watch: How to Run Inference with MobileSAM using Ultralytics | Step-by-Step Guide 🎉

MobileSAM has been adopted in a variety of projects, including Grounding-SAM, AnyLabeling, and Segment Anything in 3D.

MobileSAM was trained on a single GPU using a 100k image dataset (1% of the original images) in less than a day. The training code will be released in the future.

Link to this sectionAvailable Models, Supported Tasks, and Operating Modes#

The table below outlines the available MobileSAM model, its pretrained weights, supported tasks, and compatibility with different operating modes such as Inference, Validation, Training, and Export. Supported modes are indicated by ✅ and unsupported modes by ❌.

Model TypePretrained WeightsTasks SupportedInferenceValidationTrainingExport
MobileSAMmobile_sam.ptInstance Segmentation

Link to this sectionMobileSAM Comparison vs YOLO#

The following comparison highlights the differences between Meta's SAM variants, MobileSAM, and Ultralytics segmentation models including YOLO26n-seg:

ModelSize
(MB)
Parameters
(M)
Speed (CPU)
(ms/im)
Meta SAM-b37593.741703
Meta SAM2-b16280.828867
Meta SAM2-t78.138.923430
MobileSAM40.710.123802
FastSAM-s with YOLOv8 backbone23.911.858.0
Ultralytics YOLOv8n-seg7.1 (11.0x smaller)3.4 (11.4x less)24.8 (945x faster)
Ultralytics YOLO11n-seg6.2 (12.6x smaller)2.9 (13.4x less)24.3 (964x faster)
Ultralytics YOLO26n-seg6.7 (11.7x smaller)2.7 (14.4x less)25.2 (930x faster)

This comparison demonstrates the substantial differences in model size and speed between SAM variants and YOLO segmentation models. While SAM models offer unique automatic segmentation capabilities, YOLO models—especially YOLOv8n-seg, YOLO11n-seg and YOLO26n-seg—are significantly smaller, faster, and more computationally efficient.

SAM speeds measured with PyTorch, YOLO speeds measured with ONNX Runtime. Tests run on a 2025 Apple M4 Air with 16GB of RAM using torch==2.10.0, ultralytics==8.4.31, and onnxruntime==1.24.4. To reproduce these results:

Example
from ultralytics import ASSETS, SAM, YOLO, FastSAM

# Profile SAM2-t, SAM2-b, SAM-b, MobileSAM
for file in ["sam_b.pt", "sam2_b.pt", "sam2_t.pt", "mobile_sam.pt"]:
    model = SAM(file)
    model.info()
    model(ASSETS)

# Profile FastSAM-s
model = FastSAM("FastSAM-s.pt")
model.info()
model(ASSETS)

# Profile YOLO models (ONNX)
for file_name in ["yolov8n-seg.pt", "yolo11n-seg.pt", "yolo26n-seg.pt"]:
    model = YOLO(file_name)
    model.info()
    onnx_path = model.export(format="onnx", dynamic=True)
    model = YOLO(onnx_path)
    model(ASSETS)

Link to this sectionAdapting from SAM to MobileSAM#

MobileSAM retains the same pipeline as the original SAM, including pre-processing, post-processing, and all interfaces. This means you can transition from SAM to MobileSAM with minimal changes to your workflow.

The key difference is the image encoder: MobileSAM replaces the original ViT-H encoder (637M parameters) with a much smaller Tiny-ViT encoder (5M parameters). On a single GPU, MobileSAM processes an image in about 12ms (8ms for the encoder, 4ms for the mask decoder).

Link to this sectionViT-Based Image Encoder Comparison#

Image EncoderOriginal SAMMobileSAM
Parameters637M5M
Speed452ms8ms

Link to this sectionPrompt-Guided Mask Decoder#

Mask DecoderOriginal SAMMobileSAM
Parameters3.876M3.876M
Speed4ms4ms

Link to this sectionWhole Pipeline Comparison#

Whole Pipeline (Enc+Dec)Original SAMMobileSAM
Parameters641M9.66M
Speed456ms12ms

The performance of MobileSAM and the original SAM is illustrated below using both point and box prompts.

Image with Point as Prompt

Image with Box as Prompt

MobileSAM is approximately 7 times smaller and 5 times faster than FastSAM. For further details, visit the MobileSAM project page.

Link to this sectionTesting MobileSAM in Ultralytics#

Just like the original SAM, Ultralytics provides a simple interface for testing MobileSAM, supporting both Point and Box prompts.

Link to this sectionModel Download#

Download the MobileSAM pretrained weights from Ultralytics assets.

Link to this sectionPoint Prompt#

Example
from ultralytics import SAM

# Load the model
model = SAM("mobile_sam.pt")

# Predict a segment based on a single point prompt
model.predict("ultralytics/assets/zidane.jpg", points=[900, 370], labels=[1])

# Predict multiple segments based on multiple points prompt
model.predict("ultralytics/assets/zidane.jpg", points=[[400, 370], [900, 370]], labels=[1, 1])

# Predict a segment based on multiple points prompt per object
model.predict("ultralytics/assets/zidane.jpg", points=[[[400, 370], [900, 370]]], labels=[[1, 1]])

# Predict a segment using both positive and negative prompts.
model.predict("ultralytics/assets/zidane.jpg", points=[[[400, 370], [900, 370]]], labels=[[1, 0]])

Link to this sectionBox Prompt#

Example
from ultralytics import SAM

# Load the model
model = SAM("mobile_sam.pt")

# Predict a segment based on a single box prompt
model.predict("ultralytics/assets/zidane.jpg", bboxes=[439, 437, 524, 709])

# Predict multiple segments based on multiple box prompts
model.predict("ultralytics/assets/zidane.jpg", bboxes=[[439, 437, 524, 709], [114, 196, 313, 708]])

Both MobileSAM and SAM share the same API. For more usage details, see the SAM documentation.

Link to this sectionAutomatically Build Segmentation Datasets Using a Detection Model#

To automatically annotate your dataset with the Ultralytics framework, use the auto_annotate function as shown below:

Example
from ultralytics.data.annotator import auto_annotate

auto_annotate(data="path/to/images", det_model="yolo26x.pt", sam_model="mobile_sam.pt")
ArgumentTypeDefaultDescription
datastrrequiredPath to directory containing target images for annotation or segmentation.
det_modelstr'yolo26x.pt'YOLO detection model path for initial object detection.
sam_modelstr'sam_b.pt'SAM model path for segmentation (supports SAM, SAM 2, MobileSAM, and SAM 3 weights).
devicestr''Computation device (e.g., 'cuda:0', 'cpu', or '' for automatic device detection).
conffloat0.25YOLO detection confidence threshold for filtering weak detections.
ioufloat0.45IoU threshold for Non-Maximum Suppression to filter overlapping boxes.
imgszint640Input size for resizing images (must be multiple of 32).
max_detint300Maximum number of detections per image for memory efficiency.
classeslist[int]NoneList of class indices to detect (e.g., [0, 1] for person & bicycle).
output_dirstrNoneSave directory for annotations (default: <data>_auto_annotate_labels sibling).

Link to this sectionCitations and Acknowledgments#

If MobileSAM is helpful in your research or development, please consider citing the following paper:

Quote
@article{mobile_sam,
  title={Faster Segment Anything: Towards Lightweight SAM for Mobile Applications},
  author={Zhang, Chaoning and Han, Dongshen and Qiao, Yu and Kim, Jung Uk and Bae, Sung Ho and Lee, Seungkyu and Hong, Choong Seon},
  journal={arXiv preprint arXiv:2306.14289},
  year={2023}
}

Read the full MobileSAM paper on arXiv.

Link to this sectionFAQ#

Link to this sectionWhat Is MobileSAM and How Does It Differ from the Original SAM Model?#

MobileSAM is a lightweight, fast image segmentation model optimized for mobile and edge applications. It maintains the same pipeline as the original SAM but replaces the large ViT-H encoder (637M parameters) with a compact Tiny-ViT encoder (5M parameters). This results in MobileSAM being about 5 times smaller and 7 times faster than the original SAM, operating at roughly 12ms per image versus SAM's 456ms. Explore more about MobileSAM's implementation on the MobileSAM GitHub repository.

Link to this sectionHow Can I Test MobileSAM Using Ultralytics?#

Testing MobileSAM in Ultralytics is straightforward. You can use Point and Box prompts to predict segments. For example, using a Point prompt:

from ultralytics import SAM

# Load the model
model = SAM("mobile_sam.pt")

# Predict a segment based on a point prompt
model.predict("ultralytics/assets/zidane.jpg", points=[900, 370], labels=[1])

For more details, see the Testing MobileSAM in Ultralytics section.

Link to this sectionWhy Should I Use MobileSAM for My Mobile Application?#

MobileSAM is ideal for mobile and edge applications due to its lightweight design and rapid inference speed. Compared to the original SAM, MobileSAM is about 5 times smaller and 7 times faster, making it suitable for real-time segmentation on devices with limited computational resources. Its efficiency enables mobile devices to perform real-time image segmentation without significant latency. Additionally, MobileSAM supports Inference mode optimized for mobile performance.

Link to this sectionHow Was MobileSAM Trained, and Is the Training Code Available?#

MobileSAM was trained on a single GPU with a 100k image dataset (1% of the original images) in under a day. While the training code will be released in the future, you can currently access pretrained weights and implementation details from the MobileSAM GitHub repository.

Link to this sectionWhat Are the Primary Use Cases for MobileSAM?#

MobileSAM is designed for fast, efficient image segmentation in mobile and edge environments. Primary use cases include:

  • Real-time object detection and segmentation for mobile apps
  • Low-latency image processing on devices with limited compute
  • Integration in AI-powered mobile applications for augmented reality (AR), analytics, and more

For more details on use cases and performance, see Adapting from SAM to MobileSAM and the Ultralytics blog on MobileSAM applications.

Comments