ã»ã°ã¡ã³ãäœã§ãã¢ãã« (SAM)
Segment Anything ModelïŒã»ã°ã¡ã³ãã»ãšãã·ã³ã°ã»ã¢ãã«ãSAM ïŒã«ãããç»åã»ã°ã¡ã³ããŒã·ã§ã³ã®ããã³ãã£ã¢ãžããããããã®ç»æçãªã¢ãã«ã¯ããªã¢ã«ã¿ã€ã ã»ããã©ãŒãã³ã¹ã«ããããã³ããå¯èœãªç»åã»ã°ã¡ã³ããŒã·ã§ã³ãå°å ¥ããããšã§ã²ãŒã ãå€ãããã®åéã«ãããæ°ããªåºæºãæã¡ç«ãŠãŸããã
SAM ã®çŽ¹ä»ïŒã»ã°ã¡ã³ãäœã§ãã¢ãã«
Segment Anything ModelïŒã»ã°ã¡ã³ãã»ãšãã·ã³ã°ã»ã¢ãã«ãSAM ïŒã¯ãç»å解æã¿ã¹ã¯ã«æ¯é¡ã®ãªãå€çšéæ§ãæäŸãããå³æå¯èœãªã»ã°ã¡ã³ããŒã·ã§ã³ãå¯èœã«ããæå 端ã®ç»åã»ã°ã¡ã³ããŒã·ã§ã³ã¢ãã«ã§ãããSAM ã¯ãç»åã»ã°ã¡ã³ããŒã·ã§ã³ã®ããã®æ°ããã¢ãã«ãã¿ã¹ã¯ãããŒã¿ã»ãããå°å ¥ããç»æçãªãããžã§ã¯ããSegment Anything ã€ãã·ã¢ãã£ãã®äžæ žããªããã®ã§ããã
SAMã®é«åºŠãªèšèšã¯ãäºåç¥èãªãã«æ°ããç»åååžãã¿ã¹ã¯ã«é©å¿ããããšãå¯èœã«ããããã¯ãŒãã·ã§ãã転éãšããŠç¥ãããæ©èœã§ãã1,100äžæã®å ¥å¿µã«ç®¡çãããç»åã«åºãã10åæ以äžã®ãã¹ã¯ãå«ãèšå€§ãªSA-1BããŒã¿ã»ããã§ãã¬ãŒãã³ã°ãããSAM ã¯ãå€ãã®ã±ãŒã¹ã§ä»¥åã®å®å šæåž«ããã®çµæãäžåããå°è±¡çãªãŒãã·ã§ããæ§èœã瀺ããŸããã
SA-1B ç»åã®äžäŸã æ°ããå°å ¥ãããSA-1BããŒã¿ã»ããã®ãã¹ã¯ãéããããŒã¿ã»ããç»åãSA-1Bã«ã¯ã1,100äžæã®å€æ§ã§é«è§£å床ã®ãã©ã€ã»ã³ã¹ãããããã©ã€ãã·ãŒä¿è·ãããç»åãšã11åæã®é«å質ãªã»ã°ã¡ã³ããŒã·ã§ã³ã»ãã¹ã¯ãå«ãŸããŠããŸãããããã®ãã¹ã¯ã¯ãSAM ã«ãã£ãŠå®å šèªåã§æ³šéãä»ãããã人éã®è©äŸ¡ãšå€ãã®å®éšã«ãã£ãŠæ€èšŒãããããã«ãé«å質ã§å€æ§æ§ã«å¯ãã§ãããç»åã¯å¯èŠåã®ãããç»åãããã®ãã¹ã¯æ°ã§ã°ã«ãŒãåãããŠããïŒç»åãããã®ãã¹ã¯æ°ã¯å¹³åçŽ100ïŒã
ã»ã°ã¡ã³ãäœã§ãã¢ãã«(SAM)ã®äž»ãªç¹åŸŽ
- ããã³ããå¯èœãªã»ã°ã¡ã³ããŒã·ã§ã³ã¿ã¹ã¯: SAM ã¯ãããã³ããå¯èœãªã»ã°ã¡ã³ããŒã·ã§ã³ã¿ã¹ã¯ã念é ã«çœ®ããŠèšèšãããŠããããªããžã§ã¯ããç¹å®ãã空éãããã¹ãã®æããããªã©ãä»»æã®ããã³ããããæå¹ãªã»ã°ã¡ã³ããŒã·ã§ã³ãã¹ã¯ãçæã§ããã
- é«åºŠãªã¢ãŒããã¯ãã£ïŒã»ã°ã¡ã³ãäœã§ãã¢ãã«ã¯ã匷åãªç»åãšã³ã³ãŒããããã³ãããšã³ã³ãŒããããã³è»œéãã¹ã¯ãã³ãŒããæ¡çšããŠããŸãããã®ãŠããŒã¯ãªã¢ãŒããã¯ãã£ã«ãããæè»ãªããã³ãã衚瀺ããªã¢ã«ã¿ã€ã ã®ãã¹ã¯èšç®ãããã³ã»ã°ã¡ã³ããŒã·ã§ã³ã¿ã¹ã¯ã«ããããããŸããã®èªèãå¯èœã«ãªããŸãã
- SA-1BããŒã¿ã»ããïŒSegment Anythingãããžã§ã¯ãã«ãã£ãŠå°å ¥ãããSA-1BããŒã¿ã»ããã¯ã1,100äžæã®ç»åäžã®10å以äžã®ãã¹ã¯ãç¹åŸŽãšããŠããããããŸã§ã§æ倧ã®ã»ã°ã¡ã³ããŒã·ã§ã³ããŒã¿ã»ãããšããŠãSAM ãå€æ§ã§å€§èŠæš¡ãªåŠç¿ããŒã¿ãœãŒã¹ãæäŸããŸãã
- ãŒãã·ã§ããæ§èœ: SAM ã¯ãæ§ã ãªã»ã°ã¡ã³ããŒã·ã§ã³äœæ¥ã«ãããŠåè¶ãããŒãã·ã§ããæ§èœãçºæ®ããè¿ éãªãšã³ãžãã¢ãªã³ã°ã®å¿ èŠæ§ãæå°éã«æããæ§ã ãªã¢ããªã±ãŒã·ã§ã³ã«ããã«äœ¿çšã§ããããŒã«ãšãªã£ãŠããã
ã»ã°ã¡ã³ãã»ãšãã·ã³ã°ã»ã¢ãã«ãšSA-1BããŒã¿ã»ããã®è©³çŽ°ã«ã€ããŠã¯ãã»ã°ã¡ã³ãã»ãšãã·ã³ã°ã®ãŠã§ããµã€ããã芧ãã ããã
å©çšå¯èœãªã¢ãã«ããµããŒããããã¿ã¹ã¯ãããã³åäœã¢ãŒã
ãã®è¡šã¯ãå©çšå¯èœãªã¢ãã«ããç¹å®ã®äºåèšç·Žãããéã¿ããµããŒãããã¿ã¹ã¯ãããã³æšè«ãæ€èšŒããã¬ãŒãã³ã°ããšã¯ã¹ããŒããªã©ã®ããŸããŸãªæäœã¢ãŒããšã®äºææ§ã瀺ããŸãã
ã¢ãã«ã¿ã€ã | äºåã«èšç·ŽããããŠã§ã€ã | 察å¿ã¿ã¹ã¯ | æšè« | ããªããŒã·ã§ã³ | ãã¬ãŒãã³ã° | èŒžåº |
---|---|---|---|---|---|---|
SAM ããŒã¹ | sam_b.pt | ã€ã³ã¹ã¿ã³ã¹ã®ã»ã°ã¡ã³ããŒã·ã§ã³ | â | â | â | â |
SAM 倧ãã | sam_l.pt | ã€ã³ã¹ã¿ã³ã¹ã®ã»ã°ã¡ã³ããŒã·ã§ã³ | â | â | â | â |
SAM ïŒç»åã»ã°ã¡ã³ããŒã·ã§ã³ã«ãããæ±çšæ§ãšãã¯ãŒ
Segment Anything Modelã¯ããã®ãã¬ãŒãã³ã°ããŒã¿ãè¶ ãããå€æ°ã®ããŠã³ã¹ããªãŒã ã¿ã¹ã¯ã«äœ¿çšããããšãã§ãããããã«ã¯ããšããžæ€åºããªããžã§ã¯ãææ¡çæãã€ã³ã¹ã¿ã³ã¹ã»ã°ã¡ã³ããŒã·ã§ã³ãããã³ããã¹ããããã¹ã¯ãžã®äºåäºæž¬ãå«ãŸããŸããè¿ éãªãšã³ãžãã¢ãªã³ã°ã«ãããSAM ã¯ããŒãã·ã§ããã§æ°ããã¿ã¹ã¯ãããŒã¿ååžã«è¿ éã«é©å¿ããããšãã§ããããããç»åã»ã°ã¡ã³ããŒã·ã§ã³ã®ããŒãºã«å¯Ÿå¿ããæ±çšæ§ã®é«ã匷åãªããŒã«ãšããŠç¢ºç«ãããŠããŸãã
SAM äºæž¬äŸ
ããã³ããã«ããã»ã°ã¡ã³ã
æå®ãããããã³ããã§ç»åãåå²ããã
from ultralytics import SAM
# Load a model
model = SAM("sam_b.pt")
# Display model information (optional)
model.info()
# Run inference with bboxes prompt
results = model("ultralytics/assets/zidane.jpg", bboxes=[439, 437, 524, 709])
# Run inference with single point
results = model(points=[900, 370], labels=[1])
# Run inference with multiple points
results = model(points=[[400, 370], [900, 370]], labels=[1, 1])
# Run inference with multiple points prompt per object
results = model(points=[[[400, 370], [900, 370]]], labels=[[1, 1]])
# Run inference with negative points prompt
results = model(points=[[[400, 370], [900, 370]]], labels=[[1, 0]])
ãã¹ãŠãã»ã°ã¡ã³ãåãã
ç»åå šäœãåå²ããã
- ããã§ã®ããžãã¯ã¯ãããã³ããïŒbboxes/points/masksïŒãæž¡ããªããã°ãç»åå šäœãåå²ãããšãããã®ã§ãã
SAMPredictorã®äŸ
ããããããšã§ãç»åãšã³ã³ãŒããŒãäœåºŠãå®è¡ããããšãªããç»åãäžåºŠèšå®ããã°ãããã³ããæšè«ãäœåºŠãå®è¡ã§ããã
from ultralytics.models.sam import Predictor as SAMPredictor
# Create SAMPredictor
overrides = dict(conf=0.25, task="segment", mode="predict", imgsz=1024, model="mobile_sam.pt")
predictor = SAMPredictor(overrides=overrides)
# Set image
predictor.set_image("ultralytics/assets/zidane.jpg") # set with image file
predictor.set_image(cv2.imread("ultralytics/assets/zidane.jpg")) # set with np.ndarray
results = predictor(bboxes=[439, 437, 524, 709])
# Run inference with single point prompt
results = predictor(points=[900, 370], labels=[1])
# Run inference with multiple points prompt
results = predictor(points=[[400, 370], [900, 370]], labels=[[1, 1]])
# Run inference with negative points prompt
results = predictor(points=[[[400, 370], [900, 370]]], labels=[[1, 0]])
# Reset image
predictor.reset_image()
è¿œå åŒæ°ã§ãã¹ãŠãã»ã°ã¡ã³ãåããã
from ultralytics.models.sam import Predictor as SAMPredictor
# Create SAMPredictor
overrides = dict(conf=0.25, task="segment", mode="predict", imgsz=1024, model="mobile_sam.pt")
predictor = SAMPredictor(overrides=overrides)
# Segment with additional args
results = predictor(source="ultralytics/assets/zidane.jpg", crop_n_layers=1, points_stride=64)
泚
æ»ã£ãŠãããã¹ãŠã® results
äžèšã®äŸã§ã¯ çµæ ãã®ãªããžã§ã¯ãã¯ãäºæž¬ããããã¹ã¯ãšãœãŒã¹ç»åã«ç°¡åã«ã¢ã¯ã»ã¹ããããšãã§ããŸãã
- ã®è¿œå åŒæ°
Segment everything
èŠãPredictor/generate
åè.
SAM 察æ¯èŒYOLOv8
ããã§ã¯ãMetaã®æå°ã®SAM ã¢ãã«ãSAM-bãšãUltralytics æå°ã®ã»ã°ã¡ã³ããŒã·ã§ã³ã¢ãã«ãYOLOv8n-segãæ¯èŒããïŒ
ã¢ãã« | ãµã€ãº (MB) |
ãã©ã¡ãŒã¿ (M) |
ã¹ããŒã (CPU) (ms/im) |
---|---|---|---|
ã¡ã¿SAM-b | 358 | 94.7 | 51096 |
MobileSAM | 40.7 | 10.1 | 46122 |
FastSAM YOLOv8 ããã¯ããŒã³ä»ã | 23.7 | 11.8 | 115 |
Ultralytics YOLOv8n-ã»ã° | 6.7ïŒ53.4åå°ããïŒ | 3.4åïŒ27.9åæžïŒ | 59ïŒ866åéïŒ |
ãã®æ¯èŒã¯ãã¢ãã«éã®ã¢ãã«ãµã€ãºãšé床ã®æ¡éãã瀺ããŠãããSAM ãèªåã»ã°ã¡ã³ããŒã·ã§ã³ã®ããã®ãŠããŒã¯ãªæ©èœãåããŠããããYOLOv8 ãããå°ãããããéããããå¹ççãªã»ã°ã¡ã³ãã¢ãã«ãšçŽæ¥ç«¶åãããã®ã§ã¯ãªãã
ãã¹ãã¯16GBã®RAMãæèŒãã2023 Apple M2 Macbookã§å®è¡ããã®ãã¹ããåçŸããã«ã¯
äŸ
from ultralytics import ASSETS, SAM, YOLO, FastSAM
# Profile SAM-b, MobileSAM
for file in ["sam_b.pt", "mobile_sam.pt"]:
model = SAM(file)
model.info()
model(ASSETS)
# Profile FastSAM-s
model = FastSAM("FastSAM-s.pt")
model.info()
model(ASSETS)
# Profile YOLOv8n-seg
model = YOLO("yolov8n-seg.pt")
model.info()
model(ASSETS)
èªåã¢ãããŒã·ã§ã³ïŒã»ã°ã¡ã³ããŒã·ã§ã³ã»ããŒã¿ã»ãããžã®æ©é
ãªãŒãã¢ãããŒã·ã§ã³ã¯ãSAM ã®äž»ãªæ©èœã§ããããŠãŒã¶ãŒã¯äºåã«èšç·Žãããæ€åºã¢ãã«ã䜿çšããŠãã»ã°ã¡ã³ããŒã·ã§ã³ããŒã¿ã»ãããçæããããšãã§ããŸãããã®æ©èœã«ãããæéã®ãããæäœæ¥ã«ããã©ããªã³ã°ã®å¿ èŠæ§ãåé¿ãã倧éã®ç»åã«è¿ éãã€æ£ç¢ºãªã¢ãããŒã·ã§ã³ãè¡ãããšãã§ããŸãã
æ€åºã¢ãã«ã䜿çšããŠã»ã°ã¡ã³ããŒã·ã§ã³ã»ããŒã¿ã»ãããçæãã
Ultralytics ãã¬ãŒã ã¯ãŒã¯ã䜿ã£ãŠããŒã¿ã»ãããèªå泚éåããã«ã¯ã次ã®ããã«ããŸãã auto_annotate
é¢æ°ã¯ä»¥äžã®ããã«ãªãïŒ
äŸ
è°è« | ã¿ã€ã | ããã©ã«ã | 説æ |
---|---|---|---|
data |
str |
required | Path to directory containing target images/videos for annotation or segmentation. |
det_model |
str |
"yolo11x.pt" |
YOLO detection model path for initial object detection. |
sam_model |
str |
"sam2_b.pt" |
SAM2 model path for segmentation (supports t/s/b/l variants and SAM2.1 models). |
device |
str |
"" |
Computation device (e.g., 'cuda:0', 'cpu', or '' for automatic device detection). |
conf |
float |
0.25 |
YOLO detection confidence threshold for filtering weak detections. |
iou |
float |
0.45 |
IoU threshold for Non-Maximum Suppression to filter overlapping boxes. |
imgsz |
int |
640 |
Input size for resizing images (must be multiple of 32). |
max_det |
int |
300 |
Maximum number of detections per image for memory efficiency. |
classes |
list[int] |
None |
List of class indices to detect (e.g., [0, 1] for person & bicycle). |
output_dir |
str |
None |
Save directory for annotations (defaults to './labels' relative to data path). |
ã«ã€ã㊠auto_annotate
é¢æ°ã¯ç»åãžã®ãã¹ãåãããªãã·ã§ã³ã®åŒæ°ã§ãäºåã«èšç·Žãããæ€åºã¢ãã«ãšSAM ã»ã°ã¡ã³ããŒã·ã§ã³ã¢ãã«ãã¢ãã«ãå®è¡ããããã€ã¹ã泚éä»ãçµæãä¿åããåºåãã£ã¬ã¯ããªãæå®ããŸãã
äºåã«èšç·Žãããã¢ãã«ã«ããèªåã¢ãããŒã·ã§ã³ã¯ãé«å質ãªã»ã°ã¡ã³ããŒã·ã§ã³ããŒã¿ã»ããã®äœæã«å¿ èŠãªæéãšåŽåãåçã«åæžããŸãããã®æ©èœã¯ã倧èŠæš¡ãªç»åã³ã¬ã¯ã·ã§ã³ãæ±ãç 究è ãéçºè ã«ãšã£ãŠãæäœæ¥ã«ããã¢ãããŒã·ã§ã³ãããã¢ãã«ã®éçºãšè©äŸ¡ã«éäžã§ãããããç¹ã«æçã§ãã
åŒçšãšè¬èŸ
ãããããªãã®ç 究ãéçºæ¥åã«SAM ãç§ãã¡ã®è«æã®åŒçšããæ€èšãã ããïŒ
@misc{kirillov2023segment,
title={Segment Anything},
author={Alexander Kirillov and Eric Mintun and Nikhila Ravi and Hanzi Mao and Chloe Rolland and Laura Gustafson and Tete Xiao and Spencer Whitehead and Alexander C. Berg and Wan-Yen Lo and Piotr Dollár and Ross Girshick},
year={2023},
eprint={2304.02643},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
ã³ã³ãã¥ãŒã¿ã»ããžã§ã³ã»ã³ãã¥ããã£ã®ããã«ãã®è²ŽéãªãªãœãŒã¹ãäœæããç¶æããŠãã ãã£ãŠããMeta AI瀟ã«æè¬ã®æãè¡šããããšæããŸãã
ããããã質å
Ultralytics ã«ããã»ã°ã¡ã³ãäœã§ãã¢ãã«(SAM)ãšã¯äœã§ããïŒ
Ultralytics ã«ãã Segment Anything Model (SAM) ã¯ãããã³ããã«ããã»ã°ã¡ã³ããŒã·ã§ã³ã¿ã¹ã¯ã®ããã«èšèšãããç»æçãªç»åã»ã°ã¡ã³ããŒã·ã§ã³ã¢ãã«ã§ããç»åãšã³ã³ãŒããšããã³ãããšã³ã³ãŒãã軜éãã¹ã¯ãã³ãŒããšçµã¿åãããé«åºŠãªã¢ãŒããã¯ãã£ã掻çšãã空éãããã¹ããªã©ã®æ§ã ãªããã³ããããé«å質ã®ã»ã°ã¡ã³ããŒã·ã§ã³ãã¹ã¯ãçæãããèšå€§ãªSA-1B ããŒã¿ã»ããã§ãã¬ãŒãã³ã°ãããSAM ã¯ããŒãã·ã§ããæ§èœã«åªããäºåç¥èãªãã§æ°ããç»åååžãã¿ã¹ã¯ã«é©å¿ããŸãã詳ããã¯ãã¡ã
Segment Anything Model (SAM) ãç»åã»ã°ã¡ã³ããŒã·ã§ã³ã«äœ¿çšããã«ã¯ïŒ
Segment Anything Model (SAM) ã䜿çšããŠãããŠã³ãã£ã³ã°ããã¯ã¹ããã€ã³ããªã©æ§ã ãªããã³ããã§æšè«ãå®è¡ããç»åã®ã»ã°ã¡ã³ããŒã·ã§ã³ãè¡ãããšãã§ããŸãã以äžã¯Python ã䜿ã£ãäŸã§ãïŒ
from ultralytics import SAM
# Load a model
model = SAM("sam_b.pt")
# Segment with bounding box prompt
model("ultralytics/assets/zidane.jpg", bboxes=[439, 437, 524, 709])
# Segment with points prompt
model("ultralytics/assets/zidane.jpg", points=[900, 370], labels=[1])
# Segment with multiple points prompt
model("ultralytics/assets/zidane.jpg", points=[[400, 370], [900, 370]], labels=[[1, 1]])
# Segment with multiple points prompt per object
model("ultralytics/assets/zidane.jpg", points=[[[400, 370], [900, 370]]], labels=[[1, 1]])
# Segment with negative points prompt.
model("ultralytics/assets/zidane.jpg", points=[[[400, 370], [900, 370]]], labels=[[1, 0]])
ãããã¯ãã³ãã³ãã©ã€ã³ã€ã³ã¿ãŒãã§ã€ã¹(CLI)ã®SAM ã§æšè«ãå®è¡ããããšãã§ããïŒ
ãã詳现ãªäœ¿çšæ¹æ³ã«ã€ããŠã¯ãã»ã°ã¡ã³ããŒã·ã§ã³ã®ã»ã¯ã·ã§ã³ãã芧ãã ããã
SAM ãšYOLOv8 ã®æ§èœæ¯èŒã¯ïŒ
YOLOv8 ãšæ¯èŒãããšãSAM-bãFastSAM-sã®ãããªSAM ã¢ãã«ã¯ããã倧ãããããé ãããèªåã»ã°ã¡ã³ããŒã·ã§ã³ã®ããã®ãŠããŒã¯ãªæ©èœãæäŸãããããšãã°ãUltralytics YOLOv8n -segã¯ã SAM-bãã53.4åå°ããã866åéããããããSAM ã®ãŒãã·ã§ããæ§èœã¯ãå€æ§ã§èšç·ŽãããŠããªãã¿ã¹ã¯ã«ãããŠéåžžã«æè»ã§å¹ççã§ããSAM ãšYOLOv8 ã®æ§èœæ¯èŒã«ã€ããŠã¯ããã¡ããã芧ãã ããã
SAM ã䜿ã£ãŠããŒã¿ã»ãããèªå泚éããã«ã¯ïŒ
Ultralytics'SAM ã«ã¯ãäºåã«èšç·Žãããæ€åºã¢ãã«ã䜿çšããŠã»ã°ã¡ã³ããŒã·ã§ã³ããŒã¿ã»ãããçæã§ãããªãŒãã¢ãããŒã·ã§ã³æ©èœããããŸãã以äžã¯Python ã®äŸã§ããïŒ
from ultralytics.data.annotator import auto_annotate
auto_annotate(data="path/to/images", det_model="yolov8x.pt", sam_model="sam_b.pt")
ãã®é¢æ°ã¯ãç»åãžã®ãã¹ãšãäºååŠç¿æžã¿ã®æ€åºã¢ãã«ããã³SAM ã»ã°ã¡ã³ããŒã·ã§ã³ã¢ãã«ã®ãªãã·ã§ã³åŒæ°ãããã€ã¹ããã³åºåãã£ã¬ã¯ããªã®æå®ãåãåããŸããå®å šãªã¬ã€ãã«ã€ããŠã¯ãèªåã¢ãããŒã·ã§ã³ãåç §ããŠãã ããã
Segment Anything Model (SAM) ã®ãã¬ãŒãã³ã°ã«ã¯ã©ã®ãããªããŒã¿ã»ããã䜿çšãããŸããïŒ
SAM ã¯ã1,100äžæã®ç»åã«ããã10å以äžã®ãã¹ã¯ããæ§æãããåºç¯ãªSA-1BããŒã¿ã»ããã§åŠç¿ãããŸããSA-1Bã¯ããããŸã§ã§æ倧ã®ã»ã°ã¡ã³ããŒã·ã§ã³ããŒã¿ã»ããã§ãããé«å質ã§å€æ§ãªãã¬ãŒãã³ã°ããŒã¿ãæäŸããããšã§ãæ§ã ãªã»ã°ã¡ã³ããŒã·ã§ã³ã¿ã¹ã¯ã«ãããŠå°è±¡çãªãŒãã·ã§ããæ§èœãä¿èšŒããŸãã詳现ã«ã€ããŠã¯ãããŒã¿ã»ããã®ã»ã¯ã·ã§ã³ãã芧ãã ããã