SAM 2.1
ããæ£ç¢ºãªSAM2.1ã¢ãã«ã«å¯Ÿå¿ããŸããããã²ãè©Šããã ããïŒ
SAM 2: ã»ã°ã¡ã³ãäœã§ãã¢ãã«2
SAM 2ã¯ãMetaã®Segment Anything Model (SAM) ã®åŸç¶ã§ãç»åãšåç»ã®äž¡æ¹ã§å æ¬çãªãªããžã§ã¯ãã»ã°ã¡ã³ããŒã·ã§ã³ã®ããã«èšèšãããæå 端ã®ããŒã«ã§ãããªã¢ã«ã¿ã€ã åŠçãšãŒãã·ã§ããæ±åããµããŒããããçµ±äžãããããã³ããå¯èœãªã¢ãã«ã¢ãŒããã¯ãã£ã«ãããè€éãªèŠèŠããŒã¿ã®åŠçã«åªããŠããŸãã
äž»ãªç¹åŸŽ
èŠããã ïŒ Ultralytics ïœã¹ããããã€ã¹ãããã¬ã€ãðã䜿çšããŠã¡ã¿ã®SAM2 ïœæšè«ãå®è¡ããæ¹æ³
çµ±äžã¢ãã«ã»ã¢ãŒããã¯ãã£
SAM 2ã¯ãç»åãšãããªã®ã»ã°ã¡ã³ããŒã·ã§ã³æ©èœã1ã€ã®ã¢ãã«ã«çµ±åããŠããããã®åäžåã«ãããå°å ¥ãç°¡çŽ åãããç°ãªãã¡ãã£ã¢ã¿ã€ãéã§äžè²«ããããã©ãŒãã³ã¹ãå®çŸããŸããæè»ãªããã³ããããŒã¹ã®ã€ã³ã¿ãŒãã§ã€ã¹ã掻çšãããŠãŒã¶ãŒã¯ããã€ã³ããããŠã³ãã£ã³ã°ããã¯ã¹ããã¹ã¯ãªã©ãããŸããŸãªããã³ããã¿ã€ãã§é¢å¿ã®ãããªããžã§ã¯ããæå®ã§ããŸãã
ãªã¢ã«ã¿ã€ã ã»ããã©ãŒãã³ã¹
ãã®ã¢ãã«ã¯ãªã¢ã«ã¿ã€ã æšè«é床ãéæãã1ç§éã«çŽ44ãã¬ãŒã ãåŠçããããã®ãããSAM 2ã¯ããããªç·šéãæ¡åŒµçŸå®ãªã©ã®å³æãã£ãŒãããã¯ãå¿ èŠãšããã¢ããªã±ãŒã·ã§ã³ã«é©ããŠããã
ãŒãã·ã§ããäžè¬å
SAM 2 ã¯ããããŸã§ã«ééããããšã®ãªããªããžã§ã¯ããã»ã°ã¡ã³ããŒã·ã§ã³ããããšãã§ãã匷åãªãŒãã·ã§ããæ±åã瀺ããããã¯ããããããå®çŸ©ãããã«ããŽãªããã¹ãŠã®å¯èœãªãªããžã§ã¯ããã«ããŒããŠããªãå¯èœæ§ããããå€æ§ãªèŠèŠé åãé²åããèŠèŠé åã§ç¹ã«æçšã§ããã
ã€ã³ã¿ã©ã¯ãã£ããªãªãã¡ã€ã³ã¡ã³ã
ãŠãŒã¶ãŒã¯ãè¿œå ã®ããã³ãããæäŸããããšã«ãã£ãŠãã»ã°ã¡ã³ããŒã·ã§ã³çµæãå埩çã«æ¹è¯ããããšãã§ããåºåãæ£ç¢ºã«å¶åŸ¡ããããšãã§ããããã®ã€ã³ã¿ã©ã¯ãã£ãæ§ã¯ããããªæ³šéãå»çç»åã®ãããªã¢ããªã±ãŒã·ã§ã³ã§çµæã埮調æŽããããã«äžå¯æ¬ ã§ãã
èŠèŠç課é¡ãžã®é«åºŠãªå¯Ÿå¿
SAM 2 ã«ã¯ããªããžã§ã¯ãã®ãªã¯ã«ãŒãžã§ã³ãååºçŸãªã©ãäžè¬çãªãããªã»ã°ã¡ã³ããŒã·ã§ã³ã®èª²é¡ã管çããã¡ã«ããºã ãå«ãŸããŠããŸããæŽç·Žãããã¡ã¢ãªã¡ã«ããºã ã䜿çšããŠããã¬ãŒã éã§ãªããžã§ã¯ãã远跡ãããªããžã§ã¯ããäžæçã«èŠããªããªã£ãããã·ãŒã³ããåºããå ¥ã£ããããå Žåã§ããé£ç¶æ§ã確ä¿ããã
SAM 2ã®ã¢ãŒããã¯ãã£ãŒãšæ©èœãããæ·±ãç解ããã«ã¯ãSAM 2ã®ãªãµãŒãããŒããŒãã芧ãã ããã
æ§èœãšæè¡ç詳现
SAM 2ã¯ããã®åéã«ãããæ°ããªãã³ãããŒã¯ãèšå®ããããŸããŸãªææšã§åŸæ¥ã®ã¢ãã«ãåé§ããŠããïŒ
ã¡ãŒãã« | SAM 2 | ååã®SOTA |
---|---|---|
ã€ã³ã¿ã©ã¯ãã£ãã»ãããªã»ã»ã°ã¡ã³ããŒã·ã§ã³ | ãã¹ã | - |
人ç亀æµãå¿ èŠ | 3åæž | ããŒã¹ã©ã€ã³ |
ç»ååå²ç²ŸåºŠ | æ¹åããã | SAM |
æšè«ã¹ããŒã | 6åé | SAM |
ã¢ãã«å»ºç¯
ã³ã¢ã»ã³ã³ããŒãã³ã
- ç»åããã³ãããªãšã³ã³ãŒãïŒå€æåšããŒã¹ã®ã¢ãŒããã¯ãã£ãå©çšããŠãç»åãšãããªãã¬ãŒã ã®äž¡æ¹ããé«ã¬ãã«ã®ç¹åŸŽãæœåºããããã®ã³ã³ããŒãã³ãã¯ãåã¿ã€ã ã¹ãããã«ãããããžã¥ã¢ã«ã³ã³ãã³ãã®ç解ãæ åœããã
- ããã³ãããšã³ã³ãŒããŒïŒãŠãŒã¶ãŒããæäŸãããããã³ããïŒãã€ã³ããããã¯ã¹ããã¹ã¯ïŒãåŠçããŠãã»ã°ã¡ã³ããŒã·ã§ã³ ã¿ã¹ã¯ãã¬ã€ããããããã«ãããSAM 2 ã¯ãŠãŒã¶ãŒå ¥åã«é©å¿ããã·ãŒã³å ã®ç¹å®ã®ãªããžã§ã¯ããã¿ãŒã²ããã«ããããšãã§ããã
- ã¡ã¢ãªãŒã¡ã«ããºã ïŒã¡ã¢ãªãŒãšã³ã³ãŒããã¡ã¢ãªãŒãã³ã¯ãã¡ã¢ãªãŒã¢ãã³ã·ã§ã³ã¢ãžã¥ãŒã«ãå«ãããããã®ã³ã³ããŒãã³ãã¯ãéå»ã®ãã¬ãŒã ããã®æ å ±ããŸãšããŠä¿åã»å©çšããããšã§ãæéçµéã«äŒŽãäžè²«ããç©äœè¿œè·¡ãå¯èœã«ããã
- ãã¹ã¯ãã³ãŒããŒïŒãšã³ã³ãŒããããç»åã®ç¹åŸŽãšããã³ããã«åºã¥ããŠãæçµçãªã»ã°ã¡ã³ããŒã·ã§ã³ãã¹ã¯ãçæããããããªã§ã¯ããã¬ãŒã éã®æ£ç¢ºãªãã©ããã³ã°ãä¿èšŒããããã«ã¡ã¢ãªã³ã³ããã¹ãã䜿çšããã
ã¡ã¢ãªã®ä»çµã¿ãšãªã¯ã«ãŒãžã§ã³ã®åŠç
ãã®ã¡ã¢ãªã»ã¡ã«ããºã ã«ãããSAM 2 ã¯ãããªã»ããŒã¿ã®æéçäŸåæ§ããªã¯ã«ãŒãžã§ã³ãæ±ãããšãã§ããããªããžã§ã¯ãã移åãããçžäºäœçšããããããšãSAM 2 ã¯ãã®ç¹åŸŽãã¡ã¢ãªãã³ã¯ã«èšé²ããããªããžã§ã¯ãããªã¯ã«ãŒãžã§ã³ã«ãªããšãã¢ãã«ã¯ãã®ã¡ã¢ãªãŒãé Œãã«ãåã³çŸãããšãã®äœçœ®ãšå€èŠ³ãäºæž¬ããããšãã§ããããªã¯ã«ãŒãžã§ã³ãããã¯ããªããžã§ã¯ããèŠããªãã·ããªãªãç¹ã«åŠçãããªããžã§ã¯ãããªã¯ã«ãŒãžã§ã³ãããå¯èœæ§ãäºæž¬ããã
ãã«ããã¹ã¯ææ§ã解決
ææ§ãã®ããç¶æ³ïŒäŸãã°ãéãªãåããªããžã§ã¯ãïŒã§ã¯ãSAM 2 ã¯è€æ°ã®ãã¹ã¯äºæž¬ãçæããããšãã§ããããã®æ©èœã¯ãåäžã®ãã¹ã¯ã§ã¯ã·ãŒã³ã®ãã¥ã¢ã³ã¹ãååã«è¡šçŸã§ããªããããªè€éãªã·ãŒã³ãæ£ç¢ºã«è¡šçŸããããã«æ¥µããŠéèŠã§ããã
SA-VããŒã¿ã»ãã
SA-V ããŒã¿ã»ããã¯ãSAM 2 ã®ãã¬ãŒãã³ã°çšã«éçºããããã®ã§ãå©çšå¯èœãªæ倧ãã€æãå€æ§ãªãããªã»ã°ã¡ã³ããŒã·ã§ã³ããŒã¿ã»ããã®äžã€ã§ãããããã«ã¯ä»¥äžãå«ãŸããïŒ
- 51,000æ¬ä»¥äžã®ãããªïŒäžç47ã«åœã§æ®åœ±ãããå¹ åºãå®æŠã·ããªãªãæäŸã
- 60äžä»¥äžã®ãã¹ã¯æ³šéïŒæ空éçãªè©³çŽ°ãªãã¹ã¯ã¢ãããŒã·ã§ã³ã¯ããã¹ã¯ã¬ããããšåŒã°ãããªããžã§ã¯ãå šäœãšããŒããã«ããŒããŸãã
- ããŒã¿ã»ããã®èŠæš¡ïŒãããŸã§ã®æ倧èŠæš¡ã®ããŒã¿ã»ããã«æ¯ã¹ã4.5åã®ãããªãš53åã®ã¢ãããŒã·ã§ã³ãåé²ããŠããããããŸã§ã«ãªãå€æ§æ§ãšè€éæ§ãå®çŸããŠããã
ãã³ãããŒã¯
ãããªãªããžã§ã¯ãã®ã»ã°ã¡ã³ããŒã·ã§ã³
SAM 2ã¯ãäž»èŠãªãããªã»ã°ã¡ã³ããŒã·ã§ã³ãã³ãããŒã¯ã§åªããæ§èœãå®èšŒããŠããïŒ
ããŒã¿éå | J&F | J | F |
---|---|---|---|
ãã€ãã¹2017 | 82.5 | 79.8 | 85.2 |
ãŠãŒãã¥ãŒã-VOS | 81.2 | 78.9 | 83.5 |
ã€ã³ã¿ã©ã¯ãã£ãã»ã»ã°ã¡ã³ããŒã·ã§ã³
察話åã»ã°ã¡ã³ããŒã·ã§ã³ã®ã¿ã¹ã¯ã§ã¯ãSAM 2 ã倧ããªå¹çãšç²ŸåºŠã瀺ããŠããïŒ
ããŒã¿éå | NoC@90 | AUC |
---|---|---|
ãã€ãŽã£ã¹ã»ã€ã³ã¿ã©ã¯ãã£ã | 1.54 | 0.872 |
ã€ã³ã¹ããŒã«
SAM 2ãã€ã³ã¹ããŒã«ããã«ã¯ã以äžã®ã³ãã³ãã䜿çšããŸãããã¹ãŠã®SAM 2ã¢ãã«ã¯ãåå䜿çšæã«èªåçã«ããŠã³ããŒããããŸãã
SAM 2: ç»åãšãããªã®ã»ã°ã¡ã³ããŒã·ã§ã³ã«ãããå€æ§æ§
次ã®è¡šã¯ãå©çšå¯èœãªSAM 2 ã¢ãã«ããã®äºåèšç·Žãããéã¿ããµããŒããããã¿ã¹ã¯ãããã³æšè«ãæ€èšŒãèšç·Žããšã¯ã¹ããŒããªã©ã®ããŸããŸãªæäœã¢ãŒããšã®äºææ§ã®è©³çŽ°ã§ãã
ã¢ãã«ã¿ã€ã | äºåã«èšç·ŽããããŠã§ã€ã | 察å¿ã¿ã¹ã¯ | æšè« | ããªããŒã·ã§ã³ | ãã¬ãŒãã³ã° | èŒžåº |
---|---|---|---|---|---|---|
SAM 2 å°ã㪠| ãµã 2_t.pt | ã€ã³ã¹ã¿ã³ã¹ã®ã»ã°ã¡ã³ããŒã·ã§ã³ | â | â | â | â |
SAM å°2 | ãµã 2_s.pt | ã€ã³ã¹ã¿ã³ã¹ã®ã»ã°ã¡ã³ããŒã·ã§ã³ | â | â | â | â |
SAM 2ããŒã¹ | ãµã 2_b.pt | ã€ã³ã¹ã¿ã³ã¹ã®ã»ã°ã¡ã³ããŒã·ã§ã³ | â | â | â | â |
SAM 倧å2å° | sam2_l.pt | ã€ã³ã¹ã¿ã³ã¹ã®ã»ã°ã¡ã³ããŒã·ã§ã³ | â | â | â | â |
SAM 2.1 æ¥µå° | ãµã 2.1_t.pt | ã€ã³ã¹ã¿ã³ã¹ã®ã»ã°ã¡ã³ããŒã·ã§ã³ | â | â | â | â |
SAM 2.1 å°å | ãµã 2.1_s.pt | ã€ã³ã¹ã¿ã³ã¹ã®ã»ã°ã¡ã³ããŒã·ã§ã³ | â | â | â | â |
SAM 2.1 ããŒã¹ | ãµã 2.1_b.pt | ã€ã³ã¹ã¿ã³ã¹ã®ã»ã°ã¡ã³ããŒã·ã§ã³ | â | â | â | â |
SAM 2.1 倧å | ãµã 2.1_l.pt | ã€ã³ã¹ã¿ã³ã¹ã®ã»ã°ã¡ã³ããŒã·ã§ã³ | â | â | â | â |
SAM 2 äºæž¬äŸ
SAM 2ã¯ããªã¢ã«ã¿ã€ã ã®ãããªç·šéãå»ççšç»ååŠçãèªåŸã·ã¹ãã ãªã©ãå¹ åºãã¿ã¹ã¯ã§æŽ»çšã§ãããéçãªèŠèŠããŒã¿ãšåçãªèŠèŠããŒã¿ã®äž¡æ¹ãã»ã°ã¡ã³ãåã§ãããããç 究è ãéçºè ã«ãšã£ãŠæ±çšæ§ã®é«ãããŒã«ãšãªã£ãŠããã
ããã³ããã«ããã»ã°ã¡ã³ã
ããã³ããã«ããã»ã°ã¡ã³ã
ããã³ããã䜿çšããŠãç»åããããªå ã®ç¹å®ã®ãªããžã§ã¯ããåå²ããŸãã
from ultralytics import SAM
# Load a model
model = SAM("sam2.1_b.pt")
# Display model information (optional)
model.info()
# Run inference with bboxes prompt
results = model("path/to/image.jpg", bboxes=[100, 100, 200, 200])
# Run inference with single point
results = model(points=[900, 370], labels=[1])
# Run inference with multiple points
results = model(points=[[400, 370], [900, 370]], labels=[1, 1])
# Run inference with multiple points prompt per object
results = model(points=[[[400, 370], [900, 370]]], labels=[[1, 1]])
# Run inference with negative points prompt
results = model(points=[[[400, 370], [900, 370]]], labels=[[1, 0]])
ã»ã°ã¡ã³ãã»ãšããªã·ã³ã°
ã»ã°ã¡ã³ãã»ãšããªã·ã³ã°
ç¹å®ã®ããã³ãããªãã§ãç»åããããªã®ã³ã³ãã³ãå šäœãã»ã°ã¡ã³ãåããŸãã
ãããªã®ã»ã°ã¡ã³ãåãšãªããžã§ã¯ãã®è¿œè·¡
ã»ã°ã¡ã³ãã»ãããª
ç¹å®ã®ããã³ãããšãã©ãã¯ãªããžã§ã¯ãã䜿çšããŠããããªã³ã³ãã³ãå šäœãã»ã°ã¡ã³ãåããŸãã
from ultralytics.models.sam import SAM2VideoPredictor
# Create SAM2VideoPredictor
overrides = dict(conf=0.25, task="segment", mode="predict", imgsz=1024, model="sam2_b.pt")
predictor = SAM2VideoPredictor(overrides=overrides)
# Run inference with single point
results = predictor(source="test.mp4", points=[920, 470], labels=1)
# Run inference with multiple points
results = predictor(source="test.mp4", points=[[920, 470], [909, 138]], labels=[1, 1])
# Run inference with multiple points prompt per object
results = predictor(source="test.mp4", points=[[[920, 470], [909, 138]]], labels=[[1, 1]])
# Run inference with negative points prompt
results = predictor(source="test.mp4", points=[[[920, 470], [909, 138]]], labels=[[1, 0]])
- ãã®äŸã§ã¯ãããã³ããïŒbboxes/points/masksïŒãæäŸãããŠããªãå Žåã«ãSAM 2 ã䜿çšããŠãç»åãŸãã¯ãããªã®ã³ã³ãã³ãå šäœãã»ã°ã¡ã³ãåããæ¹æ³ã瀺ããŸãã
SAM 2 æ¯èŒYOLOv8
ããã§ã¯ãMetaã®æå°ã®SAM 2ã¢ãã«ãSAM2-tãšãUltralytics æå°ã®ã»ã°ã¡ã³ããŒã·ã§ã³ã¢ãã«ãYOLOv8n-segãæ¯èŒããïŒ
ã¢ãã« | ãµã€ãº (MB) |
ãã©ã¡ãŒã¿ (M) |
ã¹ããŒã (CPU) (ms/im) |
---|---|---|---|
ã¡ã¿SAM-b | 375 | 93.7 | 161440 |
ã¡ã¿SAM2-b | 162 | 80.8 | 121923 |
ã¡ã¿SAM2-t | 78.1 | 38.9 | 85155 |
MobileSAM | 40.7 | 10.1 | 98543 |
FastSAM YOLOv8 ããã¯ããŒã³ä»ã | 23.7 | 11.8 | 140 |
Ultralytics YOLOv8n-ã»ã° | 6.7ïŒ11.7åå°ããïŒ | 3.4åïŒ11.4åæžïŒ | 79.5ïŒ1071åéïŒ |
ãã®æ¯èŒã¯ãã¢ãã«éã®ã¢ãã«ãµã€ãºãšé床ã®æ¡éãã瀺ããŠãããSAM ãèªåã»ã°ã¡ã³ããŒã·ã§ã³ã®ããã®ãŠããŒã¯ãªæ©èœãåããŠããããYOLOv8 ãããå°ãããããéããããå¹ççãªã»ã°ã¡ã³ãã¢ãã«ãšçŽæ¥ç«¶åãããã®ã§ã¯ãªãã
ãã¹ãã¯ã16GBã®RAMãæèŒãã2023幎補Apple M2 Macbookã§å®æœããã torch==2.3.1
ãã㊠ultralytics==8.3.82
.ãã®ãã¹ããåçŸããïŒ
äŸ
from ultralytics import ASSETS, SAM, YOLO, FastSAM
# Profile SAM2-t, SAM2-b, SAM-b, MobileSAM
for file in ["sam_b.pt", "sam2_b.pt", "sam2_t.pt", "mobile_sam.pt"]:
model = SAM(file)
model.info()
model(ASSETS)
# Profile FastSAM-s
model = FastSAM("FastSAM-s.pt")
model.info()
model(ASSETS)
# Profile YOLOv8n-seg
model = YOLO("yolov8n-seg.pt")
model.info()
model(ASSETS)
ãªãŒãã¢ãããŒã·ã§ã³å¹ççãªããŒã¿ã»ããäœæ
ãªãŒãã¢ãããŒã·ã§ã³ã¯ãSAM 2 ã®åŒ·åãªæ©èœã§ããããŠãŒã¶ãŒã¯äºåã«èšç·Žãããã¢ãã«ã掻çšããããšã§ãã»ã°ã¡ã³ããŒã·ã§ã³ããŒã¿ã»ãããè¿ éãã€æ£ç¢ºã«çæããããšãã§ããããã®æ©èœã¯ã倧èŠæš¡ã§é«å質ãªããŒã¿ã»ãããæäœæ¥ã§äœæããå Žåã«ç¹ã«åœ¹ç«ã¡ãŸãã
SAM 2ã§èªå泚éãä»ããæ¹æ³
èŠããã ïŒ Ultralytics ïœããŒã¿ã©ããªã³ã°ãçšããã¡ã¿ã®ã»ã°ã¡ã³ãäœã§ã2ã¢ãã«ã«ããèªåã¢ãããŒã·ã§ã³
SAM 2ã䜿çšããŠããŒã¿ã»ãããèªå泚éä»ãããã«ã¯ã次ã®äŸã«åŸã£ãŠãã ããïŒ
èªå泚éã®äŸ
è°è« | ã¿ã€ã | ããã©ã«ã | 説æ |
---|---|---|---|
data |
str |
required | Path to directory containing target images/videos for annotation or segmentation. |
det_model |
str |
"yolo11x.pt" |
YOLO detection model path for initial object detection. |
sam_model |
str |
"sam2_b.pt" |
SAM2 model path for segmentation (supports t/s/b/l variants and SAM2.1 models). |
device |
str |
"" |
Computation device (e.g., 'cuda:0', 'cpu', or '' for automatic device detection). |
conf |
float |
0.25 |
YOLO detection confidence threshold for filtering weak detections. |
iou |
float |
0.45 |
IoU threshold for Non-Maximum Suppression to filter overlapping boxes. |
imgsz |
int |
640 |
Input size for resizing images (must be multiple of 32). |
max_det |
int |
300 |
Maximum number of detections per image for memory efficiency. |
classes |
list[int] |
None |
List of class indices to detect (e.g., [0, 1] for person & bicycle). |
output_dir |
str |
None |
Save directory for annotations (defaults to './labels' relative to data path). |
ãã®æ©èœã«ãããé«å質ãªã»ã°ã¡ã³ããŒã·ã§ã³ããŒã¿ã»ããã®è¿ éãªäœæãå¯èœã«ãªãããããžã§ã¯ãã®è¿ éåãç®æãç 究è ãéçºè ã«æé©ã§ãã
å¶éäºé
ãã®é·æã«ãããããããSAM 2ã«ã¯äžå®ã®éçãããïŒ
- ãã©ããã³ã°ã®å®å®æ§ïŒSAM 2ã¯ãé·æéã®ã·ãŒã±ã³ã¹ãå€§å¹ ãªèŠç¹å€æŽæã«ã察象ç©ãèŠå€±ãããšãããã
- ãªããžã§ã¯ãã®æ··ä¹±ïŒç¹ã«æ··éããã·ãŒã³ã§ã¯ãã¢ãã«ã䌌ããããªãªããžã§ã¯ããæ··åããŠããŸãããšãããã
- è€æ°ã®ãªããžã§ã¯ãã§ã®å¹çïŒè€æ°ã®ãªããžã§ã¯ããåæã«åŠçãããšããªããžã§ã¯ãéã®éä¿¡ãäžè¶³ãããããã»ã°ã¡ã³ããŒã·ã§ã³å¹çãäœäžããã
- 现éšã®æ£ç¢ºãïŒç¹ã«åãã®éã被åäœã§ã¯ã现ãããã£ããŒã«ãèŠéãããšããããè¿œå ã®ããã³ããã§ãã®åé¡ã«éšåçã«å¯ŸåŠã§ããããæéçãªæ»ãããã¯ä¿èšŒãããªãã
åŒçšãšè¬èŸ
SAM 2ãããªãã®ç 究ãŸãã¯éçºäœæ¥ã®éèŠãªäžéšã§ããå Žåã¯ã以äžã®åèæç®ã䜿çšããŠåŒçšããŠãã ããïŒ
@article{ravi2024sam2,
title={SAM 2: Segment Anything in Images and Videos},
author={Ravi, Nikhila and Gabeur, Valentin and Hu, Yuan-Ting and Hu, Ronghang and Ryali, Chaitanya and Ma, Tengyu and Khedr, Haitham and R{\"a}dle, Roman and Rolland, Chloe and Gustafson, Laura and Mintun, Eric and Pan, Junting and Alwala, Kalyan Vasudev and Carion, Nicolas and Wu, Chao-Yuan and Girshick, Ross and Doll{\'a}r, Piotr and Feichtenhofer, Christoph},
journal={arXiv preprint},
year={2024}
}
ãã®ç»æçãªã¢ãã«ãšããŒã¿ã»ããã§AIã³ãã¥ããã£ã«è²¢ç®ããŠãããMeta AIã«æè¬ã®æãè¡šããŸãã
ããããã質å
SAM 2ãšã¯äœã§ããïŒãŸãããªãªãžãã«ã®ã»ã°ã¡ã³ãã»ãšãã·ã³ã°ã»ã¢ãã«(SAM)ãã©ã®ããã«æ¹è¯ããã®ã§ããïŒ
SAM 2ã¯ãMetaã®Segment Anything Model (SAM) ã®åŸç¶ã§ãç»åãšåç»ã®äž¡æ¹ã§å æ¬çãªãªããžã§ã¯ãã»ã°ã¡ã³ããŒã·ã§ã³ã®ããã«èšèšãããæå 端ã®ããŒã«ã§ãããªã¢ã«ã¿ã€ã åŠçãšãŒãã·ã§ããæ±åããµããŒããããçµ±äžãããããã³ããå¯èœãªã¢ãã«ã¢ãŒããã¯ãã£ã«ãããè€éãªèŠèŠããŒã¿ã®åŠçã«åªããŠããŸããSAM 2 ã¯ããªãªãžãã«ã®SAM ã«æ¯ã¹ãŠã以äžã®ãããªããã€ãã®æ¹è¯ãæœãããŠããïŒ
- çµ±äžã¢ãã«ã¢ãŒããã¯ãã£ïŒç»åãšãããªã®ã»ã°ã¡ã³ããŒã·ã§ã³æ©èœã1ã€ã®ã¢ãã«ã«çµ±åã
- ãªã¢ã«ã¿ã€ã æ§èœïŒ1ç§éã«çŽ44ãã¬ãŒã ãåŠçãããããå³æã®ãã£ãŒãããã¯ãå¿ èŠãšããã¢ããªã±ãŒã·ã§ã³ã«é©ããŠããŸãã
- ãŒãã·ã§ããæ±åïŒããŸããŸãªèŠèŠé åã§åœ¹ç«ã€ã
- ã€ã³ã¿ã©ã¯ãã£ããªçµã蟌ã¿ïŒè¿œå ããã³ããã衚瀺ããããšã§ãã»ã°ã¡ã³ããŒã·ã§ã³çµæãå埩çã«çµã蟌ãããšãã§ããŸãã
- èŠèŠç課é¡ã®é«åºŠãªåŠçïŒãªããžã§ã¯ãã®ãªã¯ã«ãŒãžã§ã³ãååºçŸãªã©ã®äžè¬çãªãããªã»ã°ã¡ã³ããŒã·ã§ã³ã®èª²é¡ã管çããŸãã
SAM 2ã®ã¢ãŒããã¯ãã£ãšæ©èœã®è©³çŽ°ã«ã€ããŠã¯ãSAM 2ã®ãªãµãŒãããŒããŒãã芧ãã ããã
SAM 2ããªã¢ã«ã¿ã€ã ã®ãããªã»ã»ã°ã¡ã³ããŒã·ã§ã³ã«äœ¿çšããã«ã¯ã©ãããã°ããã§ããïŒ
SAM 2ã¯ããã®ããã³ããå¯èœãªã€ã³ã¿ãŒãã§ãŒã¹ãšãªã¢ã«ã¿ã€ã æšè«æ©èœã掻çšããããšã§ããªã¢ã«ã¿ã€ã ã®ãããªã»ã°ã¡ã³ããŒã·ã§ã³ã«å©çšããããšãã§ãããåºæ¬çãªäŸãæãããïŒ
ããã³ããã«ããã»ã°ã¡ã³ã
ããã³ããã䜿çšããŠãç»åããããªå ã®ç¹å®ã®ãªããžã§ã¯ããåå²ããŸãã
from ultralytics import SAM
# Load a model
model = SAM("sam2_b.pt")
# Display model information (optional)
model.info()
# Segment with bounding box prompt
results = model("path/to/image.jpg", bboxes=[100, 100, 200, 200])
# Segment with point prompt
results = model("path/to/image.jpg", points=[150, 150], labels=[1])
ããå æ¬çãªäœ¿çšæ¹æ³ã«ã€ããŠã¯ããSAM 2 ã®äœ¿çšæ¹æ³ããåç §ããŠãã ããã
SAM 2ã®èšç·Žã«ã¯ã©ã®ãããªããŒã¿ã»ããã䜿ãããã©ã®ããã«ãã®æ§èœãé«ããŠããã®ãïŒ
SAM 2ã¯ãå©çšå¯èœãªæ倧ãã€æãå€æ§ãªãããªã»ã°ã¡ã³ããŒã·ã§ã³ããŒã¿ã»ããã®1ã€ã§ããSA-VããŒã¿ã»ããã§åŠç¿ããããSA-VããŒã¿ã»ããã«ã¯ä»¥äžãå«ãŸããïŒ
- 51,000æ¬ä»¥äžã®ãããªïŒäžç47ã«åœã§æ®åœ±ãããå¹ åºãå®æŠã·ããªãªãæäŸã
- 60äžä»¥äžã®ãã¹ã¯æ³šéïŒæ空éçãªè©³çŽ°ãªãã¹ã¯ã¢ãããŒã·ã§ã³ã¯ããã¹ã¯ã¬ããããšåŒã°ãããªããžã§ã¯ãå šäœãšããŒããã«ããŒããŸãã
- ããŒã¿ã»ããã®èŠæš¡ïŒãããŸã§ã®æ倧èŠæš¡ã®ããŒã¿ã»ãããšæ¯èŒããŠã4.5åã®ãããªãš53åã®ã¢ãããŒã·ã§ã³ãåãããããŸã§ã«ãªãå€æ§æ§ãšè€éæ§ãæäŸã
ãã®åºç¯ãªããŒã¿ã»ããã«ãããSAM 2 ã¯ãäž»èŠãªãããªã»ã°ã¡ã³ããŒã·ã§ã³ãã³ãããŒã¯ã§åªããæ§èœãéæãããŒãã·ã§ããæ±åæ©èœã匷åããããšãã§ããŸãã詳现ã«ã€ããŠã¯ããSA-V ããŒã¿ã»ãããã®ã»ã¯ã·ã§ã³ãåç §ããŠãã ããã
SAM 2ã¯ãæ åã®ã»ã°ã¡ã³ããŒã·ã§ã³ã«ãããŠããªã¯ã«ãŒãžã§ã³ããªããžã§ã¯ãã®ååºçŸãã©ã®ããã«æ±ãã®ãïŒ
SAM 2ã«ã¯ããããªããŒã¿ã®æéäŸåæ§ãšãªã¯ã«ãŒãžã§ã³ã管çããããã®é«åºŠãªã¡ã¢ãªã¡ã«ããºã ãå«ãŸããŠããããã®ã¡ã¢ãªã¡ã«ããºã ã¯ä»¥äžã®ããã«æ§æãããŠããïŒ
- ã¡ã¢ãªãŒãšã³ã³ãŒããšã¡ã¢ãªãŒãã³ã¯ïŒéå»ã®ãã¬ãŒã ã®ç¹åŸŽãä¿åããã
- èšæ¶æ³šæã¢ãžã¥ãŒã«ïŒèšæ¶ãããæ å ±ã掻çšããæéã®çµéãšãšãã«äžè²«ããç©äœè¿œè·¡ãç¶æããã
- ãªã¯ã«ãŒãžã§ã³ãããïŒãªããžã§ã¯ããèŠããªãã·ããªãªã«ç¹åãããªããžã§ã¯ãããªã¯ã«ãŒãžã§ã³ãããå¯èœæ§ãäºæž¬ããã
ãã®ã¡ã«ããºã ã«ããããªããžã§ã¯ããäžæçã«èŠããªããªã£ãããã·ãŒã³ããåºããå ¥ã£ããããŠããé£ç¶æ§ãä¿ãããã詳ããã¯ãã¡ã¢ãªãŒã¡ã«ããºã ãšãªã¯ã«ãŒãžã§ã³åŠçã®ã»ã¯ã·ã§ã³ãåç §ããŠãã ããã
SAM 2ã¯ãYOLOv8 ã®ãããªä»ã®ã»ã°ã¡ã³ããŒã·ã§ã³ã¢ãã«ãšæ¯èŒããŠã©ãã§ããïŒ
SAM 2 ãšUltralytics YOLOv8 ã¯ãããããç°ãªãç®çã§äœ¿çšãããç°ãªãåéã§åªããŠãããSAM 2ã¯ããŒãã·ã§ããæ±åããªã¢ã«ã¿ã€ã æ§èœãªã©ã®é«åºŠãªæ©èœãåããå æ¬çãªãªããžã§ã¯ãã»ã°ã¡ã³ããŒã·ã§ã³çšã«èšèšãããŠããã®ã«å¯ŸããYOLOv8 ã¯ããªããžã§ã¯ãæ€åºãšã»ã°ã¡ã³ããŒã·ã§ã³ã¿ã¹ã¯ã®é床ãšå¹çã«æé©åãããŠããŸãã以äžã¯ãã®æ¯èŒã§ããïŒ
ã¢ãã« | ãµã€ãº (MB) |
ãã©ã¡ãŒã¿ (M) |
ã¹ããŒã (CPU) (ms/im) |
---|---|---|---|
ã¡ã¿SAM-b | 375 | 93.7 | 161440 |
ã¡ã¿SAM2-b | 162 | 80.8 | 121923 |
ã¡ã¿SAM2-t | 78.1 | 38.9 | 85155 |
MobileSAM | 40.7 | 10.1 | 98543 |
FastSAM YOLOv8 ããã¯ããŒã³ä»ã | 23.7 | 11.8 | 140 |
Ultralytics YOLOv8n-ã»ã° | 6.7ïŒ11.7åå°ããïŒ | 3.4åïŒ11.4åæžïŒ | 79.5ïŒ1071åéïŒ |
詳ããã¯ãSAM 2ãšYOLOv8ã®æ¯èŒãã芧ãã ããã