YOLO-äžçã¢ãã«
YOLO-ã¯ãŒã«ãã»ã¢ãã«ã¯ãå é²çãªãªã¢ã«ã¿ã€ã ã»ã¢ãã«ãå°å ¥ããŠããã UltralyticsYOLOv8-ããŒã¹ã®ã¢ãããŒããå°å ¥ããã®æè¡é©æ°ã«ããã説æçãªããã¹ãã«åºã¥ããŠç»åå ã®ããããç©äœã®æ€åºãå¯èœã«ãªããŸããYOLO-Worldã¯ã競äºåã®ããæ§èœãç¶æããªããèšç®éãå€§å¹ ã«åæžããããšã§ãå€ãã®èŠèŠããŒã¹ã®ã¢ããªã±ãŒã·ã§ã³ã®ããã®æ±çšæ§ã®é«ãããŒã«ãšããŠç»å ŽããŸããã
èŠããã ïŒ YOLO ã«ã¹ã¿ã ããŒã¿ã»ããã§ã®ã¯ãŒã«ããã¬ãŒãã³ã°ã¯ãŒã¯ãããŒ
æŠèŠ
YOLO-Worldã¯ãåŸæ¥ã®Open-Vocabularyæ€åºã¢ãã«ãçŽé¢ããŠãã課é¡ã«åãçµãã§ããŸãããã®ã¢ãã«ã¯ãå€ãã®å Žåãèšå€§ãªèšç®ãªãœãŒã¹ãå¿ èŠãšããç ©éãªTransformerã¢ãã«ã«äŸåããŠããŸãããããã®ã¢ãã«ã¯ãäºåã«å®çŸ©ããããªããžã§ã¯ãã«ããŽãªã«äŸåããŠãããããåçãªã·ããªãªã§ã®å®çšæ§ãå¶éãããŠããŸããYOLO-Worldã¯ããªãŒãã³èªåœæ€åºæ©èœãåããYOLOv8 ãã¬ãŒã ã¯ãŒã¯ã掻æ§åããŸããèŠèŠèšèªã¢ããªã³ã°ãæ¡çšããèšå€§ãªããŒã¿ã»ããã§äºåãã¬ãŒãã³ã°ãè¡ãããšã§ãæ¯é¡ã®ãªãå¹çã§ãŒãã·ã§ããã»ã·ããªãªã®åºç¯ãªãªããžã§ã¯ããèå¥ããããšã«åªããŠããŸãã
äž»ãªç¹åŸŽ
-
ãªã¢ã«ã¿ã€ã ã»ãœãªã¥ãŒã·ã§ã³ïŒCNNã®èšç®é床ã掻çšããYOLO-Worldã¯ãè¿ éãªãªãŒãã³èªåœæ€åºãœãªã¥ãŒã·ã§ã³ãæäŸããå³åº§ã«çµæãå¿ èŠãšããæ¥çã«å¯Ÿå¿ããŸãã
-
å¹çãšããã©ãŒãã³ã¹ïŒ YOLO-Worldã¯ãããã©ãŒãã³ã¹ãç ç²ã«ããããšãªããèšç®ãšãªãœãŒã¹ã®èŠä»¶ãåæžããSAM ã®ãããªã¢ãã«ã«ä»£ããå ç¢ãªã¢ãã«ãæäŸããŸãããèšç®ã³ã¹ãã¯ã»ãã®ãããã§ããªã¢ã«ã¿ã€ã ã»ã¢ããªã±ãŒã·ã§ã³ãå¯èœã«ããŸãã
-
ãªãã©ã€ã³èªåœã«ããæšè«: YOLO-Worldã¯ãããã³ããâæ€åºãæŠç¥ãå°å ¥ãããªãã©ã€ã³èªåœãæ¡çšããããšã§å¹çãããã«åäžãããããã®ã¢ãããŒãã«ããããã£ãã·ã§ã³ãã«ããŽãªãå«ããäºåã«èšç®ãããã«ã¹ã¿ã ããã³ããã®äœ¿çšãå¯èœã«ãªãããªãã©ã€ã³èªåœåã蟌ã¿ãšããŠãšã³ã³ãŒãããä¿åããããããæ€åºããã»ã¹ãå¹çåãããã
-
Powered byYOLOv8 ïŒã«åºã¥ããŠããŸãã Ultralytics YOLOv8YOLO-Worldã¯ããªã¢ã«ã¿ã€ã ç©äœæ€åºã®ææ°ã®é²æ©ã掻çšããæ¯é¡ã®ãªã粟床ãšã¹ããŒãã§ãªãŒãã³èªåœæ€åºã容æã«ããŸãã
-
åªãããã³ãããŒã¯ YOLOWorldã¯ãæšæºçãªãã³ãããŒã¯ã«ãããŠãMDETRãGLIPã·ãªãŒãºãå«ãæ¢åã®ãªãŒãã³ããã£ãã©ãªæ€åºåšãé床ãšå¹çã®ç¹ã§äžåããNVIDIA V100GPU 1å°ã§YOLOv8 ã®åªããèœåãå®èšŒããŠããŸãã
-
å€åœ©ãªã¢ããªã±ãŒã·ã§ã³ïŒ YOLO-ã¯ãŒã«ãã®é©æ°çãªã¢ãããŒãã¯ãå€ãã®ããžã§ã³ã¿ã¹ã¯ã«æ°ããªå¯èœæ§ããããããæ¢åã®æ¹æ³ã«æ¯ã¹ãŠæ¡éãã®ã¹ããŒãã¢ãããå®çŸããŸãã
å©çšå¯èœãªã¢ãã«ããµããŒããããã¿ã¹ã¯ãããã³åäœã¢ãŒã
ãã®ã»ã¯ã·ã§ã³ã§ã¯ãç¹å®ã®äºåèšç·Žãããéã¿ããµããŒãããã¿ã¹ã¯ãæšè«ãæ€èšŒããã¬ãŒãã³ã°ããšã¯ã¹ããŒããªã©ã®æ§ã ãªåäœã¢ãŒããšã®äºææ§ãšå ±ã«å©çšå¯èœãªã¢ãã«ã®è©³çŽ°ã説æããŸãïŒãµããŒããããã¢ãŒãã¯â ããµããŒããããªãã¢ãŒãã¯âã§ç€ºãããŸãïŒã
泚
å šãŠã®YOLOv8-World ãŠã§ã€ãã¯ãå ¬åŒYOLO-World ãªããžããªããçŽæ¥ç§»è¡ããã圌ãã®åªããè²¢ç®ãéç«ãããŠããã
ã¢ãã«ã»ã¿ã€ã | äºåã«èšç·ŽããããŠã§ã€ã | 察å¿ã¿ã¹ã¯ | æšè« | ããªããŒã·ã§ã³ | ãã¬ãŒãã³ã° | èŒžåº |
---|---|---|---|---|---|---|
YOLOv8s-äžç | yolov8s-World.pt | ç©äœæ€åº | â | â | â | â |
YOLOv8s-ã¯ãŒã«ãV2 | yolov8s-worldv2.pt | ç©äœæ€åº | â | â | â | â |
YOLOv8m-äžç | yolov8m-World.pt | ç©äœæ€åº | â | â | â | â |
YOLOv8m-ã¯ãŒã«ãV2 | yolov8m-worldv2.pt | ç©äœæ€åº | â | â | â | â |
YOLOv8l-äžç | yolov8l-World.pt | ç©äœæ€åº | â | â | â | â |
YOLOv8l-ã¯ãŒã«ãV2 | yolov8l-worldv2.pt | ç©äœæ€åº | â | â | â | â |
YOLOv8x-äžç | yolov8x-World.pt | ç©äœæ€åº | â | â | â | â |
YOLOv8x-ã¯ãŒã«ãV2 | yolov8x-worldv2.pt | ç©äœæ€åº | â | â | â | â |
COCOããŒã¿ã»ããã§ã®ãŒãã·ã§ãã転é
ã¢ãã«ã»ã¿ã€ã | ããã | ããã50 | ããã75 |
---|---|---|---|
yolov8s-äžç | 37.4 | 52.0 | 40.6 |
yolov8s-ã¯ãŒã«ãV2 | 37.7 | 52.2 | 41.0 |
yolov8m-äžç | 42.0 | 57.0 | 45.6 |
yolov8m-ã¯ãŒã«ãV2 | 43.0 | 58.4 | 46.8 |
yolov8l-äžç | 45.7 | 61.3 | 49.8 |
yolov8l-ã¯ãŒã«ãV2 | 45.8 | 61.3 | 49.8 |
yolov8x-äžç | 47.0 | 63.0 | 51.2 |
yolov8x-ã¯ãŒã«ãV2 | 47.1 | 62.8 | 51.4 |
䜿çšäŸ
YOLO-Worldã¢ãã«ã¯ãPython ã¢ããªã±ãŒã·ã§ã³ã«ç°¡åã«çµ±åã§ããŸããUltralytics éçºãå¹çåããããã«ããŠãŒã¶ãŒãã¬ã³ããªãŒãªPython APIãšCLI ã³ãã³ããæäŸããŸãã
åè»ã®å©çš
ããã
ã®äœ¿çšã匷ãæšå¥šããã yolov8-worldv2
決å®è«çãã¬ãŒãã³ã°ã«å¯Ÿå¿ããonnx/tensorrt ãªã©ã®ä»ã®ãã©ãŒããããžã®ãšã¯ã¹ããŒãã容æã§ãããããã«ã¹ã¿ã ãã¬ãŒãã³ã°çšã®ã¢ãã«ã§ããã
ãªããžã§ã¯ãã®æ€åºã¯ train
ã¡ãœããã䜿çšããïŒ
äŸ
PyTorch ãã
ããããã *.pt
ã¢ãã«ããã³æ§æ *.yaml
ãã¡ã€ã«ã«æž¡ãããšãã§ããã YOLOWorld()
ã¯ã©ã¹ã䜿çšããŠãpython ã«ã¢ãã«ã®ã€ã³ã¹ã¿ã³ã¹ãäœæããŸãïŒ
from ultralytics import YOLOWorld
# Load a pretrained YOLOv8s-worldv2 model
model = YOLOWorld("yolov8s-worldv2.pt")
# Train the model on the COCO8 example dataset for 100 epochs
results = model.train(data="coco8.yaml", epochs=100, imgsz=640)
# Run inference with the YOLOv8n model on the 'bus.jpg' image
results = model("path/to/bus.jpg")
å©çšç¶æ³ãäºæž¬ãã
ãªããžã§ã¯ãã®æ€åºã¯ predict
ã¡ãœããã䜿çšããïŒ
äŸ
from ultralytics import YOLOWorld
# Initialize a YOLO-World model
model = YOLOWorld("yolov8s-world.pt") # or select yolov8m/l-world.pt for different sizes
# Execute inference with the YOLOv8s-world model on the specified image
results = model.predict("path/to/image.jpg")
# Show results
results[0].show()
ãã®ã¹ããããã¯ãäºåã«èšç·Žãããã¢ãã«ãããŒãããç»åäžã§äºæž¬ãå®è¡ããã·ã³ãã«ãã瀺ããŠããŸãã
ãã«ã®äœ¿çš
ããŒã¿ã»ããã«å¯Ÿããã¢ãã«ã®æ€èšŒã¯ã次ã®ããã«å¹çåãããïŒ
äŸ
ãã©ãã¯å©çš
æ åïŒç»åäžã®YOLO-äžçã¢ãã«ãçšããç©äœè¿œè·¡ã¯ã以äžã®ããã«å¹çåãããïŒ
äŸ
泚
Ultralytics ãæäŸããYOLO-World ã¢ãã«ã¯ããªãã©ã€ã³ã®èªåœã®äžéšãšããŠCOCO ããŒã¿ã»ããã®ã«ããŽãªãäºåã«èšå®ãããŠãããããã«é©çšã§ããããã«å¹çãé«ããŠããŸãããã®çµ±åã«ãããYOLOv8-Worldã¢ãã«ã¯ãè¿œå èšå®ãã«ã¹ã¿ãã€ãºãå¿ èŠãšããã«ãCOCOããŒã¿ã»ããã«å®çŸ©ããã80ã®æšæºã«ããŽãªãŒãçŽæ¥èªèããäºæž¬ããããšãã§ããã
ããã³ãããèšå®ãã
YOLO-World ãã¬ãŒã ã¯ãŒã¯ã§ã¯ãã«ã¹ã¿ã ããã³ããã«ãã£ãŠã¯ã©ã¹ãåçã«æå®ã§ãããããåãã¬ãŒãã³ã°ãè¡ãããšãªãããŠãŒã¶ãç¹å®ã®ããŒãºã«åãããŠã¢ãã«ã調æŽããããšãã§ããŸãããã®æ©èœã¯ãåŠç¿ããŒã¿ã«ã¯å ã å«ãŸããŠããªãæ°ãããã¡ã€ã³ãç¹å®ã®ã¿ã¹ã¯ã«ã¢ãã«ãé©å¿ãããéã«ç¹ã«æçšã§ãããã«ã¹ã¿ã ããã³ãããèšå®ããããšã§ããŠãŒã¶ã¯åºæ¬çã«ã¢ãã«ã®ãã©ãŒã«ã¹ãé¢å¿ã®ãããªããžã§ã¯ãã«å°ãããšãã§ããæ€åºçµæã®é¢é£æ§ãšç²ŸåºŠãé«ããããšãã§ããã
äŸãã°ãã¢ããªã±ãŒã·ã§ã³ãã人ããšããã¹ããªããžã§ã¯ãã®æ€åºã ããå¿ èŠãšããå Žåããããã®ã¯ã©ã¹ãçŽæ¥æå®ããããšãã§ããŸãïŒ
äŸ
from ultralytics import YOLO
# Initialize a YOLO-World model
model = YOLO("yolov8s-world.pt") # or choose yolov8m/l-world.pt
# Define custom classes
model.set_classes(["person", "bus"])
# Execute prediction for specified categories on an image
results = model.predict("path/to/image.jpg")
# Show results
results[0].show()
ã«ã¹ã¿ã ã¯ã©ã¹ãèšå®ããåŸã«ã¢ãã«ãä¿åããããšãã§ããŸããããããããšã§ãç¹å®ã®ãŠãŒã¹ã±ãŒã¹ã«ç¹åããããŒãžã§ã³ã®YOLO-World ã¢ãã«ãäœæããããšãã§ããŸãããã®åŠçã«ãã£ãŠãã«ã¹ã¿ã ã¯ã©ã¹ã®å®çŸ©ãã¢ãã«ãã¡ã€ã«ã«çŽæ¥åã蟌ãŸããã¢ãã«ãããã«èª¿æŽããããšãªããæå®ããã¯ã©ã¹ã§ããã«äœ¿ããããã«ãªããŸãã以äžã®æé ã«åŸã£ãŠãã«ã¹ã¿ã YOLOv8 ã¢ãã«ã®ä¿åãšèªã¿èŸŒã¿ãè¡ã£ãŠãã ããïŒ
äŸ
ãŸããYOLO-Worldã¢ãã«ãããŒãããã«ã¹ã¿ã ã¯ã©ã¹ãèšå®ããŠä¿åããŸãïŒ
from ultralytics import YOLO
# Initialize a YOLO-World model
model = YOLO("yolov8s-world.pt") # or select yolov8m/l-world.pt
# Define custom classes
model.set_classes(["person", "bus"])
# Save the model with the defined offline vocabulary
model.save("custom_yolov8s.pt")
ä¿ååŸãcustom_yolov8s.ptã¢ãã«ã¯ãä»ã®èšç·Žæžã¿YOLOv8 ã¢ãã«ãšåãããã«åäœããŸãããéèŠãªéãããããŸãïŒå®çŸ©ããã¯ã©ã¹ã®ã¿ãæ€åºããããã«æé©åãããŠããŸãããã®ã«ã¹ã¿ãã€ãºã«ãããç¹å®ã®ã¢ããªã±ãŒã·ã§ã³ã·ããªãªã®æ€åºããã©ãŒãã³ã¹ãšå¹çãå€§å¹ ã«åäžãããããšãã§ããŸãã
ã«ã¹ã¿ã èªåœã§ä¿åããã¡ãªãã
- å¹çïŒé¢é£ãããªããžã§ã¯ãã«çŠç¹ãåœãŠãããšã§æ€åºããã»ã¹ãåçåããèšç®ãªãŒããŒããããåæžããæšè«ãé«éåããã
- æè»æ§ïŒå€§èŠæš¡ãªåãã¬ãŒãã³ã°ãããŒã¿åéã®å¿ èŠãªããæ°ããæ€åºã¿ã¹ã¯ãããããªæ€åºã¿ã¹ã¯ã«ã¢ãã«ãç°¡åã«é©å¿ãããããšãã§ããã
- ã·ã³ãã«ãïŒå®è¡æã«ã«ã¹ã¿ã ã¯ã©ã¹ãç¹°ãè¿ãæå®ããå¿ èŠããªãããããããã€ã¡ã³ããç°¡çŽ åãããåã蟌ãŸããèªåœã§ã¢ãã«ãçŽæ¥äœ¿çšã§ããããã«ãªããŸãã
- ããã©ãŒãã³ã¹ïŒã¢ãã«ã®æ³šæãšãªãœãŒã¹ãå®çŸ©ããããªããžã§ã¯ãã®èªèã«éäžãããããšã§ãæå®ãããã¯ã©ã¹ã®æ€åºç²ŸåºŠãé«ããŸãã
ãã®ã¢ãããŒãã¯ãç¹å®ã®ã¿ã¹ã¯ã®ããã«æå 端ã®ç©äœæ€åºã¢ãã«ãã«ã¹ã¿ãã€ãºãã匷åãªæ段ãæäŸããé«åºŠãªAIããã身è¿ã§å¹ åºãå®çšçã¢ããªã±ãŒã·ã§ã³ã«é©çšã§ããããã«ããã
å ¬åŒçµæããŒãããåçŸ(å®éš)
ããŒã¿ã»ãããæºåãã
- åè»ããŒã¿
ããŒã¿éå | ã¿ã€ã | ãµã³ãã« | ããã¯ã¹ | 泚éãã¡ã€ã« |
---|---|---|---|---|
ãªããžã§ã¯ã365v1 | æ€åº | 609k | 9621k | objects365_train.json |
GQA | æ¥å° | 621k | 3681k | final_mixed_train_no_coco.json |
Flickr30k | æ¥å° | 149k | 641k | final_flickr_separateGT_train.json |
- ãã«ããŒã¿
ããŒã¿éå | ã¿ã€ã | 泚éãã¡ã€ã« |
---|---|---|
LVISãããŽã¡ã« | æ€åº | minival.txt |
ãŒãããã®ãã¬ãŒãã³ã°éå§
泚
WorldTrainerFromScratch
ã¯ãæ€åºããŒã¿ã»ãããšæ¥å°ããŒã¿ã»ããã®äž¡æ¹ã§yolo-ã¯ãŒã«ãã¢ãã«ãåæã«ãã¬ãŒãã³ã°ã§ããããã«é«åºŠã«ã«ã¹ã¿ãã€ãºãããŠããŸãã詳现ã¯ãã¡ããã芧ãã ãã ultralytics.model.yolo.world.train_world.py.
äŸ
from ultralytics import YOLOWorld
from ultralytics.models.yolo.world.train_world import WorldTrainerFromScratch
data = dict(
train=dict(
yolo_data=["Objects365.yaml"],
grounding_data=[
dict(
img_path="../datasets/flickr30k/images",
json_file="../datasets/flickr30k/final_flickr_separateGT_train.json",
),
dict(
img_path="../datasets/GQA/images",
json_file="../datasets/GQA/final_mixed_train_no_coco.json",
),
],
),
val=dict(yolo_data=["lvis.yaml"]),
)
model = YOLOWorld("yolov8s-worldv2.yaml")
model.train(data=data, batch=128, epochs=100, trainer=WorldTrainerFromScratch)
åŒçšãšè¬èŸ
Tencent AILab Computer Vision Centerã® YOLO-Worldãçšãããªã¢ã«ã¿ã€ã ãªãŒãã³ããã£ãã©ãªãŒãªããžã§ã¯ãæ€åºã®å é§çç 究ã«æè¬ããïŒ
YOLO-Worldã®åæã¯arXivã§èªãããšãã§ããããããžã§ã¯ãã®ãœãŒã¹ã³ãŒããšãã®ä»ã®ãªãœãŒã¹ã¯ãGitHubãªããžããªããã¢ã¯ã»ã¹ã§ãããæã ã¯ããã®åéãçºå±ããã圌ãã®è²ŽéãªæŽå¯ãã³ãã¥ããã£ãšå ±æããããšãã圌ãã®å§¿å¢ã«æè¬ããŠããã
ããããã質å
YOLO-ã¯ãŒã«ãã»ã¢ãã«ãšã¯äœãïŒ
YOLO-Worldã¢ãã«ã¯ããã¬ãŒã ã¯ãŒã¯ã«åºã¥ããå é²çãªãªã¢ã«ã¿ã€ã ç©äœæ€åºã¢ãããŒãã§ããã Ultralytics YOLOv8ãã¬ãŒã ã¯ãŒã¯ã«åºã¥ããŠããŸãã説æçãªããã¹ãã«åºã¥ããŠç»åå ã®ãªããžã§ã¯ããèå¥ããããšã§ããªãŒãã³èªåœæ€åºã¿ã¹ã¯ã«åªããŠããŸããèŠèŠèšèªã¢ããªã³ã°ãšå€§èŠæš¡ãªããŒã¿ã»ããã§ã®äºååŠç¿ã䜿çšããããšã§ãYOLO-Worldã¯ãèšç®è² è·ãå€§å¹ ã«è»œæžããªããé«ãå¹çãšæ§èœãéæããããŸããŸãªæ¥çã®ãªã¢ã«ã¿ã€ã ã¢ããªã±ãŒã·ã§ã³ã«æé©ã§ãã
YOLO-Worldã¯ã«ã¹ã¿ã ããã³ããã«ããæšè«ãã©ã®ããã«æ±ãã®ãïŒ
YOLO-Worldã¯ããªãã©ã€ã³ã®èªåœãå©çšããŠå¹çãé«ãããããã³ããâæ€åºãæŠç¥ããµããŒãããŠããããã£ãã·ã§ã³ãç¹å®ã®ãªããžã§ã¯ãã«ããŽãªã®ãããªã«ã¹ã¿ã ããã³ããã¯äºåã«ãšã³ã³ãŒãããããªãã©ã€ã³èªåœåã蟌ã¿ãšããŠä¿åãããããã®ã¢ãããŒãã«ãããåãã¬ãŒãã³ã°ã®å¿ èŠãªãæ€åºããã»ã¹ãå¹çåãããŸãã以äžã®ããã«ããããã®ããã³ãããã¢ãã«å ã§åçã«èšå®ããç¹å®ã®æ€åºã¿ã¹ã¯ã«åãããããšãã§ããŸãïŒ
from ultralytics import YOLOWorld
# Initialize a YOLO-World model
model = YOLOWorld("yolov8s-world.pt")
# Define custom classes
model.set_classes(["person", "bus"])
# Execute prediction on an image
results = model.predict("path/to/image.jpg")
# Show results
results[0].show()
åŸæ¥ã®ãªãŒãã³èªåœæ€åºã¢ãã«ã§ã¯ãªããYOLO-Worldãéžã¶çç±ã¯ïŒ
YOLO-Worldã¯ãåŸæ¥ã®Open-Vocabularyæ€åºã¢ãã«ã«æ¯ã¹ãŠããã€ãã®å©ç¹ãããïŒ
- ãªã¢ã«ã¿ã€ã æ§èœïŒCNNã®èšç®é床ã掻çšããè¿ éã§å¹ççãªæ€åºãå®çŸã
- å¹çæ§ãšäœãªãœãŒã¹èŠä»¶: YOLO-Worldã¯ãé«ãããã©ãŒãã³ã¹ãç¶æããªãããèšç®ãšãªãœãŒã¹ã®éèŠãå€§å¹ ã«åæžããŸãã
- ã«ã¹ã¿ãã€ãºå¯èœãªããã³ããïŒãã®ã¢ãã«ã¯åçãªããã³ããèšå®ããµããŒãããŠããããŠãŒã¶ãŒã¯åãã¬ãŒãã³ã°ãªãã§ã«ã¹ã¿ã æ€åºã¯ã©ã¹ãæå®ããããšãã§ããŸãã
- åªãããã³ãããŒã¯ïŒæšæºçãªãã³ãããŒã¯ã«ãããŠãMDETRãGLIPã®ãããªä»ã®ãªãŒãã³ããã£ãã©ãªãŒãã£ãã¯ã¿ãŒãã¹ããŒããšå¹çã®äž¡æ¹ã§åé§ããŠããŸãã
èªåã®ããŒã¿ã»ããã§YOLO-World ã¢ãã«ãåŠç¿ããã«ã¯ïŒ
YOLO-World ã¢ãã«ã®ãã¬ãŒãã³ã°ã¯ãæäŸãããŠããPython API ãCLI ã³ãã³ãã䜿ã£ãŠç°¡åã«è¡ãããšãã§ãããããã§ã¯ãPython ã䜿ã£ãŠãã¬ãŒãã³ã°ãå§ããæ¹æ³ã説æããïŒ
from ultralytics import YOLOWorld
# Load a pretrained YOLOv8s-worldv2 model
model = YOLOWorld("yolov8s-worldv2.pt")
# Train the model on the COCO8 dataset for 100 epochs
results = model.train(data="coco8.yaml", epochs=100, imgsz=640)
ãŸãã¯ãCLI ïŒ
äºåã«èšç·ŽãããYOLO-World ã¢ãã«ãšãã®å¯Ÿå¿ã¿ã¹ã¯ã«ã¯ã©ã®ãããªãã®ããããŸããïŒ
Ultralytics ã¯ãããŸããŸãªã¿ã¹ã¯ãåäœã¢ãŒãããµããŒããããäºåã«èšç·Žãããè€æ°ã®YOLO-ã¯ãŒã«ãã¢ãã«ãæäŸããŸãïŒ
ã¢ãã«ã»ã¿ã€ã | äºåã«èšç·ŽããããŠã§ã€ã | 察å¿ã¿ã¹ã¯ | æšè« | ããªããŒã·ã§ã³ | ãã¬ãŒãã³ã° | èŒžåº |
---|---|---|---|---|---|---|
YOLOv8s-äžç | yolov8s-World.pt | ç©äœæ€åº | â | â | â | â |
YOLOv8s-ã¯ãŒã«ãV2 | yolov8s-worldv2.pt | ç©äœæ€åº | â | â | â | â |
YOLOv8m-äžç | yolov8m-World.pt | ç©äœæ€åº | â | â | â | â |
YOLOv8m-ã¯ãŒã«ãV2 | yolov8m-worldv2.pt | ç©äœæ€åº | â | â | â | â |
YOLOv8l-äžç | yolov8l-World.pt | ç©äœæ€åº | â | â | â | â |
YOLOv8l-ã¯ãŒã«ãV2 | yolov8l-worldv2.pt | ç©äœæ€åº | â | â | â | â |
YOLOv8x-äžç | yolov8x-World.pt | ç©äœæ€åº | â | â | â | â |
YOLOv8x-ã¯ãŒã«ãV2 | yolov8x-worldv2.pt | ç©äœæ€åº | â | â | â | â |
YOLO-Worldã®å ¬åŒçµæããŒãããåçŸããã«ã¯ïŒ
å
¬åŒçµæããŒãããåçŸããã«ã¯ãããŒã¿ã»ãããæºåããæäŸãããã³ãŒãã䜿çšããŠãã¬ãŒãã³ã°ãéå§ããå¿
èŠããããŸãããã¬ãŒãã³ã°ã®æé ãšããŠã¯ãããŒã¿èŸæžãäœæã train
ã¡ãœããã«ã«ã¹ã¿ã ãã¬ãŒããŒãå ããïŒ
from ultralytics import YOLOWorld
from ultralytics.models.yolo.world.train_world import WorldTrainerFromScratch
data = {
"train": {
"yolo_data": ["Objects365.yaml"],
"grounding_data": [
{
"img_path": "../datasets/flickr30k/images",
"json_file": "../datasets/flickr30k/final_flickr_separateGT_train.json",
},
{
"img_path": "../datasets/GQA/images",
"json_file": "../datasets/GQA/final_mixed_train_no_coco.json",
},
],
},
"val": {"yolo_data": ["lvis.yaml"]},
}
model = YOLOWorld("yolov8s-worldv2.yaml")
model.train(data=data, batch=128, epochs=100, trainer=WorldTrainerFromScratch)