YOLOv9: A Leap Forward in Object Detection Technology
YOLOv9ã¯ãProgrammable Gradient Information (PGI)ãGeneralized Efficient Layer Aggregation Network (GELAN)ãšãã£ãç»æçãªæè¡ãå°å ¥ãããªã¢ã«ã¿ã€ã ã®ç©äœæ€åºã«å€§ããªé²æ©ããããããŸããããã®ã¢ãã«ã¯ãMS COCOããŒã¿ã»ããã§æ°ããªãã³ãããŒã¯ãèšå®ããå¹çæ§ã粟床ãé©å¿æ§ã«ãããŠé¡èãªæ¹åã瀺ããŠããŸããYOLOv9ãããžã§ã¯ãã¯ãç¬ç«ãããªãŒãã³ãœãŒã¹ã»ããŒã ã«ãã£ãŠéçºãããŠããŸããã次ã®ãããªå ç¢ãªã³ãŒãããŒã¹ã«åºã¥ããŠããŸãã UltralyticsYOLOv5ã«ãã£ãŠæäŸãããå ç¢ãªã³ãŒãããŒã¹ã«åºã¥ããŠãããAIç 究ã³ãã¥ããã£ã®å調粟ç¥ã瀺ããŠããã
èŠããã ïŒ Ultralytics ïœå·¥æ¥çšããã±ãŒãžããŒã¿ã»ãããçšããã«ã¹ã¿ã ããŒã¿ã§ã®YOLOv9ãã¬ãŒãã³ã°
YOLOv9ã®çŽ¹ä»
In the quest for optimal real-time object detection, YOLOv9 stands out with its innovative approach to overcoming information loss challenges inherent in deep neural networks. By integrating PGI and the versatile GELAN architecture, YOLOv9 not only enhances the model's learning capacity but also ensures the retention of crucial information throughout the detection process, thereby achieving exceptional accuracy and performance.
YOLOv9ã®ã³ã¢ã»ã€ãããŒã·ã§ã³
YOLOv9ã®é²æ©ã¯ããã£ãŒãã»ãã¥ãŒã©ã«ã»ãããã¯ãŒã¯ã«ãããæ å ±æ倱ããããã課é¡ã«å¯ŸåŠããããšã«æ·±ãæ ¹ãããŠãããæ å ±ããã«ããã¯ã®åçãšé©æ°çãªå¯éé¢æ°ã®äœ¿çšãèšèšã®äžå¿ãšãªã£ãŠãããYOLOv9ãé«ãå¹çãšç²ŸåºŠãç¶æããããšãä¿èšŒããŠããã
æ å ±ã®ããã«ããã¯ã®åç
æ å ±ããã«ããã¯ã®åçã¯ããã£ãŒãã©ãŒãã³ã°ã«ãããåºæ¬çãªèª²é¡ãæããã«ãããããŒã¿ããããã¯ãŒã¯ã®é£ç¶ããã¬ã€ã€ãŒãééããã«ã€ããŠãæ å ±æ倱ã®å¯èœæ§ãå¢å€§ããããã®çŸè±¡ã¯æ°åŠçã«ã¯æ¬¡ã®ããã«è¡šãããïŒ
ã©ã I
ã¯çžäºæ
å ±ãè¡šã f
ãã㊠g
ãã©ã¡ãŒã¿ãæã€å€æé¢æ°ãè¡šã theta
ãã㊠phi
ããããYOLOv9ã¯ãããã°ã©ããã«åŸé
æ
å ±ïŒPGIïŒãå®è£
ããããšã§ããã®èª²é¡ã«å¯ŸåŠããŠãããPGIã¯ããããã¯ãŒã¯ã®æ·±ãå
šäœã«ããã£ãŠéèŠãªããŒã¿ãä¿æããã®ã«åœ¹ç«ã¡ãããä¿¡é Œæ§ã®é«ãåŸé
çæãä¿èšŒãããã®çµæãã¢ãã«ã®åæãšããã©ãŒãã³ã¹ãåäžããã
å¯éé¢æ°
å¯éé¢æ°ã®ã³ã³ã»ããã¯ãYOLOv9ã®èšèšã®ããäžã€ã®èŠã§ãããé¢æ°ãå¯éçã§ãããšã¿ãªãããã®ã¯ãæ å ±ã倱ãããšãªãå転ã§ããå Žåã§ããïŒ
ãš psi
ãã㊠zeta
as parameters for the reversible and its inverse function, respectively. This property is crucial for deep learning architectures, as it allows the network to retain a complete information flow, thereby enabling more accurate updates to the model's parameters. YOLOv9 incorporates reversible functions within its architecture to mitigate the risk of information degradation, especially in deeper layers, ensuring the preservation of critical data for object detection tasks.
軜éã¢ãã«ãžã®åœ±é¿
æ å ±æ倱ãžã®å¯ŸåŠã¯ããã©ã¡ãŒã¿åãäžååã§ãã£ãŒããã©ã¯ãŒãåŠçäžã«éèŠãªæ å ±ã倱ããã¡ãªè»œéã¢ãã«ã«ãšã£ãŠç¹ã«éèŠã§ããYOLOv9ã®ã¢ãŒããã¯ãã£ãŒã¯ãPGIãšå¯éé¢æ°ã®äœ¿çšã«ãããç°¡çŽ åãããã¢ãã«ã§ãã£ãŠããæ£ç¢ºãªç©äœæ€åºã«å¿ èŠãªå¿ é æ å ±ãä¿æãããå¹æçã«å©çšãããããšãä¿èšŒããŸãã
ããã°ã©ããã«ã»ã°ã©ãã£ãšã³ãã»ã€ã³ãã©ã¡ãŒã·ã§ã³ïŒPGIïŒ
PGIã¯ãæ å ±ã®ããã«ããã¯åé¡ã«å¯ŸåŠããããã«YOLOv9ã«å°å ¥ãããæ°ããæŠå¿µã§ãããæ·±ããããã¯ãŒã¯å±€ã«ããã£ãŠå¿ èŠäžå¯æ¬ ãªããŒã¿ã®ä¿åãä¿èšŒãããããã«ãããä¿¡é Œæ§ã®é«ãåŸé ã®çæãå¯èœã«ãªããæ£ç¢ºãªã¢ãã«ã®æŽæ°ã容æã«ãªããå šäœçãªæ€åºæ§èœãåäžããã
äžè¬åå¹çã¬ã€ã€éçŽãããã¯ãŒã¯ïŒGELANïŒ
GELANã¯ãYOLOv9ãåªãããã©ã¡ãŒã¿å©çšçãšèšç®å¹çãéæããããšãå¯èœã«ãããæŠç¥çãªã¢ãŒããã¯ãã£ã®é²æ©ãè¡šããŠããŸãããã®èšèšã«ãããããŸããŸãªèšç®ãããã¯ãæè»ã«çµ±åã§ãããããYOLOv9ã¯é床ã粟床ãç ç²ã«ããããšãªããå¹ åºãã¢ããªã±ãŒã·ã§ã³ã«é©å¿ã§ããã
YOLOv9ãã³ãããŒã¯
ã䜿çšããYOLOv9ã®ãã³ãããŒã¯ã§ã¯ãåŠç¿ãããã¢ãã«ã®æ§èœãå®éã®ã·ããªãªã§è©äŸ¡ããŸãã Ultralyticsã䜿çšãããã³ãããŒã¯ã§ã¯ãåŠç¿ã»æ€èšŒããã¢ãã«ã®ããã©ãŒãã³ã¹ãå®äžçã®ã·ããªãªã§è©äŸ¡ããŸãããã®ããã»ã¹ã«ã¯ä»¥äžãå«ãŸããŸãïŒ
- ããã©ãŒãã³ã¹è©äŸ¡ïŒã¢ãã«ã®ã¹ããŒããšç²ŸåºŠãè©äŸ¡ããã
- ãšã¯ã¹ããŒã圢åŒïŒããŸããŸãªãšã¯ã¹ããŒããã©ãŒãããã§ã¢ãã«ããã¹ãããå¿ èŠãªæšæºãæºãããããŸããŸãªç°å¢ã§ããŸãæ©èœããããšã確èªããã
- ãã¬ãŒã ã¯ãŒã¯ã®ãµããŒãïŒ Ultralytics YOLOv8 å ã«å æ¬çãªãã¬ãŒã ã¯ãŒã¯ãæäŸãããããã®è©äŸ¡ã容æã«ããäžè²«ããä¿¡é Œã§ããçµæãä¿èšŒããã
ãã³ãããŒã¯ãè¡ãããšã§ãã¢ãã«ã管çããããã¹ãç°å¢ã§åªããæ§èœãçºæ®ããã ãã§ãªããå®çšçãªå®äžçã®ã¢ããªã±ãŒã·ã§ã³ã§ãé«ãæ§èœãç¶æã§ããããšã確èªã§ããŸãã
èŠããã ïŒ Ultralytics Python ããã±ãŒãžã䜿çšããYOLOv9ã¢ãã«ã®ãã³ãããŒã¯æ¹æ³
MS COCOããŒã¿ã»ããã§ã®æ§èœ
The performance of YOLOv9 on the COCO dataset exemplifies its significant advancements in real-time object detection, setting new benchmarks across various model sizes. Table 1 presents a comprehensive comparison of state-of-the-art real-time object detectors, illustrating YOLOv9's superior efficiency and accuracy.
è¡š1.ææ°ã®ãªã¢ã«ã¿ã€ã ç©äœæ€åºåšã®æ¯èŒ
ããã©ãŒãã³ã¹
ã¢ãã« | ãµã€ãº (ãã¯ã»ã«) |
mAPval 50-95 |
mAPval 50 |
params (M) |
FLOPs (B) |
---|---|---|---|---|---|
YOLOv9t | 640 | 38.3 | 53.1 | 2.0 | 7.7 |
YOLOv9s | 640 | 46.8 | 63.4 | 7.2 | 26.7 |
YOLOv9m | 640 | 51.4 | 68.1 | 20.1 | 76.8 |
YOLOv9c | 640 | 53.0 | 70.2 | 25.5 | 102.8 |
YOLOv9e | 640 | 55.6 | 72.8 | 58.1 | 192.5 |
ã¢ãã« | ãµã€ãº (ãã¯ã»ã«) |
mAPbox 50-95 |
mAPmask 50-95 |
params (M) |
FLOPs (B) |
---|---|---|---|---|---|
YOLOv9c-seg | 640 | 52.4 | 42.2 | 27.9 | 159.4 |
YOLOv9e-seg | 640 | 55.1 | 44.3 | 60.5 | 248.4 |
YOLOv9ã®ã€ãã¬ãŒã·ã§ã³ã¯ãå°ããªãã®ããå°ããªãã®ãŸã§å€å²ã«ãããã t
åºç¯ãª e
model, demonstrate improvements not only in accuracy (mAP metrics) but also in efficiency with a reduced number of parameters and computational needs (FLOPs). This table underscores YOLOv9's ability to deliver high precision while maintaining or reducing the computational overhead compared to prior versions and competing models.
ããã«æ¯ã¹ãYOLOv9ã¯ç®èŠãŸããæé·ãéããŠããïŒ
- 軜éã¢ãã«ïŒYOLOv9sã¯ããã©ã¡ãŒã¿å¹çãšèšç®è² è·ã§YOLO MS-SãäžåããAPã§ã¯0.4ã0.6%ã®æ¹åãéæããã
- äžèŠæš¡ãã倧èŠæš¡ã¢ãã«YOLOv9mãšYOLOv9eã¯ãã¢ãã«ã®è€éããšæ€åºæ§èœã®ãã¬ãŒããªãã®ãã©ã³ã¹ã«ãããŠé¡èãªé²æ©ã瀺ããŠããã粟床ã®åäžãèæ¯ã«ãã©ã¡ãŒã¿ãšèšç®ã®å€§å¹ ãªåæžãå®çŸããŠããã
ç¹ã«YOLOv9cã¢ãã«ã¯ãã¢ãŒããã¯ãã£ã®æé©åã®æå¹æ§ãæµ®ã圫ãã«ããŠããããã®ã¢ãã«ã¯ãYOLOv7 AFããã42%å°ãªããã©ã¡ãŒã¿ãš21%å°ãªãèšç®éã§åäœããªãããåçã®ç²ŸåºŠãéæããŠãããYOLOv9ã®å€§å¹ ãªå¹çåäžãå®èšŒããŠããŸããããã«ãYOLOv9eã¢ãã«ã¯ãYOLOv7AFããã15%å°ãªããã©ã¡ãŒã¿ãš25%å°ãªãèšç®éã§ã倧èŠæš¡ã¢ãã«ã®æ°ããåºæºãæã¡ç«ãŠãŸããã YOLOv8xããã«ãYOLOv9eã¢ãã«ã¯ãAPã®1.7%åäžãšãšãã«ããã©ã¡ãŒã¿ã15%åæžããèšç®éã25%åæžããããšã§ã倧èŠæš¡ã¢ãã«ã®æ°ããåºæºãæã¡ç«ãŠãã
These results showcase YOLOv9's strategic advancements in model design, emphasizing its enhanced efficiency without compromising on the precision essential for real-time object detection tasks. The model not only pushes the boundaries of performance metrics but also emphasizes the importance of computational efficiency, making it a pivotal development in the field of computer vision.
çµè«
YOLOv9ã¯ããªã¢ã«ã¿ã€ã ç©äœæ€åºã«ããã極ããŠéèŠãªçºå±ã§ãããå¹çæ§ã粟床ãé©å¿æ§ã®é¢ã§å€§å¹ ãªæ¹åããããããŸããPGIãGELANã®ãããªé©æ°çãªãœãªã¥ãŒã·ã§ã³ãéããŠéèŠãªèª²é¡ã«åãçµãããšã§ãYOLOv9ã¯ãã®åéã«ãããå°æ¥ã®ç 究ãšå¿çšã«æ°ããªå äŸãæã¡ç«ãŠããAIã³ãã¥ããã£ãé²åãç¶ããäžãYOLOv9ã¯ãæè¡ã®é²æ©ãä¿é²ããã³ã©ãã¬ãŒã·ã§ã³ãšã€ãããŒã·ã§ã³ã®åã蚌æãããã®ã§ããã
䜿çšäŸ
ãã®äŸã§ã¯ãç°¡åãªYOLOv9ã®ãã¬ãŒãã³ã°ãšæšè«ã®äŸãæäŸããŸãããããã®ã¢ãŒããä»ã®ã¢ãŒãã«é¢ããå®å šãªããã¥ã¡ã³ãã¯ãPredict,Train,ValandExportdocs ããŒãžãåç §ããŠãã ããã
äŸ
PyTorch pretrained *.pt
ã¢ãã«ããã³æ§æ *.yaml
ãã¡ã€ã«ã«æž¡ãããšãã§ããã YOLO()
ã¯ã©ã¹ã䜿çšããŠãpython ã«ã¢ãã«ã®ã€ã³ã¹ã¿ã³ã¹ãäœæããŸãïŒ
from ultralytics import YOLO
# Build a YOLOv9c model from scratch
model = YOLO("yolov9c.yaml")
# Build a YOLOv9c model from pretrained weight
model = YOLO("yolov9c.pt")
# Display model information (optional)
model.info()
# Train the model on the COCO8 example dataset for 100 epochs
results = model.train(data="coco8.yaml", epochs=100, imgsz=640)
# Run inference with the YOLOv9c model on the 'bus.jpg' image
results = model("path/to/bus.jpg")
CLI ã³ãã³ãã§ã¢ãã«ãçŽæ¥å®è¡ã§ããïŒ
ãµããŒããããã¿ã¹ã¯ãšã¢ãŒã
YOLOv9ã·ãªãŒãºã«ã¯ãé«æ§èœãªç©äœæ€åºçšã«æé©åãããæ§ã ãªã¢ãã«ããããŸãããããã®ã¢ãã«ã¯ãããŸããŸãªèšç®ããŒãºã粟床èŠä»¶ã«å¯Ÿå¿ããå¹ åºãã¢ããªã±ãŒã·ã§ã³ã«å¯Ÿå¿ããŸãã
ã¢ãã« | ãã¡ã€ã«å | ã¿ã¹ã¯ | æšè« | ããªããŒã·ã§ã³ | ãã¬ãŒãã³ã° | èŒžåº |
---|---|---|---|---|---|---|
YOLOv9 | yolov9t yolov9s yolov9m yolov9c.pt yolov9e.pt |
ç©äœæ€åº | â | â | â | â |
YOLOv9ã»ã° | yolov9c-seg.pt yolov9e-seg.pt |
ã€ã³ã¹ã¿ã³ã¹ã®ã»ã°ã¡ã³ããŒã·ã§ã³ | â | â | â | â |
ãã®è¡šã¯ãYOLOv9ã¢ãã«ããªã¢ã³ãã®è©³çŽ°ãªæŠèŠã瀺ããŠãããç©äœæ€åºã¿ã¹ã¯ã«ãããããããã®æ©èœãšãæšè«ãæ€èšŒããã¬ãŒãã³ã°ããšã¯ã¹ããŒããšãã£ãæ§ã ãªæäœã¢ãŒããšã®äºææ§ã匷調ããŠããŸãããã®å æ¬çãªãµããŒãã«ããããŠãŒã¶ãŒã¯å¹ åºãç©äœæ€åºã·ããªãªã§YOLOv9ã¢ãã«ã®èœåããã«ã«æŽ»çšããããšãã§ããŸãã
泚
YOLOv9ã¢ãã«ã®ãã¬ãŒãã³ã°ã«ã¯ãåãµã€ãºã®ã¢ãã«ïŒYOLOv8 ïŒãããå€ãã®ãªãœãŒã¹ãå¿ èŠãšãªããæéããããã
åŒçšãšè¬èŸ
ãªã¢ã«ã¿ã€ã ç©äœæ€åºåéã«ãããYOLOv9äœè ã®å€å€§ãªè²¢ç®ã«æè¬ãããïŒ
ãªãªãžãã«ã®YOLOv9è«æã¯arXivã«æ²èŒãããŠãããèè ãã¯åœŒãã®ç 究ãå ¬éããã³ãŒãããŒã¹ã¯GitHubã§ã¢ã¯ã»ã¹ã§ãããæã ã¯ããã®åéãçºå±ãããããåºãã³ãã¥ããã£ãŒã圌ãã®ç 究ã«ã¢ã¯ã»ã¹ã§ããããã«ãã圌ãã®åªåã«æè¬ããŠããã
ããããã質å
YOLOv9ã¯ããªã¢ã«ã¿ã€ã ã®ç©äœæ€åºã®ããã«ã©ã®ãããªã€ãããŒã·ã§ã³ãå°å ¥ããã®ã§ããïŒ
YOLOv9ã¯ãProgrammable Gradient InformationïŒPGIïŒãGeneralized Efficient Layer Aggregation NetworkïŒGELANïŒãšãã£ãç»æçãªæè¡ãå°å ¥ããŠããããããã®ã€ãããŒã·ã§ã³ã¯ããã£ãŒããã¥ãŒã©ã«ãããã¯ãŒã¯ã«ãããæ å ±æ倱ã®èª²é¡ã«å¯ŸåŠããé«ãå¹çæ§ã粟床ãé©å¿æ§ãä¿èšŒããŸããPGIã¯ãããã¯ãŒã¯ã®ã¬ã€ã€ãŒããŸããã§éèŠãªããŒã¿ãä¿æããGELANã¯ãã©ã¡ãŒã¿ãŒã®å©çšãšèšç®å¹çãæé©åããŸããMS COCOããŒã¿ã»ããã§æ°ããªãã³ãããŒã¯ãèšå®ããYOLOv9ã®ã³ã¢ã€ãããŒã·ã§ã³ã®è©³çŽ°ã«ã€ããŠã¯ããã¡ããã芧ãã ããã
MSã®COCOããŒã¿ã»ããã«ãããYOLOv9ã®ããã©ãŒãã³ã¹ã¯ãä»ã®ã¢ãã«ãšæ¯èŒããŠã©ããªã®ãïŒ
YOLOv9ã¯ãããé«ã粟床ãšå¹çãéæããããšã§ãæå 端ã®ãªã¢ã«ã¿ã€ã ç©äœæ€åºåšãåé§ãããCOCOããŒã¿ã»ããã«ãããŠãYOLOv9ã¢ãã«ã¯ãèšç®ãªãŒãããããç¶æãŸãã¯åæžããªãããæ§ã ãªãµã€ãºã«ãããŠåªããmAPã¹ã³ã¢ã瀺ããäŸãã°ãYOLOv9cã¯ãYOLOv7 AFããã42%å°ãªããã©ã¡ãŒã¿ãš21%å°ãªãèšç®éã§ãåçã®ç²ŸåºŠãéæããŠããŸãã詳现ãªææšã«ã€ããŠã¯ãæ§èœæ¯èŒãã芧ãã ããã
Python ãCLI ã䜿ã£ãŠYOLOv9ã¢ãã«ããã¬ãŒãã³ã°ããã«ã¯ïŒ
YOLOv9ã¢ãã«ã¯ãPython ãšCLI ã®äž¡æ¹ã®ã³ãã³ãã䜿ã£ãŠãã¬ãŒãã³ã°ããããšãã§ãããPython ãã¢ãã«ãã€ã³ã¹ã¿ã³ã¹åããã«ã¯ YOLO
ã¯ã©ã¹ãåŒã³åºã train
ã¡ãœããã䜿çšããïŒ
from ultralytics import YOLO
# Build a YOLOv9c model from pretrained weights and train
model = YOLO("yolov9c.pt")
results = model.train(data="coco8.yaml", epochs=100, imgsz=640)
CLI ããã¬ãŒãã³ã°ãå®æœããïŒ
ãã¬ãŒãã³ã°ãšæšè«ã®äœ¿çšäŸã«ã€ããŠã¯ãã¡ããã芧ãã ããã
Ultralytics YOLOv9ã軜éã¢ãã«ã«äœ¿ãå©ç¹ã¯äœã§ããïŒ
YOLOv9ã¯ãæ å ±æ倱ã軜æžããããã«èšèšãããŠããŸããããã¯ãéèŠãªæ å ±ã倱ãããã¡ãªè»œéã¢ãã«ã«ãšã£ãŠç¹ã«éèŠã§ããããã°ã©ããã«åŸé æ å ±ïŒPGIïŒãšå¯éé¢æ°ãçµ±åããããšã§ãYOLOv9ã¯æ¬è³ªçãªããŒã¿ä¿æãä¿èšŒããã¢ãã«ã®ç²ŸåºŠãšå¹çãé«ããŸãããã®ãããã³ã³ãã¯ãã§é«æ§èœãªã¢ãã«ãå¿ èŠãšããã¢ããªã±ãŒã·ã§ã³ã«éåžžã«é©ããŠããŸãã詳ããã¯ãYOLOv9ã軜éã¢ãã«ã«äžãã圱é¿ã®ã»ã¯ã·ã§ã³ãã芧ãã ããã
YOLOv9ã¯ã©ã®ãããªã¿ã¹ã¯ãã¢ãŒãããµããŒãããŠããŸããïŒ
YOLOv9 supports various tasks including object detection and instance segmentation. It is compatible with multiple operational modes such as inference, validation, training, and export. This versatility makes YOLOv9 adaptable to diverse real-time computer vision applications. Refer to the supported tasks and modes section for more information.