COCO-Pose ããŒã¿ã»ãã
COCO-PoseããŒã¿ã»ããã¯ãCOCO (Common Objects in Context)ããŒã¿ã»ããã®ç¹å¥ããŒãžã§ã³ã§ãããŒãºæšå®ã¿ã¹ã¯çšã«èšèšãããŠãããCOCO Keypoints 2017ã®ç»åãšã©ãã«ã掻çšãã姿å¢æšå®ã¿ã¹ã¯ã®ããã®YOLO ã®ãããªã¢ãã«ã®åŠç¿ãå¯èœã«ããã
COCO-ããŒãºäºååŠç¿ã¢ãã«
ã¢ãã« | ãµã€ãº (ãã¯ã»ã«) |
mAPpose 50-95 |
mAPpose 50 |
é床 CPU ONNX (ms) |
ã¹ããŒã T4TensorRT10 (ms) |
params (M) |
FLOPs (B) |
---|---|---|---|---|---|---|---|
YOLO11n-ããŒãº | 640 | 50.0 | 81.0 | 52.4 ± 0.5 | 1.7 ± 0.0 | 2.9 | 7.6 |
YOLO11sããŒãº | 640 | 58.9 | 86.3 | 90.5 ± 0.6 | 2.6 ± 0.0 | 9.9 | 23.2 |
YOLO11mããŒãº | 640 | 64.9 | 89.4 | 187.3 ± 0.8 | 4.9 ± 0.1 | 20.9 | 71.7 |
YOLO11l-ããŒãº | 640 | 66.1 | 89.9 | 247.7 ± 1.1 | 6.4 ± 0.1 | 26.2 | 90.7 |
YOLO11xããŒãº | 640 | 69.5 | 91.1 | 488.0 ± 13.9 | 12.1 ± 0.2 | 58.8 | 203.3 |
äž»ãªç¹åŸŽ
- COCO-Poseã¯ãããŒãºæšå®ã¿ã¹ã¯ã®ããã®ããŒãã€ã³ãã§ã©ãã«ä»ãããã200Kç»åãå«ãCOCO Keypoints 2017ããŒã¿ã»ããã«åºã¥ããŠæ§ç¯ãããŠããã
- ãã®ããŒã¿ã»ããã¯ã人ç©ã®17ã®ããŒãã€ã³ãããµããŒãããŠããã詳现ãªããŒãºæšå®ã容æã«ããŠããã
- COCOãšåæ§ã«ãããŒãºæšå®ã¿ã¹ã¯ã®ããã®Object Keypoint Similarity (OKS)ãå«ãæšæºåãããè©äŸ¡ã¡ããªãã¯ãæäŸããã¢ãã«æ§èœã®æ¯èŒã«é©ããŠããã
ããŒã¿ã»ããæ§é
COCO-PoseããŒã¿ã»ããã¯3ã€ã®ãµãã»ããã«åå²ãããŠããïŒ
- Train2017: This subset contains 56599 images from the COCO dataset, annotated for training pose estimation models.
- Val2017: This subset has 2346 images used for validation purposes during model training.
- Test2017ïŒãã®ãµãã»ããã¯ãåŠç¿æžã¿ã¢ãã«ã®ãã¹ããšãã³ãããŒã¯ã«äœ¿çšãããç»åã§æ§æãããããã®ãµãã»ããã®ã°ã©ã³ããã¥ã«ãŒã¹ã¢ãããŒã·ã§ã³ã¯å ¬éãããŠããããçµæã¯æ§èœè©äŸ¡ã®ããã«COCOè©äŸ¡ãµãŒãã«æåºãããã
ã¢ããªã±ãŒã·ã§ã³
COCO-PoseããŒã¿ã»ããã¯ãç¹ã«OpenPoseã®ãããªããŒãã€ã³ãæ€åºãšããŒãºæšå®ã¿ã¹ã¯ã«ããããã£ãŒãã©ãŒãã³ã°ã¢ãã«ã®ãã¬ãŒãã³ã°ãšè©äŸ¡ã«äœ¿çšãããŸãããã®ããŒã¿ã»ããã®å€æ°ã®æ³šéä»ãç»åãšæšæºåãããè©äŸ¡ææšã¯ãããŒãºæšå®ã«çŠç¹ãåœãŠãã³ã³ãã¥ãŒã¿ããžã§ã³ã®ç 究è ãå®å家ã«ãšã£ãŠäžå¯æ¬ ãªãªãœãŒã¹ãšãªã£ãŠããŸãã
ããŒã¿ã»ãã YAML
YAML (Yet Another Markup Language) ãã¡ã€ã«ã¯ããŒã¿ã»ããã®èšå®ãå®çŸ©ããããã«äœ¿ãããããã®ãã¡ã€ã«ã«ã¯ãããŒã¿ã»ããã®ãã¹ãã¯ã©ã¹ããã®ä»ã®é¢é£æ
å ±ãå«ãŸããŠãããCOCO-PoseããŒã¿ã»ããã®å Žå㯠coco-pose.yaml
ãã¡ã€ã«ã¯ https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/coco-pose.yaml.
ultralytics/cfg/datasets/coco-pose.yaml
# Ultralytics YOLO ð, AGPL-3.0 license
# COCO 2017 Keypoints dataset https://cocodataset.org by Microsoft
# Documentation: https://docs.ultralytics.com/datasets/pose/coco/
# Example usage: yolo train data=coco-pose.yaml
# parent
# âââ ultralytics
# âââ datasets
# âââ coco-pose â downloads here (20.1 GB)
# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
path: ../datasets/coco-pose # dataset root dir
train: train2017.txt # train images (relative to 'path') 56599 images
val: val2017.txt # val images (relative to 'path') 2346 images
test: test-dev2017.txt # 20288 of 40670 images, submit to https://codalab.lisn.upsaclay.fr/competitions/7403
# Keypoints
kpt_shape: [17, 3] # number of keypoints, number of dims (2 for x,y or 3 for x,y,visible)
flip_idx: [0, 2, 1, 4, 3, 6, 5, 8, 7, 10, 9, 12, 11, 14, 13, 16, 15]
# Classes
names:
0: person
# Download script/URL (optional)
download: |
from ultralytics.utils.downloads import download
from pathlib import Path
# Download labels
dir = Path(yaml['path']) # dataset root dir
url = 'https://github.com/ultralytics/assets/releases/download/v0.0.0/'
urls = [url + 'coco2017labels-pose.zip'] # labels
download(urls, dir=dir.parent)
# Download data
urls = ['http://images.cocodataset.org/zips/train2017.zip', # 19G, 118k images
'http://images.cocodataset.org/zips/val2017.zip', # 1G, 5k images
'http://images.cocodataset.org/zips/test2017.zip'] # 7G, 41k images (optional)
download(urls, dir=dir / 'images', threads=3)
䜿çšæ¹æ³
COCO-PoseããŒã¿ã»ããã§YOLO11n-poseã¢ãã«ãç»åãµã€ãº640ã§100ãšããã¯åŠç¿ãããã«ã¯ã以äžã®ã³ãŒãã¹ããããã䜿çšããŸããå©çšå¯èœãªåŒæ°ã®å æ¬çãªãªã¹ãã«ã€ããŠã¯ãã¢ãã«ã®ãã¬ãŒãã³ã°ããŒãžãåç §ããŠãã ããã
åè»ã®äŸ
ãµã³ãã«ç»åãšæ³šé
COCO-PoseããŒã¿ã»ããã«ã¯ãããŒãã€ã³ãã§ã¢ãããŒã·ã§ã³ããã人ç©ã®å€æ§ãªç»åã»ãããå«ãŸããŠããŸãããã®ããŒã¿ã»ããã«å«ãŸããç»åã®äŸãã察å¿ããã¢ãããŒã·ã§ã³ãšãšãã«çŽ¹ä»ããŸãïŒ
- ã¢ã¶ã€ã¯ç»åïŒãã®ç»åã¯ã¢ã¶ã€ã¯åŠçãããããŒã¿ã»ããç»åã§æ§æããããã¬ãŒãã³ã°ãããã瀺ããã¢ã¶ã€ã¯åŠçãšã¯ãè€æ°ã®ç»åã1ã€ã®ç»åã«åæããããšã§ãåãã¬ãŒãã³ã°ãããå ã®ãªããžã§ã¯ããã·ãŒã³ã®çš®é¡ãå¢ããææ³ã§ããããã«ãããç°ãªããªããžã§ã¯ããµã€ãºãã¢ã¹ãã¯ãæ¯ãã³ã³ãã¯ã¹ãã«å¯Ÿããã¢ãã«ã®æ±åèœåãåäžãããããšãã§ããŸãã
ãã®äŸã§ã¯ãCOCO-PoseããŒã¿ã»ããã®ç»åã®å€æ§æ§ãšè€éããããã³åŠç¿ããã»ã¹ã§ã¢ã¶ã€ã¯åŠçã䜿çšããå©ç¹ã瀺ããŠããŸãã
åŒçšãšè¬èŸ
COCO-PoseããŒã¿ã»ãããç 究éçºã§äœ¿çšããå Žåã¯ã以äžã®è«æãåŒçšããŠãã ããïŒ
@misc{lin2015microsoft,
title={Microsoft COCO: Common Objects in Context},
author={Tsung-Yi Lin and Michael Maire and Serge Belongie and Lubomir Bourdev and Ross Girshick and James Hays and Pietro Perona and Deva Ramanan and C. Lawrence Zitnick and Piotr Dollár},
year={2015},
eprint={1405.0312},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
COCOã³ã³ãœãŒã·ã¢ã ããã³ã³ãã¥ãŒã¿ããžã§ã³ã³ãã¥ããã£ã®ããã«ãã®è²ŽéãªãªãœãŒã¹ãäœæããç¶æããŠãã ãã£ãŠããããšã«æè¬ããããŸããCOCO-PoseããŒã¿ã»ãããšãã®äœæè ã«ã€ããŠã®è©³çŽ°ã¯ãCOCOããŒã¿ã»ããã®ãŠã§ããµã€ããã芧ãã ããã
ããããã質å
COCO-PoseããŒã¿ã»ãããšã¯äœã§ããïŒãŸããUltralytics YOLO ãã©ã®ããã«ããŒãºæšå®ã«äœ¿ãããŠããŸããïŒ
COCO-PoseããŒã¿ã»ããã¯ãããŒãºæšå®ã¿ã¹ã¯çšã«èšèšãããCOCOïŒCommon Objects in ContextïŒããŒã¿ã»ããã®ç¹æ®ããŒãžã§ã³ã§ãããCOCO Keypoints 2017ã®ç»åãšã¢ãããŒã·ã§ã³ãããŒã¹ã«æ§ç¯ãããŠãããUltralytics YOLO ã®ãããªã¢ãã«ã®åŠç¿ã«ããã詳现ãªããŒãºæšå®ãå¯èœã«ãªããäŸãã°ãCOCO-PoseããŒã¿ã»ããã䜿çšããŠãäºåã«èšç·Žãããã¢ãã«ãããŒãããYAMLèšå®ã§ãã¬ãŒãã³ã°ããããšã§ãYOLO11n-poseã¢ãã«ããã¬ãŒãã³ã°ããããšãã§ããŸãããã¬ãŒãã³ã°ã®äŸã«ã€ããŠã¯ããã¬ãŒãã³ã°ã®ããã¥ã¡ã³ããåç §ããŠãã ããã
COCO-PoseããŒã¿ã»ããã§YOLO11 ã¢ãã«ããã¬ãŒãã³ã°ããã«ã¯ïŒ
COCO-PoseããŒã¿ã»ããã«å¯ŸããYOLO11 ã¢ãã«ã®ãã¬ãŒãã³ã°ã¯ãPython ãŸãã¯CLI ã®ããããã®ã³ãã³ãã䜿ã£ãŠè¡ãããšãã§ããŸããäŸãã°ãYOLO11n-poseã¢ãã«ãç»åãµã€ãº640ã§100ãšããã¯åŠç¿ãããã«ã¯ã以äžã®æé ã«åŸããŸãïŒ
åè»ã®äŸ
ãã¬ãŒãã³ã°ã®ããã»ã¹ãå©çšå¯èœãªåŒæ°ã®è©³çŽ°ã«ã€ããŠã¯ããã¬ãŒãã³ã°ã®ããŒãžãã芧ãã ããã
COCO-PoseããŒã¿ã»ãããæäŸãããã¢ãã«ã®æ§èœãè©äŸ¡ããããã®ããŸããŸãªææšãšã¯ïŒ
COCO-PoseããŒã¿ã»ããã¯ããªãªãžãã«ã®COCOããŒã¿ã»ãããšåæ§ã«ãããŒãºæšå®ã¿ã¹ã¯ã®ããã®ããã€ãã®æšæºåãããè©äŸ¡ã¡ããªãã¯ãæäŸããŸããäž»ãªã¡ããªã¯ã¹ã«ã¯ãäºæž¬ãããããŒãã€ã³ãã®ç²ŸåºŠãã°ã©ã³ããã¥ã«ãŒã¹æ³šéã«å¯ŸããŠè©äŸ¡ããããªããžã§ã¯ãããŒãã€ã³ãã·ãã©ãŒãªãã£ïŒOKSïŒãå«ãŸããŸãããããã®ã¡ããªã¯ã¹ã«ãããç°ãªãã¢ãã«éã®åŸ¹åºçãªæ§èœæ¯èŒãå¯èœã«ãªããŸããäŸãã°ãYOLO11n-poseãYOLO11s-poseãªã©ã®COCO-PoseäºååŠç¿ã¢ãã«ã«ã¯ãmAPpose50-95ãmAPpose50ã®ãããªç¹å®ã®ããã©ãŒãã³ã¹ã¡ããªã¯ã¹ãããã¥ã¡ã³ãã«èšèŒãããŠããŸãã
COCO-PoseããŒã¿ã»ããã®æ§é ãšåå²ã¯ã©ã®ããã«ãªã£ãŠããŸããïŒ
COCO-PoseããŒã¿ã»ããã¯3ã€ã®ãµãã»ããã«åå²ãããŠããïŒ
- Train2017: Contains 56599 COCO images, annotated for training pose estimation models.
- Val2017: 2346 images for validation purposes during model training.
- Test2017ïŒåŠç¿æžã¿ã¢ãã«ã®ãã¹ããšãã³ãããŒã¯ã«äœ¿çšãããç»åããã®ãµãã»ããã®ã°ã©ã³ããã¥ã«ãŒã¹ã¢ãããŒã·ã§ã³ã¯å ¬éãããŠããããçµæã¯æ§èœè©äŸ¡ã®ããã«COCOè©äŸ¡ãµãŒããŒã«æåºãããã
ãããã®ãµãã»ããã¯ããã¬ãŒãã³ã°ãæ€èšŒããã¹ãã®åãã§ãŒãºãå¹æçã«æŽçããã®ã«åœ¹ç«ã€ãèšå®ã®è©³çŽ°ã«ã€ããŠã¯ coco-pose.yaml
ãã¡ã€ã«ã¯ ã®ãããã.
COCO-PoseããŒã¿ã»ããã®äž»ãªç¹åŸŽãšçšéã¯ïŒ
COCO-PoseããŒã¿ã»ããã¯ãCOCO Keypoints 2017ã®ã¢ãããŒã·ã§ã³ãæ¡åŒµãã人ç©ã®17ã®ããŒãã€ã³ããå«ã¿ã詳现ãªããŒãºæšå®ãå¯èœã«ãããæšæºåãããè©äŸ¡ææšïŒOKSãªã©ïŒã«ãããç°ãªãã¢ãã«éã®æ¯èŒã容æã«ãªããŸããCOCO-PoseããŒã¿ã»ããã®çšéã¯ãã¹ããŒãåæããã«ã¹ã±ã¢ããã¥ãŒãã³ã³ã³ãã¥ãŒã¿ã€ã³ã¿ã©ã¯ã·ã§ã³ãªã©ã人ç©åã®è©³çŽ°ãªããŒãºæšå®ãå¿ èŠãšãããæ§ã ãªé åã«åã¶ãå®çšçãªäœ¿çšã«ãããŠã¯ãããã¥ã¡ã³ãã§æäŸãããŠãããããªäºååŠç¿æžã¿ã¢ãã«ïŒäŸãã°ãYOLO11n-poseïŒã掻çšããããšã§ãããã»ã¹ãå€§å¹ ã«å¹çåããããšãã§ããŸãïŒKey FeaturesïŒã
COCO-PoseããŒã¿ã»ãããç 究ãŸãã¯éçºã§äœ¿çšããå Žåã¯ã以äžã®BibTeXãšã³ããªãŒã§è«æãåŒçšããŠãã ããã