Link to this sectionCOCO-Poseデータセット#

COCO-Pose データセットは、COCO (Common Objects in Context) をポーズ推定用に適応させたものです。これは COCO Keypoints 2017 の画像 58,945 枚で構成され、17 キーポイントのスキーマを使用して 156,165 人の人物がアノテーションされています。Ultralytics YOLO26 などのキーポイントモデルのトレーニングやベンチマークに使用される標準的なセットであり、8 枚の画像で構成される COCO8-Pose サブセットは、迅速なサニティチェック用にその形式を模倣しています。

人体のキーポイントを用いたCOCO姿勢推定

Link to this sectionCOCO-Pose事前学習済みモデル#

モデル	サイズ ^{(ピクセル)}	mAP^{pose 50-95(e2e)}	mAP^pose 50(e2e)	速度 ^{CPU ONNX (ms)}	速度 ^{T4 TensorRT10 (ms)}	パラメータ ^(M)	FLOPs ^(B)
YOLO26n-pose	640	57.2	83.3	40.3 ± 0.5	1.8 ± 0.0	2.9	7.5
YOLO26s-pose	640	63.0	86.6	85.3 ± 0.9	2.7 ± 0.0	10.4	23.9
YOLO26m-pose	640	68.8	89.6	218.0 ± 1.5	5.0 ± 0.1	21.5	73.1
YOLO26l-pose	640	70.4	90.5	275.4 ± 2.4	6.5 ± 0.1	25.9	91.3
YOLO26x-pose	640	71.6	91.6	565.4 ± 3.0	12.2 ± 0.2	57.6	201.7

Link to this section主な特徴#

COCO-Pose は、156,165 人のアノテーション済み人物データにわたり 1,710,498 個の個別のキーポイントにラベルを付けた COCO Keypoints 2017 チャレンジに基づいて構築されています。
各人物のアノテーションでは、鼻、目、耳、肩、肘、手首、腰、膝、足首の 17 種類のキーポイントが使用され、これらは (x, y, visibility) の 3 つ組として格納されます。
COCOと同様に、姿勢推定タスクのためのObject Keypoint Similarity (OKS)を含む標準化された評価指標を提供しており、モデルのパフォーマンス比較に適しています。
ダウンロードサイズ: 初回使用時に約20.2 GB (train2017.zip + val2017.zip + ラベル)。7 GBの test2017.zip は自動的に取得されません。これらの画像にはグラウンドトゥルース（正解データ）が含まれておらず、test-dev2017への提出にのみ必要となるためです。

Link to this sectionデータセットの構造#

トレーニングおよび検証のために、COCO-Pose にはキーポイントがアノテーションされた人物を含む COCO 2017 画像のみが含まれているため、ラベル付けされた分割は完全な COCO データセットよりも小さくなります。その YAML では、3 つのサブセットが定義されています。

Train2017: このサブセットには、ポーズ推定モデルのトレーニング用にアノテーションされた、COCO データセット由来の 56,599 枚の画像が含まれています。
Val2017: このサブセットには、モデルのトレーニング中に検証目的で使用される 2,346 枚の画像が含まれています。
Test-dev2017: 正解データが非公開となっている、全 40,670 枚の test2017 セットのうち 20,288 枚からなるサブセットです。データセットの YAML では、この分割を COCO test-dev キーポイント評価サーバーにリンクしています。

このような規模のトレーニングこそ Ultralytics Platform が最も役立つ場面です。計算リソースを管理するため、独自の GPU をプロビジョニングすることなく実行の開始と監視を行えます。

Link to this sectionアプリケーション#

The COCO-Pose dataset is specifically used for training and evaluating deep learning models on keypoint detection and pose estimation. The dataset's large number of annotated images and standardized evaluation metrics make it an essential resource for computer vision researchers and practitioners working on human pose.

Link to this sectionデータセット YAML#

YAMLファイルはデータセットの設定を定義するために使用されます。これには、データセットのパス、クラス、およびその他の関連情報が含まれています。COCO-Poseデータセットの場合、coco-pose.yamlファイルはhttps://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/coco-pose.yamlで管理されています。

ultralytics/cfg/datasets/coco-pose.yaml

# Ultralytics 🚀 AGPL-3.0 License - https://ultralytics.com/license

# COCO 2017 Keypoints dataset https://cocodataset.org by Microsoft
# Documentation: https://docs.ultralytics.com/datasets/pose/coco
# Example usage: yolo train data=coco-pose.yaml
# parent
# ├── ultralytics
# └── datasets
#     └── coco-pose ← downloads here (20.2 GB)

# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
path: coco-pose # dataset root dir
train: train2017.txt # train images (relative to 'path') 56599 images
val: val2017.txt # val images (relative to 'path') 2346 images
test: test-dev2017.txt # 20288 of 40670 images, submit to https://codalab.lisn.upsaclay.fr/competitions/7403

# Keypoints
kpt_shape: [17, 3] # number of keypoints, number of dims (2 for x,y or 3 for x,y,visible)
flip_idx: [0, 2, 1, 4, 3, 6, 5, 8, 7, 10, 9, 12, 11, 14, 13, 16, 15]

# Classes
names:
  0: person

# Keypoint names per class
kpt_names:
  0:
    - nose
    - left_eye
    - right_eye
    - left_ear
    - right_ear
    - left_shoulder
    - right_shoulder
    - left_elbow
    - right_elbow
    - left_wrist
    - right_wrist
    - left_hip
    - right_hip
    - left_knee
    - right_knee
    - left_ankle
    - right_ankle

# Download script/URL (optional)
download: |
  from pathlib import Path

  from ultralytics.utils import ASSETS_URL
  from ultralytics.utils.downloads import download

  # Download labels
  dir = Path(yaml["path"])  # dataset root dir

  urls = [f"{ASSETS_URL}/coco2017labels-pose.zip"]
  download(urls, dir=dir.parent)

  # Download data (test2017.zip excluded: ground truth is withheld, only used for the CodaLab test-dev split)
  urls = [
      "http://images.cocodataset.org/zips/train2017.zip",  # 19G, 118k images
      "http://images.cocodataset.org/zips/val2017.zip",  # 1G, 5k images
  ]
  download(urls, dir=dir / "images", threads=3)

Link to this section使用方法#

COCO-PoseデータセットでYOLO26n-poseモデルを100エポック（画像サイズ640）トレーニングするには、以下のコードスニペットを使用できます。利用可能な引数の詳細なリストについては、モデルのトレーニングページを参照してください。

学習例

from ultralytics import YOLO

# Load a model
model = YOLO("yolo26n-pose.pt")  # load a pretrained model (recommended for training)

# Train the model
results = model.train(data="coco-pose.yaml", epochs=100, imgsz=640)

Link to this sectionサンプル画像とアノテーション#

COCO-Poseデータセットには、キーポイントがアノテーションされた人体の多様な画像セットが含まれています。以下に、対応するアノテーションが付いたデータセットの画像の例をいくつか示します。

COCO姿勢推定データセットのモザイク化トレーニングバッチ

モザイク画像: この画像は、モザイク処理されたデータセット画像で構成されるトレーニングバッチを示しています。モザイク処理は、トレーニング中に複数の画像を1つの画像に結合し、各トレーニングバッチ内のオブジェクトやシーンの多様性を高める技術です。これにより、さまざまなオブジェクトのサイズ、アスペクト比、およびコンテキストに対してモデルが汎化する能力を向上させます。

この例は、COCO-Poseデータセット内の画像の多様性と複雑さ、およびトレーニングプロセス中にモザイク処理を使用する利点を示しています。

Link to this section引用と謝辞#

研究や開発の作業でCOCO-Poseデータセットを使用する場合は、以下の論文を引用してください。

引用

@misc{lin2015microsoft,
      title={Microsoft COCO: Common Objects in Context},
      author={Tsung-Yi Lin and Michael Maire and Serge Belongie and Lubomir Bourdev and Ross Girshick and James Hays and Pietro Perona and Deva Ramanan and C. Lawrence Zitnick and Piotr Dollár},
      year={2015},
      eprint={1405.0312},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

コンピュータビジョンコミュニティのためのこの貴重なリソースを作成・維持してくださったCOCOコンソーシアムに感謝いたします。COCO-Poseデータセットとその作成者についての詳細は、COCOデータセットのウェブサイトをご覧ください。

Link to this sectionよくある質問 (FAQ)#

Link to this sectionCOCO-Poseデータセットとは何か、またUltralytics YOLOで姿勢推定にどのように使用されますか？#

COCO-Pose は、COCO Keypoints 2017 の画像とアノテーションを YOLO キーポイント形式に変換して提供し、58,945 枚の画像全体で 17 キーポイントのスキーマを使用しています。Ultralytics YOLO ポーズモデルに data=coco-pose.yaml を指定すれば使用可能であり、トレーニングページにはそこから調整可能なすべての引数が記載されています。

Link to this sectionCOCO-PoseデータセットでYOLO26モデルをトレーニングするにはどうすればよいですか？#

yolo26n-pose.pt をロードし、model.train(data="coco-pose.yaml", epochs=100, imgsz=640) を呼び出します。完全な Python および CLI スニペットについては上記のトレーニング例を、引数の包括的なリストについてはトレーニングページを参照してください。

Link to this sectionモデルパフォーマンスを評価するためにCOCO-Poseデータセットによって提供されるさまざまな指標は何ですか？#

COCO-Poseデータセットは、元のCOCOデータセットと同様に、姿勢推定タスク向けのいくつかの標準化された評価指標を提供します。主要な指標にはObject Keypoint Similarity (OKS)があり、これは予測されたキーポイントと正解アノテーションの精度を評価します。これらの指標により、モデル間での徹底的なパフォーマンス比較が可能になります。例えば、YOLO26n-poseやYOLO26s-poseなどのCOCO-Pose事前学習済みモデルには、ドキュメントにmAP^pose50-95やmAP^pose50などの特定のパフォーマンス指標が記載されています。

Link to this sectionCOCO-Poseデータセットはどのように構成され、分割されていますか？#

COCO-Pose ships two labeled splits: 56,599 train2017 images and 2,346 val2017 images. A third split, test-dev2017 (20,288 of the full 40,670 test2017 images), keeps its ground truth private; the dataset YAML links it to the COCO test-dev keypoints evaluation server. See the Dataset Structure section, or the coco-pose.yaml file on GitHub for the exact split paths.

Link to this sectionCOCO-Poseデータセットの主な機能とアプリケーションは何ですか？#

COCO-Pose は 17 種類の人体キーポイントを使用し、モデルを比較するための Object Keypoint Similarity (OKS) を含む COCO の標準化メトリクスを継承しています。この組み合わせは、スポーツ分析、ヘルスケア、ヒューマンコンピュータインタラクションなどの人体ポーズアプリケーションに適しています。事前トレーニング済みの YOLO26-pose 重みは COCO-Pose 事前トレーニング済みモデルにリストされています。

キーポイントモデルの詳細については、ポーズ推定タスクのドキュメントを参照してください。

貢献者

GLglenn-jocher¹⁵ RAraimbekovm³ RIRizwanMunawar³ JKjk4e² Y-Y-T-G¹ AMambitious-octopus¹ MAMatthewNoyce¹ LUlunarifish¹

作成日 2023年11月12日更新日 3 日前