Link to this sectionCOCO-Seg Dataset#

Name: COCO Segmentation Dataset
Creator: COCO Consortium
License: https://cocodataset.org/#termsofuse
Keywords: COCO-Seg, dataset, YOLO models, instance segmentation, object detection, COCO dataset, YOLO26, computer vision, Ultralytics, machine learning

The COCO-Seg dataset provides COCO (Common Objects in Context) instance segmentation masks — 118,287 training and 5,000 validation images with polygon masks across 80 object categories — in the Ultralytics YOLO label format. It uses COCO's original images and native segmentation annotations, converted for YOLO training, making it a crucial resource for researchers and developers working on instance segmentation tasks.

Link to this sectionCOCO-Seg Pretrained Models#

Model	size ^(pixels)	mAP^{box 50-95(e2e)}	mAP^{mask 50-95(e2e)}	Speed ^{CPU ONNX (ms)}	Speed ^{T4 TensorRT10 (ms)}	params ^(M)	FLOPs ^(B)
YOLO26n-seg	640	39.6	33.9	53.3 ± 0.5	2.1 ± 0.0	2.7	9.1
YOLO26s-seg	640	47.3	40.0	118.4 ± 0.9	3.3 ± 0.0	10.4	34.2
YOLO26m-seg	640	52.5	44.1	328.2 ± 2.4	6.7 ± 0.1	23.6	121.5
YOLO26l-seg	640	54.4	45.5	387.0 ± 3.7	8.0 ± 0.1	28.0	139.8
YOLO26x-seg	640	56.5	47.0	787.0 ± 6.8	16.4 ± 0.1	62.8	313.5

Link to this sectionKey Features#

COCO-Seg provides instance segmentation masks for 123,287 labeled COCO train2017/val2017 images (118,287 train + 5,000 val), out of COCO's broader ~330K-image release.
The dataset consists of the same 80 object categories found in the original COCO dataset.
Annotations provide instance segmentation masks in the YOLO polygon label format.
COCO-Seg provides standardized mAP and mAR metrics for evaluating instance segmentation performance, enabling effective comparison of model performance.
Download size: ~20.3 GB on first use (train2017.zip + val2017.zip + labels). The 7 GB test2017.zip is not fetched automatically, since those images have withheld ground truth and are only needed for a test-dev2017 submission.

Link to this sectionDataset Structure#

The COCO-Seg dataset is partitioned into three subsets:

Train2017: 118,287 images for training instance segmentation models.
Val2017: 5,000 images used for validation during model development.
Test-dev2017: 20,288 of the 40,670 test2017 images, used for benchmarking. Ground-truth annotations for this subset are not publicly available, so predictions must be submitted to the COCO evaluation server for scoring.

For smaller experimentation needs, see the COCO128-Seg (128 images) and COCO8-Seg (8 images) subsets.

Link to this sectionApplications#

COCO-Seg is widely used for training and evaluating deep learning models on instance segmentation, such as the YOLO models. The large number of annotated images, the diversity of object categories, and the standardized evaluation metrics make it an indispensable resource for computer vision researchers and practitioners. Full COCO-Seg annotations can also be browsed and managed on Ultralytics Platform.

Link to this sectionDataset YAML#

A YAML file is used to define the dataset configuration. It contains information about the dataset's paths, classes, and other relevant information. In the case of the COCO-Seg dataset, the coco.yaml file is maintained at https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/coco.yaml.

ultralytics/cfg/datasets/coco.yaml

# Ultralytics 🚀 AGPL-3.0 License - https://ultralytics.com/license

# COCO 2017 dataset https://cocodataset.org by Microsoft
# Documentation: https://docs.ultralytics.com/datasets/detect/coco
# Example usage: yolo train data=coco.yaml
# parent
# ├── ultralytics
# └── datasets
#     └── coco ← downloads here (20.3 GB)

# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
path: coco # dataset root dir
train: train2017.txt # train images (relative to 'path') 118287 images
val: val2017.txt # val images (relative to 'path') 5000 images
test: test-dev2017.txt # 20288 of 40670 images, submit via https://cocodataset.org/#detection-eval

# Classes
names:
  0: person
  1: bicycle
  2: car
  3: motorcycle
  4: airplane
  5: bus
  6: train
  7: truck
  8: boat
  9: traffic light
  10: fire hydrant
  11: stop sign
  12: parking meter
  13: bench
  14: bird
  15: cat
  16: dog
  17: horse
  18: sheep
  19: cow
  20: elephant
  21: bear
  22: zebra
  23: giraffe
  24: backpack
  25: umbrella
  26: handbag
  27: tie
  28: suitcase
  29: frisbee
  30: skis
  31: snowboard
  32: sports ball
  33: kite
  34: baseball bat
  35: baseball glove
  36: skateboard
  37: surfboard
  38: tennis racket
  39: bottle
  40: wine glass
  41: cup
  42: fork
  43: knife
  44: spoon
  45: bowl
  46: banana
  47: apple
  48: sandwich
  49: orange
  50: broccoli
  51: carrot
  52: hot dog
  53: pizza
  54: donut
  55: cake
  56: chair
  57: couch
  58: potted plant
  59: bed
  60: dining table
  61: toilet
  62: tv
  63: laptop
  64: mouse
  65: remote
  66: keyboard
  67: cell phone
  68: microwave
  69: oven
  70: toaster
  71: sink
  72: refrigerator
  73: book
  74: clock
  75: vase
  76: scissors
  77: teddy bear
  78: hair drier
  79: toothbrush

# Download script/URL (optional)
download: |
  from pathlib import Path

  from ultralytics.utils import ASSETS_URL
  from ultralytics.utils.downloads import download

  # Download labels
  segments = True  # segment or box labels
  dir = Path(yaml["path"])  # dataset root dir
  urls = [ASSETS_URL + ("/coco2017labels-segments.zip" if segments else "/coco2017labels.zip")]  # labels
  download(urls, dir=dir.parent)

  # Download data (test2017.zip excluded: ground truth is withheld, only used for the eval-server test-dev split)
  urls = [
      "http://images.cocodataset.org/zips/train2017.zip",  # 19G, 118k images
      "http://images.cocodataset.org/zips/val2017.zip",  # 1G, 5k images
  ]
  download(urls, dir=dir / "images", threads=3)

Link to this sectionUsage#

To train a YOLO26n-seg model on the COCO-Seg dataset for 100 epochs with an image size of 640, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model Training page.

Train Example

from ultralytics import YOLO

# Load a model
model = YOLO("yolo26n-seg.pt")  # load a pretrained model (recommended for training)

# Train the model
results = model.train(data="coco.yaml", epochs=100, imgsz=640)

Link to this sectionSample Images and Annotations#

COCO-Seg contains the same diverse images, object categories, and complex scenes as COCO, with instance segmentation masks provided in the YOLO label format. Here are some examples of images from the dataset, along with their corresponding instance segmentation masks:

COCO segmentation dataset mosaic training batch

Mosaiced Image: This image demonstrates a training batch composed of mosaiced dataset images. Mosaicing is a technique used during training that combines multiple images into a single image to increase the variety of objects and scenes within each training batch. This aids the model's ability to generalize to different object sizes, aspect ratios, and contexts.

Link to this sectionCitations and Acknowledgments#

If you use the COCO-Seg dataset in your research or development work, please cite the original COCO paper and acknowledge the extension to COCO-Seg:

Quote

@misc{lin2015microsoft,
      title={Microsoft COCO: Common Objects in Context},
      author={Tsung-Yi Lin and Michael Maire and Serge Belongie and Lubomir Bourdev and Ross Girshick and James Hays and Pietro Perona and Deva Ramanan and C. Lawrence Zitnick and Piotr Dollár},
      year={2015},
      eprint={1405.0312},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

We extend our thanks to the COCO Consortium for creating and maintaining this invaluable resource for the computer vision community. For more information about the COCO dataset and its creators, visit the COCO dataset website.

Link to this sectionFAQ#

Link to this sectionWhat is the COCO-Seg dataset and how does it differ from the original COCO dataset?#

COCO-Seg is the Ultralytics YOLO-format packaging of COCO's (Common Objects in Context) native instance segmentation masks for the same 118,287 train2017 and 5,000 val2017 images. The original COCO annotations already include these polygon masks for all 80 object categories; COCO-Seg converts them to the YOLO label format used for object instance segmentation training.

Link to this sectionHow can I train a YOLO26 model using the COCO-Seg dataset?#

To train a YOLO26n-seg model on the COCO-Seg dataset for 100 epochs with an image size of 640, you can use the following code snippets. For a detailed list of available training arguments, refer to the model Training page.

Train Example

from ultralytics import YOLO

# Load a model
model = YOLO("yolo26n-seg.pt")  # load a pretrained model (recommended for training)

# Train the model
results = model.train(data="coco.yaml", epochs=100, imgsz=640)

Link to this sectionWhat are the key features of the COCO-Seg dataset?#

The COCO-Seg dataset includes several key features:

Provides instance segmentation masks for 123,287 labeled COCO train2017/val2017 images (118,287 train + 5,000 val).
Annotates the same 80 object categories found in the original COCO.
Provides instance segmentation masks in the YOLO polygon label format.
Uses standardized evaluation metrics such as mean Average Precision (mAP) and mean Average Recall (mAR) for instance segmentation tasks.

Link to this sectionWhat pretrained models are available for COCO-Seg, and what are their performance metrics?#

The COCO-Seg dataset supports multiple pretrained YOLO26 segmentation models with varying performance metrics. Here's a summary of the available models and their key metrics:

Model	size ^(pixels)	mAP^{box 50-95(e2e)}	mAP^{mask 50-95(e2e)}	Speed ^{CPU ONNX (ms)}	Speed ^{T4 TensorRT10 (ms)}	params ^(M)	FLOPs ^(B)
YOLO26n-seg	640	39.6	33.9	53.3 ± 0.5	2.1 ± 0.0	2.7	9.1
YOLO26s-seg	640	47.3	40.0	118.4 ± 0.9	3.3 ± 0.0	10.4	34.2
YOLO26m-seg	640	52.5	44.1	328.2 ± 2.4	6.7 ± 0.1	23.6	121.5
YOLO26l-seg	640	54.4	45.5	387.0 ± 3.7	8.0 ± 0.1	28.0	139.8
YOLO26x-seg	640	56.5	47.0	787.0 ± 6.8	16.4 ± 0.1	62.8	313.5

These models range from the lightweight YOLO26n-seg to the more powerful YOLO26x-seg, offering different trade-offs between speed and accuracy to suit various application requirements. For more information on model selection, visit the Ultralytics models page.

Link to this sectionHow is the COCO-Seg dataset structured and what subsets does it contain?#

The COCO-Seg dataset is partitioned into three subsets for specific training and evaluation needs:

Train2017: Contains 118,287 images used primarily for training instance segmentation models.
Val2017: Comprises 5,000 images utilized for validation during the training process.
Test-dev2017: Encompasses 20,288 of the 40,670 test2017 images reserved for testing and benchmarking trained models. Note that ground truth annotations for this subset are not publicly available, and performance results are submitted to the COCO evaluation server for assessment.

For smaller experimentation needs, you might also consider the COCO128-Seg dataset (128 images) or the COCO8-Seg dataset, a compact version containing just 8 images from the COCO train 2017 set.

Contributors

GLglenn-jocher¹⁷ RAraimbekovm⁴ JKjk4e³ RIRizwanMunawar² Y-Y-T-G¹ AMambitious-octopus¹ MAMatthewNoyce¹ LUlunarifish¹

Created Nov 12, 2023Updated 5 days ago