跳至内容

COCO8-Seg 数据集

导言

Ultralytics COCO8-Seg is a small, but versatile instance segmentation dataset composed of the first 8 images of the COCO train 2017 set, 4 for training and 4 for validation. This dataset is ideal for testing and debugging segmentation models, or for experimenting with new detection approaches. With 8 images, it is small enough to be easily manageable, yet diverse enough to test training pipelines for errors and act as a sanity check before training larger datasets.

This dataset is intended for use with Ultralytics HUB and YOLO11.

数据集 YAML

YAML(另一种标记语言)文件用于定义数据集配置。它包含数据集的路径、类和其他相关信息。就 COCO8-Seg 数据集而言,YAML 文件中的 coco8-seg.yaml 文件保存在 https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/coco8-seg.yaml.

ultralytics/cfg/datasets/coco8-seg.yaml

# Ultralytics YOLO 🚀, AGPL-3.0 license
# COCO8-seg dataset (first 8 images from COCO train2017) by Ultralytics
# Documentation: https://docs.ultralytics.com/datasets/segment/coco8-seg/
# Example usage: yolo train data=coco8-seg.yaml
# parent
# ├── ultralytics
# └── datasets
#     └── coco8-seg  ← downloads here (1 MB)

# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
path: ../datasets/coco8-seg # dataset root dir
train: images/train # train images (relative to 'path') 4 images
val: images/val # val images (relative to 'path') 4 images
test: # test images (optional)

# Classes
names:
  0: person
  1: bicycle
  2: car
  3: motorcycle
  4: airplane
  5: bus
  6: train
  7: truck
  8: boat
  9: traffic light
  10: fire hydrant
  11: stop sign
  12: parking meter
  13: bench
  14: bird
  15: cat
  16: dog
  17: horse
  18: sheep
  19: cow
  20: elephant
  21: bear
  22: zebra
  23: giraffe
  24: backpack
  25: umbrella
  26: handbag
  27: tie
  28: suitcase
  29: frisbee
  30: skis
  31: snowboard
  32: sports ball
  33: kite
  34: baseball bat
  35: baseball glove
  36: skateboard
  37: surfboard
  38: tennis racket
  39: bottle
  40: wine glass
  41: cup
  42: fork
  43: knife
  44: spoon
  45: bowl
  46: banana
  47: apple
  48: sandwich
  49: orange
  50: broccoli
  51: carrot
  52: hot dog
  53: pizza
  54: donut
  55: cake
  56: chair
  57: couch
  58: potted plant
  59: bed
  60: dining table
  61: toilet
  62: tv
  63: laptop
  64: mouse
  65: remote
  66: keyboard
  67: cell phone
  68: microwave
  69: oven
  70: toaster
  71: sink
  72: refrigerator
  73: book
  74: clock
  75: vase
  76: scissors
  77: teddy bear
  78: hair drier
  79: toothbrush

# Download script/URL (optional)
download: https://github.com/ultralytics/assets/releases/download/v0.0.0/coco8-seg.zip

使用方法

To train a YOLO11n-seg model on the COCO8-Seg dataset for 100 epochs with an image size of 640, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model Training page.

列车示例

from ultralytics import YOLO

# Load a model
model = YOLO("yolo11n-seg.pt")  # load a pretrained model (recommended for training)

# Train the model
results = model.train(data="coco8-seg.yaml", epochs=100, imgsz=640)
# Start training from a pretrained *.pt model
yolo segment train data=coco8-seg.yaml model=yolo11n-seg.pt epochs=100 imgsz=640

图片和注释示例

下面是 COCO8-Seg 数据集中的一些图像示例及其相应的注释:

数据集样本图像

  • 镶嵌图像:该图像展示了由马赛克数据集图像组成的训练批次。马赛克是一种在训练过程中使用的技术,可将多幅图像合并为单幅图像,以增加每个训练批次中物体和场景的多样性。这有助于提高模型对不同物体尺寸、长宽比和环境的泛化能力。

该示例展示了 COCO8-Seg 数据集中图像的多样性和复杂性,以及在训练过程中使用镶嵌技术的好处。

引文和致谢

如果您在研究或开发工作中使用 COCO 数据集,请引用以下论文:

@misc{lin2015microsoft,
      title={Microsoft COCO: Common Objects in Context},
      author={Tsung-Yi Lin and Michael Maire and Serge Belongie and Lubomir Bourdev and Ross Girshick and James Hays and Pietro Perona and Deva Ramanan and C. Lawrence Zitnick and Piotr Dollár},
      year={2015},
      eprint={1405.0312},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

We would like to acknowledge the COCO Consortium for creating and maintaining this valuable resource for the computer vision community. For more information about the COCO dataset and its creators, visit the COCO dataset website.

常见问题

What is the COCO8-Seg dataset, and how is it used in Ultralytics YOLO11?

The COCO8-Seg dataset is a compact instance segmentation dataset by Ultralytics, consisting of the first 8 images from the COCO train 2017 set—4 images for training and 4 for validation. This dataset is tailored for testing and debugging segmentation models or experimenting with new detection methods. It is particularly useful with Ultralytics YOLO11 and HUB for rapid iteration and pipeline error-checking before scaling to larger datasets. For detailed usage, refer to the model Training page.

How can I train a YOLO11n-seg model using the COCO8-Seg dataset?

To train a YOLO11n-seg model on the COCO8-Seg dataset for 100 epochs with an image size of 640, you can use Python or CLI commands. Here's a quick example:

列车示例

from ultralytics import YOLO

# Load a model
model = YOLO("yolo11n-seg.pt")  # Load a pretrained model (recommended for training)

# Train the model
results = model.train(data="coco8-seg.yaml", epochs=100, imgsz=640)
# Start training from a pretrained *.pt model
yolo segment train data=coco8-seg.yaml model=yolo11n-seg.pt epochs=100 imgsz=640

有关可用参数和配置选项的详细说明,请查阅培训文档。

为什么 COCO8-Seg 数据集对模型开发和调试非常重要?

COCO8-Seg 数据集体积小、易于管理且具有多样性,是理想的数据集。该数据集仅由 8 幅图像组成,为测试和调试分割模型或新检测方法提供了一种快速方法,而无需较大数据集的开销。因此,在对大型数据集进行大量训练之前,它是进行正确性检查和管道错误识别的有效工具。点击此处了解有关数据集格式的更多信息。

在哪里可以找到 COCO8-Seg 数据集的 YAML 配置文件?

COCO8-Seg 数据集的 YAML 配置文件可在Ultralytics 存储库中找到。您可以直接访问该文件。YAML 文件包含模型训练和验证所需的数据集路径、类和配置设置等基本信息。

在 COCO8-Seg 数据集的训练过程中使用镶嵌技术有哪些好处?

Using mosaicing during training helps increase the diversity and variety of objects and scenes in each training batch. This technique combines multiple images into a single composite image, enhancing the model's ability to generalize to different object sizes, aspect ratios, and contexts within the scene. Mosaicing is beneficial for improving a model's robustness and accuracy, especially when working with small datasets like COCO8-Seg. For an example of mosaiced images, see the Sample Images and Annotations section.


📅 Created 11 months ago ✏️ Updated 12 days ago

评论