Link to this sectionSemantic Segmentation Datasets Overview#

Q: What pixel value is ignored during training?

Pixel value 255 is used as the ignore label. These pixels are skipped during loss and metric computation, which is useful for void regions, unlabeled pixels, or classes outside the training label set.

Q: Can I use original dataset label IDs directly?

Yes, if they already match your names class IDs. If the source dataset uses non-contiguous IDs or includes labels that should be ignored, add a label_mapping section to convert source pixel values to training IDs.

Semantic segmentation assigns one class label to every pixel in an image. Unlike instance segmentation, semantic segmentation does not separate individual objects of the same class. The training target is a dense class map where each pixel stores a class ID.

This guide explains the dataset format used by Ultralytics YOLO semantic segmentation models and lists the built-in dataset configurations available for training and validation.

Link to this sectionSupported Dataset Formats#

Two label formats are supported. The dataset loader picks PNG masks when the dataset YAML defines a masks_dir key, or when a masks/ folder already exists next to your images at the dataset root; otherwise it falls back to YOLO polygon labels.

Link to this sectionPNG mask format#

Semantic segmentation datasets use one image file and one mask file per sample. The mask is a single-channel image, usually PNG, where each pixel value is the class index for the corresponding image pixel.

Pixel values 0, 1, 2, ... represent class IDs from the dataset names mapping.
Pixel value 255 is treated as the ignore label and is excluded from loss and metric computation.
Mask files should use the same stem as their matching image file, for example frankfurt_000000_000294.png.
Masks are resolved as .png by default; if missing, other supported image extensions are also accepted. Use lossless formats such as .png or .tiff, since lossy compression (e.g. .jpg) corrupts the class ID pixel values.

The default layout keeps images and masks in parallel folders. The masks_dir value from the dataset YAML replaces the images path component to find masks.

dataset/
├── images/
│   ├── train/
│   └── val/
└── masks/
    ├── train/
    └── val/

For example, an image at images/train/aachen_000000_000019.png is paired with a mask at masks/train/aachen_000000_000019.png when masks_dir: masks.

Link to this sectionYOLO polygon label format#

If your dataset already has Ultralytics YOLO polygon labels (one .txt per image with <class-index> <x1> <y1> <x2> <y2> ... rows), you can train semantic segmentation directly from them — no PNG mask conversion needed. See the instance segmentation dataset format for the row-level layout.

This path is selected automatically when the dataset YAML omits masks_dir and no masks/ folder exists next to your images at the dataset root — remove or rename any leftover masks/ folder, or the loader falls back to PNG-mask mode and looks for masks there instead. Behavior:

Polygons are converted to a per-image semantic mask at load time, sorted by area so smaller objects override larger ones in overlap regions.
Multi-class (N > 1 in names): an extra background class is appended after your declared classes for pixels not covered by any polygon. The model is built with N + 1 output channels and the last channel is background.
Single-class (N == 1 in names): still trained as 1 class. The mask is binary, with your declared class shown as 1 and pixels not covered by any polygon as 0. No extra background class is added to names.
Pixels added by augmentation padding (e.g. random crop) still use 255 as the ignore label.

Use this path when your data is already labeled as instance polygons and you want a semantic segmentation model from the same files.

Link to this sectionDataset YAML format#

Semantic segmentation datasets are configured with YAML files. The main fields are:

Key	Description
`path`	Dataset root directory.
`train`	Training image path relative to `path`, or an absolute path.
`val`	Validation image path relative to `path`, or an absolute path.
`test`	Optional test image path.
`masks_dir`	Directory name used for semantic masks. Omit this key (with no `masks/` folder at the dataset root) to switch to the YOLO polygon label format.
`names`	Class ID to class name mapping.
`label_mapping`	Optional mapping from source dataset IDs to training IDs or `ignore_label`.

ultralytics/cfg/datasets/cityscapes8.yaml

# Ultralytics 🚀 AGPL-3.0 License - https://ultralytics.com/license

# Cityscapes semantic segmentation dataset (19 classes)
# Documentation: https://docs.ultralytics.com/datasets/semantic/cityscapes8
# Example usage: yolo semantic train data=cityscapes8.yaml model=yolo26n-sem.pt
# parent
# ├── ultralytics
# └── datasets
#     └── cityscapes8 ← downloads here (small subset)
#         └── images
#         └── masks

# Dataset root directory
path: cityscapes8 # dataset root dir
train: images/train # train images (relative to 'path') 4 images
val: images/val # val images (relative to 'path') 4 images

masks_dir: masks # semantic mask directory

# Cityscapes 19-class labels
names:
  0: road
  1: sidewalk
  2: building
  3: wall
  4: fence
  5: pole
  6: traffic light
  7: traffic sign
  8: vegetation
  9: terrain
  10: sky
  11: person
  12: rider
  13: car
  14: truck
  15: bus
  16: train
  17: motorcycle
  18: bicycle

# Map source label IDs to train IDs; ignore_label is converted to 255.
label_mapping:
  -1: ignore_label
  0: ignore_label
  1: ignore_label
  2: ignore_label
  3: ignore_label
  4: ignore_label
  5: ignore_label
  6: ignore_label
  7: 0
  8: 1
  9: ignore_label
  10: ignore_label
  11: 2
  12: 3
  13: 4
  14: ignore_label
  15: ignore_label
  16: ignore_label
  17: 5
  18: ignore_label
  19: 6
  20: 7
  21: 8
  22: 9
  23: 10
  24: 11
  25: 12
  26: 13
  27: 14
  28: 15
  29: ignore_label
  30: ignore_label
  31: 16
  32: 17
  33: 18

# Download URL (optional)
download: https://github.com/ultralytics/assets/releases/download/v0.0.0/cityscapes8.zip

Use label_mapping when the source mask IDs do not already match contiguous training class IDs. Cityscapes and ADE20K include mappings that convert original label IDs into YOLO semantic segmentation train IDs and ignore unused labels.

Link to this sectionUsage#

Train a YOLO26 semantic segmentation model with Python or CLI:

Example

from ultralytics import YOLO

# Load a pretrained semantic segmentation model
model = YOLO("yolo26n-sem.pt")

# Train on the Cityscapes8 semantic segmentation dataset
results = model.train(data="cityscapes8.yaml", epochs=100, imgsz=1024)

Link to this sectionSupported Datasets#

Ultralytics provides semantic segmentation dataset YAML files for these datasets. See the semantic segmentation task page for the full pretrained-model benchmark table.

Cityscapes: Urban street-scene semantic segmentation dataset with 19 train classes.
Cityscapes8: An 8-image Cityscapes subset for quick tests and CI checks.
ADE20K: Scene parsing dataset with 150 semantic classes.

Link to this sectionAdding Your Own Dataset#

Link to this sectionOption A — PNG masks#

Save your images under split folders such as images/train and images/val.
Save one single-channel mask per image under the mirrored mask folders, such as masks/train and masks/val.
Ensure mask pixel values are class IDs. Use 255 for pixels that should be ignored.
Create a dataset YAML with path, train, val, masks_dir, and names.
Add label_mapping only when your mask IDs need conversion to contiguous train IDs.

path: path/to/my-semantic-dataset
train: images/train
val: images/val
masks_dir: masks

names:
    0: background
    1: road
    2: building

Link to this sectionOption B — Polygon labels#

Lay out images and .txt polygon files exactly as for instance segmentation.
Create a dataset YAML with path, train, val, and names — omit masks_dir.
Make sure no masks/ folder exists next to your images at the dataset root — its presence alone switches the loader to PNG-mask mode even without masks_dir in the YAML.
Do not add a "background" entry to names. For multi-class datasets the loader appends one automatically; for single-class datasets training stays at 1 class — your declared class becomes 1 in the mask and uncovered pixels become 0.

path: path/to/my-polygon-dataset
train: images/train
val: images/val

names:
    0: person
    1: car

Ultralytics Platform provides a polygon annotation tool for the semantic task, plus SAM-assisted Smart annotation — annotate directly in the browser and export or train on the resulting polygon-labeled dataset without setting up this layout by hand.

Link to this sectionFAQ#

Link to this sectionWhat is the difference between semantic segmentation masks and instance segmentation labels?#

Semantic segmentation masks are dense pixel maps. Each pixel stores a class ID, and there is one mask image per training image. Instance segmentation labels in Ultralytics YOLO use text files with polygon coordinates, one row per object instance.

Link to this sectionWhat pixel value is ignored during training?#

Pixel value 255 is used as the ignore label. These pixels are skipped during loss and metric computation, which is useful for void regions, unlabeled pixels, or classes outside the training label set.

Link to this sectionDo mask file names need to match image file names?#

Yes. Each semantic mask should have the same file stem as the corresponding image. The dataset loader replaces the images directory component with masks_dir and searches for matching mask files, falling back to other supported image extensions (.jpg, .tiff, etc.) if a .png mask isn't found — though only lossless formats are recommended, since the fallback doesn't enforce this.

Link to this sectionCan I use original dataset label IDs directly?#

Yes, if they already match your names class IDs. If the source dataset uses non-contiguous IDs or includes labels that should be ignored, add a label_mapping section to convert source pixel values to training IDs.

Link to this sectionCan I use my instance segmentation dataset to train semantic segmentation?#

Yes. Instance segmentation datasets use Ultralytics YOLO polygon labels (one .txt per image with <class-index> <x1> <y1> <x2> <y2> ... rows), and the same files can be reused for semantic segmentation — just omit masks_dir from the dataset YAML, and make sure no masks/ folder exists next to your images at the dataset root (its presence alone triggers PNG-mask mode even without masks_dir set). The loader then converts polygons to per-image masks on the fly. For multi-class datasets (N > 1) an extra background class is appended, and the model is built with N + 1 output channels. For single-class datasets (N == 1) training stays at 1 class — the mask shows your declared class as 1 and uncovered pixels as 0.

Link to this sectionWhich datasets ship with Ultralytics for semantic segmentation?#

Ultralytics includes ready-to-use dataset YAML files for Cityscapes (19 urban-scene classes), the lightweight Cityscapes8 subset for pipeline testing, and ADE20K (150 scene-parsing classes). Each page documents the exact class list, download steps, and a verified training example.

Contributors

RAraimbekovm¹ MImiles-deans-ultralytics¹ GLglenn-jocher¹ LALaughing-q¹

Created 2 months agoUpdated 6 days ago