Dönüştürmeden COCO JSON üzerinde YOLO Nasıl Eğitilir

Neden Doğrudan COCO JSON Üzerinde Eğitmelisin

Annotations in COCO JSON format can be used directly for Ultralytics YOLO training without converting to .txt files first. This is done by subclassing YOLODataset to parse COCO JSON on the fly and wiring it into the training pipeline through a custom trainer.

Bu yaklaşım, COCO JSON'u tek doğru veri kaynağı olarak tutar; convert_coco() çağrısı, dizin yeniden düzenleme veya ara etiket dosyaları gerekmez. YOLO26 ve diğer tüm Ultralytics YOLO tespit modelleri desteklenir. Segmentasyon ve poz modelleri ek etiket alanları gerektirir (bkz. SSS).

Tek seferlik bir dönüştürme mi arıyorsun?

See the COCO to YOLO Conversion guide for the standard convert_coco() workflow.

Mimariye Genel Bakış

İki sınıf gereklidir:

COCODataset — COCO JSON'u okur ve eğitim sırasında sınırlayıcı kutuları bellekte YOLO formatına dönüştürür
COCOTrainer — overrides build_dataset() to use COCODataset instead of the default YOLODataset

Uygulama, JSON açıklamalarını da doğrudan okuyan yerleşik GroundingDataset ile aynı deseni izler. Üç yöntem geçersiz kılınır: get_img_files(), cache_labels() ve get_labels().

COCO JSON Veri Kümesi Sınıfını Oluşturma

COCODataset sınıfı YOLODataset'ten miras alır ve etiket yükleme mantığını geçersiz kılar. Bir etiket dizininden .txt dosyalarını okumak yerine, COCO JSON dosyasını açar, resme göre gruplandırılmış açıklamalar üzerinde yineler ve her sınırlayıcı kutuyu COCO piksel formatı olan [x_min, y_min, width, height] değerinden, YOLO normalleştirilmiş merkez formatı olan [x_center, y_center, width, height] değerine dönüştürür. Kalabalık açıklamaları (iscrowd: 1) ve sıfır alanlı kutular otomatik olarak atlanır.

The get_img_files() method returns an empty list because image paths are resolved from the JSON file_name field inside cache_labels(). Category IDs are sorted and remapped to zero-indexed class indices, so both 1-based (standard COCO) and non-contiguous ID schemes work correctly.

import json
from collections import defaultdict
from pathlib import Path

import numpy as np

from ultralytics.data.dataset import DATASET_CACHE_VERSION, YOLODataset
from ultralytics.data.utils import get_hash, load_dataset_cache_file, save_dataset_cache_file
from ultralytics.utils import TQDM

class COCODataset(YOLODataset):
    """Dataset that reads COCO JSON annotations directly without conversion to .txt files."""

    def __init__(self, *args, json_file="", **kwargs):
        self.json_file = json_file
        super().__init__(*args, data={"channels": 3}, **kwargs)

    def get_img_files(self, img_path):
        """Image paths are resolved from the JSON file, not from scanning a directory."""
        return []

    def cache_labels(self, path=Path("./labels.cache")):
        """Parse COCO JSON and convert annotations to YOLO format. Results are saved to a .cache file."""
        x = {"labels": []}
        with open(self.json_file) as f:
            coco = json.load(f)

        images = {img["id"]: img for img in coco["images"]}

        # Sort categories by ID and map to 0-indexed classes
        categories = {cat["id"]: i for i, cat in enumerate(sorted(coco["categories"], key=lambda c: c["id"]))}

        img_to_anns = defaultdict(list)
        for ann in coco["annotations"]:
            img_to_anns[ann["image_id"]].append(ann)

        for img_info in TQDM(coco["images"], desc="reading annotations"):
            h, w = img_info["height"], img_info["width"]
            im_file = Path(self.img_path) / img_info["file_name"]
            if not im_file.exists():
                continue

            self.im_files.append(str(im_file))
            bboxes = []
            for ann in img_to_anns.get(img_info["id"], []):
                if ann.get("iscrowd", False):
                    continue
                # COCO: [x, y, w, h] top-left in pixels -> YOLO: [cx, cy, w, h] center normalized
                box = np.array(ann["bbox"], dtype=np.float32)
                box[:2] += box[2:] / 2  # top-left to center
                box[[0, 2]] /= w  # normalize x
                box[[1, 3]] /= h  # normalize y
                if box[2] <= 0 or box[3] <= 0:
                    continue
                cls = categories[ann["category_id"]]
                bboxes.append([cls, *box.tolist()])

            lb = np.array(bboxes, dtype=np.float32) if bboxes else np.zeros((0, 5), dtype=np.float32)
            x["labels"].append(
                {
                    "im_file": str(im_file),
                    "shape": (h, w),
                    "cls": lb[:, 0:1],
                    "bboxes": lb[:, 1:],
                    "segments": [],
                    "normalized": True,
                    "bbox_format": "xywh",
                }
            )
        x["hash"] = get_hash([self.json_file, str(self.img_path)])
        save_dataset_cache_file(self.prefix, path, x, DATASET_CACHE_VERSION)
        return x

    def get_labels(self):
        """Load labels from .cache file if available, otherwise parse JSON and create the cache."""
        cache_path = Path(self.json_file).with_suffix(".cache")
        try:
            cache = load_dataset_cache_file(cache_path)
            assert cache["version"] == DATASET_CACHE_VERSION
            assert cache["hash"] == get_hash([self.json_file, str(self.img_path)])
            self.im_files = [lb["im_file"] for lb in cache["labels"]]
        except (FileNotFoundError, AssertionError, AttributeError, KeyError, ModuleNotFoundError):
            cache = self.cache_labels(cache_path)
        cache.pop("hash", None)
        cache.pop("version", None)
        return cache["labels"]

Ayrıştırılan etiketler, JSON'un yanında bir .cache dosyasına kaydedilir (örneğin instances_train.cache). Sonraki eğitim çalışmalarında önbellek doğrudan yüklenir ve JSON ayrıştırma adımı atlanır. JSON dosyası değişirse, karma denetimi başarısız olur ve önbellek otomatik olarak yeniden oluşturulur.

Veri Kümesini Eğitim Hattına Bağlama

The only change needed in the trainer is overriding build_dataset(). The default DetectionTrainer builds a YOLODataset that scans for .txt label files. By replacing it with COCODataset, the trainer reads from the COCO JSON instead.

JSON dosya yolu, veri yapılandırmasındaki özel bir train_json / val_json alanından çekilir (bkz. Adım 3). Eğitim sırasında mode="train", train_json değerine çözümlenir; doğrulama sırasında mode="val", val_json değerine çözümlenir. val_json ayarlanmamışsa, varsayılan olarak train_json kullanılır.

from ultralytics.models.yolo.detect import DetectionTrainer
from ultralytics.utils import colorstr

class COCOTrainer(DetectionTrainer):
    """Trainer that uses COCODataset for direct COCO JSON training."""

    def build_dataset(self, img_path, mode="train", batch=None):
        json_file = self.data["train_json"] if mode == "train" else self.data.get("val_json", self.data["train_json"])
        return COCODataset(
            img_path=img_path,
            json_file=json_file,
            imgsz=self.args.imgsz,
            batch_size=batch,
            augment=mode == "train",
            hyp=self.args,
            rect=self.args.rect or mode == "val",
            cache=self.args.cache or None,
            single_cls=self.args.single_cls or False,
            stride=int(self.model.stride.max()) if hasattr(self, "model") and self.model else 32,
            pad=0.0 if mode == "train" else 0.5,
            prefix=colorstr(f"{mode}: "),
            task=self.args.task,
            classes=self.args.classes,
            fraction=self.args.fraction if mode == "train" else 1.0,
        )

COCO JSON için dataset.yaml yapılandırması

dataset.yaml, resim dizinlerini bulmak için standart path, train ve val alanlarını kullanır. İki ek alan olan train_json ve val_json, COCOTrainer'ın okuduğu COCO açıklama dosyalarını belirtir. nc ve names alanları, sınıf sayısını ve adlarını tanımlar ve JSON'daki categories dizisinin sıralı düzeniyle eşleşir.

path: /path/to/images # root directory with train/ and val/ subfolders
train: train
val: val

# COCO JSON annotation files
train_json: /path/to/annotations/instances_train.json
val_json: /path/to/annotations/instances_val.json

nc: 80
names:
    0: person
    1: bicycle
    # ... remaining class names

Beklenen dizin yapısı:

my_dataset/
  images/
    train/
      img_001.jpg
      ...
    val/
      img_100.jpg
      ...
  annotations/
    instances_train.json
    instances_val.json
  dataset.yaml

COCO JSON Üzerinde Eğitimi Çalıştırma

Veri kümesi sınıfı, eğitici sınıfı ve YAML yapılandırması hazır olduğunda, eğitim standart model.train() çağrısı ile çalışır. Normal bir eğitim çalışmasından tek farkı, Ultralytics'e varsayılan yerine özel veri kümesi yükleyicisini kullanmasını söyleyen trainer=COCOTrainer argümanıdır.

from ultralytics import YOLO

model = YOLO("yolo26n.pt")
model.train(data="dataset.yaml", epochs=100, imgsz=640, trainer=COCOTrainer)

Eksiksiz eğitim hattı, doğrulama, kontrol noktası kaydetme ve metrik günlüğü dahil olmak üzere beklendiği gibi çalışır.

Tam Uygulama

Kolaylık olması için tam uygulama, aşağıda tek bir kopyala-yapıştır betiği olarak verilmiştir. Özel veri kümesini, özel eğitiyi ve eğitim çağrısını içerir. Bunu dataset.yaml dosyanın yanına kaydet ve doğrudan çalıştır.

import json
from collections import defaultdict
from pathlib import Path

import numpy as np

from ultralytics import YOLO
from ultralytics.data.dataset import DATASET_CACHE_VERSION, YOLODataset
from ultralytics.data.utils import get_hash, load_dataset_cache_file, save_dataset_cache_file
from ultralytics.models.yolo.detect import DetectionTrainer
from ultralytics.utils import TQDM, colorstr

class COCODataset(YOLODataset):
    """Dataset that reads COCO JSON annotations directly without conversion to .txt files."""

    def __init__(self, *args, json_file="", **kwargs):
        self.json_file = json_file
        super().__init__(*args, data={"channels": 3}, **kwargs)

    def get_img_files(self, img_path):
        return []

    def cache_labels(self, path=Path("./labels.cache")):
        x = {"labels": []}
        with open(self.json_file) as f:
            coco = json.load(f)

        images = {img["id"]: img for img in coco["images"]}
        categories = {cat["id"]: i for i, cat in enumerate(sorted(coco["categories"], key=lambda c: c["id"]))}

        img_to_anns = defaultdict(list)
        for ann in coco["annotations"]:
            img_to_anns[ann["image_id"]].append(ann)

        for img_info in TQDM(coco["images"], desc="reading annotations"):
            h, w = img_info["height"], img_info["width"]
            im_file = Path(self.img_path) / img_info["file_name"]
            if not im_file.exists():
                continue

            self.im_files.append(str(im_file))
            bboxes = []
            for ann in img_to_anns.get(img_info["id"], []):
                if ann.get("iscrowd", False):
                    continue
                box = np.array(ann["bbox"], dtype=np.float32)
                box[:2] += box[2:] / 2
                box[[0, 2]] /= w
                box[[1, 3]] /= h
                if box[2] <= 0 or box[3] <= 0:
                    continue
                cls = categories[ann["category_id"]]
                bboxes.append([cls, *box.tolist()])

            lb = np.array(bboxes, dtype=np.float32) if bboxes else np.zeros((0, 5), dtype=np.float32)
            x["labels"].append(
                {
                    "im_file": str(im_file),
                    "shape": (h, w),
                    "cls": lb[:, 0:1],
                    "bboxes": lb[:, 1:],
                    "segments": [],
                    "normalized": True,
                    "bbox_format": "xywh",
                }
            )
        x["hash"] = get_hash([self.json_file, str(self.img_path)])
        save_dataset_cache_file(self.prefix, path, x, DATASET_CACHE_VERSION)
        return x

    def get_labels(self):
        cache_path = Path(self.json_file).with_suffix(".cache")
        try:
            cache = load_dataset_cache_file(cache_path)
            assert cache["version"] == DATASET_CACHE_VERSION
            assert cache["hash"] == get_hash([self.json_file, str(self.img_path)])
            self.im_files = [lb["im_file"] for lb in cache["labels"]]
        except (FileNotFoundError, AssertionError, AttributeError, KeyError, ModuleNotFoundError):
            cache = self.cache_labels(cache_path)
        cache.pop("hash", None)
        cache.pop("version", None)
        return cache["labels"]

class COCOTrainer(DetectionTrainer):
    """Trainer that uses COCODataset for direct COCO JSON training."""

    def build_dataset(self, img_path, mode="train", batch=None):
        json_file = self.data["train_json"] if mode == "train" else self.data.get("val_json", self.data["train_json"])
        return COCODataset(
            img_path=img_path,
            json_file=json_file,
            imgsz=self.args.imgsz,
            batch_size=batch,
            augment=mode == "train",
            hyp=self.args,
            rect=self.args.rect or mode == "val",
            cache=self.args.cache or None,
            single_cls=self.args.single_cls or False,
            stride=int(self.model.stride.max()) if hasattr(self, "model") and self.model else 32,
            pad=0.0 if mode == "train" else 0.5,
            prefix=colorstr(f"{mode}: "),
            task=self.args.task,
            classes=self.args.classes,
            fraction=self.args.fraction if mode == "train" else 1.0,
        )

model = YOLO("yolo26n.pt")
model.train(data="dataset.yaml", epochs=100, imgsz=640, trainer=COCOTrainer)

Hiperparametre önerileri için Model Eğitimi İpuçları kılavuzuna bak.

SSS

Bunun convert_coco() ile farkı nedir?

convert_coco(), tek seferlik bir dönüştürme olarak diskteki .txt etiket dosyalarını yazar. Bu yaklaşım, her eğitim çalışmasının başında JSON'u ayrıştırır ve açıklamaları bellekte dönüştürür. Kalıcı YOLO formatlı etiketler tercih edildiğinde convert_coco() kullan; COCO JSON'u ek dosyalar oluşturmadan tek doğru veri kaynağı olarak tutmak istediğinde bu yaklaşımı kullan.

YOLO, özel kod olmadan COCO JSON üzerinde eğitilebilir mi?

Varsayılan olarak YOLO .txt etiketlerini bekleyen mevcut Ultralytics hattı ile bu mümkün değildir. Bu kılavuz, gereken minimum özel kodu (bir veri kümesi sınıfı ve bir eğitici sınıfı) sağlar. Bir kez tanımlandıktan sonra, eğitim yalnızca standart bir model.train() çağrısı gerektirir.

Bu, segmentasyon ve poz tahminini destekliyor mu?

This guide covers object detection. To add instance segmentation support, include the segmentation polygon data from COCO annotations in the segments field of each label dictionary. For pose estimation, include keypoints. The GroundingDataset source code provides a reference implementation for handling segments.

Artırmalar (augmentations) bu özel veri kümesiyle çalışır mı?

Evet. COCODataset, YOLODataset sınıfını genişletir, bu nedenle yerleşik tüm veri artırmaları — mosaic, mixup, copy-paste ve diğerleri — hiçbir değişiklik olmadan çalışır.

Kategori kimlikleri sınıf dizinlerine nasıl eşlenir?

Categories are sorted by id and mapped to sequential indices starting from 0. This handles 1-based IDs (standard COCO), 0-based IDs, and non-contiguous IDs. The names dictionary in dataset.yaml should follow the same sorted order as the COCO categories array.

Önceden dönüştürülmüş etiketlere kıyasla performans yükü var mı?

COCO JSON, ilk eğitim çalışmasında bir kez ayrıştırılır. Ayrıştırılan etiketler bir .cache dosyasına kaydedilir, böylece sonraki çalışmalar yeniden ayrıştırma olmadan anında yüklenir. Açıklamalar bellekte tutulduğu için eğitim hızı standart YOLO eğitimi ile aynıdır. JSON dosyası değişirse önbellek otomatik olarak yeniden oluşturulur.

Contributors

RAraimbekovm² GLglenn-jocher¹

Created 2 ay önceUpdated 2 ay önce