VisDrone 데이터 세트

Q: What are the main subsets of the VisDrone dataset and their applications?

VisDrone 데이터 세트는 5개의 주요 하위 집합으로 나뉘며, 각각은 특정 컴퓨터 비전 작업에 맞게 조정됩니다: 이러한 하위 집합은 감시, 교통 모니터링, 공공 안전과 같은 드론 기반 애플리케이션에서 딥러닝 모델을 훈련하고 평가하는 데 널리 사용됩니다.

Q: Where can I find the configuration file for the VisDrone dataset in Ultralytics?

VisDrone 데이터 세트의 구성 파일인 VisDrone.yaml은 다음 링크의 Ultralytics 리포지토리에서 찾을 수 있습니다: VisDrone.yaml.

VisDrone 데이터 세트는 중국 톈진대학교 머신러닝 및 데이터 마이닝 연구소의 AISKYEYE 팀이 만든 대규모 벤치마크입니다. 여기에는 드론 기반 이미지 및 비디오 분석과 관련된 다양한 컴퓨터 비전 작업을 위해 신중하게 주석이 달린 실측 데이터가 포함되어 있습니다.

Watch: 드론 이미지 분석을 위한 VisDrone 데이터 세트의 모델 훈련 방법 Ultralytics YOLO

VisDrone은 드론에 장착된 다양한 카메라로 촬영한 261,908 프레임의 288개 비디오 클립과 10,209개의 정적 이미지로 구성되어 있습니다. 이 데이터 세트는 위치(중국 전역의 14개 도시), 환경(도시 및 농촌), 물체(보행자, 차량, 자전거 등), 밀도(드문드문 붐비는 장면) 등 다양한 측면을 포괄합니다. 데이터 세트는 다양한 시나리오와 날씨 및 조명 조건에서 다양한 드론 플랫폼을 사용하여 수집되었습니다. 이러한 프레임에는 보행자, 자동차, 자전거, 세발자전거 등 260만 개가 넘는 대상의 경계 상자에 수동으로 주석을 달았습니다. 씬 가시성, 오브젝트 클래스, 오클루전과 같은 속성도 제공되어 데이터 활용도를 높일 수 있습니다.

데이터 세트 구조

VisDrone 데이터 세트는 5개의 주요 하위 집합으로 구성되어 있으며, 각 집합은 특정 작업에 중점을 두고 있습니다:

작업 1: 이미지에서 물체 감지
작업 2: 동영상에서 객체 감지
작업 3: 단일 개체 추적
작업 4: 다중 개체 추적
작업 5: 군중 계산

애플리케이션

VisDrone 데이터 세트는 물체 감지, 물체 추적, 군중 계산과 같은 드론 기반 컴퓨터 비전 작업에서 딥 러닝 모델을 훈련하고 평가하는 데 널리 사용됩니다. 이 데이터 세트의 다양한 센서 데이터, 객체 주석 및 속성은 드론 기반 컴퓨터 비전 분야의 연구자와 실무자에게 유용한 리소스입니다.

데이터 세트 YAML

데이터 세트 구성을 정의하는 데는 YAML(또 다른 마크업 언어) 파일이 사용됩니다. 여기에는 데이터 세트의 경로, 클래스 및 기타 관련 정보에 대한 정보가 포함되어 있습니다. Visdrone 데이터 세트의 경우, 데이터 세트의 VisDrone.yaml 파일은 다음 위치에서 유지됩니다. https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/VisDrone.yaml.

ultralytics/cfg/datasets/VisDrone.yaml

# Ultralytics YOLO 🚀, AGPL-3.0 license
# VisDrone2019-DET dataset https://github.com/VisDrone/VisDrone-Dataset by Tianjin University
# Documentation: https://docs.ultralytics.com/datasets/detect/visdrone/
# Example usage: yolo train data=VisDrone.yaml
# parent
# ├── ultralytics
# └── datasets
#     └── VisDrone  ← downloads here (2.3 GB)

# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
path: ../datasets/VisDrone # dataset root dir
train: VisDrone2019-DET-train/images # train images (relative to 'path')  6471 images
val: VisDrone2019-DET-val/images # val images (relative to 'path')  548 images
test: VisDrone2019-DET-test-dev/images # test images (optional)  1610 images

# Classes
names:
  0: pedestrian
  1: people
  2: bicycle
  3: car
  4: van
  5: truck
  6: tricycle
  7: awning-tricycle
  8: bus
  9: motor

# Download script/URL (optional) ---------------------------------------------------------------------------------------
download: |
  import os
  from pathlib import Path

  from ultralytics.utils.downloads import download

  def visdrone2yolo(dir):
      from PIL import Image
      from tqdm import tqdm

      def convert_box(size, box):
          # Convert VisDrone box to YOLO xywh box
          dw = 1. / size[0]
          dh = 1. / size[1]
          return (box[0] + box[2] / 2) * dw, (box[1] + box[3] / 2) * dh, box[2] * dw, box[3] * dh

      (dir / 'labels').mkdir(parents=True, exist_ok=True)  # make labels directory
      pbar = tqdm((dir / 'annotations').glob('*.txt'), desc=f'Converting {dir}')
      for f in pbar:
          img_size = Image.open((dir / 'images' / f.name).with_suffix('.jpg')).size
          lines = []
          with open(f, 'r') as file:  # read annotation.txt
              for row in [x.split(',') for x in file.read().strip().splitlines()]:
                  if row[4] == '0':  # VisDrone 'ignored regions' class 0
                      continue
                  cls = int(row[5]) - 1
                  box = convert_box(img_size, tuple(map(int, row[:4])))
                  lines.append(f"{cls} {' '.join(f'{x:.6f}' for x in box)}\n")
                  with open(str(f).replace(f'{os.sep}annotations{os.sep}', f'{os.sep}labels{os.sep}'), 'w') as fl:
                      fl.writelines(lines)  # write label.txt


  # Download
  dir = Path(yaml['path'])  # dataset root dir
  urls = ['https://github.com/ultralytics/assets/releases/download/v0.0.0/VisDrone2019-DET-train.zip',
          'https://github.com/ultralytics/assets/releases/download/v0.0.0/VisDrone2019-DET-val.zip',
          'https://github.com/ultralytics/assets/releases/download/v0.0.0/VisDrone2019-DET-test-dev.zip',
          'https://github.com/ultralytics/assets/releases/download/v0.0.0/VisDrone2019-DET-test-challenge.zip']
  download(urls, dir=dir, curl=True, threads=4)

  # Convert
  for d in 'VisDrone2019-DET-train', 'VisDrone2019-DET-val', 'VisDrone2019-DET-test-dev':
      visdrone2yolo(dir / d)  # convert VisDrone annotations to YOLO labels

사용법

이미지 크기가 640인 100개의 에포크에 대해 VisDrone 데이터 세트에서 YOLO11n 모델을 훈련하려면 다음 코드 스니펫을 사용할 수 있습니다. 사용 가능한 인수의 전체 목록은 모델 훈련 페이지를 참조하세요.

열차 예시

PythonCLI

from ultralytics import YOLO

# Load a model
model = YOLO("yolo11n.pt")  # load a pretrained model (recommended for training)

# Train the model
results = model.train(data="VisDrone.yaml", epochs=100, imgsz=640)

# Start training from a pretrained *.pt model
yolo detect train data=VisDrone.yaml model=yolo11n.pt epochs=100 imgsz=640

샘플 데이터 및 주석

VisDrone 데이터 세트에는 드론에 장착된 카메라로 캡처한 다양한 이미지와 동영상이 포함되어 있습니다. 다음은 데이터 세트의 몇 가지 데이터 예시와 해당 주석입니다:

데이터 세트 샘플 이미지

작업 1: 이미지에서 객체 감지 - 이 이미지는 객체에 경계 상자가 주석으로 표시된 이미지에서 객체를 감지하는 예시를 보여줍니다. 이 데이터 세트는 다양한 위치, 환경 및 밀도에서 촬영된 다양한 이미지를 제공하여 이 작업을 위한 모델 개발을 용이하게 합니다.

이 예는 VisDrone 데이터 세트에 포함된 데이터의 다양성과 복잡성을 보여주며 드론 기반 컴퓨터 비전 작업에서 고품질 센서 데이터의 중요성을 강조합니다.

인용 및 감사

연구 또는 개발 작업에 VisDrone 데이터세트를 사용하는 경우 다음 논문을 인용해 주세요:

BibTeX

@ARTICLE{9573394,
  author={Zhu, Pengfei and Wen, Longyin and Du, Dawei and Bian, Xiao and Fan, Heng and Hu, Qinghua and Ling, Haibin},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
  title={Detection and Tracking Meet Drones Challenge},
  year={2021},
  volume={},
  number={},
  pages={1-1},
  doi={10.1109/TPAMI.2021.3119563}}

드론 기반 컴퓨터 비전 연구 커뮤니티를 위한 귀중한 리소스인 VisDrone 데이터세트를 만들고 유지 관리하는 중국 천진대학교 머신러닝 및 데이터 마이닝 연구소의 AISKYEYE 팀에 감사의 말씀을 전합니다. VisDrone 데이터 세트와 제작자에 대한 자세한 내용은 VisDrone 데이터 세트 GitHub 리포지토리에서 확인할 수 있습니다.

자주 묻는 질문