VisDrone 数据集
The VisDrone Dataset is a large-scale benchmark created by the AISKYEYE team at the Lab of Machine Learning and Data Mining, Tianjin University, China. It contains carefully annotated ground truth data for various computer vision tasks related to drone-based image and video analysis.
观看: 如何在用于无人机图像分析的 VisDrone 数据集上训练Ultralytics YOLO 模型
VisDrone由288个视频片段、261,908帧图像和10,209张静态图像组成,这些视频片段和图像由不同的无人机摄像头拍摄。数据集涵盖了多个方面,包括地点(中国 14 个不同城市)、环境(城市和农村)、物体(行人、车辆、自行车等)和密度(稀疏和拥挤场景)。数据集是在不同场景、天气和照明条件下使用各种无人机平台收集的。这些帧由人工标注了超过 260 万个目标(如行人、汽车、自行车和三轮车)的边界框。为了更好地利用数据,还提供了场景可见度、物体类别和遮挡等属性。
数据集结构
VisDrone 数据集分为五个主要子集,每个子集都侧重于特定任务:
- 任务 1:图像中的物体检测
- 任务 2:视频中的物体检测
- 任务 3:单个目标跟踪
- 任务 4:多目标跟踪
- 任务 5:人群计数
应用
The VisDrone dataset is widely used for training and evaluating deep learning models in drone-based computer vision tasks such as object detection, object tracking, and crowd counting. The dataset's diverse set of sensor data, object annotations, and attributes make it a valuable resource for researchers and practitioners in the field of drone-based computer vision.
数据集 YAML
YAML(另一种标记语言)文件用于定义数据集配置。它包含数据集的路径、类和其他相关信息。就 Visdrone 数据集而言,YAML 文件中的 VisDrone.yaml
文件保存在 https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/VisDrone.yaml.
ultralytics/cfg/datasets/VisDrone.yaml
# Ultralytics YOLO 🚀, AGPL-3.0 license
# VisDrone2019-DET dataset https://github.com/VisDrone/VisDrone-Dataset by Tianjin University
# Documentation: https://docs.ultralytics.com/datasets/detect/visdrone/
# Example usage: yolo train data=VisDrone.yaml
# parent
# ├── ultralytics
# └── datasets
# └── VisDrone ← downloads here (2.3 GB)
# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
path: ../datasets/VisDrone # dataset root dir
train: VisDrone2019-DET-train/images # train images (relative to 'path') 6471 images
val: VisDrone2019-DET-val/images # val images (relative to 'path') 548 images
test: VisDrone2019-DET-test-dev/images # test images (optional) 1610 images
# Classes
names:
0: pedestrian
1: people
2: bicycle
3: car
4: van
5: truck
6: tricycle
7: awning-tricycle
8: bus
9: motor
# Download script/URL (optional) ---------------------------------------------------------------------------------------
download: |
import os
from pathlib import Path
from ultralytics.utils.downloads import download
def visdrone2yolo(dir):
from PIL import Image
from tqdm import tqdm
def convert_box(size, box):
# Convert VisDrone box to YOLO xywh box
dw = 1. / size[0]
dh = 1. / size[1]
return (box[0] + box[2] / 2) * dw, (box[1] + box[3] / 2) * dh, box[2] * dw, box[3] * dh
(dir / 'labels').mkdir(parents=True, exist_ok=True) # make labels directory
pbar = tqdm((dir / 'annotations').glob('*.txt'), desc=f'Converting {dir}')
for f in pbar:
img_size = Image.open((dir / 'images' / f.name).with_suffix('.jpg')).size
lines = []
with open(f, 'r') as file: # read annotation.txt
for row in [x.split(',') for x in file.read().strip().splitlines()]:
if row[4] == '0': # VisDrone 'ignored regions' class 0
continue
cls = int(row[5]) - 1
box = convert_box(img_size, tuple(map(int, row[:4])))
lines.append(f"{cls} {' '.join(f'{x:.6f}' for x in box)}\n")
with open(str(f).replace(f'{os.sep}annotations{os.sep}', f'{os.sep}labels{os.sep}'), 'w') as fl:
fl.writelines(lines) # write label.txt
# Download
dir = Path(yaml['path']) # dataset root dir
urls = ['https://github.com/ultralytics/assets/releases/download/v0.0.0/VisDrone2019-DET-train.zip',
'https://github.com/ultralytics/assets/releases/download/v0.0.0/VisDrone2019-DET-val.zip',
'https://github.com/ultralytics/assets/releases/download/v0.0.0/VisDrone2019-DET-test-dev.zip',
'https://github.com/ultralytics/assets/releases/download/v0.0.0/VisDrone2019-DET-test-challenge.zip']
download(urls, dir=dir, curl=True, threads=4)
# Convert
for d in 'VisDrone2019-DET-train', 'VisDrone2019-DET-val', 'VisDrone2019-DET-test-dev':
visdrone2yolo(dir / d) # convert VisDrone annotations to YOLO labels
使用方法
To train a YOLO11n model on the VisDrone dataset for 100 epochs with an image size of 640, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model Training page.
列车示例
样本数据和注释
VisDrone 数据集包含由安装在无人机上的相机拍摄的各种图像和视频。下面是数据集中的一些数据示例及其相应的注释:
- Task 1: Object detection in images - This image demonstrates an example of object detection in images, where objects are annotated with bounding boxes. The dataset provides a wide variety of images taken from different locations, environments, and densities to facilitate the development of models for this task.
该示例展示了 VisDrone 数据集中数据的多样性和复杂性,并强调了高质量传感器数据对于无人机计算机视觉任务的重要性。
引文和致谢
如果您在研究或开发工作中使用 VisDrone 数据集,请引用以下论文:
@ARTICLE{9573394,
author={Zhu, Pengfei and Wen, Longyin and Du, Dawei and Bian, Xiao and Fan, Heng and Hu, Qinghua and Ling, Haibin},
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
title={Detection and Tracking Meet Drones Challenge},
year={2021},
volume={},
number={},
pages={1-1},
doi={10.1109/TPAMI.2021.3119563}}
We would like to acknowledge the AISKYEYE team at the Lab of Machine Learning and Data Mining, Tianjin University, China, for creating and maintaining the VisDrone dataset as a valuable resource for the drone-based computer vision research community. For more information about the VisDrone dataset and its creators, visit the VisDrone Dataset GitHub repository.
常见问题
什么是 VisDrone 数据集,它有哪些主要功能?
The VisDrone Dataset is a large-scale benchmark created by the AISKYEYE team at Tianjin University, China. It is designed for various computer vision tasks related to drone-based image and video analysis. Key features include:
- Composition: 288 video clips with 261,908 frames and 10,209 static images.
- Annotations: Over 2.6 million bounding boxes for objects like pedestrians, cars, bicycles, and tricycles.
- Diversity: Collected across 14 cities, in urban and rural settings, under different weather and lighting conditions.
- Tasks: Split into five main tasks—object detection in images and videos, single-object and multi-object tracking, and crowd counting.
How can I use the VisDrone Dataset to train a YOLO11 model with Ultralytics?
To train a YOLO11 model on the VisDrone dataset for 100 epochs with an image size of 640, you can follow these steps:
列车示例
有关其他配置选项,请参阅型号培训页面。
VisDrone 数据集的主要子集及其应用是什么?
The VisDrone dataset is divided into five main subsets, each tailored for a specific computer vision task:
- Task 1: Object detection in images.
- Task 2: Object detection in videos.
- Task 3: Single-object tracking.
- Task 4: Multi-object tracking.
- Task 5: Crowd counting.
These subsets are widely used for training and evaluating deep learning models in drone-based applications such as surveillance, traffic monitoring, and public safety.
哪里可以找到Ultralytics 中 VisDrone 数据集的配置文件?
VisDrone 数据集的配置文件、 VisDrone.yaml
可通过以下链接在Ultralytics 存储库中找到: VisDrone.yaml.
如果在研究中使用 VisDrone 数据集,如何引用该数据集?
如果您在研究或开发工作中使用 VisDrone 数据集,请引用以下论文:
@ARTICLE{9573394,
author={Zhu, Pengfei and Wen, Longyin and Du, Dawei and Bian, Xiao and Fan, Heng and Hu, Qinghua and Ling, Haibin},
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
title={Detection and Tracking Meet Drones Challenge},
year={2021},
volume={},
number={},
pages={1-1},
doi={10.1109/TPAMI.2021.3119563}
}