xView 데이터 세트

Q: What is the xView dataset and how does it benefit computer vision research?

xView 데이터 세트는 60개 클래스에 걸쳐 100만 개 이상의 물체 인스턴스를 포함하는 공개적으로 사용 가능한 최대 규모의 고해상도 오버헤드 이미지 컬렉션 중 하나입니다. 탐지를 위한 최소 해상도 감소, 학습 효율성 향상, 더 많은 객체 클래스 발견, 세분화된 객체 탐지 발전 등 컴퓨터 비전 연구의 다양한 측면을 향상시키기 위해 설계되었습니다.

Q: How can I use Ultralytics YOLO to train a model on the xView dataset?

Ultralytics YOLO 을 사용하여 xView 데이터 세트에서 모델을 학습시키려면 다음 단계를 따르세요: 자세한 인수 및 설정은 모델 학습 페이지를 참조하세요.

Q: How do I cite the xView dataset in my research?

연구에 xView 데이터셋을 활용하는 경우 다음 논문을 인용해 주세요: xView 데이터 세트에 대한 자세한 내용은 xView 데이터 세트 공식 웹사이트를 참조하세요.

xView 데이터 세트는 전 세계의 복잡한 장면에 바운딩 박스를 사용해 주석이 달린 이미지가 포함된 공개적으로 사용 가능한 가장 큰 오버헤드 이미지 데이터 세트 중 하나입니다. xView 데이터 세트의 목표는 네 가지 컴퓨터 비전 분야의 발전을 가속화하는 것입니다:

탐지를 위한 최소 해상도를 낮춥니다.
학습 효율성을 개선하세요.
더 많은 개체 클래스를 검색할 수 있습니다.
세분화된 클래스에 대한 탐지 기능을 개선합니다.

xView는 COCO(Common Objects in Context) 와 같은 과제의 성공을 바탕으로 컴퓨터 비전을 활용하여 우주에서 점점 더 많은 양의 이미지를 분석하여 새로운 방식으로 시각 세계를 이해하고 다양한 중요한 애플리케이션을 다루는 것을 목표로 합니다.

주요 기능

xView에는 60개 클래스에 걸쳐 100만 개 이상의 오브젝트 인스턴스가 포함되어 있습니다.
이 데이터 세트의 해상도는 0.3미터로, 대부분의 공개 위성 이미지 데이터 세트보다 더 높은 해상도의 이미지를 제공합니다.
xView는 바운딩 박스 주석이 있는 작고 희귀하며 세분화된 다양한 유형의 오브젝트 컬렉션을 제공합니다.
다음을 사용하여 사전 학습된 기준 모델과 함께 제공됩니다. TensorFlow 객체 감지 API와 PyTorch.

데이터 세트 구조

xView 데이터 세트는 월드뷰-3 위성에서 0.3m의 지상 샘플 거리에서 수집한 위성 이미지로 구성되어 있습니다. 여기에는 1,400km² 이상의 이미지에 60개 클래스에 걸쳐 100만 개 이상의 물체가 포함되어 있습니다. 이 데이터 세트는 원격 감지 애플리케이션과 환경 모니터링에 특히 유용합니다.

애플리케이션

xView 데이터 세트는 오버헤드 이미지에서 물체 감지를 위한 딥 러닝 모델을 훈련하고 평가하는 데 널리 사용됩니다. 이 데이터 세트의 다양한 객체 클래스와 고해상도 이미지는 컴퓨터 비전 분야, 특히 위성 이미지 분석 분야의 연구자와 실무자에게 유용한 리소스입니다. 활용 분야는 다음과 같습니다:

군사 및 방위 정찰
도시 계획 및 개발
환경 모니터링
재난 대응 및 평가
인프라 매핑 및 관리

데이터 세트 YAML

데이터 세트 구성을 정의하는 데는 YAML(또 다른 마크업 언어) 파일이 사용됩니다. 여기에는 데이터 세트의 경로, 클래스 및 기타 관련 정보에 대한 정보가 포함되어 있습니다. xView 데이터 세트의 경우, 데이터 세트의 xView.yaml 파일은 다음 위치에서 유지됩니다. https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/xView.yaml.

ultralytics/cfg/datasets/xView.yaml

# Ultralytics 🚀 AGPL-3.0 License - https://ultralytics.com/license

# DIUx xView 2018 Challenge https://challenge.xviewdataset.org by U.S. National Geospatial-Intelligence Agency (NGA)
# --------  DOWNLOAD DATA MANUALLY and jar xf val_images.zip to 'datasets/xView' before running train command!  --------
# Documentation: https://docs.ultralytics.com/datasets/detect/xview/
# Example usage: yolo train data=xView.yaml
# parent
# ├── ultralytics
# └── datasets
#     └── xView  ← downloads here (20.7 GB)

# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
path: ../datasets/xView # dataset root dir
train: images/autosplit_train.txt # train images (relative to 'path') 90% of 847 train images
val: images/autosplit_val.txt # train images (relative to 'path') 10% of 847 train images

# Classes
names:
  0: Fixed-wing Aircraft
  1: Small Aircraft
  2: Cargo Plane
  3: Helicopter
  4: Passenger Vehicle
  5: Small Car
  6: Bus
  7: Pickup Truck
  8: Utility Truck
  9: Truck
  10: Cargo Truck
  11: Truck w/Box
  12: Truck Tractor
  13: Trailer
  14: Truck w/Flatbed
  15: Truck w/Liquid
  16: Crane Truck
  17: Railway Vehicle
  18: Passenger Car
  19: Cargo Car
  20: Flat Car
  21: Tank car
  22: Locomotive
  23: Maritime Vessel
  24: Motorboat
  25: Sailboat
  26: Tugboat
  27: Barge
  28: Fishing Vessel
  29: Ferry
  30: Yacht
  31: Container Ship
  32: Oil Tanker
  33: Engineering Vehicle
  34: Tower crane
  35: Container Crane
  36: Reach Stacker
  37: Straddle Carrier
  38: Mobile Crane
  39: Dump Truck
  40: Haul Truck
  41: Scraper/Tractor
  42: Front loader/Bulldozer
  43: Excavator
  44: Cement Mixer
  45: Ground Grader
  46: Hut/Tent
  47: Shed
  48: Building
  49: Aircraft Hangar
  50: Damaged Building
  51: Facility
  52: Construction Site
  53: Vehicle Lot
  54: Helipad
  55: Storage Tank
  56: Shipping container lot
  57: Shipping Container
  58: Pylon
  59: Tower

# Download script/URL (optional) ---------------------------------------------------------------------------------------
download: |
  import json
  import os
  from pathlib import Path

  import numpy as np
  from PIL import Image
  from tqdm import tqdm

  from ultralytics.data.utils import autosplit
  from ultralytics.utils.ops import xyxy2xywhn


  def convert_labels(fname=Path("xView/xView_train.geojson")):
      """Converts xView geoJSON labels to YOLO format, mapping classes to indices 0-59 and saving as text files."""
      path = fname.parent
      with open(fname, encoding="utf-8") as f:
          print(f"Loading {fname}...")
          data = json.load(f)

      # Make dirs
      labels = Path(path / "labels" / "train")
      os.system(f"rm -rf {labels}")
      labels.mkdir(parents=True, exist_ok=True)

      # xView classes 11-94 to 0-59
      xview_class2index = [-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 0, 1, 2, -1, 3, -1, 4, 5, 6, 7, 8, -1, 9, 10, 11,
                           12, 13, 14, 15, -1, -1, 16, 17, 18, 19, 20, 21, 22, -1, 23, 24, 25, -1, 26, 27, -1, 28, -1,
                           29, 30, 31, 32, 33, 34, 35, 36, 37, -1, 38, 39, 40, 41, 42, 43, 44, 45, -1, -1, -1, -1, 46,
                           47, 48, 49, -1, 50, 51, -1, 52, -1, -1, -1, 53, 54, -1, 55, -1, -1, 56, -1, 57, -1, 58, 59]

      shapes = {}
      for feature in tqdm(data["features"], desc=f"Converting {fname}"):
          p = feature["properties"]
          if p["bounds_imcoords"]:
              id = p["image_id"]
              file = path / "train_images" / id
              if file.exists():  # 1395.tif missing
                  try:
                      box = np.array([int(num) for num in p["bounds_imcoords"].split(",")])
                      assert box.shape[0] == 4, f"incorrect box shape {box.shape[0]}"
                      cls = p["type_id"]
                      cls = xview_class2index[int(cls)]  # xView class to 0-60
                      assert 59 >= cls >= 0, f"incorrect class index {cls}"

                      # Write YOLO label
                      if id not in shapes:
                          shapes[id] = Image.open(file).size
                      box = xyxy2xywhn(box[None].astype(np.float), w=shapes[id][0], h=shapes[id][1], clip=True)
                      with open((labels / id).with_suffix(".txt"), "a", encoding="utf-8") as f:
                          f.write(f"{cls} {' '.join(f'{x:.6f}' for x in box[0])}\n")  # write label.txt
                  except Exception as e:
                      print(f"WARNING: skipping one label for {file}: {e}")


  # Download manually from https://challenge.xviewdataset.org
  dir = Path(yaml["path"])  # dataset root dir
  # urls = [
  #     "https://d307kc0mrhucc3.cloudfront.net/train_labels.zip",  # train labels
  #     "https://d307kc0mrhucc3.cloudfront.net/train_images.zip",  # 15G, 847 train images
  #     "https://d307kc0mrhucc3.cloudfront.net/val_images.zip",  # 5G, 282 val images (no labels)
  # ]
  # download(urls, dir=dir)

  # Convert labels
  convert_labels(dir / "xView_train.geojson")

  # Move images
  images = Path(dir / "images")
  images.mkdir(parents=True, exist_ok=True)
  Path(dir / "train_images").rename(dir / "images" / "train")
  Path(dir / "val_images").rename(dir / "images" / "val")

  # Split
  autosplit(dir / "images" / "train")

사용법

이미지 크기가 640인 xView 데이터 세트에서 100개의 에포크에 대한 모델을 훈련하려면 다음 코드 조각을 사용할 수 있습니다. 사용 가능한 인수의 전체 목록은 모델 학습 페이지를 참조하세요.

열차 예시

PythonCLI

from ultralytics import YOLO

# Load a model
model = YOLO("yolo11n.pt")  # load a pretrained model (recommended for training)

# Train the model
results = model.train(data="xView.yaml", epochs=100, imgsz=640)

# Start training from a pretrained *.pt model
yolo detect train data=xView.yaml model=yolo11n.pt epochs=100 imgsz=640

샘플 데이터 및 주석

xView 데이터 세트에는 바운딩 박스를 사용하여 주석이 달린 다양한 개체 집합이 포함된 고해상도 위성 이미지가 포함되어 있습니다. 다음은 데이터 세트의 데이터와 해당 주석의 몇 가지 예입니다:

데이터 세트 샘플 이미지

오버헤드 이미지: 이 이미지는 오버헤드 이미지에서 물체에 경계 상자가 주석으로 표시된 물체 감지의 예를 보여줍니다. 이 데이터 세트는 이 작업을 위한 모델을 쉽게 개발할 수 있도록 고해상도 위성 이미지를 제공합니다.

이 예는 xView 데이터 세트에 포함된 데이터의 다양성과 복잡성을 보여주며 물체 감지 작업에서 고품질 위성 이미지의 중요성을 강조합니다.

위성 이미지로 작업하는 경우 이러한 관련 데이터 세트를 살펴보는 것도 흥미로울 수 있습니다:

DOTA-v2: 항공 이미지에서 방향성 물체 감지를 위한 데이터 세트
VisDrone: 드론으로 촬영한 이미지에서 물체 감지 및 추적을 위한 데이터 세트
Argoverse: 3D 추적 주석이 포함된 자율 주행을 위한 데이터 세트

인용 및 감사

연구 또는 개발 작업에 xView 데이터세트를 사용하는 경우 다음 논문을 인용해 주세요:

BibTeX

@misc{lam2018xview,
      title={xView: Objects in Context in Overhead Imagery},
      author={Darius Lam and Richard Kuzma and Kevin McGee and Samuel Dooley and Michael Laielli and Matthew Klaric and Yaroslav Bulatov and Brendan McCord},
      year={2018},
      eprint={1802.07856},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

컴퓨터 비전 연구 커뮤니티에 귀중한 기여를 해주신 국방 혁신 유닛 (DIU)과 xView 데이터 세트의 제작자에게 감사의 말씀을 전합니다. xView 데이터 세트와 제작자에 대한 자세한 내용은 xView 데이터 세트 웹사이트를 참조하세요.

자주 묻는 질문