Global Wheat Head Dataset

Q: What are the key features of the Global Wheat Head Dataset?

Key features of the Global Wheat Head Dataset include: These features facilitate the development of robust models capable of generalization across multiple regions.

The Global Wheat Head Dataset is a collection of images designed to support the development of accurate wheat head detection models for applications in wheat phenotyping and crop management. Wheat heads, also known as spikes, are the grain-bearing parts of the wheat plant. Accurate estimation of wheat head density and size is essential for assessing crop health, maturity, and yield potential. The dataset, created by a collaboration of nine research institutes from seven countries, covers multiple growing regions to ensure models generalize well across different environments.

Key Features

The dataset contains over 3,000 training images from Europe (France, UK, Switzerland) and North America (Canada).
It includes approximately 1,000 test images from Australia, Japan, and China.
Images are outdoor field images, capturing the natural variability in wheat head appearances.
Annotations include wheat head bounding boxes to support object detection tasks.

Dataset Structure

The Global Wheat Head Dataset is organized into two main subsets:

Training Set: This subset contains over 3,000 images from Europe and North America. The images are labeled with wheat head bounding boxes, providing ground truth for training object detection models.
Test Set: This subset consists of approximately 1,000 images from Australia, Japan, and China. These images are used for evaluating the performance of trained models on unseen genotypes, environments, and observational conditions.

Applications

The Global Wheat Head Dataset is widely used for training and evaluating deep learning models in wheat head detection tasks. The dataset's diverse set of images, capturing a wide range of appearances, environments, and conditions, make it a valuable resource for researchers and practitioners in the field of plant phenotyping and crop management.

Dataset YAML

A YAML (Yet Another Markup Language) file is used to define the dataset configuration. It contains information about the dataset's paths, classes, and other relevant information. For the case of the Global Wheat Head Dataset, the GlobalWheat2020.yaml file is maintained at https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/GlobalWheat2020.yaml.

ultralytics/cfg/datasets/GlobalWheat2020.yaml

# Ultralytics 🚀 AGPL-3.0 License - https://ultralytics.com/license

# Global Wheat 2020 dataset https://www.global-wheat.com/ by University of Saskatchewan
# Documentation: https://docs.ultralytics.com/datasets/detect/globalwheat2020/
# Example usage: yolo train data=GlobalWheat2020.yaml
# parent
# ├── ultralytics
# └── datasets
#     └── GlobalWheat2020 ← downloads here (7.0 GB)

# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
path: GlobalWheat2020 # dataset root dir
train: # train images (relative to 'path') 3422 images
  - images/arvalis_1
  - images/arvalis_2
  - images/arvalis_3
  - images/ethz_1
  - images/rres_1
  - images/inrae_1
  - images/usask_1
val: # val images (relative to 'path') 748 images (WARNING: train set contains ethz_1)
  - images/ethz_1
test: # test images (optional) 1276 images
  - images/utokyo_1
  - images/utokyo_2
  - images/nau_1
  - images/uq_1

# Classes
names:
  0: wheat_head

# Download script/URL (optional) ---------------------------------------------------------------------------------------
download: |
  from pathlib import Path

  from ultralytics.utils.downloads import download

  # Download
  dir = Path(yaml["path"])  # dataset root dir
  urls = [
      "https://zenodo.org/record/4298502/files/global-wheat-codalab-official.zip",
      "https://github.com/ultralytics/assets/releases/download/v0.0.0/GlobalWheat2020_labels.zip",
  ]
  download(urls, dir=dir)

  # Make Directories
  for p in "annotations", "images", "labels":
      (dir / p).mkdir(parents=True, exist_ok=True)

  # Move
  for p in (
      "arvalis_1",
      "arvalis_2",
      "arvalis_3",
      "ethz_1",
      "rres_1",
      "inrae_1",
      "usask_1",
      "utokyo_1",
      "utokyo_2",
      "nau_1",
      "uq_1",
  ):
      (dir / "global-wheat-codalab-official" / p).rename(dir / "images" / p)  # move to /images
      f = (dir / "global-wheat-codalab-official" / p).with_suffix(".json")  # json file
      if f.exists():
          f.rename((dir / "annotations" / p).with_suffix(".json"))  # move to /annotations

Usage

To train a YOLO26n model on the Global Wheat Head Dataset for 100 epochs with an image size of 640, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model Training page.

Train Example

PythonCLI

from ultralytics import YOLO

# Load a model
model = YOLO("yolo26n.pt")  # load a pretrained model (recommended for training)

# Train the model
results = model.train(data="GlobalWheat2020.yaml", epochs=100, imgsz=640)

# Start training from a pretrained *.pt model
yolo detect train data=GlobalWheat2020.yaml model=yolo26n.pt epochs=100 imgsz=640

Sample Data and Annotations

The Global Wheat Head Dataset contains a diverse set of outdoor field images, capturing the natural variability in wheat head appearances, environments, and conditions. Here are some examples of data from the dataset, along with their corresponding annotations:

Global Wheat dataset sample showing wheat head detection

Wheat Head Detection: This image demonstrates an example of wheat head detection, where wheat heads are annotated with bounding boxes. The dataset provides a variety of images to facilitate the development of models for this task.

The example showcases the variety and complexity of the data in the Global Wheat Head Dataset and highlights the importance of accurate wheat head detection for applications in wheat phenotyping and crop management.

Citations and Acknowledgments

If you use the Global Wheat Head Dataset in your research or development work, please cite the following paper:

BibTeX

@article{david2020global,
         title={Global Wheat Head Detection (GWHD) Dataset: A Large and Diverse Dataset of High-Resolution RGB-Labelled Images to Develop and Benchmark Wheat Head Detection Methods},
         author={David, Etienne and Madec, Simon and Sadeghi-Tehran, Pouria and Aasen, Helge and Zheng, Bangyou and Liu, Shouyang and Kirchgessner, Norbert and Ishikawa, Goro and Nagasawa, Koichi and Badhon, Minhajul and others},
         journal={arXiv preprint arXiv:2005.02162},
         year={2020}
}

We would like to acknowledge the researchers and institutions that contributed to the creation and maintenance of the Global Wheat Head Dataset as a valuable resource for the plant phenotyping and crop management research community. For more information about the dataset and its creators, visit the Global Wheat Head Dataset website.

FAQ

What is the Global Wheat Head Dataset used for?

The Global Wheat Head Dataset is primarily used for developing and training deep learning models aimed at wheat head detection. This is crucial for applications in wheat phenotyping and crop management, allowing for more accurate estimations of wheat head density, size, and overall crop yield potential. Accurate detection methods help in assessing crop health and maturity, essential for efficient crop management.

How do I train a YOLO26n model on the Global Wheat Head Dataset?

To train a YOLO26n model on the Global Wheat Head Dataset, you can use the following code snippets. Make sure you have the GlobalWheat2020.yaml configuration file specifying dataset paths and classes:

Train Example

PythonCLI

from ultralytics import YOLO

# Load a pretrained model (recommended for training)
model = YOLO("yolo26n.pt")

# Train the model
results = model.train(data="GlobalWheat2020.yaml", epochs=100, imgsz=640)

# Start training from a pretrained *.pt model
yolo detect train data=GlobalWheat2020.yaml model=yolo26n.pt epochs=100 imgsz=640

For a comprehensive list of available arguments, refer to the model Training page.

What are the key features of the Global Wheat Head Dataset?

Key features of the Global Wheat Head Dataset include:

Over 3,000 training images from Europe (France, UK, Switzerland) and North America (Canada).
Approximately 1,000 test images from Australia, Japan, and China.
High variability in wheat head appearances due to different growing environments.
Detailed annotations with wheat head bounding boxes to aid object detection models.

These features facilitate the development of robust models capable of generalization across multiple regions.

Where can I find the configuration YAML file for the Global Wheat Head Dataset?

The configuration YAML file for the Global Wheat Head Dataset, named GlobalWheat2020.yaml, is available on GitHub. You can access it at https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/GlobalWheat2020.yaml. This file contains necessary information about dataset paths, classes, and other configuration details needed for model training in Ultralytics YOLO.

Why is wheat head detection important in crop management?

Wheat head detection is critical in crop management because it enables accurate estimation of wheat head density and size, which are essential for evaluating crop health, maturity, and yield potential. By leveraging deep learning models trained on datasets like the Global Wheat Head Dataset, farmers and researchers can better monitor and manage crops, leading to improved productivity and optimized resource use in agricultural practices. This technological advancement supports sustainable agriculture and food security initiatives.

For more information on applications of AI in agriculture, visit AI in Agriculture.

📅 Created 2 years ago ✏️ Updated 3 months ago