Hand Keypoints Dataset

Q: How is the Hand Keypoints dataset structured?

The Hand Keypoints dataset is divided into two subsets: This structure ensures a comprehensive training and validation process. For more details, see the Dataset Structure section.

Introduction

The hand-keypoints dataset contains 26,768 images of hands annotated with keypoints, making it suitable for training models like Ultralytics YOLO for pose estimation tasks. The annotations were generated using the Google MediaPipe library, ensuring high accuracy and consistency, and the dataset is compatible with Ultralytics YOLO11 formats.

Watch: Hand Keypoints Estimation with Ultralytics YOLO11 | Human Hand Pose Estimation Tutorial

Hand Landmarks

KeyPoints

The dataset includes keypoints for hand detection. The keypoints are annotated as follows:

Wrist
Thumb (4 points)
Index finger (4 points)
Middle finger (4 points)
Ring finger (4 points)
Little finger (4 points)

Each hand has a total of 21 keypoints.

Key Features

Large Dataset: 26,768 images with hand keypoint annotations.
YOLO11 Compatibility: Ready for use with YOLO11 models.
21 Keypoints: Detailed hand pose representation.

Dataset Structure

The hand keypoint dataset is split into two subsets:

Train: This subset contains 18,776 images from the hand keypoints dataset, annotated for training pose estimation models.
Val: This subset contains 7,992 images that can be used for validation purposes during model training.

Applications

Hand keypoints can be used for gesture recognition, AR/VR controls, robotic manipulation, and hand movement analysis in healthcare. They can also be applied in animation for motion capture and biometric authentication systems for security. The detailed tracking of finger positions enables precise interaction with virtual objects and touchless control interfaces.

Dataset YAML

A YAML (Yet Another Markup Language) file is used to define the dataset configuration. It contains information about the dataset's paths, classes, and other relevant information. In the case of the Hand Keypoints dataset, the hand-keypoints.yaml file is maintained at https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/hand-keypoints.yaml.

ultralytics/cfg/datasets/hand-keypoints.yaml

# Ultralytics 🚀 AGPL-3.0 License - https://ultralytics.com/license

# Hand Keypoints dataset by Ultralytics
# Documentation: https://docs.ultralytics.com/datasets/pose/hand-keypoints/
# Example usage: yolo train data=hand-keypoints.yaml
# parent
# ├── ultralytics
# └── datasets
#     └── hand-keypoints  ← downloads here (369 MB)

# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
path: ../datasets/hand-keypoints # dataset root dir
train: train # train images (relative to 'path') 18776 images
val: val # val images (relative to 'path') 7992 images

# Keypoints
kpt_shape: [21, 3] # number of keypoints, number of dims (2 for x,y or 3 for x,y,visible)
flip_idx:
  [0, 1, 2, 4, 3, 10, 11, 12, 13, 14, 5, 6, 7, 8, 9, 15, 16, 17, 18, 19, 20]

# Classes
names:
  0: hand

# Download script/URL (optional)
download: https://github.com/ultralytics/assets/releases/download/v0.0.0/hand-keypoints.zip

Usage

To train a YOLO11n-pose model on the Hand Keypoints dataset for 100 epochs with an image size of 640, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model Training page.

Train Example

PythonCLI

from ultralytics import YOLO

# Load a model
model = YOLO("yolo11n-pose.pt")  # load a pretrained model (recommended for training)

# Train the model
results = model.train(data="hand-keypoints.yaml", epochs=100, imgsz=640)

# Start training from a pretrained *.pt model
yolo pose train data=hand-keypoints.yaml model=yolo11n-pose.pt epochs=100 imgsz=640

Sample Images and Annotations

The Hand keypoints dataset contains a diverse set of images with human hands annotated with keypoints. Here are some examples of images from the dataset, along with their corresponding annotations:

Dataset sample image

Mosaiced Image: This image demonstrates a training batch composed of mosaiced dataset images. Mosaicing is a technique used during training that combines multiple images into a single image to increase the variety of objects and scenes within each training batch. This helps improve the model's ability to generalize to different object sizes, aspect ratios, and contexts.

The example showcases the variety and complexity of the images in the Hand Keypoints dataset and the benefits of using mosaicing during the training process.

Citations and Acknowledgments

If you use the hand-keypoints dataset in your research or development work, please acknowledge the following sources:

Credits

We would like to thank the following sources for providing the images used in this dataset:

The images were collected and used under the respective licenses provided by each platform and are distributed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

We would also like to acknowledge the creator of this dataset, Rion Dsilva, for his great contribution to Vision AI research.

FAQ

How do I train a YOLO11 model on the Hand Keypoints dataset?

To train a YOLO11 model on the Hand Keypoints dataset, you can use either Python or the command line interface (CLI). Here's an example for training a YOLO11n-pose model for 100 epochs with an image size of 640:

Example

PythonCLI

from ultralytics import YOLO

# Load a model
model = YOLO("yolo11n-pose.pt")  # load a pretrained model (recommended for training)

# Train the model
results = model.train(data="hand-keypoints.yaml", epochs=100, imgsz=640)

# Start training from a pretrained *.pt model
yolo pose train data=hand-keypoints.yaml model=yolo11n-pose.pt epochs=100 imgsz=640

For a comprehensive list of available arguments, refer to the model Training page.

What are the key features of the Hand Keypoints dataset?

The Hand Keypoints dataset is designed for advanced pose estimation tasks and includes several key features:

Large Dataset: Contains 26,768 images with hand keypoint annotations.
YOLO11 Compatibility: Ready for use with YOLO11 models.
21 Keypoints: Detailed hand pose representation, including wrist and finger joints.

For more details, you can explore the Hand Keypoints Dataset section.

What applications can benefit from using the Hand Keypoints dataset?

The Hand Keypoints dataset can be applied in various fields, including:

Gesture Recognition: Enhancing human-computer interaction.
AR/VR Controls: Improving user experience in augmented and virtual reality.
Robotic Manipulation: Enabling precise control of robotic hands.
Healthcare: Analyzing hand movements for medical diagnostics.
Animation: Capturing motion for realistic animations.
Biometric Authentication: Enhancing security systems.

For more information, refer to the Applications section.

How is the Hand Keypoints dataset structured?

The Hand Keypoints dataset is divided into two subsets:

Train: Contains 18,776 images for training pose estimation models.
Val: Contains 7,992 images for validation purposes during model training.

This structure ensures a comprehensive training and validation process. For more details, see the Dataset Structure section.

How do I use the dataset YAML file for training?

The dataset configuration is defined in a YAML file, which includes paths, classes, and other relevant information. The hand-keypoints.yaml file can be found at hand-keypoints.yaml.

To use this YAML file for training, specify it in your training script or CLI command as shown in the training example above. For more details, refer to the Dataset YAML section.

📅 Created 6 months ago ✏️ Updated 1 month ago