Skip to content

Data Preparation

Data preparation is the foundation of successful computer vision models. Ultralytics Platform provides comprehensive tools for managing your training data, from upload through annotation to analysis.

Overview

The Data section of Ultralytics Platform helps you:

  • Upload images, videos, and ZIP archives
  • Annotate with manual tools and AI-assisted labeling
  • Analyze your data with statistics and visualizations
  • Export in standard formats for local training

Workflow

graph LR
    A[📤 Upload] --> B[🏷️ Annotate]
    B --> C[📊 Analyze]
    C --> D[🚀 Train]

    style A fill:#4CAF50,color:#fff
    style B fill:#2196F3,color:#fff
    style C fill:#FF9800,color:#fff
    style D fill:#9C27B0,color:#fff
StageDescription
UploadImport images, videos, or ZIP archives with automatic processing
AnnotateLabel data with bounding boxes, polygons, keypoints, or classifications
AnalyzeView class distributions, spatial heatmaps, and dimension statistics
ExportDownload in NDJSON format for offline use

Supported Tasks

Ultralytics Platform supports all 5 YOLO task types:

TaskDescriptionAnnotation Tool
DetectObject detection with bounding boxesRectangle tool
SegmentInstance segmentation with pixel masksPolygon tool
PoseKeypoint estimation (17-point COCO format)Keypoint tool
OBBOriented bounding boxes for rotated objectsOriented box tool
ClassifyImage-level classificationClass selector

Key Features

Smart Storage

Ultralytics Platform uses efficient storage technology:

  • Deduplication: Identical images stored only once
  • Integrity: Checksums ensure data integrity
  • Efficiency: Optimized storage and fast processing

Dataset URIs

Reference datasets using the ul:// URI format:

yolo train data=ul://username/datasets/my-dataset

This allows training on Platform datasets from any machine with your API key configured.

Statistics and Visualization

Every dataset includes automatic statistics:

  • Class Distribution: Bar chart of label counts per class
  • Location Heatmap: Spatial distribution of annotations
  • Dimension Analysis: Image width vs height distribution
  • Split Breakdown: Train/validation/test sample counts
  • Datasets: Upload and manage your training data
  • Annotation: Label data with manual and AI-assisted tools

FAQ

What file formats are supported for upload?

Ultralytics Platform supports:

  • Images: JPG, PNG, WebP, TIFF, BMP, and other common formats
  • Videos: MP4, AVI, MOV - frames are extracted automatically
  • Archives: ZIP files containing images with optional YOLO-format labels

What is the maximum dataset size?

Storage limits depend on your plan:

PlanStorage Limit
Free100 GB
Pro500 GB
EnterpriseCustom

Can I use my Platform datasets for local training?

Yes! Use the dataset URI format to train locally:

export ULTRALYTICS_API_KEY="your_key"
yolo train data=ul://username/datasets/my-dataset epochs=100

Or export your dataset in NDJSON format for fully offline training.



📅 Created 0 days ago ✏️ Updated 0 days ago
glenn-jocher

Comments