Skip to content

Model Training

Ultralytics Platform provides comprehensive tools for training YOLO models, from organizing experiments to running cloud training jobs with real-time metrics streaming.



Watch: Get Started with Ultralytics Platform - Train

Overview

The Training section helps you:

  • Organize models into projects for easier management
  • Train on cloud GPUs with a single click
  • Monitor real-time metrics during training
  • Compare model performance across experiments
  • Export to 17+ deployment formats (see supported formats)

Ultralytics Platform Train Overview

Workflow

graph LR
    A[📁 Project] --> B[⚙️ Configure]
    B --> C[🚀 Train]
    C --> D[📈 Monitor]
    D --> E[📦 Export]

    style A fill:#4CAF50,color:#fff
    style B fill:#2196F3,color:#fff
    style C fill:#FF9800,color:#fff
    style D fill:#9C27B0,color:#fff
    style E fill:#00BCD4,color:#fff
StageDescription
ProjectCreate a workspace to organize related models
ConfigureSelect dataset, base model, and training parameters
TrainRun on cloud GPUs or your local hardware
MonitorView real-time loss curves and metrics
ExportConvert to 17+ deployment formats (details)

Training Options

Ultralytics Platform supports multiple training approaches:

MethodDescriptionBest For
Cloud TrainingTrain on Ultralytics Cloud GPUsNo local GPU, scalability
Local TrainingTrain locally, stream metrics to the platformExisting hardware, privacy
Colab TrainingUse Google Colab with platform integrationFree GPU access

GPU Options

Available GPUs for cloud training on Ultralytics Cloud:

GPUGenerationVRAMCost/HourBest For
RTX 2000 AdaAda16 GB$0.24Small datasets, testing
RTX A4500Ampere20 GB$0.25Small-medium datasets
RTX 4000 AdaAda20 GB$0.26Medium datasets
RTX A5000Ampere24 GB$0.27Medium datasets
L4Ada24 GB$0.39Inference optimized
A40Ampere48 GB$0.40Larger batch sizes
RTX 3090Ampere24 GB$0.46General training
RTX A6000Ampere48 GB$0.49Large models
RTX PRO 4500Blackwell32 GB$0.54Great price/performance
RTX 4090Ada24 GB$0.59Best price/performance
RTX 6000 AdaAda48 GB$0.77Large batch training
L40SAda48 GB$0.86Large batch training
RTX 5090Blackwell32 GB$0.89Latest consumer generation
L40Ada48 GB$0.99Large models
A100 PCIeAmpere80 GB$1.39Production training
A100 SXMAmpere80 GB$1.49Production training
RTX PRO 6000Blackwell96 GB$1.69Recommended default
H100 PCIeHopper80 GB$2.39High-performance training
H100 SXMHopper80 GB$2.69Fastest training
H100 NVLHopper94 GB$3.07Maximum performance
H200 NVLHopper143 GB$3.39Maximum memory (Pro+)
H200 SXMHopper141 GB$3.59Maximum performance (Pro+)
B200Blackwell180 GB$4.99Largest models (Pro+)

GPU Tier Access

H200 and B200 GPUs require a Pro or Enterprise plan. All other GPUs are available on all plans including Free.

Signup Credits

New accounts receive signup credits for training. Check Billing for details.

Real-Time Metrics

During training, view live metrics across three subtabs:

graph LR
    A[Charts] --> B[Loss Curves]
    A --> C[Performance Metrics]
    D[Console] --> E[Live Logs]
    D --> F[Error Detection]
    G[System] --> H[GPU Utilization]
    G --> I[Memory & Temp]

    style A fill:#2196F3,color:#fff
    style D fill:#FF9800,color:#fff
    style G fill:#9C27B0,color:#fff
SubtabMetrics
ChartsBox/class/DFL loss, mAP50, mAP50-95, precision, recall
ConsoleLive training logs with ANSI color and error detection
SystemGPU utilization, memory, temperature, CPU, disk

Automatic Checkpoints

The Platform automatically saves checkpoints at every epoch. The best model (highest mAP) and final model are always preserved.

Quick Start

Get started with cloud training in under a minute:

  1. Create a project in the sidebar
  2. Click New Model
  3. Select a model, dataset, and GPU
  4. Click Start Training
export ULTRALYTICS_API_KEY="YOUR_API_KEY"
yolo train model=yolo26n.pt data=ul://username/datasets/my-dataset \
  epochs=100 project=username/my-project name=exp1
from ultralytics import YOLO

model = YOLO("yolo26n.pt")
model.train(
    data="ul://username/datasets/my-dataset",
    epochs=100,
    project="username/my-project",
    name="exp1",
)

FAQ

How long does training take?

Training time depends on:

  • Dataset size (number of images)
  • Model size (n, s, m, l, x)
  • Number of epochs
  • GPU type selected

A typical training run with 1000 images, YOLO26n, 100 epochs on RTX PRO 6000 takes about 2-3 hours. Smaller runs (500 images, 50 epochs on RTX 4090) complete in under an hour. See cost examples for detailed estimates.

Can I train multiple models simultaneously?

Yes. Concurrent cloud training limits depend on your plan: Free allows 3, Pro allows 10, and Enterprise is unlimited. For additional parallel training, use remote training from multiple machines.

What happens if training fails?

If training fails:

  1. Checkpoints are saved at each epoch
  2. You can resume from the last checkpoint
  3. Credits are only charged for completed compute time

How do I choose the right GPU?

ScenarioRecommended GPU
Most training jobsRTX PRO 6000
Large datasets or batch sizesH100 SXM or H200 (Pro+)
Budget-consciousRTX 4090


📅 Created 2 months ago ✏️ Updated 9 days ago
glenn-jocherRizwanMunawarsergiuwaxmann

Comments