Skip to content

Cloud Training

Ultralytics Platform Cloud Training offers single-click training on cloud GPUs, making model training accessible without complex setup. Train YOLO models with real-time metrics streaming and automatic checkpoint saving.

Train from UI

Start cloud training directly from the Platform:

  1. Navigate to your project
  2. Click Train Model
  3. Configure training parameters
  4. Click Start Training

Step 1: Select Dataset

Choose a dataset from your uploads:

OptionDescription
Your DatasetsDatasets you've uploaded
Public DatasetsPublic datasets from Explore

Step 2: Configure Model

Select base model and parameters:

ParameterDescriptionDefault
ModelBase architecture (YOLO26n, s, m, l, x)YOLO26n
EpochsNumber of training iterations100
Image SizeInput resolution640
Batch SizeSamples per iterationAuto

Step 3: Select GPU

Choose your compute resources:

TierGPUVRAMPrice/HourBest For
BudgetRTX A20006 GB$0.12Small datasets, testing
BudgetRTX 308010 GB$0.25Medium datasets
BudgetRTX 3080 Ti12 GB$0.30Medium datasets
BudgetA3024 GB$0.44Larger batch sizes
MidRTX 409024 GB$0.60Great price/performance
MidA600048 GB$0.90Large models
MidL424 GB$0.54Inference optimized
MidL40S48 GB$1.72Large batch training
ProA100 40GB40 GB$2.78Production training
ProA100 80GB80 GB$3.44Very large models
ProH10080 GB$5.38Fastest training
EnterpriseH200141 GB$5.38Maximum performance
EnterpriseB200192 GB$10.38Largest models
UltralyticsRTX PRO 600048 GB$3.68Ultralytics infrastructure

GPU Selection

  • RTX 4090: Best price/performance ratio for most jobs at $0.60/hr
  • A100 80GB: Required for large batch sizes or big models
  • H100/H200: Maximum performance for time-sensitive training
  • B200: NVIDIA Blackwell architecture for cutting-edge workloads

Step 4: Start Training

Click Start Training to launch your job. The Platform:

  1. Provisions a GPU instance
  2. Downloads your dataset
  3. Begins training
  4. Streams metrics in real-time

Free Credits

New accounts receive $5 in signup credits ($25 for company emails) - enough for several training runs. Check your balance in Settings > Billing.

Monitor Training

View real-time training progress:

Live Metrics

MetricDescription
LossTraining and validation loss
mAPMean Average Precision
PrecisionCorrect positive predictions
RecallDetected ground truths
GPU UtilGPU utilization percentage
MemoryGPU memory usage

Checkpoints

Checkpoints are saved automatically:

  • Every epoch: Latest weights saved
  • Best model: Highest mAP checkpoint preserved
  • Final model: Weights at training completion

Stop and Resume

Stop Training

Click Stop Training to pause your job:

  • Current checkpoint is saved
  • GPU instance is released
  • Credits stop being charged

Resume Training

Continue from your last checkpoint:

  1. Navigate to the model
  2. Click Resume Training
  3. Confirm continuation

Resume Limitations

You can only resume training that was explicitly stopped. Failed training jobs may need to restart from scratch.

Remote Training

Train on your own hardware while streaming metrics to the Platform.

Package Version Requirement

Platform integration requires ultralytics>=8.4.0. Lower versions will NOT work with Platform.

pip install "ultralytics>=8.4.0"

Setup API Key

  1. Go to Settings > API Keys
  2. Create a new key with training scope
  3. Set the environment variable:
export ULTRALYTICS_API_KEY="your_api_key"

Train with Streaming

Use the project and name parameters to stream metrics:

yolo train model=yolo26n.pt data=coco.yaml epochs=100 \
  project=username/my-project name=experiment-1
from ultralytics import YOLO

model = YOLO("yolo26n.pt")
model.train(
    data="coco.yaml",
    epochs=100,
    project="username/my-project",
    name="experiment-1",
)

Using Platform Datasets

Train with datasets stored on the Platform:

yolo train model=yolo26n.pt data=ul://username/datasets/my-dataset epochs=100

The ul:// URI format automatically downloads and configures your dataset.

Billing

Training costs are based on GPU usage:

Cost Estimation

Before training starts, the Platform estimates total cost based on:

Estimated Cost = Base Time × Model Multiplier × Dataset Multiplier × GPU Speed Factor × GPU Rate

Factors affecting cost:

FactorImpact
Dataset SizeMore images = longer training time
Model SizeLarger models (m, l, x) train slower than (n, s)
Number of EpochsDirect multiplier on training time
Image SizeLarger imgsz increases computation
GPU SpeedFaster GPUs reduce training time

Cost Examples

ScenarioGPUTimeCost
1000 images, YOLO26n, 100 epochsRTX 4090~1 hour~$0.60
5000 images, YOLO26m, 100 epochsA100 80GB~4 hours~$13.76
10000 images, YOLO26x, 200 epochsH100~8 hours~$43.04

Hold/Settle System

The Platform uses a consumer-protection billing model:

  1. Estimate: Cost calculated before training starts
  2. Hold: Estimated amount + 20% safety margin reserved from balance
  3. Train: Reserved amount shown as "Reserved" in your balance
  4. Settle: After completion, charged only for actual GPU time used
  5. Refund: Any excess automatically returned to your balance

Consumer Protection

You're never charged more than the estimate shown before training. If training completes early or is canceled, you only pay for actual compute time used.

Payment Methods

MethodDescription
Account BalancePre-loaded credits
Pay Per JobCharge at job completion

Minimum Balance

A minimum balance of $5.00 is required to start epoch-based training.

View Training Costs

After training, view detailed costs in the Billing tab:

  • Per-epoch cost breakdown
  • Total GPU time
  • Download cost report

Training Tips

Choose the Right Model Size

ModelParametersBest For
YOLO26n2.4MReal-time, edge devices
YOLO26s9.5MBalanced speed/accuracy
YOLO26m20.4MHigher accuracy
YOLO26l24.8MProduction accuracy
YOLO26x55.7MMaximum accuracy

Optimize Training Time

  1. Start small: Test with fewer epochs first
  2. Use appropriate GPU: Match GPU to model/batch size
  3. Validate dataset: Ensure quality before training
  4. Monitor early: Stop if metrics plateau

Troubleshooting

IssueSolution
Training stuck at 0%Check dataset format, retry
Out of memoryReduce batch size or use larger GPU
Poor accuracyIncrease epochs, check data quality
Training slowConsider faster GPU

FAQ

How long does training take?

Training time depends on:

  • Dataset size
  • Model size
  • Number of epochs
  • GPU selected

Typical times (1000 images, 100 epochs):

ModelRTX 4090A100
YOLO26n30 min20 min
YOLO26m60 min40 min
YOLO26x120 min80 min

Can I train overnight?

Yes, training continues until completion. You'll receive a notification when training finishes. Make sure your account has sufficient balance for epoch-based training.

What happens if I run out of credits?

Training pauses at the end of the current epoch. Your checkpoint is saved, and you can resume after adding credits.

Can I use custom training arguments?

Yes, advanced users can specify additional arguments in the training configuration.

Training Parameters Reference

Core Parameters

ParameterTypeDefaultRangeDescription
epochsint1001+Number of training epochs
batchint16-1 = autoBatch size (-1 for auto)
imgszint64032+Input image size
patienceint1000+Early stopping patience
workersint80+Dataloader workers
cacheboolFalse-Cache images (ram/disk)

Learning Rate Parameters

ParameterTypeDefaultRangeDescription
lr0float0.010.0-1.0Initial learning rate
lrffloat0.010.0-1.0Final LR factor
momentumfloat0.9370.0-1.0SGD momentum
weight_decayfloat0.00050.0-1.0L2 regularization
warmup_epochsfloat3.00+Warmup epochs
cos_lrboolFalse-Cosine LR scheduler

Augmentation Parameters

ParameterTypeDefaultRangeDescription
hsv_hfloat0.0150.0-1.0HSV hue augmentation
hsv_sfloat0.70.0-1.0HSV saturation
hsv_vfloat0.40.0-1.0HSV value
degreesfloat0.0-Rotation degrees
translatefloat0.10.0-1.0Translation fraction
scalefloat0.50.0-1.0Scale factor
fliplrfloat0.50.0-1.0Horizontal flip prob
flipudfloat0.00.0-1.0Vertical flip prob
mosaicfloat1.00.0-1.0Mosaic augmentation
mixupfloat0.00.0-1.0Mixup augmentation
copy_pastefloat0.00.0-1.0Copy-paste (segment)

Optimizer Selection

ValueDescription
autoAutomatic selection (default)
SGDStochastic Gradient Descent
AdamAdam optimizer
AdamWAdam with weight decay

Task-Specific Parameters

Some parameters only apply to specific tasks:

  • Segment: overlap_mask, mask_ratio, copy_paste
  • Pose: pose (loss weight), kobj (keypoint objectness)
  • Classify: dropout, erasing, auto_augment


📅 Created 20 days ago ✏️ Updated 13 days ago
glenn-jocherLaughing-q

Comments