Link to this sectionClearML Integration#

Link to this sectionAbout ClearML#

ClearML is an open-source MLOps platform built to streamline machine learning workflows and save engineering time.

🔨 Track every YOLOv5 training run in the experiment manager.
🔧 Version and access your custom training data with the integrated ClearML data versioning tool.
🔦 Remotely train and monitor YOLOv5 runs using the ClearML Agent.
🔬 Find the best mAP with ClearML hyperparameter optimization.
🔭 Turn your trained YOLOv5 model into an API with a few commands using ClearML Serving.

Use as many or as few of these tools as you need — start with the experiment manager alone, or chain everything together into a full pipeline.

ClearML scalars dashboard showing YOLOv5 training metrics

Link to this section🦾 Setting Things Up#

ClearML needs to communicate with a server to track your experiments and data. You have two options:

Sign up for the free ClearML Hosted Service, or
Deploy your own ClearML server — it is open-source, so it remains a viable option even for sensitive data.

Then install the clearml Python package and connect the SDK to your server:

pip install clearml

Generate credentials at Settings → Workspace → Create new credentials (top right of the ClearML UI), then run:

clearml-init

Follow the prompts. That's it — setup is complete.

Link to this section🚀 Training YOLOv5 With ClearML#

To enable experiment tracking, install the ClearML pip package if you haven't already:

pip install clearml

This will enable integration with the YOLOv5 training script. Every training run from now on will be captured and stored by the ClearML experiment manager.

To customize the project and task names, pass --project and --name to train.py. The defaults are YOLOv5 and Training. ClearML uses / as a subproject delimiter, so avoid / in custom project names.

python train.py --img 640 --batch 16 --epochs 3 --data coco8.yaml --weights yolov5s.pt --cache

Or with custom names:

python train.py --project my_project --name my_training --img 640 --batch 16 --epochs 3 --data coco8.yaml --weights yolov5s.pt --cache

Each run captures:

Source code and uncommitted changes
Installed packages
Hyperparameters
Model checkpoints (use --save-period n to save every n epochs)
Console output
Scalars (mAP_0.5, mAP_0.5:0.95, precision, recall, losses, learning rates)
Machine details, runtime, and creation date
Generated plots such as the label correlogram and confusion matrix
Images with bounding boxes per epoch
Mosaic visualizations per epoch
Validation images per epoch

Everything appears in the ClearML UI so you can monitor training in one place. Add custom columns (for example, mAP_0.5) to sort by the best-performing model, or select multiple experiments to compare them side by side.

Keep reading for hyperparameter optimization and remote execution.

Link to this section🔗 Dataset Version Management#

Versioning data separately from code makes it easy to pull the latest version and ensures full reproducibility. This repository accepts a dataset version ID, fetches the data automatically if it is missing, and records the ID as a task parameter so you always know which data was used in which experiment.

ClearML dataset version management interface

Link to this sectionPrepare Your Dataset#

The YOLOv5 repository supports many datasets via YAML configuration files. By default, datasets download to the ../datasets folder relative to the repository root. After downloading coco128, the folder structure looks like:

..
|_ yolov5
|_ datasets
    |_ coco128
        |_ images
        |_ labels
        |_ LICENSE
        |_ README.txt

Any dataset works, provided you preserve this structure.

Next, copy the dataset's YAML file into the dataset root folder — ClearML reads this file to use the dataset correctly. You can write your own YAML by following the example layout, ensuring it defines path, train, test, val, nc, and names.

..
|_ yolov5
|_ datasets
    |_ coco128
        |_ images
        |_ labels
        |_ coco128.yaml  # <---- HERE
        |_ LICENSE
        |_ README.txt

Link to this sectionUpload Your Dataset#

To register the dataset as a versioned ClearML dataset, change into its root folder and run:

cd ../datasets/coco128
clearml-data sync --project YOLOv5 --name coco128 --folder .

clearml-data sync is shorthand for the following sequence, which you can also run explicitly:

# Add --parent <parent_dataset_id> to base this version on a previous one.
# Duplicate files are not re-uploaded.
clearml-data create --name coco128 --project YOLOv5
clearml-data add --files .
clearml-data close

Link to this sectionTrain On A ClearML Dataset#

With the dataset registered, point training at it by ID:

python train.py --img 640 --batch 16 --epochs 3 --data clearml://YOUR_DATASET_ID --weights yolov5s.pt --cache

Link to this section👀 Hyperparameter Optimization#

With experiments and data versioned, you can build on top of them. Because each tracked experiment captures the full environment — code, installed packages, and configuration — runs are fully reproducible. ClearML lets you clone an experiment, change its parameters, and rerun it automatically, which is the foundation of hyperparameter optimization (HPO).

To run HPO locally, use the bundled script. First make sure a training task exists in the experiment manager — the script clones it and varies its hyperparameters.

Fill in the template task ID in utils/loggers/clearml/hpo.py, then run:

# Install Optuna or change the optimizer to RandomSearch.
pip install optuna
python utils/loggers/clearml/hpo.py

Switch task.execute_locally() to task.execute() to push the job to a ClearML queue for a remote agent to pick up.

ClearML HPO dashboard with YOLOv5 metrics

Link to this section🤯 Remote Execution (Advanced)#

Running HPO locally is convenient, but you'll often want experiments on more powerful hardware — an on-prem GPU machine or a cloud instance. That's the role of the ClearML Agent:

Each tracked experiment contains everything needed to reproduce it on another machine (installed packages, uncommitted changes, and configuration). A ClearML agent listens to a queue, picks up incoming tasks, recreates the environment, runs the job, and streams scalars and plots back to the experiment manager.

Turn any machine — a cloud VM, a local GPU box, or a laptop — into a ClearML agent with:

clearml-agent daemon --queue QUEUES_TO_LISTEN_TO [--docker]

Link to this sectionCloning, Editing, And Enqueuing#

With an agent running, you can assign it work directly from the UI:

🪄 Right-click an experiment and clone it.
🎯 Edit its hyperparameters.
⏳ Right-click the cloned task and enqueue it to a target queue.

Enqueue a task from the UI

Link to this sectionExecuting A Task Remotely#

You can also flag a running script for remote execution programmatically by adding task.execute_remotely() after the ClearML logger has been instantiated. Add the highlighted line to train.py:

# ...
# Loggers
data_dict = None
if RANK in {-1, 0}:
    loggers = Loggers(save_dir, weights, opt, hyp, LOGGER)  # loggers instance
    if loggers.clearml:
        loggers.clearml.task.execute_remotely(queue="my_queue")  # <------ ADD THIS LINE
        # data_dict is None unless the user selected a ClearML dataset, in which case ClearML fills it in.
        data_dict = loggers.clearml.data_dict
# ...

After this change, running the training script executes up to that line, packages the code, and ships it to the queue.

Link to this sectionAutoscaling Workers#

ClearML ships with autoscalers that spin up remote machines in AWS, GCP, or Azure when a queue has pending experiments, convert them into ClearML agents, and shut them down when work is finished — so you only pay for compute that is actually running.

Watch the getting-started video below:

Link to this sectionLearn More#

For more information about integrating ClearML with Ultralytics models, check out our ClearML integration guide and explore how you can enhance your MLOps workflow with other experiment tracking tools.

Contributors

RAraimbekovm¹ GLglenn-jocher¹

Created 1 month agoUpdated 3 weeks ago