📚 This guide explains hyperparameter evolution for YOLOv5 🚀. Hyperparameter evolution is a method of Hyperparameter Optimization using a Genetic Algorithm (GA) for optimization. UPDATED 25 September 2022.
Hyperparameters in ML control various aspects of training, and finding optimal values for them can be a challenge. Traditional methods like grid searches can quickly become intractable due to 1) the high dimensional search space 2) unknown correlations among the dimensions, and 3) expensive nature of evaluating the fitness at each point, making GA a suitable candidate for hyperparameter searches.
Before You Start
1. Initialize Hyperparameters
YOLOv5 has about 30 hyperparameters used for various training settings. These are defined in
*.yaml files in the
/data directory. Better initial guesses will produce better final results, so it is important to initialize these values properly before evolving. If in doubt, simply use the default values, which are optimized for YOLOv5 COCO training from scratch.
2. Define Fitness
Fitness is the value we seek to maximize. In YOLOv5 we define a default fitness function as a weighted combination of metrics:
mAP@0.5 contributes 10% of the weight and
mAP@0.5:0.95 contributes the remaining 90%, with Precision
P and Recall
R absent. You may adjust these as you see fit or use the default fitness definition (recommended).
Evolution is performed about a base scenario which we seek to improve upon. The base scenario in this example is finetuning COCO128 for 10 epochs using pretrained YOLOv5s. The base scenario training command is:
# Single-GPU python train.py --epochs 10 --data coco128.yaml --weights yolov5s.pt --cache --evolve # Multi-GPU for i in 0 1 2 3 4 5 6 7; do sleep $(expr 30 \* $i) && # 30-second delay (optional) echo 'Starting GPU '$i'...' && nohup python train.py --epochs 10 --data coco128.yaml --weights yolov5s.pt --cache --device $i --evolve > evolve_gpu_$i.log & done # Multi-GPU bash-while (not recommended) for i in 0 1 2 3 4 5 6 7; do sleep $(expr 30 \* $i) && # 30-second delay (optional) echo 'Starting GPU '$i'...' && "$(while true; do nohup python train.py... --device $i --evolve 1 > evolve_gpu_$i.log; done)" & done
The default evolution settings will run the base scenario 300 times, i.e. for 300 generations. You can modify generations via the
--evolve argument, i.e.
python train.py --evolve 1000.
The main genetic operators are crossover and mutation. In this work mutation is used, with a 80% probability and a 0.04 variance to create new offspring based on a combination of the best parents from all previous generations. Results are logged to
runs/evolve/exp/evolve.csv, and the highest fitness offspring is saved every generation as
# YOLOv5 Hyperparameter Evolution Results # Best generation: 287 # Last generation: 300 # metrics/precision, metrics/recall, metrics/mAP_0.5, metrics/mAP_0.5:0.95, val/box_loss, val/obj_loss, val/cls_loss # 0.54634, 0.55625, 0.58201, 0.33665, 0.056451, 0.042892, 0.013441 lr0: 0.01 # initial learning rate (SGD=1E-2, Adam=1E-3) lrf: 0.2 # final OneCycleLR learning rate (lr0 * lrf) momentum: 0.937 # SGD momentum/Adam beta1 weight_decay: 0.0005 # optimizer weight decay 5e-4 warmup_epochs: 3.0 # warmup epochs (fractions ok) warmup_momentum: 0.8 # warmup initial momentum warmup_bias_lr: 0.1 # warmup initial bias lr box: 0.05 # box loss gain cls: 0.5 # cls loss gain cls_pw: 1.0 # cls BCELoss positive_weight obj: 1.0 # obj loss gain (scale with pixels) obj_pw: 1.0 # obj BCELoss positive_weight iou_t: 0.20 # IoU training threshold anchor_t: 4.0 # anchor-multiple threshold # anchors: 3 # anchors per output layer (0 to ignore) fl_gamma: 0.0 # focal loss gamma (efficientDet default gamma=1.5) hsv_h: 0.015 # image HSV-Hue augmentation (fraction) hsv_s: 0.7 # image HSV-Saturation augmentation (fraction) hsv_v: 0.4 # image HSV-Value augmentation (fraction) degrees: 0.0 # image rotation (+/- deg) translate: 0.1 # image translation (+/- fraction) scale: 0.5 # image scale (+/- gain) shear: 0.0 # image shear (+/- deg) perspective: 0.0 # image perspective (+/- fraction), range 0-0.001 flipud: 0.0 # image flip up-down (probability) fliplr: 0.5 # image flip left-right (probability) mosaic: 1.0 # image mosaic (probability) mixup: 0.0 # image mixup (probability) copy_paste: 0.0 # segment copy-paste (probability)
We recommend a minimum of 300 generations of evolution for best results. Note that evolution is generally expensive and time consuming, as the base scenario is trained hundreds of times, possibly requiring hundreds or thousands of GPU hours.
evolve.csv is plotted as
utils.plots.plot_evolve() after evolution finishes with one subplot per hyperparameter showing fitness (y axis) vs hyperparameter values (x axis). Yellow indicates higher concentrations. Vertical distributions indicate that a parameter has been disabled and does not mutate. This is user selectable in the
meta dictionary in train.py, and is useful for fixing parameters and preventing them from evolving.
W&B sweep support was added in PR https://github.com/ultralytics/yolov5/pull/3938. To run a sweep the steps are:
- Set the hyperparameter limits in
data/hyps/sweep.yamland define dataset path and search strategy.
wandb sweep utils/logging/wandb/sweep.yaml. You can optionally pass
- The above command will output an id. Start the sweep with
wandb agent sweep_id. This will start W&B sweep over the defined search space in
sweep.yaml. Here's a live example from this script.
- Notebooks with free GPU:
- Google Cloud Deep Learning VM. See GCP Quickstart Guide
- Amazon Deep Learning AMI. See AWS Quickstart Guide
- Docker Image. See Docker Quickstart Guide
If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training, validation, inference, export and benchmarks on macOS, Windows, and Ubuntu every 24 hours and on every commit.