Transfer Learning with Frozen Layers
📚 This guide explains how to freeze YOLOv5 🚀 layers when transfer learning. Transfer learning is a useful way to quickly retrain a model on new data without having to retrain the entire network. Instead, part of the initial weights are frozen in place, and the rest of the weights are used to compute loss and are updated by the optimizer. This requires less resources than normal training and allows for faster training times, though it may also results in reductions to final trained accuracy.
Before You Start
Clone this repo and install requirements.txt dependencies, including Python>=3.8 and PyTorch>=1.7.
$ git clone https://github.com/ultralytics/yolov5 # clone repo $ cd yolov5 $ pip install wandb -qr requirements.txt # install requirements.txt
All layers that match the
freeze list in train.py will be frozen by setting their gradients to zero before training starts.
To see a list of module names:
for k, v in model.named_parameters(): print(k) # Output model.0.conv.conv.weight model.0.conv.bn.weight model.0.conv.bn.bias model.1.conv.weight model.1.bn.weight model.1.bn.bias model.2.cv1.conv.weight model.2.cv1.bn.weight ... model.23.m.0.cv2.bn.weight model.23.m.0.cv2.bn.bias model.24.m.0.weight model.24.m.0.bias model.24.m.1.weight model.24.m.1.bias model.24.m.2.weight model.24.m.2.bias
Looking at the model architecture we can see that the model backbone is layers 0-9: https://github.com/ultralytics/yolov5/blob/58f8ba771e3712b525ca93a1ee66bc2b2df2092f/models/yolov5s.yaml#L12-L48
so we define the freeze list to contain all modules with 'model.0.' - 'model.9.' in their names, and then we start training.
freeze = ['model.%s.' % x for x in range(10)] # parameter names to freeze (full or partial)
Freeze All Layers
To freeze the full model except for the final output convolution layers in Detect(), we set freeze list to contain all modules with 'model.0.' - 'model.23.' in their names, and then we start training.
freeze = ['model.%s.' % x for x in range(24)] # parameter names to freeze (full or partial)
We trained YOLOv5m on VOC on both of the above scenarios, along with a default model (no freezing), starting from the official COCO pretrained
--weights yolov5m.pt. The training command for all runs was:
$ train.py --batch 48 --weights yolov5m.pt --data voc.yaml --epochs 50 --cache --img 512 --hyp hyp.finetune.yaml
The results show that freezing speeds up training, but reduces final accuracy slightly. A full W&B Report of the runs can be found at this link: https://wandb.ai/glenn-jocher/yolov5_tutorial_freeze/reports/Freezing-Layers-in-YOLOv5--VmlldzozMDk3NTg
GPU Utilization Comparison
Interestingly, the more modules are frozen the less GPU memory is required to train, and the lower GPU utilization. This indicates that larger models, or models trained at larger --image-size may benefit from freezing in order to train faster.
- Google Colab and Kaggle notebooks with free GPU:
- Google Cloud Deep Learning VM. See GCP Quickstart Guide
- Amazon Deep Learning AMI. See AWS Quickstart Guide
- Docker Image. See Docker Quickstart Guide
If this badge is green, all YOLOv5 GitHub Actions Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 training (train.py), testing (test.py), inference (detect.py) and export (export.py) on MacOS, Windows, and Ubuntu every 24 hours and on every commit.