Link to this sectionYOLOv5의 모델 프루닝(Pruning) 및 희소성(Sparsity)#

📚 이 가이드는 YOLOv5 🚀 모델에 프루닝을 적용하여 성능을 유지하면서 더욱 효율적인 네트워크를 만드는 방법을 설명합니다.

Link to this section모델 프루닝이란 무엇인가요?#

모델 프루닝은 신경망에서 중요도가 낮은 파라미터(가중치 및 연결)를 제거하여 모델의 크기와 복잡도를 줄이는 기법입니다. 이 과정은 다음과 같은 여러 이점을 가진 더 효율적인 모델을 생성합니다:

자원이 제한된 장치에 쉽게 배포할 수 있도록 모델 크기 감소
정확도에 미치는 영향을 최소화하면서 추론 속도 향상
메모리 사용량 및 에너지 소비 절감
실시간 애플리케이션을 위한 전반적인 효율성 개선

프루닝은 모델 성능에 기여도가 낮은 파라미터를 식별하고 제거하여, 유사한 정확도를 유지하면서도 더 가벼운 모델을 만드는 방식으로 작동합니다.

Link to this section시작하기 전에#

Clone repo and install requirements.txt in a Python>=3.8.0 environment, including PyTorch>=1.8. Models and datasets download automatically from the latest YOLOv5 release.

git clone https://github.com/ultralytics/yolov5 # clone
cd yolov5
pip install -r requirements.txt # install

Link to this section기준 성능 테스트#

프루닝을 수행하기 전에 비교 대상이 될 기준 성능을 설정하십시오. 이 명령어는 이미지 크기 640픽셀에서 COCO val2017 데이터셋으로 YOLOv5x를 테스트합니다. yolov5x.pt는 사용 가능한 가장 크고 정확한 모델입니다. 다른 옵션으로는 yolov5s.pt, yolov5m.pt, yolov5l.pt 또는 커스텀 데이터셋을 학습하여 얻은 체크포인트인 ./weights/best.pt가 있습니다. 사용 가능한 모든 모델에 대한 자세한 내용은 README 테이블을 참조하십시오.

python val.py --weights yolov5x.pt --data coco.yaml --img 640 --half

출력:

val: data=/content/yolov5/data/coco.yaml, weights=['yolov5x.pt'], batch_size=32, imgsz=640, conf_thres=0.001, iou_thres=0.65, task=val, device=, workers=8, single_cls=False, augment=False, verbose=False, save_txt=False, save_conf=False, save_json=True, project=runs/val, name=exp, exist_ok=False, half=True, dnn=False
YOLOv5 🚀 v6.0-224-g4c40933 torch 1.10.0+cu111 CUDA:0 (Tesla V100-SXM2-16GB, 16160MiB)

Fusing layers...
Model Summary: 444 layers, 86705005 parameters, 0 gradients
val: Scanning '/content/datasets/coco/val2017.cache' images and labels... 4952 found, 48 missing, 0 empty, 0 corrupt: 100% 5000/5000 [00:00<?, ?it/s]
               Class     Images     Labels          P          R     mAP@.5 mAP@.5:.95: 100% 157/157 [01:12<00:00,  2.16it/s]
                 all       5000      36335      0.732      0.628      0.683      0.496
Speed: 0.1ms pre-process, 5.2ms inference, 1.7ms NMS per image at shape (32, 3, 640, 640)  # <--- base speed

Evaluating pycocotools mAP... saving runs/val/exp-2/yolov5x_predictions.json...
...
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.507  # <--- base mAP
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.689
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.552
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.345
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.559
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.652
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.381
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.630
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.682
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.526
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.731
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.829
Results saved to runs/val/exp

Link to this sectionYOLOv5x에 프루닝 적용 (30% 희소성)#

We can apply pruning to the model using the torch_utils.prune() command defined in utils/torch_utils.py. To test a pruned model, we update val.py to prune YOLOv5x to 0.3 sparsity (30% of weights set to zero):

YOLOv5 model pruning to 30% sparsity code

30% 프루닝 출력:

val: data=/content/yolov5/data/coco.yaml, weights=['yolov5x.pt'], batch_size=32, imgsz=640, conf_thres=0.001, iou_thres=0.65, task=val, device=, workers=8, single_cls=False, augment=False, verbose=False, save_txt=False, save_conf=False, save_json=True, project=runs/val, name=exp, exist_ok=False, half=True, dnn=False
YOLOv5 🚀 v6.0-224-g4c40933 torch 1.10.0+cu111 CUDA:0 (Tesla V100-SXM2-16GB, 16160MiB)

Fusing layers...
Model Summary: 444 layers, 86705005 parameters, 0 gradients
Pruning model...  0.3 global sparsity
val: Scanning '/content/datasets/coco/val2017.cache' images and labels... 4952 found, 48 missing, 0 empty, 0 corrupt: 100% 5000/5000 [00:00<?, ?it/s]
               Class     Images     Labels          P          R     mAP@.5 mAP@.5:.95: 100% 157/157 [01:11<00:00,  2.19it/s]
                 all       5000      36335      0.724      0.614      0.671      0.478
Speed: 0.1ms pre-process, 5.2ms inference, 1.7ms NMS per image at shape (32, 3, 640, 640)  # <--- prune speed

Evaluating pycocotools mAP... saving runs/val/exp-3/yolov5x_predictions.json...
...
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.489  # <--- prune mAP
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.677
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.537
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.334
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.542
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.635
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.370
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.612
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.664
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.496
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.722
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.803
Results saved to runs/val/exp-3

Link to this section결과 분석#

결과를 통해 다음 사항을 확인할 수 있습니다:

30% 희소성 달성: nn.Conv2d 레이어에 있는 모델 가중치 파라미터의 30%가 이제 0이 되었습니다.
추론 시간 변화 없음: 프루닝에도 불구하고 처리 속도는 본질적으로 동일합니다.
최소한의 성능 영향: mAP가 0.507에서 0.489로 약간 감소했습니다(3.6% 감소).
모델 크기 감소: 프루닝된 모델은 저장에 더 적은 메모리를 요구합니다.

이는 프루닝이 성능에 미치는 영향을 최소화하면서 모델 복잡도를 크게 줄일 수 있음을 보여주며, 자원이 제한된 환경에 배포하기 위한 효과적인 최적화 기법임을 입증합니다.

Link to this section프루닝된 모델 미세 조정(Fine-tuning)#

최상의 결과를 얻으려면 프루닝 후에 정확도를 복구하기 위해 모델을 미세 조정해야 합니다. 이는 다음 단계로 수행할 수 있습니다:

원하는 희소성 수준으로 프루닝 적용
더 낮은 학습률로 몇 에포크(epochs) 동안 프루닝된 모델을 학습
미세 조정된 프루닝 모델을 기준 모델과 비교하여 평가

이 과정은 남아 있는 파라미터들이 제거된 연결을 보완하도록 적응하게 하여, 종종 원래의 정확도를 대부분 또는 전부 복구할 수 있게 합니다.

Link to this section지원되는 환경#

Ultralytics는 프로젝트를 빠르게 시작할 수 있도록 CUDA, CUDNN, Python, PyTorch와 같은 필수 종속성이 미리 설치된 다양한 준비된 환경을 제공합니다.

무료 GPU 노트북:
Google Cloud: GCP 퀵스타트 가이드
Amazon: AWS 퀵스타트 가이드
Azure: AzureML 퀵스타트 가이드
Docker: Docker 퀵스타트 가이드

Link to this section프로젝트 상태#

이 배지는 모든 YOLOv5 GitHub Actions CI(지속적 통합) 테스트가 성공적으로 통과되었음을 나타냅니다. 이러한 CI 테스트는 학습, 검증, 추론, 내보내기 및 벤치마크를 포함한 YOLOv5의 다양한 핵심 기능과 성능을 엄격하게 점검합니다. 테스트는 24시간마다 그리고 새로운 커밋이 있을 때마다 수행되며, macOS, Windows 및 Ubuntu에서 일관되고 안정적인 운영을 보장합니다.

기여자

GLglenn-jocher¹¹ LALaughing-q² RIRizwanMunawar¹

생성됨 2023년 11월 12일업데이트됨 지난달