Skip to content

YOLOv9 vs. EfficientDet: A Detailed Comparison

Choosing the optimal object detection model is critical for computer vision tasks, balancing accuracy, speed, and computational resources. This page provides a detailed technical comparison between Ultralytics YOLOv9 and EfficientDet, two significant models in the object detection landscape. We will delve into their architectural designs, performance benchmarks, and suitable applications to assist you in making an informed decision.

YOLOv9: State-of-the-Art Accuracy and Efficiency

YOLOv9, introduced in 2024 by Chien-Yao Wang and Hong-Yuan Mark Liao from the Institute of Information Science, Academia Sinica, Taiwan, represents a significant advancement in the YOLO series. It is detailed in their paper "YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information" and implemented in their GitHub repository. YOLOv9 addresses the challenge of information loss in deep networks through innovative architectural elements like Programmable Gradient Information (PGI) and Generalized Efficient Layer Aggregation Network (GELAN). These innovations ensure that the model learns effectively and maintains high accuracy with fewer parameters, showcasing a strong balance between performance and efficiency.

Technical Details:

Strengths:

  • State-of-the-art Accuracy: YOLOv9 achieves superior accuracy in object detection, often outperforming competitors at similar parameter counts.
  • Efficient Parameter Utilization: PGI and GELAN architectures enhance feature extraction and reduce information loss, leading to better performance with fewer parameters and FLOPs.
  • Scalability: The YOLOv9 family includes various model sizes (YOLOv9t to YOLOv9e), offering flexibility for different computational capabilities.
  • Ultralytics Ecosystem: While the original research is from Academia Sinica, integration within the Ultralytics framework provides benefits like ease of use, extensive documentation, efficient training processes, readily available pre-trained weights, and strong community support. YOLO models typically exhibit lower memory requirements during training compared to transformer-based models.

Weaknesses:

  • Inference Speed: While highly efficient, the largest YOLOv9 variants might show slower inference speeds compared to the most lightweight EfficientDet models on certain hardware, though often providing higher accuracy.
  • Novelty: As a newer model, real-world deployment examples might be less numerous than for older, established models like EfficientDet, although adoption within the Ultralytics community is rapid.

Use Cases:

YOLOv9 is particularly well-suited for applications where accuracy and efficiency are paramount, such as:

Learn more about YOLOv9

EfficientDet: Scalable and Efficient Object Detection

EfficientDet, developed by the Google Brain team (Mingxing Tan, Ruoming Pang, Quoc V. Le) in 2019, focuses on achieving high efficiency and accuracy through architectural innovations like the BiFPN (Bi-directional Feature Pyramid Network) and compound scaling. The model details are available in their paper "EfficientDet: Scalable and Efficient Object Detection" and the official implementation is hosted on GitHub.

Technical Details:

Strengths:

  • Scalability: Offers a wide range of models (D0-D7) scaled efficiently using a compound coefficient, allowing adaptation to various resource constraints.
  • Efficiency: BiFPN allows for effective multi-scale feature fusion with fewer parameters compared to traditional FPNs.

Weaknesses:

  • Accuracy/Speed Trade-off: While efficient, EfficientDet models can be outperformed in accuracy by comparable YOLOv9 variants (see table below). Larger EfficientDet models show significantly slower inference speeds on GPUs compared to YOLOv9.
  • Complexity: The compound scaling and BiFPN architecture, while effective, might be less straightforward to modify or understand compared to the more modular YOLO architectures.
  • Ecosystem: Lacks the integrated ecosystem, extensive tooling, and active maintenance provided by Ultralytics for YOLO models.

Use Cases:

EfficientDet is suitable for applications where a balance between accuracy and computational resources is needed, particularly when deploying across a range of hardware capabilities.

  • Mobile and edge applications where model size is a constraint (though smaller YOLOv9 models like YOLOv9t offer strong competition).
  • General-purpose object detection tasks.

Learn more about EfficientDet

Performance Comparison: YOLOv9 vs. EfficientDet

The table below compares various YOLOv9 and EfficientDet models based on performance metrics on the COCO val dataset.

Model size
(pixels)
mAPval
50-95
Speed
CPU ONNX
(ms)
Speed
T4 TensorRT10
(ms)
params
(M)
FLOPs
(B)
YOLOv9t 640 38.3 - 2.3 2.0 7.7
YOLOv9s 640 46.8 - 3.54 7.1 26.4
YOLOv9m 640 51.4 - 6.43 20.0 76.3
YOLOv9c 640 53.0 - 7.16 25.3 102.1
YOLOv9e 640 55.6 - 16.77 57.3 189.0
EfficientDet-d0 640 34.6 10.2 3.92 3.9 2.54
EfficientDet-d1 640 40.5 13.5 7.31 6.6 6.1
EfficientDet-d2 640 43.0 17.7 10.92 8.1 11.0
EfficientDet-d3 640 47.5 28.0 19.59 12.0 24.9
EfficientDet-d4 640 49.7 42.8 33.55 20.7 55.2
EfficientDet-d5 640 51.5 72.5 67.86 33.7 130.0
EfficientDet-d6 640 52.6 92.8 89.29 51.9 226.0
EfficientDet-d7 640 53.7 122.0 128.07 51.9 325.0

Analysis: YOLOv9 models consistently demonstrate superior mAP compared to EfficientDet models with similar or even significantly higher parameter counts and FLOPs. For instance, YOLOv9c achieves 53.0 mAP with 25.3M parameters, surpassing EfficientDet-d6 (52.6 mAP, 51.9M parameters). Furthermore, YOLOv9 models exhibit significantly faster inference speeds on NVIDIA T4 GPUs using TensorRT, highlighting their optimization for real-time performance. YOLOv9e reaches the highest mAP (55.6) with considerably faster inference than EfficientDet-d7. Even the smallest YOLOv9t model offers competitive accuracy (38.3 mAP) with extremely low parameters (2.0M) and fast inference (2.3ms).

Conclusion

While EfficientDet was a significant step forward in efficient object detection upon its release, YOLOv9 represents the current state-of-the-art, offering superior accuracy and often better speed, particularly on GPU hardware. YOLOv9's innovative PGI and GELAN architectures provide a more effective balance of performance and computational cost.

For developers and researchers seeking the best combination of accuracy, speed, and ease of use, YOLOv9 is the recommended choice. Its integration within the Ultralytics ecosystem further enhances its appeal, providing streamlined workflows for training, validation, deployment, and robust community support.

For those interested in exploring other cutting-edge models, consider checking out comparisons involving YOLOv8, YOLOv10, or transformer-based models like RT-DETR.



📅 Created 1 year ago ✏️ Updated 1 month ago

Comments