EfficientDet vs. YOLOv9: A Detailed Comparison
Choosing the optimal object detection model is critical for computer vision tasks, balancing accuracy, speed, and computational resources. This page provides a detailed technical comparison between EfficientDet and Ultralytics YOLOv9, two powerful models known for their performance in object detection. We will delve into their architectural designs, performance benchmarks, and suitable applications to assist you in making an informed decision.
EfficientDet Overview
EfficientDet, developed by the Google Brain team, was introduced in 2019. It focuses on achieving high accuracy and efficiency through a scalable architecture. EfficientDet utilizes EfficientNet backbones and a novel Bi-directional Feature Pyramid Network (BiFPN) for effective feature fusion across different scales.
Technical Details:
- Authors: Mingxing Tan, Ruoming Pang, and Quoc V. Le
- Organization: Google
- Date: 2019-11-20
- Arxiv Link: https://arxiv.org/abs/1911.09070
- GitHub Link: https://github.com/google/automl/tree/master/efficientdet
- Docs Link: https://github.com/google/automl/tree/master/efficientdet#readme
Strengths:
- Scalability: Offers a family of models (D0-D7) that scale consistently in terms of accuracy and computational cost.
- Accuracy-Efficiency Balance: Achieves strong accuracy with relatively fewer parameters and FLOPs compared to older models at the time of release.
- BiFPN: The Bi-directional Feature Pyramid Network allows for efficient multi-scale feature fusion.
Weaknesses:
- Inference Speed: While efficient for its accuracy level, larger EfficientDet models can be slower than comparable YOLOv9 models, especially on GPU with TensorRT optimization.
- Ecosystem: Lacks the integrated ecosystem, extensive documentation, and active community support provided by Ultralytics for YOLO models. Training and deployment might require more manual effort.
YOLOv9 Overview
Ultralytics YOLOv9, introduced in 2024 by Chien-Yao Wang and Hong-Yuan Mark Liao, represents a significant advancement in the YOLO series. It addresses information loss in deep networks through innovative architectural elements like Programmable Gradient Information (PGI) and the Generalized Efficient Layer Aggregation Network (GELAN). YOLOv9 aims for state-of-the-art accuracy while maintaining high efficiency.
Technical Details:
- Authors: Chien-Yao Wang and Hong-Yuan Mark Liao
- Organization: Institute of Information Science, Academia Sinica, Taiwan
- Date: 2024-02-21
- Arxiv Link: https://arxiv.org/abs/2402.13616
- GitHub Link: https://github.com/WongKinYiu/yolov9
- Docs Link: https://docs.ultralytics.com/models/yolov9/
Strengths:
- State-of-the-art Accuracy: Achieves superior accuracy, particularly evident in the larger YOLOv9e model.
- Parameter and Computational Efficiency: PGI and GELAN architectures enhance feature extraction and reduce information loss, leading to excellent performance with fewer parameters and FLOPs compared to models with similar accuracy.
- Inference Speed: Highly optimized for speed, especially when using TensorRT, significantly outperforming EfficientDet on GPU.
- Ease of Use: Benefits from the Ultralytics ecosystem, offering a streamlined user experience, simple Python API, extensive documentation, and efficient training processes with readily available pre-trained weights.
- Well-Maintained Ecosystem: Actively developed and supported by Ultralytics, with a strong community, frequent updates, and integration with tools like Ultralytics HUB for MLOps workflows.
- Memory Efficiency: Generally requires less CUDA memory during training and inference compared to many other architectures.
Weaknesses:
- CPU Inference Speed: Specific CPU ONNX speeds are not benchmarked in the provided table, but smaller EfficientDet models show very fast CPU inference.
- Novelty: As a newer model, the community is still growing compared to long-established models, though Ultralytics provides robust support.
Performance Comparison
The table below compares various EfficientDet and YOLOv9 model variants based on performance metrics on the COCO dataset.
Model | size (pixels) |
mAPval 50-95 |
Speed CPU ONNX (ms) |
Speed T4 TensorRT10 (ms) |
params (M) |
FLOPs (B) |
---|---|---|---|---|---|---|
EfficientDet-d0 | 640 | 34.6 | 10.2 | 3.92 | 3.9 | 2.54 |
EfficientDet-d1 | 640 | 40.5 | 13.5 | 7.31 | 6.6 | 6.1 |
EfficientDet-d2 | 640 | 43.0 | 17.7 | 10.92 | 8.1 | 11.0 |
EfficientDet-d3 | 640 | 47.5 | 28.0 | 19.59 | 12.0 | 24.9 |
EfficientDet-d4 | 640 | 49.7 | 42.8 | 33.55 | 20.7 | 55.2 |
EfficientDet-d5 | 640 | 51.5 | 72.5 | 67.86 | 33.7 | 130.0 |
EfficientDet-d6 | 640 | 52.6 | 92.8 | 89.29 | 51.9 | 226.0 |
EfficientDet-d7 | 640 | 53.7 | 122.0 | 128.07 | 51.9 | 325.0 |
YOLOv9t | 640 | 38.3 | - | 2.3 | 2.0 | 7.7 |
YOLOv9s | 640 | 46.8 | - | 3.54 | 7.1 | 26.4 |
YOLOv9m | 640 | 51.4 | - | 6.43 | 20.0 | 76.3 |
YOLOv9c | 640 | 53.0 | - | 7.16 | 25.3 | 102.1 |
YOLOv9e | 640 | 55.6 | - | 16.77 | 57.3 | 189.0 |
Analysis:
- Accuracy (mAP): YOLOv9 models generally achieve higher mAP than EfficientDet models with similar or fewer parameters. YOLOv9e reaches the highest mAP (55.6) among the compared models.
- Speed (TensorRT): YOLOv9 models demonstrate significantly faster inference speeds on NVIDIA T4 GPUs using TensorRT compared to EfficientDet models. For instance, YOLOv9c (53.0 mAP) is much faster (7.16 ms) than EfficientDet-d6 (52.6 mAP, 89.29 ms) and EfficientDet-d7 (53.7 mAP, 128.07 ms). Even the smallest YOLOv9t achieves remarkable speed (2.3 ms).
- Efficiency (Params/FLOPs): YOLOv9 models are highly parameter-efficient. YOLOv9c achieves 53.0 mAP with only 25.3M parameters, whereas EfficientDet-d5 needs 33.7M parameters for 51.5 mAP. YOLOv9t provides a good starting point with just 2.0M parameters.
- CPU Speed: While YOLOv9 CPU speeds aren't listed, EfficientDet shows strong performance, especially the smaller D0 variant (10.2 ms).
Conclusion
Both EfficientDet and YOLOv9 are powerful object detection models. EfficientDet offers a scalable family of models with a good balance of accuracy and efficiency, particularly strong on CPU inference for smaller variants.
However, YOLOv9 stands out as the superior choice for most applications, especially those requiring state-of-the-art accuracy and high-speed GPU inference. Its innovative PGI and GELAN architecture delivers exceptional performance with remarkable parameter efficiency. Furthermore, being part of the Ultralytics ecosystem provides significant advantages, including ease of use, a simple API, comprehensive documentation, efficient training, lower memory requirements, and strong community support.
For developers seeking the best combination of accuracy, speed, and usability within a well-supported framework, YOLOv9 is highly recommended.
Explore other models within the Ultralytics ecosystem, such as the versatile YOLOv8, the highly efficient YOLOv10, or the cutting-edge YOLO11, to find the perfect fit for your specific computer vision task.