EfficientDet vs YOLO11: A Technical Comparison
Choosing the right object detection model involves understanding the trade-offs between different architectures. This page provides a detailed technical comparison between Google's EfficientDet and Ultralytics YOLO11, analyzing their architectures, performance metrics, and ideal use cases to guide your selection process for computer vision tasks.
EfficientDet
Authors: Mingxing Tan, Ruoming Pang, Quoc V. Le
Organization: Google
Date: 2019-11-20
Arxiv Link: https://arxiv.org/abs/1911.09070
GitHub Link: https://github.com/google/automl/tree/master/efficientdet
Docs Link: https://github.com/google/automl/tree/master/efficientdet#readme
EfficientDet is a family of object detection models developed by Google Research, introduced in 2019. It focuses on achieving high efficiency and accuracy through architectural innovations.
Architecture and Key Features
EfficientDet utilizes the EfficientNet backbone, known for its compound scaling method that balances network depth, width, and resolution. A key feature is the BiFPN (Bi-directional Feature Pyramid Network), a weighted feature fusion mechanism designed for efficient multi-scale feature aggregation. This architecture allows EfficientDet models to scale effectively from small, fast variants (e.g., EfficientDet-D0) to larger, more accurate ones (e.g., EfficientDet-D7).
Performance Metrics
EfficientDet models offer a range of performance points, trading off speed for accuracy. As seen in the table below, smaller variants like EfficientDet-D0 are very fast on CPU (10.2 ms ONNX) and have low parameter counts (3.9M), while larger variants like D7 achieve higher mAP (53.7) but with significantly increased latency and computational cost.
Strengths
- Scalability: Offers a wide range of models balancing efficiency and accuracy.
- Accuracy: Larger models achieve competitive mAP scores on benchmarks like COCO.
- Efficient Architecture: BiFPN and EfficientNet contribute to good performance per parameter.
Weaknesses
- Primarily Detection Focused: Less inherently versatile compared to models supporting multiple tasks out-of-the-box.
- Ecosystem: May lack the integrated ecosystem, extensive documentation, and active community support found with Ultralytics models.
- Real-time Speed: While efficient, larger variants may not match the GPU inference speeds of highly optimized models like YOLO11 for real-time applications.
Ideal Use Cases
EfficientDet is suitable for applications where a balance between accuracy and computational efficiency is needed, particularly when deploying across different hardware tiers. Examples include:
- Cloud-based image analysis.
- Applications where moderate latency is acceptable.
- Scenarios requiring scalable model options.
Ultralytics YOLO11
Authors: Glenn Jocher, Jing Qiu
Organization: Ultralytics
Date: 2024-09-27
Arxiv Link: None
GitHub Link: https://github.com/ultralytics/ultralytics
Docs Link: https://docs.ultralytics.com/models/yolo11/
Ultralytics YOLO11 is the latest state-of-the-art model in the YOLO series from Ultralytics, released in September 2024. It builds upon the success of predecessors like YOLOv8, focusing on enhancing speed, accuracy, and overall usability within a comprehensive ecosystem.
Architecture and Key Features
YOLO11 employs a single-stage, anchor-free architecture optimized for real-time performance. It features architectural refinements in the backbone, neck, and head to improve feature extraction and detection speed while maintaining high accuracy. A significant advantage is its Versatility, supporting multiple tasks like object detection, instance segmentation, pose estimation, image classification, and oriented bounding boxes (OBB) within a unified framework.
Performance Metrics
YOLO11 demonstrates an excellent Performance Balance, achieving high mAP scores with remarkable inference speeds, especially on GPUs (e.g., YOLO11n at 1.5 ms on T4 TensorRT10). It often requires lower memory during training and inference compared to more complex architectures. The range of models (n, s, m, l, x) allows users to select the optimal trade-off for their specific needs, from edge devices to cloud servers.
Strengths
- Ease of Use: Offers a streamlined user experience with a simple Python API and extensive documentation.
- Well-Maintained Ecosystem: Benefits from the integrated Ultralytics HUB platform for training and deployment, active development, strong community support, and frequent updates.
- High Speed: Optimized for real-time inference, particularly on GPUs using TensorRT.
- Versatility: Supports a wide array of computer vision tasks beyond just object detection.
- Training Efficiency: Efficient training processes with readily available pre-trained weights and lower memory requirements compared to many alternatives.
Weaknesses
- Resource Needs (Large Models): Larger variants (l, x) require significant computational resources, similar to other high-accuracy models.
- Small Object Detection: Like most one-stage detectors, may face challenges with extremely small objects compared to specialized two-stage detectors in certain niche scenarios.
Ideal Use Cases
YOLO11 excels in applications demanding high speed, accuracy, and ease of development:
- Real-time applications: Robotics, autonomous driving, surveillance, and security systems.
- Edge AI: Deployment on resource-constrained devices like NVIDIA Jetson or Raspberry Pi.
- Multi-task solutions: Projects requiring detection, segmentation, and pose estimation simultaneously.
- Rapid Prototyping: Quick development cycles thanks to the user-friendly ecosystem.
Performance Comparison
The table below provides a quantitative comparison of various EfficientDet and YOLO11 model variants based on performance metrics on the COCO dataset. Values in bold indicate the best performance in that column across all listed models.
Model | size (pixels) |
mAPval 50-95 |
Speed CPU ONNX (ms) |
Speed T4 TensorRT10 (ms) |
params (M) |
FLOPs (B) |
---|---|---|---|---|---|---|
EfficientDet-d0 | 640 | 34.6 | 10.2 | 3.92 | 3.9 | 2.54 |
EfficientDet-d1 | 640 | 40.5 | 13.5 | 7.31 | 6.6 | 6.1 |
EfficientDet-d2 | 640 | 43.0 | 17.7 | 10.92 | 8.1 | 11.0 |
EfficientDet-d3 | 640 | 47.5 | 28.0 | 19.59 | 12.0 | 24.9 |
EfficientDet-d4 | 640 | 49.7 | 42.8 | 33.55 | 20.7 | 55.2 |
EfficientDet-d5 | 640 | 51.5 | 72.5 | 67.86 | 33.7 | 130.0 |
EfficientDet-d6 | 640 | 52.6 | 92.8 | 89.29 | 51.9 | 226.0 |
EfficientDet-d7 | 640 | 53.7 | 122.0 | 128.07 | 51.9 | 325.0 |
YOLO11n | 640 | 39.5 | 56.1 | 1.5 | 2.6 | 6.5 |
YOLO11s | 640 | 47.0 | 90.0 | 2.5 | 9.4 | 21.5 |
YOLO11m | 640 | 51.5 | 183.2 | 4.7 | 20.1 | 68.0 |
YOLO11l | 640 | 53.4 | 238.6 | 6.2 | 25.3 | 86.9 |
YOLO11x | 640 | 54.7 | 462.8 | 11.3 | 56.9 | 194.9 |
Conclusion
Both EfficientDet and YOLO11 are powerful object detection models, but they cater to different priorities. EfficientDet offers strong scalability and accuracy, particularly suitable for applications where inference time is less critical than maximizing mAP per parameter.
However, Ultralytics YOLO11 is generally recommended for most users due to its superior ease of use, versatility across multiple vision tasks, faster real-time inference speeds (especially on GPU), and the robust, well-maintained ecosystem provided by Ultralytics. Its efficient training, readily available pre-trained models, and lower memory footprint make it an excellent choice for both researchers and developers looking for rapid development and deployment across diverse platforms.
For users exploring alternatives, Ultralytics offers comparisons with other models like YOLOv10, YOLOv9, YOLOv8, RT-DETR, and DAMO-YOLO within the Ultralytics documentation.