Skip to content

YOLOv10 vs. YOLOv9: Advancing Real-Time Object Detection

The year 2024 marked a period of rapid innovation in the object detection landscape, with the release of two significant architectures: YOLOv10 and YOLOv9. While both models aim to push the boundaries of speed and accuracy, they achieve this through fundamentally different architectural philosophies.

YOLOv10 focuses on eliminating the inference latency caused by post-processing through an NMS-free design, whereas YOLOv9 emphasizes information retention in deep networks using Programmable Gradient Information (PGI).

Performance Comparison

The following table provides a detailed look at how these models compare across standard benchmarks. The data highlights the trade-offs between parameter efficiency, inference speed, and detection accuracy (mAP).

Modelsize
(pixels)
mAPval
50-95
Speed
CPU ONNX
(ms)
Speed
T4 TensorRT10
(ms)
params
(M)
FLOPs
(B)
YOLOv10n64039.5-1.562.36.7
YOLOv10s64046.7-2.667.221.6
YOLOv10m64051.3-5.4815.459.1
YOLOv10b64052.7-6.5424.492.0
YOLOv10l64053.3-8.3329.5120.3
YOLOv10x64054.4-12.256.9160.4
YOLOv9t64038.3-2.32.07.7
YOLOv9s64046.8-3.547.126.4
YOLOv9m64051.4-6.4320.076.3
YOLOv9c64053.0-7.1625.3102.1
YOLOv9e64055.6-16.7757.3189.0

YOLOv10: The End-to-End Pioneer

YOLOv10, developed by researchers at Tsinghua University, represents a shift toward end-to-end processing. Released on May 23, 2024, by Ao Wang, Hui Chen, and colleagues, it addresses the bottleneck of Non-Maximum Suppression (NMS).

Learn more about YOLOv10

Key Architectural Features

  • NMS-Free Training: By employing consistent dual assignments, YOLOv10 eliminates the need for NMS during inference. This reduces latency and simplifies deployment pipelines, particularly for edge computing applications.
  • Holistic Efficiency Design: The architecture optimizes various components to reduce computational overhead (FLOPs) while maintaining high capability.
  • Improved Latency: As shown in the table, YOLOv10 models generally offer lower inference times compared to their YOLOv9 counterparts for similar accuracy levels.

For technical details, you can consult the YOLOv10 arXiv paper.

YOLOv9: Mastering Information Flow

YOLOv9, released on February 21, 2024, by Chien-Yao Wang and Hong-Yuan Mark Liao from Academia Sinica, focuses on the theoretical issue of information loss in deep neural networks.

Learn more about YOLOv9

Key Architectural Features

  • GELAN Architecture: The Generalized Efficient Layer Aggregation Network combines the strengths of CSPNet and ELAN to maximize parameter utilization.
  • Programmable Gradient Information (PGI): This auxiliary supervision mechanism ensures that deep layers retain critical information for accurate detection, making the model highly effective for tasks requiring high precision.
  • High Accuracy: The YOLOv9e model achieves an impressive mAPval of 55.6%, outperforming many contemporaries in pure detection accuracy.

For a deeper dive, read the YOLOv9 arXiv paper.

Training and Ease of Use

Both models are fully integrated into the Ultralytics ecosystem, providing a unified and seamless experience for developers. Whether you are using YOLOv10 or YOLOv9, the Ultralytics Python API abstracts away the complexity of training pipelines, data augmentation, and logging.

Code Example

Training a model on a custom dataset or a standard benchmark like COCO8 is straightforward. The framework automatically handles the differences in architecture.

from ultralytics import YOLO

# Load a model (Choose YOLOv10 or YOLOv9)
model = YOLO("yolov10n.pt")  # or "yolov9c.pt"

# Train the model on the COCO8 dataset
results = model.train(data="coco8.yaml", epochs=100, imgsz=640)

# Validate the model
model.val()

Memory Efficiency

Ultralytics YOLO models are engineered for optimal GPU memory usage. Compared to transformer-based architectures or older detection models, they allow for larger batch sizes on consumer-grade hardware, making state-of-the-art AI accessible to a wider audience.

Ideal Use Cases

Choosing between YOLOv10 and YOLOv9 often depends on the specific constraints of your deployment environment.

When to Choose YOLOv10

  • Low Latency Constraints: If your application runs on mobile devices or embedded systems where every millisecond counts, the NMS-free design of YOLOv10 offers a significant advantage.
  • Simple Deployment: Removing post-processing steps simplifies the export to formats like ONNX or TensorRT, reducing the risk of operator incompatibility.
  • Real-Time Video: Ideal for traffic management or high-speed manufacturing lines where throughput is critical.

When to Choose YOLOv9

  • Maximum Accuracy: For research applications or scenarios where precision is paramount (e.g., medical image analysis), the PGI-enhanced architecture of YOLOv9e delivers superior results.
  • Small Object Detection: The rich feature preservation of GELAN makes YOLOv9 particularly robust for detecting small or occluded objects in aerial imagery.
  • Complex Scenes: In environments with high visual clutter, the programmable gradient information helps the model distinguish relevant features more effectively.

The Future is Here: YOLO26

While YOLOv9 and YOLOv10 are powerful tools, the field of computer vision evolves rapidly. Ultralytics recently released YOLO26, a model that synthesizes the best features of previous generations while introducing groundbreaking optimizations.

Learn more about YOLO26

YOLO26 is the recommended choice for new projects, offering a superior balance of speed, accuracy, and versatility.

Why Upgrade to YOLO26?

  • End-to-End NMS-Free: Like YOLOv10, YOLO26 is natively end-to-end. It eliminates NMS post-processing, ensuring faster inference and simplified deployment pipelines.
  • MuSGD Optimizer: Inspired by innovations in Large Language Model (LLM) training (specifically Moonshot AI's Kimi K2), YOLO26 utilizes a hybrid of SGD and the Muon optimizer. This results in significantly more stable training and faster convergence.
  • DFL Removal: By removing Distribution Focal Loss, YOLO26 streamlines the model architecture, making it friendlier for export and compatible with a wider range of edge/low-power devices.
  • Performance Leap: Optimizations specifically targeting CPU inference deliver speeds up to 43% faster than previous generations, making it a powerhouse for edge AI.
  • Task Versatility: Unlike the detection-focused releases of v9 and v10, YOLO26 includes specialized improvements for all tasks:
    • Segmentation: New semantic segmentation loss and multi-scale proto.
    • Pose: Residual Log-Likelihood Estimation (RLE) for high-accuracy keypoints.
    • OBB: Specialized angle loss to handle boundary issues in Oriented Bounding Box tasks.

Streamlined Workflow with Ultralytics Platform

Developers can leverage the Ultralytics Platform (formerly HUB) to manage the entire lifecycle of their YOLO26 models. From annotating datasets to training on the cloud and deploying to edge devices, the Platform provides a unified interface that accelerates time-to-market.

Conclusion

Both YOLOv10 and YOLOv9 represent significant milestones in the history of object detection. YOLOv10 proved that NMS-free architectures could achieve state-of-the-art performance, while YOLOv9 demonstrated the importance of gradient information flow in deep networks.

However, for developers seeking the most robust, versatile, and future-proof solution, YOLO26 stands out as the premier choice. By combining an NMS-free design with the revolutionary MuSGD optimizer and extensive task support, YOLO26 offers the best performance balance for modern computer vision applications.

  • YOLO11 - The robust predecessor to YOLO26, known for its stability.
  • YOLOv8 - A versatile classic widely used in industry.
  • RT-DETR - A transformer-based real-time detector.

Comments