ΠŸΠ΅Ρ€Π΅ΠΉΡ‚ΠΈ ΠΊ содСрТимому

Optimizing YOLO11 Inferences with Neural Magic's DeepSparse Engine

When deploying object detection models like Ultralytics YOLO11 on various hardware, you can bump into unique issues like optimization. This is where YOLO11's integration with Neural Magic's DeepSparse Engine steps in. It transforms the way YOLO11 models are executed and enables GPU-level performance directly on CPUs.

This guide shows you how to deploy YOLO11 using Neural Magic's DeepSparse, how to run inferences, and also how to benchmark performance to ensure it is optimized.

Neural Magic'DeepSparse

Neural MagicΠžΠ±Π·ΠΎΡ€ DeepSparse

Neural Magic's DeepSparse is an inference run-time designed to optimize the execution of neural networks on CPUs. It applies advanced techniques like sparsity, pruning, and quantization to dramatically reduce computational demands while maintaining accuracy. DeepSparse offers an agile solution for efficient and scalable neural network execution across various devices.

Benefits of Integrating Neural Magic's DeepSparse with YOLO11

ΠŸΡ€Π΅ΠΆΠ΄Π΅ Ρ‡Π΅ΠΌ ΡƒΠ³Π»ΡƒΠ±Π»ΡΡ‚ΡŒΡΡ Π² Ρ€Π°Π·Π²Π΅Ρ€Ρ‚Ρ‹Π²Π°Π½ΠΈΠ΅ YOLOV8 ΠΈΡΠΏΠΎΠ»ΡŒΠ·ΡƒΡ DeepSparse, Π΄Π°Π²Π°ΠΉΡ‚Π΅ разбСрСмся Π² прСимущСствах использования DeepSparse. Π’ΠΎΡ‚ Π½Π΅ΠΊΠΎΡ‚ΠΎΡ€Ρ‹Π΅ ΠΊΠ»ΡŽΡ‡Π΅Π²Ρ‹Π΅ прСимущСства:

  • Enhanced Inference Speed: Achieves up to 525 FPS (on YOLO11n), significantly speeding up YOLO11's inference capabilities compared to traditional methods.

ΠŸΠΎΠ²Ρ‹ΡˆΠ΅Π½Π½Π°Ρ ΡΠΊΠΎΡ€ΠΎΡΡ‚ΡŒ ΡƒΠΌΠΎΠ·Π°ΠΊΠ»ΡŽΡ‡Π΅Π½ΠΈΠΉ

  • Optimized Model Efficiency: Uses pruning and quantization to enhance YOLO11's efficiency, reducing model size and computational requirements while maintaining accuracy.

ΠžΠΏΡ‚ΠΈΠΌΠΈΠ·ΠΈΡ€ΠΎΠ²Π°Π½Π½Π°Ρ ΡΡ„Ρ„Π΅ΠΊΡ‚ΠΈΠ²Π½ΠΎΡΡ‚ΡŒ ΠΌΠΎΠ΄Π΅Π»ΠΈ

  • Высокая ΠΏΡ€ΠΎΠΈΠ·Π²ΠΎΠ΄ΠΈΡ‚Π΅Π»ΡŒΠ½ΠΎΡΡ‚ΡŒ Π½Π° стандартных процСссорах: ΠžΠ±Π΅ΡΠΏΠ΅Ρ‡ΠΈΠ²Π°Π΅Ρ‚ ΠΏΡ€ΠΎΠΈΠ·Π²ΠΎΠ΄ΠΈΡ‚Π΅Π»ΡŒΠ½ΠΎΡΡ‚ΡŒ, ΠΏΠΎΠ΄ΠΎΠ±Π½ΡƒΡŽ GPU, Π½Π° Ρ†Π΅Π½Ρ‚Ρ€Π°Π»ΡŒΠ½Ρ‹Ρ… процСссорах, прСдоставляя Π±ΠΎΠ»Π΅Π΅ доступный ΠΈ экономичный Π²Π°Ρ€ΠΈΠ°Π½Ρ‚ для Ρ€Π°Π·Π»ΠΈΡ‡Π½Ρ‹Ρ… ΠΏΡ€ΠΈΠ»ΠΎΠΆΠ΅Π½ΠΈΠΉ.

  • Streamlined Integration and Deployment: Offers user-friendly tools for easy integration of YOLO11 into applications, including image and video annotation features.

  • Support for Various Model Types: Compatible with both standard and sparsity-optimized YOLO11 models, adding deployment flexibility.

  • ЭкономичСски эффСктивноС ΠΈ ΠΌΠ°ΡΡˆΡ‚Π°Π±ΠΈΡ€ΡƒΠ΅ΠΌΠΎΠ΅ Ρ€Π΅ΡˆΠ΅Π½ΠΈΠ΅: Π‘ΠΎΠΊΡ€Π°Ρ‰Π°Π΅Ρ‚ эксплуатационныС расходы ΠΈ ΠΏΡ€Π΅Π΄Π»Π°Π³Π°Π΅Ρ‚ ΠΌΠ°ΡΡˆΡ‚Π°Π±ΠΈΡ€ΡƒΠ΅ΠΌΠΎΠ΅ Ρ€Π°Π·Π²Π΅Ρ€Ρ‚Ρ‹Π²Π°Π½ΠΈΠ΅ ΠΏΠ΅Ρ€Π΅Π΄ΠΎΠ²Ρ‹Ρ… ΠΌΠΎΠ΄Π΅Π»Π΅ΠΉ обнаруТСния ΠΎΠ±ΡŠΠ΅ΠΊΡ‚ΠΎΠ².

Как Ρ€Π°Π±ΠΎΡ‚Π°Π΅Ρ‚ тСхнология Neural Magic'DeepSparse'?

Neural MagicВСхнология Deep Sparse основана Π½Π° эффСктивности чСловСчСского ΠΌΠΎΠ·Π³Π° Π² вычислСниях Π½Π΅ΠΉΡ€ΠΎΠ½Π½Ρ‹Ρ… сСтСй. Он ΠΏΠ΅Ρ€Π΅Π½ΠΈΠΌΠ°Π΅Ρ‚ Π΄Π²Π° ΠΊΠ»ΡŽΡ‡Π΅Π²Ρ‹Ρ… ΠΏΡ€ΠΈΠ½Ρ†ΠΈΠΏΠ° ΠΌΠΎΠ·Π³Π° ΡΠ»Π΅Π΄ΡƒΡŽΡ‰ΠΈΠΌ ΠΎΠ±Ρ€Π°Π·ΠΎΠΌ:

  • Sparsity: The process of sparsification involves pruning redundant information from deep learning networks, leading to smaller and faster models without compromising accuracy. This technique reduces the network's size and computational needs significantly.

  • Locality of Reference: DeepSparse ΠΈΡΠΏΠΎΠ»ΡŒΠ·ΡƒΠ΅Ρ‚ ΡƒΠ½ΠΈΠΊΠ°Π»ΡŒΠ½Ρ‹ΠΉ ΠΌΠ΅Ρ‚ΠΎΠ΄ выполнСния, разбивая ΡΠ΅Ρ‚ΡŒ Π½Π° Tensor ΠΊΠΎΠ»ΠΎΠ½ΠΊΠΈ. Π­Ρ‚ΠΈ ΠΊΠΎΠ»ΠΎΠ½ΠΊΠΈ Π²Ρ‹ΠΏΠΎΠ»Π½ΡΡŽΡ‚ΡΡ Π² Π³Π»ΡƒΠ±ΠΈΠ½Ρƒ, ΠΏΠΎΠ»Π½ΠΎΡΡ‚ΡŒΡŽ ΠΏΠΎΠΌΠ΅Ρ‰Π°ΡΡΡŒ Π² кэш CPU. Π’Π°ΠΊΠΎΠΉ ΠΏΠΎΠ΄Ρ…ΠΎΠ΄ ΠΈΠΌΠΈΡ‚ΠΈΡ€ΡƒΠ΅Ρ‚ ΡΡ„Ρ„Π΅ΠΊΡ‚ΠΈΠ²Π½ΠΎΡΡ‚ΡŒ Ρ€Π°Π±ΠΎΡ‚Ρ‹ ΠΌΠΎΠ·Π³Π°, минимизируя ΠΏΠ΅Ρ€Π΅ΠΌΠ΅Ρ‰Π΅Π½ΠΈΠ΅ Π΄Π°Π½Π½Ρ‹Ρ… ΠΈ максимально ΠΈΡΠΏΠΎΠ»ΡŒΠ·ΡƒΡ кэш CPU.

Как Ρ€Π°Π±ΠΎΡ‚Π°Π΅Ρ‚ тСхнология Neural Magic'DeepSparse

Π‘ΠΎΠ»Π΅Π΅ ΠΏΠΎΠ΄Ρ€ΠΎΠ±Π½ΠΎ ΠΎ Ρ‚ΠΎΠΌ, ΠΊΠ°ΠΊ Ρ€Π°Π±ΠΎΡ‚Π°Π΅Ρ‚ тСхнология Neural Magic'DeepSparse, Ρ‡ΠΈΡ‚Π°ΠΉ Π² ΠΈΡ… Π±Π»ΠΎΠ³Π΅.

Creating A Sparse Version of YOLO11 Trained on a Custom Dataset

SparseZoo, an open-source model repository by Neural Magic, offers a collection of pre-sparsified YOLO11 model checkpoints. With SparseML, seamlessly integrated with Ultralytics, users can effortlessly fine-tune these sparse checkpoints on their specific datasets using a straightforward command-line interface.

Checkout Neural Magic's SparseML YOLO11 documentation for more details.

ИспользованиС: Π Π°Π·Π²Π΅Ρ€Ρ‚Ρ‹Π²Π°Π½ΠΈΠ΅ YOLOV8 с ΠΏΠΎΠΌΠΎΡ‰ΡŒΡŽ DeepSparse

Deploying YOLO11 with Neural Magic's DeepSparse involves a few straightforward steps. Before diving into the usage instructions, be sure to check out the range of YOLO11 models offered by Ultralytics. This will help you choose the most appropriate model for your project requirements. Here's how you can get started.

Π¨Π°Π³ 1: Установка

Π§Ρ‚ΠΎΠ±Ρ‹ ΡƒΡΡ‚Π°Π½ΠΎΠ²ΠΈΡ‚ΡŒ Π½Π΅ΠΎΠ±Ρ…ΠΎΠ΄ΠΈΠΌΡ‹Π΅ ΠΏΠ°ΠΊΠ΅Ρ‚Ρ‹, Π²Ρ‹ΠΏΠΎΠ»Π½ΠΈ:

Установка

# Install the required packages
pip install deepsparse[yolov8]

Step 2: Exporting YOLO11 to ONNX Format

DeepSparse Engine requires YOLO11 models in ONNX format. Exporting your model to this format is essential for compatibility with DeepSparse. Use the following command to export YOLO11 models:

Экспорт ΠΌΠΎΠ΄Π΅Π»Π΅ΠΉ

# Export YOLO11 model to ONNX format
yolo task=detect mode=export model=yolo11n.pt format=onnx opset=13

Π­Ρ‚Π° ΠΊΠΎΠΌΠ°Π½Π΄Π° сохранит yolo11n.onnx модСль Π½Π° свой диск.

Π¨Π°Π³ 3: Π Π°Π·Π²Π΅Ρ€Ρ‚Ρ‹Π²Π°Π½ΠΈΠ΅ ΠΈ запуск ΡƒΠΌΠΎΠ·Π°ΠΊΠ»ΡŽΡ‡Π΅Π½ΠΈΠΉ

With your YOLO11 model in ONNX format, you can deploy and run inferences using DeepSparse. This can be done easily with their intuitive Python API:

Π Π°Π·Π²Π΅Ρ€Ρ‚Ρ‹Π²Π°Π½ΠΈΠ΅ ΠΈ запуск ΡƒΠΌΠΎΠ·Π°ΠΊΠ»ΡŽΡ‡Π΅Π½ΠΈΠΉ

from deepsparse import Pipeline

# Specify the path to your YOLO11 ONNX model
model_path = "path/to/yolo11n.onnx"

# Set up the DeepSparse Pipeline
yolo_pipeline = Pipeline.create(task="yolov8", model_path=model_path)

# Run the model on your images
images = ["path/to/image.jpg"]
pipeline_outputs = yolo_pipeline(images=images)

Π¨Π°Π³ 4: Π‘Π΅Π½Ρ‡ΠΌΠ°Ρ€ΠΊΠΈΠ½Π³ ΠΏΡ€ΠΎΠΈΠ·Π²ΠΎΠ΄ΠΈΡ‚Π΅Π»ΡŒΠ½ΠΎΡΡ‚ΠΈ

It's important to check that your YOLO11 model is performing optimally on DeepSparse. You can benchmark your model's performance to analyze throughput and latency:

Π‘Π΅Π½Ρ‡ΠΌΠ°Ρ€ΠΊΠΈΠ½Π³

# Benchmark performance
deepsparse.benchmark model_path="path/to/yolo11n.onnx" --scenario=sync --input_shapes="[1,3,640,640]"

Π¨Π°Π³ 5: Π”ΠΎΠΏΠΎΠ»Π½ΠΈΡ‚Π΅Π»ΡŒΠ½Ρ‹Π΅ возмоТности

DeepSparse provides additional features for practical integration of YOLO11 in applications, such as image annotation and dataset evaluation.

Π”ΠΎΠΏΠΎΠ»Π½ΠΈΡ‚Π΅Π»ΡŒΠ½Ρ‹Π΅ Ρ„ΡƒΠ½ΠΊΡ†ΠΈΠΈ

# For image annotation
deepsparse.yolov8.annotate --source "path/to/image.jpg" --model_filepath "path/to/yolo11n.onnx"

# For evaluating model performance on a dataset
deepsparse.yolov8.eval --model_path "path/to/yolo11n.onnx"

Π’Ρ‹ΠΏΠΎΠ»Π½ΠΈΠ² ΠΊΠΎΠΌΠ°Π½Π΄Ρƒ annotate, Ρ‚Ρ‹ ΠΎΠ±Ρ€Π°Π±ΠΎΡ‚Π°Π΅ΡˆΡŒ ΡƒΠΊΠ°Π·Π°Π½Π½ΠΎΠ΅ ΠΈΠ·ΠΎΠ±Ρ€Π°ΠΆΠ΅Π½ΠΈΠ΅, ΠΎΠ±Π½Π°Ρ€ΡƒΠΆΠΈΡˆΡŒ ΠΎΠ±ΡŠΠ΅ΠΊΡ‚Ρ‹ ΠΈ ΡΠΎΡ…Ρ€Π°Π½ΠΈΡˆΡŒ Π°Π½Π½ΠΎΡ‚ΠΈΡ€ΠΎΠ²Π°Π½Π½ΠΎΠ΅ ΠΈΠ·ΠΎΠ±Ρ€Π°ΠΆΠ΅Π½ΠΈΠ΅ с ΠΎΠ³Ρ€Π°Π½ΠΈΡ‡ΠΈΡ‚Π΅Π»ΡŒΠ½Ρ‹ΠΌΠΈ Ρ€Π°ΠΌΠΊΠ°ΠΌΠΈ ΠΈ классификациСй. АннотированноС ΠΈΠ·ΠΎΠ±Ρ€Π°ΠΆΠ΅Π½ΠΈΠ΅ Π±ΡƒΠ΄Π΅Ρ‚ сохранСно Π² ΠΏΠ°ΠΏΠΊΠ΅ annotation-results. Π­Ρ‚ΠΎ ΠΏΠΎΠΌΠΎΠΆΠ΅Ρ‚ ΡΠΎΠ·Π΄Π°Ρ‚ΡŒ Π²ΠΈΠ·ΡƒΠ°Π»ΡŒΠ½ΠΎΠ΅ прСдставлСниС ΠΎ возмоТностях ΠΌΠΎΠ΄Π΅Π»ΠΈ ΠΏΠΎ ΠΎΠ±Π½Π°Ρ€ΡƒΠΆΠ΅Π½ΠΈΡŽ ΠΎΠ±ΡŠΠ΅ΠΊΡ‚ΠΎΠ².

Ѐункция аннотирования ΠΈΠ·ΠΎΠ±Ρ€Π°ΠΆΠ΅Π½ΠΈΠΉ

After running the eval command, you will receive detailed output metrics such as precision, recall, and mAP (mean Average Precision). This provides a comprehensive view of your model's performance on the dataset. This functionality is particularly useful for fine-tuning and optimizing your YOLO11 models for specific use cases, ensuring high accuracy and efficiency.

РСзюмС

This guide explored integrating Ultralytics' YOLO11 with Neural Magic's DeepSparse Engine. It highlighted how this integration enhances YOLO11's performance on CPU platforms, offering GPU-level efficiency and advanced neural network sparsity techniques.

For more detailed information and advanced usage, visit Neural Magic's DeepSparse documentation. Also, check out Neural Magic's documentation on the integration with YOLO11 here and watch a great session on it here.

Additionally, for a broader understanding of various YOLO11 integrations, visit the Ultralytics integration guide page, where you can discover a range of other exciting integration possibilities.

Π’ΠžΠŸΠ ΠžΠ‘Π« И ΠžΠ’Π’Π•Π’Π«

What is Neural Magic's DeepSparse Engine and how does it optimize YOLO11 performance?

Neural Magic's DeepSparse Engine is an inference runtime designed to optimize the execution of neural networks on CPUs through advanced techniques such as sparsity, pruning, and quantization. By integrating DeepSparse with YOLO11, you can achieve GPU-like performance on standard CPUs, significantly enhancing inference speed, model efficiency, and overall performance while maintaining accuracy. For more details, check out the Neural Magic's DeepSparse section.

How can I install the needed packages to deploy YOLO11 using Neural Magic's DeepSparse?

Installing the required packages for deploying YOLO11 with Neural Magic's DeepSparse is straightforward. You can easily install them using the CLI. Here's the command you need to run:

pip install deepsparse[yolov8]

Once installed, follow the steps provided in the Installation section to set up your environment and start using DeepSparse with YOLO11.

How do I convert YOLO11 models to ONNX format for use with DeepSparse?

To convert YOLO11 models to the ONNX format, which is required for compatibility with DeepSparse, you can use the following CLI command:

yolo task=detect mode=export model=yolo11n.pt format=onnx opset=13

This command will export your YOLO11 model (yolo11n.pt) Π² Ρ„ΠΎΡ€ΠΌΠ°Ρ‚ (yolo11n.onnx), ΠΊΠΎΡ‚ΠΎΡ€Ρ‹Π΅ ΠΌΠΎΠ³ΡƒΡ‚ Π±Ρ‹Ρ‚ΡŒ ΠΈΡΠΏΠΎΠ»ΡŒΠ·ΠΎΠ²Π°Π½Ρ‹ Π΄Π²ΠΈΠΆΠΊΠΎΠΌ DeepSparse Engine. Π‘ΠΎΠ»Π΅Π΅ ΠΏΠΎΠ΄Ρ€ΠΎΠ±Π½ΡƒΡŽ ΠΈΠ½Ρ„ΠΎΡ€ΠΌΠ°Ρ†ΠΈΡŽ ΠΎΠ± экспортС ΠΌΠΎΠ΄Π΅Π»Π΅ΠΉ ΠΌΠΎΠΆΠ½ΠΎ Π½Π°ΠΉΡ‚ΠΈ Π² Ρ€Π°Π·Π΄Π΅Π»Π΅ Π Π°Π·Π΄Π΅Π» "Экспорт ΠΌΠΎΠ΄Π΅Π»Π΅ΠΉ.

How do I benchmark YOLO11 performance on the DeepSparse Engine?

Benchmarking YOLO11 performance on DeepSparse helps you analyze throughput and latency to ensure your model is optimized. You can use the following CLI command to run a benchmark:

deepsparse.benchmark model_path="path/to/yolo11n.onnx" --scenario=sync --input_shapes="[1,3,640,640]"

Π­Ρ‚Π° ΠΊΠΎΠΌΠ°Π½Π΄Π° прСдоставит Ρ‚Π΅Π±Π΅ ΠΆΠΈΠ·Π½Π΅Π½Π½ΠΎ Π²Π°ΠΆΠ½Ρ‹Π΅ ΠΏΠΎΠΊΠ°Π·Π°Ρ‚Π΅Π»ΠΈ ΠΏΡ€ΠΎΠΈΠ·Π²ΠΎΠ΄ΠΈΡ‚Π΅Π»ΡŒΠ½ΠΎΡΡ‚ΠΈ. ΠŸΠΎΠ΄Ρ€ΠΎΠ±Π½Π΅Π΅ ΠΎΠ± этом Ρ‡ΠΈΡ‚Π°ΠΉ Π² Ρ€Π°Π·Π΄Π΅Π»Π΅ "Π‘Π΅Π½Ρ‡ΠΌΠ°Ρ€ΠΊΠΈΠ½Π³ ΠΏΡ€ΠΎΠΈΠ·Π²ΠΎΠ΄ΠΈΡ‚Π΅Π»ΡŒΠ½ΠΎΡΡ‚ΠΈ".

Why should I use Neural Magic's DeepSparse with YOLO11 for object detection tasks?

Integrating Neural Magic's DeepSparse with YOLO11 offers several benefits:

  • Enhanced Inference Speed: Achieves up to 525 FPS, significantly speeding up YOLO11's capabilities.
  • ΠžΠΏΡ‚ΠΈΠΌΠΈΠ·ΠΈΡ€ΠΎΠ²Π°Π½Π½Π°Ρ ΡΡ„Ρ„Π΅ΠΊΡ‚ΠΈΠ²Π½ΠΎΡΡ‚ΡŒ ΠΌΠΎΠ΄Π΅Π»ΠΈ: Π˜ΡΠΏΠΎΠ»ΡŒΠ·ΡƒΠΉ Ρ‚Π΅Ρ…Π½ΠΈΠΊΠΈ разрСТСнности, ΠΎΠ±Ρ€Π΅Π·ΠΊΠΈ ΠΈ квантования, Ρ‡Ρ‚ΠΎΠ±Ρ‹ ΡƒΠΌΠ΅Π½ΡŒΡˆΠΈΡ‚ΡŒ Ρ€Π°Π·ΠΌΠ΅Ρ€ ΠΌΠΎΠ΄Π΅Π»ΠΈ ΠΈ Π²Ρ‹Ρ‡ΠΈΡΠ»ΠΈΡ‚Π΅Π»ΡŒΠ½Ρ‹Π΅ потрСбности, сохранив ΠΏΡ€ΠΈ этом Ρ‚ΠΎΡ‡Π½ΠΎΡΡ‚ΡŒ.
  • Высокая ΠΏΡ€ΠΎΠΈΠ·Π²ΠΎΠ΄ΠΈΡ‚Π΅Π»ΡŒΠ½ΠΎΡΡ‚ΡŒ Π½Π° стандартных процСссорах: ΠŸΡ€Π΅Π΄Π»Π°Π³Π°Π΅Ρ‚ GPU-ΠΏΠΎΠ΄ΠΎΠ±Π½ΡƒΡŽ ΠΏΡ€ΠΎΠΈΠ·Π²ΠΎΠ΄ΠΈΡ‚Π΅Π»ΡŒΠ½ΠΎΡΡ‚ΡŒ Π½Π° экономичном CPU ΠΎΠ±ΠΎΡ€ΡƒΠ΄ΠΎΠ²Π°Π½ΠΈΠΈ.
  • ΠžΠΏΡ‚ΠΈΠΌΠΈΠ·ΠΈΡ€ΠΎΠ²Π°Π½Π½Π°Ρ интСграция: Π£Π΄ΠΎΠ±Π½Ρ‹Π΅ инструмСнты для Π»Π΅Π³ΠΊΠΎΠ³ΠΎ развСртывания ΠΈ ΠΈΠ½Ρ‚Π΅Π³Ρ€Π°Ρ†ΠΈΠΈ.
  • Flexibility: Supports both standard and sparsity-optimized YOLO11 models.
  • ЭкономичСски эффСктивный: Π‘ΠΎΠΊΡ€Π°Ρ‰Π°Π΅Ρ‚ эксплуатационныС расходы Π·Π° счСт эффСктивного использования рСсурсов.

For a deeper dive into these advantages, visit the Benefits of Integrating Neural Magic's DeepSparse with YOLO11 section.


πŸ“… Created 9 months ago ✏️ Updated 12 days ago

ΠšΠΎΠΌΠΌΠ΅Π½Ρ‚Π°Ρ€ΠΈΠΈ