Ultralytics YOLO11 on NVIDIA Jetson using DeepStream SDK and TensorRT

شاهد: How to Run Multiple Streams with DeepStream SDK on Jetson Nano using Ultralytics YOLO11

This comprehensive guide provides a detailed walkthrough for deploying Ultralytics YOLO11 on NVIDIA Jetson devices using DeepStream SDK and TensorRT. Here we use TensorRT to maximize the inference performance on the Jetson platform.

DeepStream على NVIDIA جيتسون

ملاحظه

تم اختبار هذا الدليل مع كل من Seeed Studio reComputer J4012 الذي يعتمد على NVIDIA Jetson Orin NX 16GB الذي يعمل بإصدار JetPack JP5.1.3 وSeeed Studio reComputer J1020 v2 الذي يعتمد على NVIDIA Jetson Nano 4GB الذي يعمل بإصدار JetPack JP4.6.4. من المتوقع أن يعمل عبر جميع مجموعة أجهزة NVIDIA Jetson بما في ذلك الأحدث والقديم.

ما هو NVIDIA DeepStream؟

NVIDIA's DeepStream SDK is a complete streaming analytics toolkit based on GStreamer for AI-based multi-sensor processing, video, audio, and image understanding. It's ideal for vision AI developers, software partners, startups, and OEMs building IVA (Intelligent Video Analytics) apps and services. You can now create stream-processing pipelines that incorporate neural networks and other complex processing tasks like tracking, video encoding/decoding, and video rendering. These pipelines enable real-time analytics on video, image, and sensor data. DeepStream's multi-platform support gives you a faster, easier way to develop vision AI applications and services on-premise, at the edge, and in the cloud.

المتطلبات المسبقه

قبل البدء في اتباع هذا الدليل:

Visit our documentation, Quick Start Guide: NVIDIA Jetson with Ultralytics YOLO11 to set up your NVIDIA Jetson device with Ultralytics YOLO11
قم بتثبيت DeepStream SDK وفقًا لإصدار JetPack
- بالنسبة ل JetPack 4.6.4، قم بتثبيت DeepStream 6.0.1
- بالنسبة إلى JetPack 5.1.3، قم بتثبيت DeepStream 6.3

بقشيش

استخدمنا في هذا الدليل طريقة حزمة دبيان لتثبيت DeepStream SDK على جهاز Jetson. يمكنك أيضًا زيارة DeepStream SDK على Jetson (مؤرشف) للوصول إلى الإصدارات القديمة من DeepStream.

DeepStream Configuration for YOLO11

نحن هنا نستخدم مستودع Marcoslucianops/DeepStream-Yolo GitHub الذي يتضمن دعم NVIDIA DeepStream SDK لنماذج YOLO . نحن نقدر جهود marcoslucianops على مساهماته!

تثبيت التبعيات
```
pip install cmake
pip install onnxsim
```

استنساخ المستودع التالي

git clone https://github.com/marcoslucianops/DeepStream-Yolo
cd DeepStream-Yolo

Download Ultralytics YOLO11 detection model (.pt) of your choice from YOLO11 releases. Here we use yolov8s.pt.
```
wget https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8s.pt
```
ملاحظه
You can also use a custom trained YOLO11 model.
تحويل النموذج إلى ONNX
```
python3 utils/export_yoloV8.py -w yolov8s.pt
```
مرر الوسيطات أدناه إلى الأمر أعلاه
بالنسبة إلى DeepStream 6.0.1 ، استخدم opset 12 أو أقل. المرجع الافتراضي هو 16.
```
--opset 12
```
لتغيير حجم الاستدلال (افتراضي: 640)
```
-s SIZE
--size SIZE
-s HEIGHT WIDTH
--size HEIGHT WIDTH
```
مثال ل 1280:
```
-s 1280
or
-s 1280 1280
```
لتبسيط ONNX نموذج (ديب ستريم > = 6.0)
```
--simplify
```
لاستخدام حجم الدفعة الديناميكي (DeepStream > = 6.1)
```
--dynamic
```
لاستخدام حجم الدفعة الثابت (مثال لحجم الدفعة = 4)
```
--batch 4
```
قم بتعيين الإصدار CUDA وفقًا لإصدار JetPack المثبت

لجت باك 4.6.4:
```
export CUDA_VER=10.2
```
لجت باك 5.1.3:
```
export CUDA_VER=11.4
```

تجميع المكتبة

make -C nvdsinfer_custom_impl_Yolo clean && make -C nvdsinfer_custom_impl_Yolo

قم بتحرير config_infer_primary_yoloV8.txt وفقا للطراز الخاص بك (ل YOLOv8s مع 80 فصلا)
```
[property]
...
onnx-file=yolov8s.onnx
...
num-detected-classes=80
...
```

قم بتحرير deepstream_app_config ملف

...
[primary-gie]
...
config-file=config_infer_primary_yoloV8.txt

يمكنك أيضا تغيير مصدر الفيديو في deepstream_app_config ملف. هنا يتم تحميل ملف فيديو افتراضي
```
...
[source0]
...
uri=file:///opt/nvidia/deepstream/deepstream/samples/streams/sample_1080p_h264.mp4
```

تشغيل الاستدلال

deepstream-app -c deepstream_app_config.txt

ملاحظه

سوف يستغرق الأمر وقتا طويلا لإنشاء ملف TensorRT ملف المحرك قبل بدء الاستدلال. لذا يرجى التحلي بالصبر.

بقشيش

If you want to convert the model to FP16 precision, simply set model-engine-file=model_b1_gpu0_fp16.engine و network-mode=2 داخل config_infer_primary_yoloV8.txt

معايرة INT8

إذا كنت تريد استخدام دقة INT8 للاستدلال ، فأنت بحاجة إلى اتباع الخطوات أدناه

جبر OPENCV متغير البيئة
```
export OPENCV=1
```

تجميع المكتبة

make -C nvdsinfer_custom_impl_Yolo clean && make -C nvdsinfer_custom_impl_Yolo

بالنسبة لمجموعة بيانات COCO، قم بتنزيل فال2017واستخرج، وانتقل إلى DeepStream-Yolo مجلد
إنشاء دليل جديد لصور المعايرة
```
mkdir calibration
```
قم بتشغيل ما يلي لتحديد 1000 صورة عشوائية من مجموعة بيانات COCO لتشغيل المعايرة
```
for jpg in $(ls -1 val2017/*.jpg | sort -R | head -1000); do \
    cp ${jpg} calibration/; \
done
```
ملاحظه
NVIDIA recommends at least 500 images to get a good accuracy. On this example, 1000 images are chosen to get better accuracy (more images = more accuracy). You can set it from head -1000. For example, for 2000 images, head -2000. This process can take a long time.
إنشاء calibration.txt ملف مع جميع الصور المحددة
```
realpath calibration/*jpg > calibration.txt
```
تعيين متغيرات البيئة
```
export INT8_CALIB_IMG_PATH=calibration.txt
export INT8_CALIB_BATCH_SIZE=1
```
ملاحظه
ستؤدي قيم INT8_CALIB_BATCH_SIZE الأعلى إلى دقة أكبر وسرعة معايرة أسرع. اضبطها وفقًا لذاكرتك GPU .

قم بتحديث config_infer_primary_yoloV8.txt ملف

من

...
model-engine-file=model_b1_gpu0_fp32.engine
#int8-calib-file=calib.table
...
network-mode=0
...

ل

...
model-engine-file=model_b1_gpu0_int8.engine
int8-calib-file=calib.table
...
network-mode=1
...

تشغيل الاستدلال

deepstream-app -c deepstream_app_config.txt

إعداد متعدد الدفق

لإعداد تدفقات متعددة ضمن تطبيق تدفق عميق واحد، يمكنك إجراء التغييرات التالية على deepstream_app_config.txt ملف

قم بتغيير الصفوف والأعمدة لإنشاء عرض شبكي وفقا لعدد التدفقات التي تريدها. على سبيل المثال ، بالنسبة إلى 4 تدفقات ، يمكننا إضافة صفين وعمودين.
```
[tiled-display]
rows=2
columns=2
```

جبر num-sources=4 وأضف uri من جميع تيارات 4

[source0]
enable=1
type=3
uri=<path_to_video>
uri=<path_to_video>
uri=<path_to_video>
uri=<path_to_video>
num-sources=4

تشغيل الاستدلال

deepstream-app -c deepstream_app_config.txt

النتائج المعيارية

يلخص الجدول التالي كيفية أداء نماذج YOLOv8s على مستويات مختلفة من الدقة TensorRT مع حجم إدخال 640 × 640 على NVIDIA Jetson Orin NX 16GB.

اسم الموديل	دقة	وقت الاستدلال (م/م)	إطارا في الثانية
YOLOv8s	FP32	15.63	64
	FP16	7.94	126
	INT8	5.53	181

الاعترافات

تم إنشاء هذا الدليل في البداية من قبل أصدقائنا في Seeed Studio و Lakshantha و Elaine.

الأسئلة المتداولة

How do I set up Ultralytics YOLO11 on an NVIDIA Jetson device?

To set up Ultralytics YOLO11 on an NVIDIA Jetson device, you first need to install the DeepStream SDK compatible with your JetPack version. Follow the step-by-step guide in our Quick Start Guide to configure your NVIDIA Jetson for YOLO11 deployment.

What is the benefit of using TensorRT with YOLO11 on NVIDIA Jetson?

Using TensorRT with YOLO11 optimizes the model for inference, significantly reducing latency and improving throughput on NVIDIA Jetson devices. TensorRT provides high-performance, low-latency deep learning inference through layer fusion, precision calibration, and kernel auto-tuning. This leads to faster and more efficient execution, particularly useful for real-time applications like video analytics and autonomous machines.

Can I run Ultralytics YOLO11 with DeepStream SDK across different NVIDIA Jetson hardware?

Yes, the guide for deploying Ultralytics YOLO11 with the DeepStream SDK and TensorRT is compatible across the entire NVIDIA Jetson lineup. This includes devices like the Jetson Orin NX 16GB with JetPack 5.1.3 and the Jetson Nano 4GB with JetPack 4.6.4. Refer to the section DeepStream Configuration for YOLO11 for detailed steps.

How can I convert a YOLO11 model to ONNX for DeepStream?

To convert a YOLO11 model to ONNX format for deployment with DeepStream, use the utils/export_yoloV8.py نص من ديب ستريم-Yolo مستودع.

إليك مثال على ذلك الأمر

python3 utils/export_yoloV8.py -w yolov8s.pt --opset 12 --simplify

لمزيد من التفاصيل حول تحويل النماذج، راجع قسم تصدير النماذج لدينا.

What are the performance benchmarks for YOLO on NVIDIA Jetson Orin NX?

The performance of YOLO11 models on NVIDIA Jetson Orin NX 16GB varies based on TensorRT precision levels. For example, YOLOv8s models achieve:

دقة FP32: 15.63 مترًا في الثانية، 64 إطارًا في الثانية
دقة FP16: 7.94 متر/متر، 126 إطارًا في الثانية
الدقة INT8: 5.53 متر/متر، 181 إطارًا في الثانية

These benchmarks underscore the efficiency and capability of using TensorRT-optimized YOLO11 models on NVIDIA Jetson hardware. For further details, see our Benchmark Results section.

📅 Created 3 months ago ✏️ Updated 22 days ago

Ultralytics YOLO11 on NVIDIA Jetson using DeepStream SDK and TensorRT

ما هو NVIDIA DeepStream؟

المتطلبات المسبقه

DeepStream Configuration for YOLO11

تشغيل الاستدلال

معايرة INT8

تشغيل الاستدلال

إعداد متعدد الدفق

تشغيل الاستدلال

النتائج المعيارية

الاعترافات

الأسئلة المتداولة

How do I set up Ultralytics YOLO11 on an NVIDIA Jetson device?

What is the benefit of using TensorRT with YOLO11 on NVIDIA Jetson?

Can I run Ultralytics YOLO11 with DeepStream SDK across different NVIDIA Jetson hardware?

How can I convert a YOLO11 model to ONNX for DeepStream?

What are the performance benchmarks for YOLO on NVIDIA Jetson Orin NX?

التعليقات