Link to this sectionDEEPX Export for Ultralytics YOLO Models#

Q: How do I export my Ultralytics YOLO model to DEEPX format?

You can export your model using the export() method in Python or via the CLI. The export automatically enables INT8 quantization and uses a calibration dataset to minimize accuracy loss. The dx_com compiler package is installed automatically if not already present.

Q: What platforms are supported for DEEPX export?

DEEPX model export (compilation) requires an x86-64 Linux host. The export step is not supported on ARM64 (aarch64) and Windows machines. Inference using the exported .dxnn model can be run on any Linux platform (x86-64 and ARM64) supported by the dx_engine runtime.

Q: What is the output of a DEEPX export?

The export creates a directory (e.g., yolo26n_deepx_model/) containing: - yolo26n.dxnn — the compiled NPU binary - config.json — calibration and preprocessing settings - metadata.yaml — model metadata including class names and image size

Q: How do I install the DEEPX runtime for inference?

The DEEPX runtime is not bundled with ultralytics and must be installed separately before running inference. On x86-64 Linux machines and ARM64 Linux machines (e.g., Raspberry Pi 5), install the NPU driver (dxrt-driver-dkms) and runtime (libdxrt) from the DEEPX-AI GitHub releases, then install the bundled dx_engine Python wheel. See the Runtime Installation section above for step-by-step commands.

Deploying computer vision models on specialized NPU hardware requires a compatible and optimized model format. Exporting Ultralytics YOLO models to DEEPX format enables efficient, INT8-quantized inference on DEEPX NPU accelerators. This guide walks you through converting your YOLO models to DEEPX format and deploying them on DEEPX-powered hardware.

Link to this sectionWhat is DEEPX?#

DEEPX NPU Inference

DEEPX is an AI semiconductor company specializing in Neural Processing Units (NPUs) designed for power-efficient deep learning inference at the edge. DEEPX NPUs are engineered for demanding embedded and industrial AI applications, delivering high throughput with minimal power consumption. Their hardware is well suited for deployment scenarios where cloud connectivity is unreliable or undesirable, such as robotics, smart cameras, and industrial automation systems.

Link to this sectionDEEPX Export Format#

The DEEPX export produces a compiled .dxnn model binary that is optimized for execution on DEEPX NPU hardware. The compilation pipeline uses the dx_com toolkit to perform INT8 quantization and hardware-specific optimization, generating a self-contained model directory ready for deployment.

Link to this sectionKey Features of DEEPX Models#

DEEPX models offer several advantages for edge deployment:

INT8 Quantization: Models are quantized to INT8 precision during export, significantly reducing model size and maximizing NPU throughput. Learn more about model quantization.
NPU-Optimized: The .dxnn format is specifically compiled for DEEPX NPU hardware, leveraging dedicated acceleration units for fast, efficient inference.
Low Power Consumption: By offloading inference to the NPU, DEEPX models consume far less power than equivalent CPU or GPU inference.
Calibration-Based Accuracy: The export uses EMA-based calibration with real dataset images to minimize accuracy loss during quantization.
Self-Contained Output: The exported model directory bundles the compiled binary, calibration config, and metadata for straightforward deployment.

Link to this sectionSupported Tasks#

All standard Ultralytics tasks are supported for DEEPX export across YOLO26, YOLO11, and YOLOv8 model families.

Task	Supported
Object Detection	✅
Instance Segmentation	✅
Semantic Segmentation	✅
Pose Estimation	✅
OBB Detection	✅
Classification	✅

Link to this sectionExport to DEEPX: Converting Your YOLO Model#

Export an Ultralytics YOLO model to DEEPX format and run inference with the exported model.

Note

DEEPX export is only supported on x86-64 Linux machines. ARM64 (aarch64) is not supported for the export step. However, the exported dxnn models are fully compatible and executable on ARM64 platforms.

Link to this sectionInstallation#

To install the required packages, run:

Installation

# Install the required package for YOLO
pip install ultralytics

The dx_com compiler package will be automatically installed from the DEEPX SDK repository on first export. For detailed instructions and best practices related to the installation process, check our Ultralytics Installation guide. While installing the required packages for YOLO, if you encounter any difficulties, consult our Common Issues guide for solutions and tips.

Link to this sectionUsage#

The DEEPX format supports the Export, Predict, and Validate modes. Inference and validation run on DEEPX NPU hardware. Export your model, then load the exported model to run inference or validate its accuracy.

Export

from ultralytics import YOLO

# Load a YOLO26 model
model = YOLO("yolo26n.pt")

# Export the model to DEEPX format (int8=True is enforced automatically)
model.export(format="deepx")  # creates 'yolo26n_deepx_model/'

Predict

from ultralytics import YOLO

# Load the exported DEEPX model
model = YOLO("yolo26n_deepx_model")

# Run inference
results = model("https://ultralytics.com/images/bus.jpg")

Validate

from ultralytics import YOLO

# Load the exported DEEPX model
model = YOLO("yolo26n_deepx_model")

# Validate accuracy on the COCO8 dataset
metrics = model.val(data="coco8.yaml")

Link to this sectionExport Arguments#

Argument	Type	Default	Description
`format`	`str`	`'deepx'`	Target format for the exported model, defining compatibility with DEEPX NPU hardware.
`imgsz`	`int` or `tuple`	`640`	Desired image size for the model input. DEEPX export requires a square input — pass an integer (e.g., `640`) or a tuple where height equals width.
`int8`	`bool`	`True`	Enables INT8 quantization. Required for DEEPX export — automatically set to `True` if not specified.
`data`	`str`	`'coco128.yaml'`	Dataset configuration file used for INT8 calibration. Specifies the calibration image source.
`device`	`str`	`None`	Specifies the device for exporting: GPU (`device=0`) or CPU (`device=cpu`).
`optimize`	`bool`	`False`	Enables higher compiler optimization which reduces inference latency and increases compilation time.

Tip

Always run DEEPX export on an x86-64 Linux host. The dx_com compiler does not support ARM64.

For more details about the export process, visit the Ultralytics documentation page on exporting.

Link to this sectionOutput Structure#

After a successful export, a model directory is created with the following layout:

yolo26n_deepx_model/
├── yolo26n.dxnn     # Compiled DEEPX model binary (NPU executable)
├── config.json      # Calibration and preprocessing configuration
└── metadata.yaml    # Model metadata (classes, image size, task, etc.)

The .dxnn file is the compiled model binary that the dx_engine runtime loads directly on the NPU. The metadata.yaml contains class names, image size, and other information used by the Ultralytics inference pipeline.

Link to this sectionDeploying Exported YOLO DEEPX Models#

Once you've successfully exported your Ultralytics YOLO model to DEEPX format, the next step is deploying these models on DEEPX NPU hardware.

Link to this sectionRuntime Installation#

Inference requires the DEEPX NPU driver, the libdxrt runtime, and the dx_engine Python package.

Note

DEEPX runtime supports both x86-64 Linux and ARM64 (e.g., Raspberry Pi 5).

# Install the NPU driver and libdxrt runtime
sudo apt update
wget https://github.com/DEEPX-AI/dx_rt_npu_linux_driver/raw/main/release/2.4.1/dxrt-driver-dkms_2.4.1-2_all.deb
sudo apt install ./dxrt-driver-dkms_2.4.1-2_all.deb
wget https://github.com/DEEPX-AI/dx_rt/raw/main/release/3.3.2/libdxrt_3.3.2_all.deb
sudo apt install ./libdxrt_3.3.2_all.deb

# Create dx-engine wheel
cd /usr/share/libdxrt/python_package && sudo ./make_whl.sh

# Install the bundled dx_engine Python wheel
pip install dx_engine-*.whl

Verify the runtime is installed correctly with dxrt-cli --version. You should see output similar to:

DXRT v3.3.2
Minimum Driver Versions
Device Driver: v2.4.0
PCIe Driver: v2.2.0
Firmware: v2.5.2
Minimum Compiler Versions
Compiler: v1.18.1
.dxnn File Format: v6

Once the runtime is installed, run inference and validation on your DEEPX device exactly as shown in the Usage section above — the exported _deepx_model loads directly with YOLO(...).

Link to this sectionVisualizing with dxtron#

dxtron is DEEPX's graph visualizer for inspecting the compiled .dxnn model.

Install dxtron on x86-64 Linux by downloading the .deb package from the DEEPX SDK and installing it via dpkg:

wget https://sdk.deepx.ai/release/dxtron/v2.0.1/dxtron_2.0.1_amd64.deb
sudo dpkg -i dxtron_2.0.1_amd64.deb

Then open your exported model:

dxtron yolo26n_deepx_model/yolo26n.dxnn

Note

dxtron is available for both x86-64 and aarch64 platforms.

Link to this sectionBenchmarks#

The Ultralytics team benchmarked YOLO26 models, comparing speed and accuracy between PyTorch and DEEPX.

Performance

Raspberry Pi 5 DEEPX M1 NPU vs PyTorch benchmarks

Model	Format	Status	Size (MB)	metrics/mAP50-95(B)	Inference time (ms/im)
YOLO26n	PyTorch	✅	5.3	0.4760	315.2
YOLO26n	DEEPX	✅	6.6	0.4660	34.6
YOLO26n-seg	PyTorch	✅	6.5	0.4080	485.4
YOLO26n-seg	DEEPX	✅	7.9	0.3920	53.8
YOLO26n-pose	PyTorch	✅	7.6	0.4230	506.3
YOLO26n-pose	DEEPX	✅	8.8	0.4590	37.6
YOLO26n-obb	PyTorch	✅	5.7	0.817	1094.4
YOLO26n-obb	DEEPX	✅	7.3	0.783	56.4

Model	Format	Status	Size (MB)	acc (top1)	acc (top5)	Inference time (ms/im)
YOLO26n-cls	PyTorch	✅	5.6	0.431	0.716	23.8
YOLO26n-cls	DEEPX	✅	5.9	0.333	0.686	2.7

Note

Validation for the above benchmarks were done using coco128 for detection, coco128-seg for segmentation, coco8-pose for pose estimation, imagenet100 for classification and dota128 for OBB models. Inference time does not include pre/ post-processing.

Performance Optimization Tips

To get the best inference throughput from the DX-M1 NPU connected to a Raspberry Pi 5, open the boot configuration file and enable PCIe Gen 3 support.

sudo nano /boot/firmware/config.txt

Add the following lines at the end of the file:

dtparam=pciex1
dtparam=pciex1_gen=3

Save and exit (Ctrl+X, then Y, then Enter), then reboot:

sudo reboot

Check the PCIe generation. Expected speed is 8GT/s for PCIe Gen3.

sudo lspci -vvv | grep -iA 33 accelerators | grep -E "LnkCap|LnkSta"

Link to this sectionRecommended Workflow#

Train your model using Ultralytics Train Mode
Export to DEEPX format using model.export(format="deepx")
Validate accuracy with yolo val to verify minimal quantization loss
Predict using yolo predict for qualitative validation
Deploy the exported _deepx_model/ directory to DEEPX NPU hardware using the dx_engine runtime

Link to this sectionReal-World Applications#

YOLO models deployed on DEEPX NPU hardware are well suited for a wide range of edge AI applications:

Smart Surveillance: Real-time object detection for security and monitoring systems with low power consumption and no cloud dependency.
Industrial Automation: On-device quality control, defect detection, and process monitoring in factory environments.
Robotics: Vision-based navigation, obstacle avoidance, and object recognition on autonomous robots and drones.
Smart Agriculture: Crop health monitoring, pest detection, and yield estimation using computer vision in agriculture.
Retail Analytics: Customer flow analysis, shelf monitoring, and inventory tracking with real-time edge inference.

Link to this sectionSummary#

In this guide, you've learned how to export Ultralytics YOLO models to DEEPX format and deploy them on DEEPX NPU hardware. The export pipeline uses INT8 calibration and the dx_com compiler to produce a hardware-optimized .dxnn binary, while the dx_engine runtime handles inference on the device.

The combination of Ultralytics YOLO and DEEPX's NPU technology provides an effective solution for running advanced computer vision workloads on embedded and edge devices — delivering high throughput with low power consumption for real-time applications.

For further details on usage, visit the DEEPX official website.

Also, if you'd like to know more about other Ultralytics YOLO integrations, visit our integration guide page. You'll find plenty of useful resources and insights there.

Link to this sectionFAQ#

Link to this sectionHow do I export my Ultralytics YOLO model to DEEPX format?#

You can export your model using the export() method in Python or via the CLI. The export automatically enables INT8 quantization and uses a calibration dataset to minimize accuracy loss. The dx_com compiler package is installed automatically if not already present.

Example

from ultralytics import YOLO

model = YOLO("yolo26n.pt")
model.export(format="deepx")

Link to this sectionWhy does DEEPX export require INT8 quantization?#

DEEPX NPUs are designed to execute INT8 computations at maximum efficiency. The dx_com compiler quantizes the model during export using EMA-based calibration with real dataset images, enabling the NPU to deliver its full performance. INT8 is always enforced for DEEPX exports — if you pass int8=False, it will be overridden with a warning.

Link to this sectionWhat platforms are supported for DEEPX export?#

DEEPX model export (compilation) requires an x86-64 Linux host. The export step is not supported on ARM64 (aarch64) and Windows machines. Inference using the exported .dxnn model can be run on any Linux platform (x86-64 and ARM64) supported by the dx_engine runtime.

Link to this sectionWhat is the output of a DEEPX export?#

The export creates a directory (e.g., yolo26n_deepx_model/) containing:

yolo26n.dxnn — the compiled NPU binary
config.json — calibration and preprocessing settings
metadata.yaml — model metadata including class names and image size

Link to this sectionCan I deploy custom-trained models on DEEPX hardware?#

Yes. Any model trained using Ultralytics Train Mode and exported with format="deepx" can be deployed on DEEPX NPU hardware, provided it uses supported layer operations. Export supports detection, segmentation, pose estimation, oriented bounding box (OBB), and classification tasks.

Link to this sectionHow many calibration images should I use for DEEPX export?#

The DEEPX export pipeline uses every image in the calibration dataset (after fraction filtering) with the EMA calibration method. A few hundred images is usually sufficient for good quantization accuracy. Point data at a smaller dataset (or set fraction below 1.0) if compilation time becomes a concern on large datasets.

Link to this sectionHow do I install the DEEPX runtime for inference?#

The DEEPX runtime is not bundled with ultralytics and must be installed separately before running inference. On x86-64 Linux machines and ARM64 Linux machines (e.g., Raspberry Pi 5), install the NPU driver (dxrt-driver-dkms) and runtime (libdxrt) from the DEEPX-AI GitHub releases, then install the bundled dx_engine Python wheel. See the Runtime Installation section above for step-by-step commands.

Contributors

GLglenn-jocher⁴ LAlakshanthad² DEdeepx-dgkim¹ AMambitious-octopus¹

Created 2 weeks agoUpdated 3 days ago

Link to this sectionDEEPX Export for Ultralytics YOLO Models#

Link to this sectionWhat is DEEPX?#

Link to this sectionDEEPX Export Format#

Link to this sectionKey Features of DEEPX Models#

Link to this sectionSupported Tasks#

Link to this sectionExport to DEEPX: Converting Your YOLO Model#

Link to this sectionInstallation#

Link to this sectionUsage#

Link to this sectionExport Arguments#

Link to this sectionOutput Structure#

Link to this sectionDeploying Exported YOLO DEEPX Models#

Link to this sectionRuntime Installation#

Link to this sectionVisualizing with dxtron#

Link to this sectionBenchmarks#

Link to this sectionRecommended Workflow#

Link to this sectionReal-World Applications#

Link to this sectionSummary#

Link to this sectionFAQ#

Link to this sectionHow do I export my Ultralytics YOLO model to DEEPX format?#

Link to this sectionWhy does DEEPX export require INT8 quantization?#

Link to this sectionWhat platforms are supported for DEEPX export?#

Link to this sectionWhat is the output of a DEEPX export?#

Link to this sectionCan I deploy custom-trained models on DEEPX hardware?#

Link to this sectionHow many calibration images should I use for DEEPX export?#

Link to this sectionHow do I install the DEEPX runtime for inference?#

Comments