How to Export Non-YOLO PyTorch Models with Ultralytics

Q: What models can I export with Ultralytics?

Any torch.nn.Module. This includes models from timm, torchvision, or any custom PyTorch model. The model must be in evaluation mode (model.eval()) before export. ONNX and OpenVINO additionally accept a tuple of example tensors for multi-input models.

Q: What Ultralytics version do I need?

Use Ultralytics >=8.4.38, which includes the ultralytics.utils.export module and the standardized output_file/output_dir arguments.

Q: Can I quantize my exported model to INT8 or FP16?

Yes, for several formats. Pass half=True for FP16 or int8=True for INT8 when exporting to OpenVINO, CoreML, MNN, or NCNN. INT8 in OpenVINO additionally requires a calibration_dataset argument for post-training quantization. See each format's integration page for quantization trade-offs.

Deploying PyTorch models to production usually means juggling a different exporter for every target: torch.onnx.export for ONNX, coremltools for Apple devices, onnx2tf for TensorFlow, pnnx for NCNN, and so on. Each tool has its own API, dependency quirks, and output conventions.

Ultralytics ships standalone export utilities that wrap multiple backends behind one consistent interface. You can export any torch.nn.Module, including timm image models, torchvision classifiers and detectors, or your own custom architectures, to ONNX, TorchScript, OpenVINO, CoreML, NCNN, PaddlePaddle, MNN, ExecuTorch, and TensorFlow SavedModel without learning each backend separately.

Why Use Ultralytics for Non-YOLO Export?

One API across 10 formats: learn a single calling convention instead of a dozen.
Shared utility surface: the export helpers live under ultralytics.utils.export, so once the backend packages are installed you can keep the same calling pattern across formats.
Same code path as YOLO exports: the same helpers power every Ultralytics YOLO export.
FP16 and INT8 quantization built in for formats that support it (OpenVINO, CoreML, MNN, NCNN).
Works on CPU: no GPU required for the export step itself, so you can run it locally on any laptop.

Quick Start

The fastest path is a two-line export to ONNX with no YOLO code and no setup beyond pip install ultralytics onnx timm:

import timm
import torch

from ultralytics.utils.export import torch2onnx

model = timm.create_model("resnet18", pretrained=True).eval()
torch2onnx(model, torch.randn(1, 3, 224, 224), output_file="resnet18.onnx")

Supported Export Formats

The torch2* functions take a standard torch.nn.Module and an example input tensor. MNN, TF SavedModel, and TF Frozen Graph go through an intermediate ONNX or Keras artifact. No YOLO-specific attributes are required in either case.

Format	Function	Install	Output
ONNX	`torch2onnx()`	`pip install onnx`	`.onnx` file
TorchScript	`torch2torchscript()`	included with PyTorch	`.torchscript` file
OpenVINO	`torch2openvino()`	`pip install openvino`	`_openvino_model/` directory
CoreML	`torch2coreml()`	`pip install coremltools`	`.mlpackage`
TF SavedModel	`onnx2saved_model()`	see detailed requirements below	`_saved_model/` directory
TF Frozen Graph	`keras2pb()`	see detailed requirements below	`.pb` file
NCNN	`torch2ncnn()`	`pip install ncnn pnnx`	`_ncnn_model/` directory
MNN	`onnx2mnn()`	`pip install MNN`	`.mnn` file
PaddlePaddle	`torch2paddle()`	`pip install paddlepaddle x2paddle`	`_paddle_model/` directory
ExecuTorch	`torch2executorch()`	`pip install executorch`	`_executorch_model/` directory

ONNX as an intermediate format

MNN, TF SavedModel, and TF Frozen Graph exports go through ONNX as an intermediate step. Export to ONNX first, then convert.

Embedding metadata

Several export functions accept an optional metadata dictionary (e.g., torch2torchscript(..., metadata={"author": "me"})) that embeds custom key-value pairs into the exported artifact where the format supports it.

Step-by-Step Examples

Every example below uses the same setup, a pretrained ResNet-18 from timm in evaluation mode:

import timm
import torch

model = timm.create_model("resnet18", pretrained=True).eval()
im = torch.randn(1, 3, 224, 224)

Always call model.eval() before exporting

Dropout, batch normalization, and other train-only layers behave differently during inference. Skipping .eval() produces exports with incorrect outputs.

Export to ONNX

from ultralytics.utils.export import torch2onnx

torch2onnx(model, im, output_file="resnet18.onnx")

For dynamic batch size, pass a dynamic dictionary:

torch2onnx(model, im, output_file="resnet18_dyn.onnx", dynamic={"images": {0: "batch_size"}})

The default opset is 14 and the default input name is "images". Override with the opset, input_names, or output_names arguments.

Export to TorchScript

No extra dependencies needed. Uses torch.jit.trace under the hood.

from ultralytics.utils.export import torch2torchscript

torch2torchscript(model, im, output_file="resnet18.torchscript")

Export to OpenVINO

from ultralytics.utils.export import torch2openvino

ov_model = torch2openvino(model, im, output_dir="resnet18_openvino_model")

The directory contains a fixed-name model.xml and model.bin pair:

resnet18_openvino_model/
├── model.xml
└── model.bin

Pass dynamic=True for dynamic input shapes, half=True for FP16, or int8=True for INT8 quantization. INT8 additionally requires a calibration_dataset argument.

Requires openvino>=2024.0.0 (or >=2025.2.0 on macOS 15.4+) and torch>=2.1.

Export to CoreML

import coremltools as ct

from ultralytics.utils.export import torch2coreml

inputs = [ct.TensorType("input", shape=(1, 3, 224, 224))]
ct_model = torch2coreml(model, inputs, im, output_file="resnet18.mlpackage")

For classification models, pass a list of class names to classifier_names to add a classification head to the CoreML model.

Requires coremltools>=9.0, torch>=1.11, and numpy<=2.3.5. Not supported on Windows.

BlobWriter not loaded error

coremltools>=9.0 ships wheels for Python 3.10–3.13 on macOS and Linux. On newer Python versions the native C extension fails to load. Use Python 3.10–3.13 for CoreML export.

Export to TensorFlow SavedModel

TF SavedModel export goes through ONNX as an intermediate step:

from ultralytics.utils.export import onnx2saved_model, torch2onnx

torch2onnx(model, im, output_file="resnet18.onnx")
keras_model = onnx2saved_model("resnet18.onnx", output_dir="resnet18_saved_model")

The function returns a Keras model and also generates TFLite files (.tflite) inside the output directory:

resnet18_saved_model/
├── saved_model.pb
├── variables/
├── resnet18_float32.tflite
├── resnet18_float16.tflite
└── resnet18_int8.tflite

Requirements:

tensorflow>=2.0.0,<=2.19.0
onnx2tf>=1.26.3,<1.29.0
tf_keras<=2.19.0
sng4onnx>=1.0.1
onnx_graphsurgeon>=0.3.26 (install with --extra-index-url https://pypi.ngc.nvidia.com)
ai-edge-litert>=1.2.0,<1.4.0 on macOS (ai-edge-litert>=1.2.0 on other platforms)
onnxslim>=0.1.71
onnx>=1.12.0,<2.0.0
protobuf>=5

Export to TensorFlow Frozen Graph

Continuing from the SavedModel export above, convert the returned Keras model to a frozen .pb graph:

from pathlib import Path

from ultralytics.utils.export import keras2pb

keras2pb(keras_model, output_file=Path("resnet18_saved_model/resnet18.pb"))

Export to NCNN

from ultralytics.utils.export import torch2ncnn

torch2ncnn(model, im, output_dir="resnet18_ncnn_model")

The directory contains fixed-name param and bin files along with a Python wrapper:

resnet18_ncnn_model/
├── model.ncnn.param
├── model.ncnn.bin
└── model_ncnn.py

torch2ncnn() checks for ncnn and pnnx on first use.

Export to MNN

MNN export requires an ONNX file as input. Export to ONNX first, then convert:

from ultralytics.utils.export import onnx2mnn, torch2onnx

torch2onnx(model, im, output_file="resnet18.onnx")
onnx2mnn("resnet18.onnx", output_file="resnet18.mnn")

Supports half=True for FP16 and int8=True for INT8 quantization. Requires MNN>=2.9.6 and torch>=1.10.

Export to PaddlePaddle

from ultralytics.utils.export import torch2paddle

torch2paddle(model, im, output_dir="resnet18_paddle_model")

The directory contains the PaddlePaddle model and parameter files:

resnet18_paddle_model/
├── model.pdmodel
└── model.pdiparams

Requires x2paddle and the correct PaddlePaddle distribution for your platform:

paddlepaddle-gpu>=3.0.0,<3.3.0 on CUDA
paddlepaddle==3.0.0 on ARM64 CPU
paddlepaddle>=3.0.0,<3.3.0 on other CPUs

Not supported on NVIDIA Jetson.

Export to ExecuTorch

from ultralytics.utils.export import torch2executorch

torch2executorch(model, im, output_dir="resnet18_executorch_model")

The exported .pte file is saved inside the output directory:

resnet18_executorch_model/
└── model.pte

Requires torch>=2.9.0 and a matching ExecuTorch runtime (pip install executorch). For runtime usage, see the ExecuTorch integration.

Verify Your Exported Model

After exporting, verify numerical parity with the original PyTorch model before shipping. A quick smoke test with ONNXBackend from ultralytics.nn.backends compares outputs and flags tracing or quantization errors early:

import numpy as np
import timm
import torch

from ultralytics.nn.backends import ONNXBackend

model = timm.create_model("resnet18", pretrained=True).eval()
im = torch.randn(1, 3, 224, 224)
with torch.no_grad():
    pytorch_output = model(im).numpy()

onnx_model = ONNXBackend("resnet18.onnx", device=torch.device("cpu"))
onnx_output = onnx_model.forward(im)[0]

diff = np.abs(pytorch_output - onnx_output).max()
print(f"Max difference: {diff:.6f}")  # should be < 1e-5

Expected difference

For FP32 exports, the max absolute difference should be under 1e-5. Larger differences point to unsupported ops, incorrect input shape, or a model not in eval mode. FP16 and INT8 exports have looser tolerances. Validate on real data instead of random tensors.

For other runtimes, the input tensor name may differ. OpenVINO, for example, uses the model's forward-argument name (typically x for generic models), while torch2onnx defaults to "images".

Known Limitations

Multi-input support is uneven: torch2onnx and torch2openvino accept a tuple or list of example tensors for models with multiple inputs. torch2torchscript, torch2coreml, torch2ncnn, torch2paddle, and torch2executorch assume a single input tensor.
ExecuTorch needs flatc: The ExecuTorch runtime requires the FlatBuffers compiler. Install with brew install flatbuffers on macOS or apt install flatbuffers-compiler on Ubuntu.
No inference via Ultralytics: Exported non-YOLO models cannot be loaded back through YOLO() for inference. Use the native runtime for each format (ONNX Runtime, OpenVINO Runtime, etc.).
YOLO-only formats: Axelera and Sony IMX500 exports require YOLO-specific model attributes and are not available for generic models.
Platform-specific formats: TensorRT requires an NVIDIA GPU. RKNN requires the rknn-toolkit2 SDK (Linux only). Edge TPU requires the edgetpu_compiler binary (Linux only).

FAQ

What models can I export with Ultralytics?

Any torch.nn.Module. This includes models from timm, torchvision, or any custom PyTorch model. The model must be in evaluation mode (model.eval()) before export. ONNX and OpenVINO additionally accept a tuple of example tensors for multi-input models.

Which export formats work without a GPU?

All supported formats (TorchScript, ONNX, OpenVINO, CoreML, TF SavedModel, TF Frozen Graph, NCNN, PaddlePaddle, MNN, ExecuTorch) can export on CPU. No GPU is required for the export process itself. TensorRT is the only format that requires an NVIDIA GPU.

What Ultralytics version do I need?

Use Ultralytics >=8.4.38, which includes the ultralytics.utils.export module and the standardized output_file/output_dir arguments.

Can I export a torchvision model to CoreML for iOS deployment?

Yes. torchvision classifiers, detectors, and segmentation models export to .mlpackage via torch2coreml. For image classification models, pass a list of class names to classifier_names to bake in a classification head. Run the export on macOS or Linux. CoreML is not supported on Windows. See the CoreML integration for iOS deployment details.

Can I quantize my exported model to INT8 or FP16?

Yes, for several formats. Pass half=True for FP16 or int8=True for INT8 when exporting to OpenVINO, CoreML, MNN, or NCNN. INT8 in OpenVINO additionally requires a calibration_dataset argument for post-training quantization. See each format's integration page for quantization trade-offs.

📅 Created 0 days ago ✏️ Updated 0 days ago