Link to this section用于 Rust 的 Ultralytics Inference#

Ultralytics Inference is a high-performance YOLO inference library and command-line tool written in Rust. It runs exported ONNX models through ONNX Runtime to deliver fast, memory-safe predictions on images, videos, webcams, and streams, with no Python runtime required at inference time.

该项目以单一 crate ultralytics-inference 的形式发布，你可以通过两种方式使用它：作为用于快速预测和批量作业的 CLI，或者作为直接嵌入到你的 Rust 应用程序中的库。它支持所有的 Ultralytics 任务，并通过统一的设备接口支持广泛的硬件后端。

Link to this section为什么要选择 Rust 进行推理？#

原生速度和小巧的内存占用。 编译为无需解释器的原生二进制文件，非常适合服务器、容器和边缘设备。
内存安全。 Rust 的所有权模型无需垃圾回收器即可消除各类运行时错误。
所有 YOLO 任务。 通过同一个 API 即可实现检测、分割、姿态估计、OBB、分类、语义分割和深度估计。
广泛的硬件支持。 支持 CPU 以及在构建时选择的 CUDA、TensorRT、CoreML、OpenVINO、DirectML、ROCm 和 XNNPACK 执行提供程序。
GPU 端预处理。 可选的融合 CUDA 内核将 letterbox、归一化和布局转换保留在设备上，实现零拷贝输入路径。
自动下载。 已知的 YOLO 模型名称和示例资产会在首次使用时自动下载。

在寻找 Python 包？

此页面涵盖独立的 Rust crate。有关 Python 工作流程（训练、验证、导出和预测），请参阅主要的快速入门和预测模式。使用 ONNX 集成将任何 Ultralytics 模型导出为 ONNX，然后在此处运行。

Link to this section安装#

需要 Rust 1.89 或更高版本。video 功能还需要系统安装 FFmpeg 7+。

# Install the command-line tool from crates.io
cargo install ultralytics-inference

# Or with GPU support compiled in
cargo install ultralytics-inference --features cuda,tensorrt

该二进制文件位于 ~/.cargo/bin/ultralytics-inference (Linux 和 macOS) 或 Windows 上的 %USERPROFILE%\.cargo\bin\。

Link to this sectionCLI 快速入门#

CLI 提供了一个 predict 子命令。如果不加参数，它会自动下载一个 nano 检测模型和示例图像，运行推理，并将带注释的结果保存到 runs/detect/predict。

# Detect on the built-in samples (downloads model and images)
ultralytics-inference predict

# Detect on your own image
ultralytics-inference predict --model yolo26n.onnx --source image.jpg

# Segmentation (auto-downloads yolo26n-seg.onnx)
ultralytics-inference predict --task segment --source image.jpg

# Pose on a video, shown live in a window
ultralytics-inference predict --task pose --source video.mp4 --show

# Depth estimation (auto-downloads yolo26n-depth.onnx)
ultralytics-inference predict --task depth --source image.jpg

# Depth with the DepthAnything-style palette (disparity is already the default)
ultralytics-inference predict --task depth --source image.jpg --colormap spectral

# Depth normalized by depth value instead (near = low color, far = high color)
ultralytics-inference predict --task depth --source image.jpg --depth-viz metric

# Tune thresholds and filter to specific classes
ultralytics-inference predict --source image.jpg --conf 0.5 --iou 0.45 --classes "0,1,2"

# Run a whole folder on the GPU in half precision
ultralytics-inference predict --source images/ --device cuda:0 --half

常用标志：

标志	默认值	描述
`--model`, `-m`	`yolo26n.onnx`	指向 ONNX 模型的路径；已知的 YOLO 名称会自动下载。
`--task`	`detect`	`detect`、`segment`、`pose`、`obb`、`classify`、`semantic`、`depth` 中的任意一个。
`--source`, `-s`	示例	图像、目录、glob 模式、视频、摄像头索引或 URL。
`--conf`	`0.25`	置信度阈值。
`--iou`	`0.7`	用于非极大值抑制 (NMS) 的 IoU 阈值。
`--imgsz`	模型元数据	推理图像尺寸。
`--device`	`cpu`	执行设备，例如 `cuda:0`, `coreml`, `tensorrt:0`。
`--half`	`false`	FP16 半精度推理。
`--save`	`true`	将带注释的结果保存到 `runs/<task>/predict`。
`--show`	`false`	在窗口中显示结果。
`--classes`	全部	按类别 ID 过滤检测结果，例如 `"0,1,2"`。
`--colormap`	`jet`	深度颜色映射：`jet`、`inferno`、`spectral` 或 `gray`（仅限深度）。
`--depth-viz`	`disparity`	深度归一化：`disparity`（逆深度，近 = 高色值）或 `metric`（仅限深度）。

Link to this section库快速入门#

加载模型并运行预测。类名、任务类型和图像尺寸等模型元数据会自动从 ONNX 文件中读取。

use ultralytics_inference::YOLOModel;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Metadata (classes, task, imgsz) is parsed from the model.
    let mut model = YOLOModel::load("yolo26n.onnx")?;

    let results = model.predict("image.jpg")?;

    for result in &results {
        if let Some(boxes) = &result.boxes {
            for i in 0..boxes.len() {
                let class_id = boxes.cls()[i] as usize;
                let conf = boxes.conf()[i];
                let name = result.names.get(&class_id).map_or("unknown", |s| s.as_str());
                println!("{name} {conf:.2}");
            }
        }
    }

    Ok(())
}

使用 InferenceConfig 并结合构建器 API 来控制阈值、图像尺寸、精度和设备：

use ultralytics_inference::{Device, InferenceConfig, YOLOModel};

let config = InferenceConfig::new()
    .with_confidence(0.5)
    .with_iou(0.45)
    .with_imgsz(640, 640)
    .with_device(Device::Cuda(0))
    .with_half(true);

let mut model = YOLOModel::load_with_config("yolo26n.onnx", config)?;
let results = model.predict("image.jpg")?;

每个任务都会填充 Results 中的不同字段。下方的每个标签页都是一个完整的、可运行的程序；模型和示例输入会在首次运行时自动下载。将 predict_default() 替换为 predict("image.jpg") 即可在你的文件上运行。

use ultralytics_inference::YOLOModel;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let mut model = YOLOModel::load("yolo26n.onnx")?;
    let results = model.predict_default()?;

    for result in &results {
        if let Some(boxes) = &result.boxes {
            println!("{} detections", boxes.len());
            let xyxy = boxes.xyxy(); // rows of [x1, y1, x2, y2]
            for i in 0..boxes.len() {
                let class_id = boxes.cls()[i] as usize;
                let name = result.names.get(&class_id).map_or("unknown", |s| s.as_str());
                println!("  {name} {:.2} {:?}", boxes.conf()[i], xyxy.row(i).to_vec());
            }
        }
    }

    Ok(())
}

Link to this section支持的任务#

支持所有 Ultralytics 任务。当省略 --model 时，会自动下载所选任务对应的 nano 模型。

任务	`--task`	输出	默认模型
检测	`detect`	边界框和类别	`yolo26n.onnx`
实例分割	`segment`	边界框加上实例掩码	`yolo26n-seg.onnx`
姿态	`pose`	边界框加上关键点	`yolo26n-pose.onnx`
旋转边界框	`obb`	旋转边界框	`yolo26n-obb.onnx`
分类	`classify`	类别概率	`yolo26n-cls.onnx`
语义分割	`semantic`	逐像素分类图	`yolo26n-sem.onnx`
深度估计	`depth`	以米为单位的像素级深度图	`yolo26n-depth.onnx`

Link to this section模型兼容性#

任何导出为 ONNX 的 Ultralytics 模型都可以从本地文件加载。标准的 YOLO26、YOLO11 和 YOLOv8 模型名称（尺寸包括 n、s、m、l 和 x）支持自动下载：

模型系列	可自动下载的变体
YOLO26	`yolo26{n,s,m,l,x}.onnx`、`-seg`、`-pose`、`-obb`、`-cls`、`-sem` 和 `-depth`
YOLO11	`yolo11{n,s,m,l,x}.onnx`, `-seg`, `-pose`, `-obb` 和 `-cls`
YOLOv8	`yolov8{n,s,m,l,x}.onnx`, `-seg`, `-pose`, `-obb` 和 `-cls`

语义分割 (-sem) 和深度估计 (-depth) 仅适用于 YOLO26。

Link to this section输入源#

--source 参数（以及库中的 Source 类型）接受多种输入类型，并根据字符串自动检测：

来源	示例	注意事项
图像	`image.jpg`	单个文件。
目录	`images/`	文件夹中的所有图像。
Glob 模式	`images/*.jpg`	Shell 风格的匹配模式。
视频	`video.mp4`	需要 `video` 功能特性。
网络摄像头	`0`	需要 `video` 功能特性。
Stream	`rtsp://...`	需要 `video` 功能特性。
URL	`https://example.com/image.jpg`	远程图像下载。

Link to this section设备和执行提供程序#

默认在 CPU 上运行推理。GPU 和加速器后端作为 Cargo 功能特性编译，并在运行时通过 --device (CLI) 或 Device (库) 进行选择。

设备字符串	`Device` 变体	构建功能特性	硬件
`cpu`	`Device::Cpu`	内置	任何 CPU
`cuda:0`	`Device::Cuda(0)`	`cuda`	NVIDIA GPU
`tensorrt:0`	`Device::TensorRt(0)`	`tensorrt`	NVIDIA GPU，已优化
`coreml`	`Device::CoreMl`	`coreml`	Apple Silicon / macOS
`openvino`	`Device::OpenVino`	`openvino`	Intel CPU / iGPU
`directml:0`	`Device::DirectMl(0)`	`directml`	Windows GPU
`rocm:0`	`Device::Rocm(0)`	`rocm`	AMD GPU
`xnnpack`	`Device::Xnnpack`	`xnnpack`	已优化的 CPU

# Build the CLI with the providers you need
cargo install ultralytics-inference --features cuda,tensorrt

Link to this sectionGPU 加速和 CUDA 预处理#

在 NVIDIA 硬件上，cuda 特性启用了 CUDA 执行提供程序，而 tensorrt 则添加了用于进一步优化的 TensorRT 提供程序。为了实现尽可能低的延迟，cuda-preprocess 特性将预处理移动到了 GPU 上执行。

cuda-preprocess 将 letterbox 调整大小、归一化以及 HWC 到 CHW 的布局转换作为单个融合的 CUDA 内核运行，然后将结果作为零拷贝设备张量馈送给模型。这消除了每张图像的 CPU 预处理开销和主机到设备的内存拷贝，这对高吞吐量批处理和实时流处理至关重要。

# Build with fused GPU preprocessing (implies cuda + tensorrt)
cargo build --release --features cuda-preprocess

当满足以下所有条件时，系统将自动使用快速路径，且无需更改 API：该功能已编译，设备为 CUDA 或 TensorRT，任务为检测、分割、姿态估计、OBB、语义分割或深度估计，且模型使用 FP32 输入。该功能默认开启，并可针对每个模型单独关闭：

use ultralytics_inference::{Device, InferenceConfig};

let config = InferenceConfig::new()
    .with_device(Device::TensorRt(0))
    .with_cuda_preprocess(false); // force CPU preprocessing

匹配你的 CUDA 工具包

cuda-preprocess 要求在构建时使用匹配的 CUDA 工具包，并在运行时使用 NVRTC 进行融合预处理内核。有关版本要求和故障排除，请参阅 CUDA 和 TensorRT 加速指南。

Link to this sectionCargo 特性#

特性在构建时启用。默认配置涵盖了标注和实时显示。

功能	默认值	用途
`annotate`	是	绘制边界框、掩码、关键点和标签；`--save` 所必需。
`visualize`	是	用于 `--show` 的实时窗口显示。
`video`	否	读取和写入视频文件（需要 FFmpeg 7+）。
`cuda`	否	NVIDIA CUDA 执行提供程序。
`tensorrt`	否	NVIDIA TensorRT 执行提供程序。
`cuda-preprocess`	否	具有零拷贝输入的融合 GPU 预处理（暗示包含 cuda、tensorrt）。
`coreml`	否	Apple CoreML 执行提供程序。
`openvino`	否	Intel OpenVINO 执行提供程序。
`rocm`	否	AMD ROCm 执行提供程序。
`directml`	否	Windows DirectML 执行提供程序。

便捷分组捆绑了相关提供程序：nvidia (cuda, tensorrt)、amd (rocm, migraphx)、intel (openvino, onednn)、mobile (nnapi, coreml, qnn) 以及 all (annotate, visualize, video)。其他提供程序如 nnapi、qnn、xnnpack、webgpu 等也可使用。

在安装 CLI 或添加库时启用特性：

cargo install ultralytics-inference --features video
cargo install ultralytics-inference --features cuda,tensorrt

[dependencies]
ultralytics-inference = { version = "0.0.29", features = ["video"] }

Link to this section输出与保存#

默认情况下，预测结果会被标注并保存到自动递增的运行目录中：

runs/
└── detect/
    └── predict/          # then predict2, predict3, ...
        └── image.jpg     # annotated result

子文件夹与任务名称匹配（例如 runs/segment/、runs/pose/ 等）。对于视频源，带标注的输出将写入视频文件；传递 --save-frames 可改为写入单独的帧。对于 semantic 任务，--save-json 会在 results/ 子文件夹下写入像素级的分类映射 PNG 图片。对于 depth 任务，带标注的图片将并排写入，原始图片位于彩色深度图旁边。带标注的图片和视频保存功能需要 annotate 特性；语义分类映射 PNG 导出则不需要。视频输入和输出需要 video 特性。

Link to this section常见问题解答#

Link to this section我需要安装 Python 吗？#

不需要。该 crate 直接通过 ONNX Runtime 运行已导出的 ONNX 模型。仅当你需要事先使用 Ultralytics 包训练或导出模型时，才需要 Python。

Link to this section我可以运行哪些模型？#

任何已导出为 ONNX 的 Ultralytics YOLO 模型，包括 YOLO26、YOLO11 和 YOLOv8。已知模型名称会自动下载；你也可以将 --model 指向任何本地 .onnx 文件。

Link to this section我该如何获取模型文件？#

从 Python 包中导出，例如使用 ONNX 集成，或者让 CLI 在首次运行时为所选任务下载标准 nano 模型。

Link to this section支持视频吗？#

支持，前提是启用了 video 特性并在系统上安装了 FFmpeg 7+。这涵盖了视频文件、网络摄像头和 RTSP/RTMP/HTTP 流。

Link to this section`annotate` 和 `visualize` 特性有什么作用？#

两者默认均已启用。annotate 在图像上绘制边界框、掩码、关键点和类别标签，是 --save 写入标注结果所必需的。visualize 为 --show 打开一个实时窗口。如果需要一个更小、无头（headless）且仅以编程方式返回结果的构建版本，请使用 cargo build --no-default-features 禁用它们（按需重新添加单独的特性）。

Link to this section完整的 API 参考在哪里？#

本页面仅为概览。针对每个公共结构、方法和配置选项的完整、按类型分类的 API 参考发布在 docs.rs 上，均直接从源码生成。

贡献者

ONonuralpszr⁸ GLglenn-jocher¹

创建于上个月更新于 4天前