简单实用

Q: How can I use Ultralytics to auto-label my dataset?

如果您有预先训练好的Ultralytics YOLO 物体检测模型，您可以将其与SAM 模型一起使用，以分割格式自动标注数据集。下面是一个示例：更多详情，请查看自动标注参考部分。

Q: How do I convert COCO dataset annotations to YOLO format in Ultralytics?

要将 COCO JSON 注释转换为YOLO 格式以用于对象检测，可以使用 convert_coco 实用程序。下面是一个示例代码片段：有关其他信息，请访问 convert_coco 参考页面。

Q: What is the purpose of the YOLO Data Explorer in the Ultralytics package?

YOLO Explorer 是 8.1.0 更新中引入的一个功能强大的工具，可增强对数据集的理解。通过该工具，您可以使用文本查询查找数据集中的对象实例，从而更轻松地分析和管理数据。该工具为数据集的组成和分布提供了宝贵的见解，有助于改进模型训练和性能。

Q: How can I convert bounding boxes to segments in Ultralytics?

要将现有的边界框数据（x y w h 格式）转换为线段，可以使用yolo_bbox2segment 函数。请确保您的文件是按图像和标签的不同目录组织的。更多信息，请访问yolo_bbox2segment 参考页面。

透视代码

"(《世界人权宣言》) ultralytics 软件包附带了大量实用程序，可以支持、增强和加快工作流程。还有更多可用的工具，但以下是对大多数开发人员有用的一些。在学习编程时，它们也是一个很好的参考点。

观看： Ultralytics 实用工具 | 自动注释、资源管理器 API 和数据集转换

数据

自动标签/注释

Dataset annotation is a very resource intensive and time-consuming process. If you have a YOLO object detection model trained on a reasonable amount of data, you can use it and SAM to auto-annotate additional data (segmentation format).

from ultralytics.data.annotator import auto_annotate

auto_annotate(  # (1)!
    data="path/to/new/data",
    det_model="yolo11n.pt",
    sam_model="mobile_sam.pt",
    device="cuda",
    output_dir="path/to/save_labels",
)

该函数不会返回任何内容
参见参考资料部分 annotator.auto_annotate 了解该功能如何运行。
与功能 segments2boxes 同时生成对象检测边界框

将分割蒙版转换为YOLO 格式

分割掩码至YOLO 格式

用于将分割掩码图像数据集转换为 YOLO 分割格式。该函数获取包含二进制格式掩码图像的目录，并将其转换为YOLO 分割格式。

转换后的掩码将保存在指定的输出目录中。

from ultralytics.data.converter import convert_segment_masks_to_yolo_seg

# The classes here is the total classes in the dataset, for COCO dataset we have 80 classes
convert_segment_masks_to_yolo_seg(masks_dir="path/to/masks_dir", output_dir="path/to/output_dir", classes=80)

将 COCO 转换为YOLO 格式

用于将 COCO JSON 注释转换为适当的YOLO 格式。用于对象检测（边界框）数据集、 use_segments 和 use_keypoints 都应 False

from ultralytics.data.converter import convert_coco

convert_coco(  # (1)!
    "../datasets/coco/annotations/",
    use_segments=False,
    use_keypoints=False,
    cls91to80=True,
)

该函数不会返回任何内容

有关 convert_coco 功能、访问参考页面

Get Bounding Box Dimensions

from ultralytics.utils.plotting import Annotator
from ultralytics import YOLO
import cv2

model = YOLO('yolo11n.pt')  # Load pretrain or fine-tune model

# Process the image
source = cv2.imread('path/to/image.jpg')
results = model(source)

# Extract results
annotator = Annotator(source, example=model.names)

for box in results[0].boxes.xyxy.cpu():
    width, height, area = annotator.get_bbox_dimension(box)
    print("Bounding Box Width {}, Height {}, Area {}".format(
        width.item(), height.item(), area.item()))

将边框转换为线段

现有的 x y w h 边框数据，使用 yolo_bbox2segment 功能。图像和注释文件需要这样组织：

data
|__ images
    ├─ 001.jpg
    ├─ 002.jpg
    ├─ ..
    └─ NNN.jpg
|__ labels
    ├─ 001.txt
    ├─ 002.txt
    ├─ ..
    └─ NNN.txt

from ultralytics.data.converter import yolo_bbox2segment

yolo_bbox2segment(  # (1)!
    im_dir="path/to/images",
    save_dir=None,  # saved to "labels-segment" in images directory
    sam_model="sam_b.pt",
)

该函数不会返回任何内容

访问 yolo_bbox2segment 参考页了解有关该功能的更多信息。

将线段转换为边框

如果您的数据集使用了分割数据集格式您可以轻松地将其转换为上下（或水平）边界框 (x y w h 格式）。

import numpy as np

from ultralytics.utils.ops import segments2boxes

segments = np.array(
    [
        [805, 392, 797, 400, ..., 808, 714, 808, 392],
        [115, 398, 113, 400, ..., 150, 400, 149, 298],
        [267, 412, 265, 413, ..., 300, 413, 299, 412],
    ]
)

segments2boxes([s.reshape(-1, 2) for s in segments])
# >>> array([[ 741.66, 631.12, 133.31, 479.25],
#           [ 146.81, 649.69, 185.62, 502.88],
#           [ 281.81, 636.19, 118.12, 448.88]],
#           dtype=float32) # xywh bounding boxes

要了解该功能的工作原理，请访问参考页面

公用设施

图像压缩

将单个图像文件压缩至更小尺寸，同时保留其宽高比和质量。如果输入图像小于最大尺寸，则不会调整其大小。

from pathlib import Path

from ultralytics.data.utils import compress_one_image

for f in Path("path/to/dataset").rglob("*.jpg"):
    compress_one_image(f)  # (1)!

该函数不会返回任何内容

自动分割数据集

自动将数据集分割成 train/val/test 分割，并将分割结果保存到 autosplit_*.txt 文件。该功能将使用随机取样，而使用 fraction 培训论据.

from ultralytics.data.utils import autosplit

autosplit(  # (1)!
    path="path/to/images",
    weights=(0.9, 0.1, 0.0),  # (train, validation, test) fractional splits
    annotated_only=False,  # split only images with annotation file when True
)

该函数不会返回任何内容

有关该功能的更多详情，请参阅参考页面。

分段多边形到二进制掩码

将单个多边形（列表形式）转换为指定图像大小的二进制掩码。多边形的形式为 [N, 2] 与 N 作为 (x, y) 定义多边形轮廓的点。

警告

N 总要是偶数。

import numpy as np

from ultralytics.data.utils import polygon2mask

imgsz = (1080, 810)
polygon = np.array([805, 392, 797, 400, ..., 808, 714, 808, 392])  # (238, 2)

mask = polygon2mask(
    imgsz,  # tuple
    [polygon],  # input as list
    color=255,  # 8-bit binary
    downsample_ratio=1,
)

边界框

边界框（水平）实例

要管理边界框数据，可使用 Bboxes 类将有助于在方框坐标格式之间进行转换、缩放方框尺寸、计算面积、包含偏移等！

import numpy as np

from ultralytics.utils.instance import Bboxes

boxes = Bboxes(
    bboxes=np.array(
        [
            [22.878, 231.27, 804.98, 756.83],
            [48.552, 398.56, 245.35, 902.71],
            [669.47, 392.19, 809.72, 877.04],
            [221.52, 405.8, 344.98, 857.54],
            [0, 550.53, 63.01, 873.44],
            [0.0584, 254.46, 32.561, 324.87],
        ]
    ),
    format="xyxy",
)

boxes.areas()
# >>> array([ 4.1104e+05,       99216,       68000,       55772,       20347,      2288.5])

boxes.convert("xywh")
print(boxes.bboxes)
# >>> array(
#     [[ 413.93, 494.05,  782.1, 525.56],
#      [ 146.95, 650.63,  196.8, 504.15],
#      [  739.6, 634.62, 140.25, 484.85],
#      [ 283.25, 631.67, 123.46, 451.74],
#      [ 31.505, 711.99,  63.01, 322.91],
#      [  16.31, 289.67, 32.503,  70.41]]
# )

参见 Bboxes 参考部分了解更多可用属性和方法。

提示

可以使用 Bboxes 类但如果你想直接使用这些函数，请参阅下一小节，了解如何独立导入这些函数。

缩放盒

当向上或向下缩放图像时，可以使用 ultralytics.utils.ops.scale_boxes.

import cv2 as cv
import numpy as np

from ultralytics.utils.ops import scale_boxes

image = cv.imread("ultralytics/assets/bus.jpg")
h, w, c = image.shape
resized = cv.resize(image, None, (), fx=1.2, fy=1.2)
new_h, new_w, _ = resized.shape

xyxy_boxes = np.array(
    [
        [22.878, 231.27, 804.98, 756.83],
        [48.552, 398.56, 245.35, 902.71],
        [669.47, 392.19, 809.72, 877.04],
        [221.52, 405.8, 344.98, 857.54],
        [0, 550.53, 63.01, 873.44],
        [0.0584, 254.46, 32.561, 324.87],
    ]
)

new_boxes = scale_boxes(
    img1_shape=(h, w),  # original image dimensions
    boxes=xyxy_boxes,  # boxes from original image
    img0_shape=(new_h, new_w),  # resized image dimensions (scale to)
    ratio_pad=None,
    padding=False,
    xywh=False,
)

print(new_boxes)  # (1)!
# >>> array(
#     [[  27.454,  277.52,  965.98,   908.2],
#     [   58.262,  478.27,  294.42,  1083.3],
#     [   803.36,  470.63,  971.66,  1052.4],
#     [   265.82,  486.96,  413.98,    1029],
#     [        0,  660.64,  75.612,  1048.1],
#     [   0.0701,  305.35,  39.073,  389.84]]
# )

根据新图像尺寸缩放边界框

边框格式转换

XYXY → XYWH

将边界框坐标从 (x1, y1, x2, y2) 格式转换为 (x, y, width, height) 格式，其中 (x1, y1) 为左上角，(x2, y2) 为右下角。

import numpy as np

from ultralytics.utils.ops import xyxy2xywh

xyxy_boxes = np.array(
    [
        [22.878, 231.27, 804.98, 756.83],
        [48.552, 398.56, 245.35, 902.71],
        [669.47, 392.19, 809.72, 877.04],
        [221.52, 405.8, 344.98, 857.54],
        [0, 550.53, 63.01, 873.44],
        [0.0584, 254.46, 32.561, 324.87],
    ]
)
xywh = xyxy2xywh(xyxy_boxes)

print(xywh)
# >>> array(
#     [[ 413.93,  494.05,   782.1, 525.56],
#     [  146.95,  650.63,   196.8, 504.15],
#     [   739.6,  634.62,  140.25, 484.85],
#     [  283.25,  631.67,  123.46, 451.74],
#     [  31.505,  711.99,   63.01, 322.91],
#     [   16.31,  289.67,  32.503,  70.41]]
# )

所有边框转换

from ultralytics.utils.ops import (
    ltwh2xywh,
    ltwh2xyxy,
    xywh2ltwh,  # xywh → top-left corner, w, h
    xywh2xyxy,
    xywhn2xyxy,  # normalized → pixel
    xyxy2ltwh,  # xyxy → top-left corner, w, h
    xyxy2xywhn,  # pixel → normalized
)

for func in (ltwh2xywh, ltwh2xyxy, xywh2ltwh, xywh2xyxy, xywhn2xyxy, xyxy2ltwh, xyxy2xywhn):
    print(help(func))  # print function docstrings

请参阅每个函数的文档说明，或访问 ultralytics.utils.ops 参考页阅读更多关于每种功能的信息。

绘图

绘制注释

Ultralytics 包含一个注释器类，可用于注释任何类型的数据。它最容易用于对象检测边界框、姿势关键点和定向边界框。

水平边界框

import cv2 as cv
import numpy as np

from ultralytics.utils.plotting import Annotator, colors

names = {  # (1)!
    0: "person",
    5: "bus",
    11: "stop sign",
}

image = cv.imread("ultralytics/assets/bus.jpg")
ann = Annotator(
    image,
    line_width=None,  # default auto-size
    font_size=None,  # default auto-size
    font="Arial.ttf",  # must be ImageFont compatible
    pil=False,  # use PIL, otherwise uses OpenCV
)

xyxy_boxes = np.array(
    [
        [5, 22.878, 231.27, 804.98, 756.83],  # class-idx x1 y1 x2 y2
        [0, 48.552, 398.56, 245.35, 902.71],
        [0, 669.47, 392.19, 809.72, 877.04],
        [0, 221.52, 405.8, 344.98, 857.54],
        [0, 0, 550.53, 63.01, 873.44],
        [11, 0.0584, 254.46, 32.561, 324.87],
    ]
)

for nb, box in enumerate(xyxy_boxes):
    c_idx, *box = box
    label = f"{str(nb).zfill(2)}:{names.get(int(c_idx))}"
    ann.box_label(box, label, color=colors(c_idx, bgr=True))

image_with_bboxes = ann.result()

名称可用于 model.names 当处理检测结果

定向边框（OBB）

import cv2 as cv
import numpy as np

from ultralytics.utils.plotting import Annotator, colors

obb_names = {10: "small vehicle"}
obb_image = cv.imread("datasets/dota8/images/train/P1142__1024__0___824.jpg")
obb_boxes = np.array(
    [
        [0, 635, 560, 919, 719, 1087, 420, 803, 261],  # class-idx x1 y1 x2 y2 x3 y2 x4 y4
        [0, 331, 19, 493, 260, 776, 70, 613, -171],
        [9, 869, 161, 886, 147, 851, 101, 833, 115],
    ]
)
ann = Annotator(
    obb_image,
    line_width=None,  # default auto-size
    font_size=None,  # default auto-size
    font="Arial.ttf",  # must be ImageFont compatible
    pil=False,  # use PIL, otherwise uses OpenCV
)
for obb in obb_boxes:
    c_idx, *obb = obb
    obb = np.array(obb).reshape(-1, 4, 2).squeeze()
    label = f"{obb_names.get(int(c_idx))}"
    ann.box_label(
        obb,
        label,
        color=colors(c_idx, True),
        rotated=True,
    )

image_with_obb = ann.result()

Bounding Boxes Circle Annotation Circle Label

import cv2

from ultralytics import YOLO
from ultralytics.utils.plotting import Annotator

model = YOLO("yolo11s.pt")
names = model.names
cap = cv2.VideoCapture("path/to/video/file.mp4")

w, h, fps = (int(cap.get(x)) for x in (cv2.CAP_PROP_FRAME_WIDTH, cv2.CAP_PROP_FRAME_HEIGHT, cv2.CAP_PROP_FPS))
writer = cv2.VideoWriter("Ultralytics circle annotation.avi", cv2.VideoWriter_fourcc(*"MJPG"), fps, (w, h))

while True:
    ret, im0 = cap.read()
    if not ret:
        break

    annotator = Annotator(im0)
    results = model.predict(im0)
    boxes = results[0].boxes.xyxy.cpu()
    clss = results[0].boxes.cls.cpu().tolist()

    for box, cls in zip(boxes, clss):
        annotator.circle_label(box, label=names[int(cls)])

    writer.write(im0)
    cv2.imshow("Ultralytics circle annotation", im0)

    if cv2.waitKey(1) & 0xFF == ord("q"):
        break

writer.release()
cap.release()
cv2.destroyAllWindows()

Bounding Boxes Text Annotation Text Label

import cv2

from ultralytics import YOLO
from ultralytics.utils.plotting import Annotator

model = YOLO("yolo11s.pt")
names = model.names
cap = cv2.VideoCapture("path/to/video/file.mp4")

w, h, fps = (int(cap.get(x)) for x in (cv2.CAP_PROP_FRAME_WIDTH, cv2.CAP_PROP_FRAME_HEIGHT, cv2.CAP_PROP_FPS))
writer = cv2.VideoWriter("Ultralytics text annotation.avi", cv2.VideoWriter_fourcc(*"MJPG"), fps, (w, h))

while True:
    ret, im0 = cap.read()
    if not ret:
        break

    annotator = Annotator(im0)
    results = model.predict(im0)
    boxes = results[0].boxes.xyxy.cpu()
    clss = results[0].boxes.cls.cpu().tolist()

    for box, cls in zip(boxes, clss):
        annotator.text_label(box, label=names[int(cls)])

    writer.write(im0)
    cv2.imshow("Ultralytics text annotation", im0)

    if cv2.waitKey(1) & 0xFF == ord("q"):
        break

writer.release()
cap.release()
cv2.destroyAllWindows()

参见 Annotator 参考页面了解更多信息。

杂项

代码剖析

检查代码运行/处理的持续时间，可以使用 with 或作为装饰设计师。

from ultralytics.utils.ops import Profile

with Profile(device="cuda:0") as dt:
    pass  # operation to measure

print(dt)
# >>> "Elapsed time is 9.5367431640625e-07 s"

Ultralytics 支持的格式

希望或需要以编程方式使用Ultralytics 支持的图像或视频类型格式？如有需要，请使用这些常量。

from ultralytics.data.utils import IMG_FORMATS, VID_FORMATS

print(IMG_FORMATS)
# {'tiff', 'pfm', 'bmp', 'mpo', 'dng', 'jpeg', 'png', 'webp', 'tif', 'jpg'}

print(VID_FORMATS)
# {'avi', 'mpg', 'wmv', 'mpeg', 'm4v', 'mov', 'mp4', 'asf', 'mkv', 'ts', 'gif', 'webm'}

可分割

计算最近的整数 x 除以 y.

from ultralytics.utils.ops import make_divisible

make_divisible(7, 3)
# >>> 9
make_divisible(7, 2)
# >>> 8

常见问题

What utilities are included in the Ultralytics package to enhance machine learning workflows?

Ultralytics 软件包包含多种实用程序，旨在简化和优化机器学习工作流程。主要实用程序包括用于标注数据集的自动注释、使用convert_coco 将 COCO 转换为YOLO 格式、压缩图像以及数据集自动分割。这些工具旨在减少人工操作，确保一致性，提高数据处理效率。

如何使用Ultralytics 自动标注数据集？

如果您有预先训练好的Ultralytics YOLO 物体检测模型，您可以将其与 SAM模型来自动标注分割格式的数据集。下面是一个例子：

from ultralytics.data.annotator import auto_annotate

auto_annotate(
    data="path/to/new/data",
    det_model="yolo11n.pt",
    sam_model="mobile_sam.pt",
    device="cuda",
    output_dir="path/to/save_labels",
)

有关详细信息，请查看auto_annotate 参考章节。

如何在Ultralytics 中将 COCO 数据集注释转换为YOLO 格式？

要将 COCO JSON 注释转换为YOLO 格式以用于对象检测，可以使用 convert_coco 实用工具。下面是一个示例代码片段：

from ultralytics.data.converter import convert_coco

convert_coco(
    "../datasets/coco/annotations/",
    use_segments=False,
    use_keypoints=False,
    cls91to80=True,
)

有关其他信息，请访问convert_coco 参考页面。

Ultralytics 软件包中的YOLO Data Explorer 有什么作用？

"(《世界人权宣言》) YOLO 探险家中引入的一个强大工具。 8.1.0 更新，以增强对数据集的理解。通过它，您可以使用文本查询来查找数据集中的对象实例，从而更轻松地分析和管理数据。该工具为数据集的组成和分布提供了宝贵的见解，有助于改进模型训练和性能。

如何在Ultralytics 中将边界框转换为线段？

要转换现有的边界框数据（以 x y w h 格式）到线段，可以使用 yolo_bbox2segment 功能。确保文件有序，图像和标签有独立的目录。

from ultralytics.data.converter import yolo_bbox2segment

yolo_bbox2segment(
    im_dir="path/to/images",
    save_dir=None,  # saved to "labels-segment" in the images directory
    sam_model="sam_b.pt",
)

更多信息，请访问yolo_bbox2segment 参考页面。

📅 Created 8 months ago ✏️ Updated 8 days ago