简单实用工具
ultralytics 软件包提供了多种实用工具,旨在支持、增强并加速你的工作流程。虽然还有更多可用功能,但本指南重点介绍了对开发者最有用的一些工具,可作为你使用 Ultralytics 工具进行编程时的实用参考。
Watch: Ultralytics Utilities | Auto Annotation, Explorer API and Dataset Conversion
数据
自动标注 / 注释
数据集标注 是一个资源密集且耗时的过程。如果你有一个基于适量数据训练过的 Ultralytics YOLO 目标检测模型,你可以将其与 SAM 配合使用,以分割格式自动标注更多数据。
from ultralytics.data.annotator import auto_annotate
auto_annotate(
data="path/to/new/data",
det_model="yolo26n.pt",
sam_model="mobile_sam.pt",
device="cuda",
output_dir="path/to/save_labels",
)该函数不返回任何值。有关详细信息,请参阅:
- 请参阅
annotator.auto_annotate参考章节,以深入了解该函数的运作方式。 - 结合 函数
segments2boxes使用,也可以生成目标检测边界框。
可视化数据集标注
此函数在训练前将 YOLO 标注可视化显示在图像上,有助于识别和纠正任何可能导致错误检测结果的不当标注。它会绘制边界框,并使用类别名称标注对象,同时根据背景亮度调整文本颜色,以提高可读性。
from ultralytics.data.utils import visualize_image_annotations
label_map = { # Define the label map with all annotated class labels.
0: "person",
1: "car",
}
# Visualize
visualize_image_annotations(
"path/to/image.jpg", # Input image path.
"path/to/annotations.txt", # Annotation file path for the image.
label_map,
)将分割掩码转换为 YOLO 格式

使用此功能可将分割掩码图像数据集转换为 Ultralytics YOLO 分割格式。此函数会读取包含二进制格式掩码图像的目录,并将其转换为 YOLO 分割格式。
转换后的掩码将保存在指定的输出目录中。
from ultralytics.data.converter import convert_segment_masks_to_yolo_seg
# The classes here is the total classes in the dataset.
# for COCO dataset we have 80 classes.
convert_segment_masks_to_yolo_seg(masks_dir="path/to/masks_dir", output_dir="path/to/output_dir", classes=80)将 COCO 转换为 YOLO 格式
使用此功能可将 COCO JSON 标注转换为 YOLO 格式。对于目标检测(边界框)数据集,请将 use_segments 和 use_keypoints 均设为 False。
from ultralytics.data.converter import convert_coco
convert_coco(
"coco/annotations/",
use_segments=False,
use_keypoints=False,
cls91to80=True,
)有关 convert_coco 函数的更多信息,请访问参考页面。
获取边界框维度
import cv2
from ultralytics import YOLO
from ultralytics.utils.plotting import Annotator
model = YOLO("yolo26n.pt") # Load pretrain or fine-tune model
# Process the image
source = cv2.imread("path/to/image.jpg")
results = model(source)
# Extract results
annotator = Annotator(source, example=model.names)
for box in results[0].boxes.xyxy.cpu():
width, height, area = annotator.get_bbox_dimension(box)
print(f"Bounding Box Width {width.item()}, Height {height.item()}, Area {area.item()}")将边界框转换为分割
有了现有的 x y w h 边界框数据,你可以使用 yolo_bbox2segment 函数将其转换为分割。请按以下方式组织图像和标注文件:
data
|__ images
├─ 001.jpg
├─ 002.jpg
├─ ..
└─ NNN.jpg
|__ labels
├─ 001.txt
├─ 002.txt
├─ ..
└─ NNN.txtfrom ultralytics.data.converter import yolo_bbox2segment
yolo_bbox2segment(
im_dir="path/to/images",
save_dir=None, # saved to "labels-segment" in images directory
sam_model="sam_b.pt",
)请访问 yolo_bbox2segment 参考页面以获取有关该函数的更多信息。
将分割转换为边界框
如果你有一个使用 分割数据集格式 的数据集,可以使用此函数轻松将其转换为直立(或水平)边界框(x y w h 格式)。
import numpy as np
from ultralytics.utils.ops import segments2boxes
segments = np.array(
[
[805, 392, 797, 400, ..., 808, 714, 808, 392],
[115, 398, 113, 400, ..., 150, 400, 149, 298],
[267, 412, 265, 413, ..., 300, 413, 299, 412],
]
)
segments2boxes([s.reshape(-1, 2) for s in segments])
# >>> array([[ 741.66, 631.12, 133.31, 479.25],
# [ 146.81, 649.69, 185.62, 502.88],
# [ 281.81, 636.19, 118.12, 448.88]],
# dtype=float32) # xywh bounding boxes要了解该函数的工作原理,请访问 参考页面。
实用工具
图像压缩
将单个图像文件压缩为较小尺寸,同时保持其宽高比和质量。如果输入图像小于最大尺寸,则不会调整大小。
from pathlib import Path
from ultralytics.data.utils import compress_one_image
for f in Path("path/to/dataset").rglob("*.jpg"):
compress_one_image(f)自动拆分数据集
自动将数据集拆分为 train/val/test 集合,并将拆分结果保存到 autosplit_*.txt 文件中。此函数使用随机采样,在使用 fraction 参数进行训练时此部分会被排除。
from ultralytics.data.split import autosplit
autosplit(
path="path/to/images",
weights=(0.9, 0.1, 0.0), # (train, validation, test) fractional splits
annotated_only=False, # split only images with annotation file when True
)有关此函数的详细信息,请参阅 参考页面。
将多边形分割转换为二进制掩码
将单个多边形(作为列表)转换为指定图像大小的二进制掩码。多边形应采用 [N, 2] 的形式,其中 N 是定义多边形轮廓的 (x, y) 点数。
N 必须始终 为偶数。
import numpy as np
from ultralytics.data.utils import polygon2mask
imgsz = (1080, 810)
polygon = np.array([805, 392, 797, 400, ..., 808, 714, 808, 392]) # (238, 2)
mask = polygon2mask(
imgsz, # tuple
[polygon], # input as list
color=255, # 8-bit binary
downsample_ratio=1,
)边界框
边界框(水平)实例
为了管理边界框数据,Bboxes 类有助于在各种框坐标格式之间进行转换、缩放框尺寸、计算面积、包含偏移量等。
import numpy as np
from ultralytics.utils.instance import Bboxes
boxes = Bboxes(
bboxes=np.array(
[
[22.878, 231.27, 804.98, 756.83],
[48.552, 398.56, 245.35, 902.71],
[669.47, 392.19, 809.72, 877.04],
[221.52, 405.8, 344.98, 857.54],
[0, 550.53, 63.01, 873.44],
[0.0584, 254.46, 32.561, 324.87],
]
),
format="xyxy",
)
boxes.areas()
# >>> array([ 4.1104e+05, 99216, 68000, 55772, 20347, 2288.5])
boxes.convert("xywh")
print(boxes.bboxes)
# >>> array(
# [[ 413.93, 494.05, 782.1, 525.56],
# [ 146.95, 650.63, 196.8, 504.15],
# [ 739.6, 634.62, 140.25, 484.85],
# [ 283.25, 631.67, 123.46, 451.74],
# [ 31.505, 711.99, 63.01, 322.91],
# [ 16.31, 289.67, 32.503, 70.41]]
# )请参阅 Bboxes 参考章节以了解更多属性和方法。
以下许多功能(以及更多功能)都可以通过 Bboxes 类访问,但如果你更喜欢直接使用函数,请参阅接下来的子章节了解如何独立导入它们。
缩放边界框
当放大或缩小图像时,你可以使用 ultralytics.utils.ops.scale_boxes 相应地缩放对应的边界框坐标以进行匹配。
import cv2 as cv
import numpy as np
from ultralytics.utils.ops import scale_boxes
image = cv.imread("ultralytics/assets/bus.jpg")
h, w, c = image.shape
resized = cv.resize(image, None, (), fx=1.2, fy=1.2)
new_h, new_w, _ = resized.shape
xyxy_boxes = np.array(
[
[22.878, 231.27, 804.98, 756.83],
[48.552, 398.56, 245.35, 902.71],
[669.47, 392.19, 809.72, 877.04],
[221.52, 405.8, 344.98, 857.54],
[0, 550.53, 63.01, 873.44],
[0.0584, 254.46, 32.561, 324.87],
]
)
new_boxes = scale_boxes(
img1_shape=(h, w), # original image dimensions
boxes=xyxy_boxes, # boxes from original image
img0_shape=(new_h, new_w), # resized image dimensions (scale to)
ratio_pad=None,
padding=False,
xywh=False,
)
print(new_boxes)
# >>> array(
# [[ 27.454, 277.52, 965.98, 908.2],
# [ 58.262, 478.27, 294.42, 1083.3],
# [ 803.36, 470.63, 971.66, 1052.4],
# [ 265.82, 486.96, 413.98, 1029],
# [ 0, 660.64, 75.612, 1048.1],
# [ 0.0701, 305.35, 39.073, 389.84]]
# )边界框格式转换
XYXY → XYWH
将边界框坐标从 (x1, y1, x2, y2) 格式转换为 (x, y, width, height) 格式,其中 (x1, y1) 为左上角,(x2, y2) 为右下角。
import numpy as np
from ultralytics.utils.ops import xyxy2xywh
xyxy_boxes = np.array(
[
[22.878, 231.27, 804.98, 756.83],
[48.552, 398.56, 245.35, 902.71],
[669.47, 392.19, 809.72, 877.04],
[221.52, 405.8, 344.98, 857.54],
[0, 550.53, 63.01, 873.44],
[0.0584, 254.46, 32.561, 324.87],
]
)
xywh = xyxy2xywh(xyxy_boxes)
print(xywh)
# >>> array(
# [[ 413.93, 494.05, 782.1, 525.56],
# [ 146.95, 650.63, 196.8, 504.15],
# [ 739.6, 634.62, 140.25, 484.85],
# [ 283.25, 631.67, 123.46, 451.74],
# [ 31.505, 711.99, 63.01, 322.91],
# [ 16.31, 289.67, 32.503, 70.41]]
# )所有边界框转换
from ultralytics.utils.ops import (
ltwh2xywh,
ltwh2xyxy,
xywh2ltwh, # xywh → top-left corner, w, h
xywh2xyxy,
xywhn2xyxy, # normalized → pixel
xyxy2ltwh, # xyxy → top-left corner, w, h
xyxy2xywhn, # pixel → normalized
)
for func in (ltwh2xywh, ltwh2xyxy, xywh2ltwh, xywh2xyxy, xywhn2xyxy, xyxy2ltwh, xyxy2xywhn):
print(help(func)) # print function docstrings请查看每个函数的文档字符串或访问 ultralytics.utils.ops 参考页面以阅读更多信息。
绘图
标注实用工具
Ultralytics 包含一个用于标注各类数据类型的 Annotator 类。它最适合与 目标检测边界框、姿态关键点 和 旋转边界框 配合使用。
框标注
import cv2 as cv
import numpy as np
from ultralytics.utils.plotting import Annotator, colors
names = {
0: "person",
5: "bus",
11: "stop sign",
}
image = cv.imread("ultralytics/assets/bus.jpg")
ann = Annotator(
image,
line_width=None, # default auto-size
font_size=None, # default auto-size
font="Arial.ttf", # must be ImageFont compatible
pil=False, # use PIL, otherwise uses OpenCV
)
xyxy_boxes = np.array(
[
[5, 22.878, 231.27, 804.98, 756.83], # class-idx x1 y1 x2 y2
[0, 48.552, 398.56, 245.35, 902.71],
[0, 669.47, 392.19, 809.72, 877.04],
[0, 221.52, 405.8, 344.98, 857.54],
[0, 0, 550.53, 63.01, 873.44],
[11, 0.0584, 254.46, 32.561, 324.87],
]
)
for nb, box in enumerate(xyxy_boxes):
c_idx, *box = box
label = f"{str(nb).zfill(2)}:{names.get(int(c_idx))}"
ann.box_label(box, label, color=colors(c_idx, bgr=True))
image_with_bboxes = ann.result()Names can be used from model.names when working with detection results.
Also see the Annotator Reference Page for additional insight.
Ultralytics 扫描标注
import cv2
import numpy as np
from ultralytics import YOLO
from ultralytics.solutions.solutions import SolutionAnnotator
from ultralytics.utils.plotting import colors
# User defined video path and model file
cap = cv2.VideoCapture("path/to/video.mp4")
model = YOLO(model="yolo26s-seg.pt") # Model file, e.g., yolo26s.pt or yolo26m-seg.pt
if not cap.isOpened():
print("Error: Could not open video.")
exit()
# Initialize the video writer object.
w, h, fps = (int(cap.get(x)) for x in (cv2.CAP_PROP_FRAME_WIDTH, cv2.CAP_PROP_FRAME_HEIGHT, cv2.CAP_PROP_FPS))
video_writer = cv2.VideoWriter("ultralytics.avi", cv2.VideoWriter_fourcc(*"mp4v"), fps, (w, h))
masks = None # Initialize variable to store masks data
f = 0 # Initialize frame count variable for enabling mouse event.
line_x = w # Store width of line.
dragging = False # Initialize bool variable for line dragging.
classes = model.names # Store model classes names for plotting.
window_name = "Ultralytics Sweep Annotator"
def drag_line(event, x, _, flags, param):
"""Mouse callback function to enable dragging a vertical sweep line across the video frame."""
global line_x, dragging
if event == cv2.EVENT_LBUTTONDOWN or (flags & cv2.EVENT_FLAG_LBUTTON):
line_x = max(0, min(x, w))
dragging = True
while cap.isOpened(): # Loop over the video capture object.
ret, im0 = cap.read()
if not ret:
break
f = f + 1 # Increment frame count.
count = 0 # Re-initialize count variable on every frame for precise counts.
results = model.track(im0, persist=True)[0]
if f == 1:
cv2.namedWindow(window_name)
cv2.setMouseCallback(window_name, drag_line)
annotator = SolutionAnnotator(im0)
if results.boxes.is_track:
if results.masks is not None:
masks = [np.array(m, dtype=np.int32) for m in results.masks.xy]
boxes = results.boxes.xyxy.tolist()
track_ids = results.boxes.id.int().cpu().tolist()
clss = results.boxes.cls.cpu().tolist()
for mask, box, cls, t_id in zip(masks or [None] * len(boxes), boxes, clss, track_ids):
color = colors(t_id, True) # Assign different color to each tracked object.
label = f"{classes[cls]}:{t_id}"
if mask is not None and mask.size > 0:
if box[0] > line_x:
count += 1
cv2.polylines(im0, [mask], True, color, 2)
x, y = mask.min(axis=0)
(w_m, _), _ = cv2.getTextSize(label, cv2.FONT_HERSHEY_SIMPLEX, 0.5, 1)
cv2.rectangle(im0, (x, y - 20), (x + w_m, y), color, -1)
cv2.putText(im0, label, (x, y - 5), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (255, 255, 255), 1)
else:
if box[0] > line_x:
count += 1
annotator.box_label(box=box, color=color, label=label)
# Generate draggable sweep line
annotator.sweep_annotator(line_x=line_x, line_y=h, label=f"COUNT:{count}")
cv2.imshow(window_name, im0)
video_writer.write(im0)
if cv2.waitKey(1) & 0xFF == ord("q"):
break
# Release the resources
cap.release()
video_writer.release()
cv2.destroyAllWindows()Find additional details about the sweep_annotator method in our reference section here.
自适应标签标注
从 Ultralytics v8.3.167 开始,circle_label 和 text_label 已被统一的 adaptive_label 函数取代。你现在可以使用 shape 参数指定标注类型:
- 矩形:
annotator.adaptive_label(box, label=names[int(cls)], color=colors(cls, True), shape="rect") - 圆形:
annotator.adaptive_label(box, label=names[int(cls)], color=colors(cls, True), shape="circle")
Watch: In-Depth Guide to Text & Circle Annotations with Python Live Demos | Ultralytics Annotations 🚀
import cv2
from ultralytics import YOLO
from ultralytics.solutions.solutions import SolutionAnnotator
from ultralytics.utils.plotting import colors
model = YOLO("yolo26s.pt")
names = model.names
cap = cv2.VideoCapture("path/to/video.mp4")
w, h, fps = (int(cap.get(x)) for x in (cv2.CAP_PROP_FRAME_WIDTH, cv2.CAP_PROP_FRAME_HEIGHT, cv2.CAP_PROP_FPS))
writer = cv2.VideoWriter("Ultralytics circle annotation.avi", cv2.VideoWriter_fourcc(*"MJPG"), fps, (w, h))
while True:
ret, im0 = cap.read()
if not ret:
break
annotator = SolutionAnnotator(im0)
results = model.predict(im0)[0]
boxes = results.boxes.xyxy.cpu()
clss = results.boxes.cls.cpu().tolist()
for box, cls in zip(boxes, clss):
annotator.adaptive_label(box, label=names[int(cls)], color=colors(cls, True), shape="circle")
writer.write(im0)
cv2.imshow("Ultralytics circle annotation", im0)
if cv2.waitKey(1) & 0xFF == ord("q"):
break
writer.release()
cap.release()
cv2.destroyAllWindows()有关更多见解,请参阅 SolutionAnnotator 参考页面。
杂项
代码分析 (Profiling)
使用 with 或作为装饰器检查代码运行/处理的时长。
from ultralytics.utils.ops import Profile
with Profile(device="cuda:0") as dt:
pass # operation to measure
print(dt)
# >>> "Elapsed time is 9.5367431640625e-07 s"Ultralytics 支持的格式
需要以编程方式使用 Ultralytics 中支持的 图像或视频格式?如果需要,请使用这些常量:
from ultralytics.data.utils import IMG_FORMATS, VID_FORMATS
print(IMG_FORMATS)
# {'avif', 'bmp', 'dng', 'heic', 'heif', 'jp2', 'jpeg', 'jpeg2000', 'jpg', 'mpo', 'png', 'tif', 'tiff', 'webp'}
print(VID_FORMATS)
# {'asf', 'avi', 'gif', 'm4v', 'mkv', 'mov', 'mp4', 'mpeg', 'mpg', 'ts', 'wmv', 'webm'}使可被整除
计算最接近 x 且能被 y 整除的整数。
from ultralytics.utils.ops import make_divisible
make_divisible(7, 3)
# >>> 9
make_divisible(7, 2)
# >>> 8常见问题 (FAQ)
Ultralytics 软件包中包含哪些用于增强机器学习工作流程的实用工具?
The Ultralytics package includes utilities designed to streamline and optimize machine learning workflows. Key utilities include auto-annotation for labeling datasets, converting COCO to YOLO format with convert_coco, compressing images, and dataset auto-splitting. These tools reduce manual effort, ensure consistency, and enhance data processing efficiency.
我该如何使用 Ultralytics 自动标注我的数据集?
如果你有预训练的 Ultralytics YOLO 目标检测模型,可以将其与 SAM 模型配合使用,以分割格式自动标注数据集。示例如下:
from ultralytics.data.annotator import auto_annotate
auto_annotate(
data="path/to/new/data",
det_model="yolo26n.pt",
sam_model="mobile_sam.pt",
device="cuda",
output_dir="path/to/save_labels",
)有关更多详细信息,请查看 auto_annotate 参考章节,或者使用 Ultralytics Platform 作为托管的无代码替代方案,通过 SAM 2.1 或 SAM 3 进行基于点击的掩码标注,或者获取预训练和微调后的 YOLO 模型在检测、分割和 OBB 任务上的预测结果。
如何在 Ultralytics 中将 COCO 数据集标注转换为 YOLO 格式?
要将 COCO JSON 标注转换为目标检测的 YOLO 格式,你可以使用 convert_coco 实用工具。以下是代码示例:
from ultralytics.data.converter import convert_coco
convert_coco(
"coco/annotations/",
use_segments=False,
use_keypoints=False,
cls91to80=True,
)有关更多信息,请访问 convert_coco 参考页面。
如何分析我的数据集组成和分布?
Ultralytics Platform 提供自动数据集分析功能:Charts 选项卡显示拆分分布、顶级类别计数、图像尺寸直方图以及标注位置的 2D 热图,帮助你在训练前发现不平衡和异常值。
如何将边界框转换为 Ultralytics 中的分割?
要将现有的边界框数据(x y w h 格式)转换为分割,你可以使用 yolo_bbox2segment 函数。请确保你的文件已组织好,并且图像和标签存放在不同的目录中。
from ultralytics.data.converter import yolo_bbox2segment
yolo_bbox2segment(
im_dir="path/to/images",
save_dir=None, # saved to "labels-segment" in the images directory
sam_model="sam_b.pt",
)有关更多信息,请访问 yolo_bbox2segment 参考页面。