μ½˜ν…μΈ λ‘œ κ±΄λ„ˆλ›°κΈ°

κ°„λ‹¨ν•œ μœ ν‹Έλ¦¬ν‹°

관점이 μžˆλŠ” μ½”λ“œ

그리고 ultralytics νŒ¨ν‚€μ§€μ—λŠ” μ›Œν¬ν”Œλ‘œλ₯Ό 지원, ν–₯상 및 속도λ₯Ό 높일 수 μžˆλŠ” μˆ˜λ§Žμ€ μœ ν‹Έλ¦¬ν‹°κ°€ ν¬ν•¨λ˜μ–΄ μžˆμŠ΅λ‹ˆλ‹€. 더 λ§Žμ€ μœ ν‹Έλ¦¬ν‹°κ°€ μžˆμ§€λ§Œ μ—¬κΈ°μ„œλŠ” λŒ€λΆ€λΆ„μ˜ κ°œλ°œμžμ—κ²Œ μœ μš©ν•œ λͺ‡ 가지λ₯Ό μ†Œκ°œν•©λ‹ˆλ‹€. λ˜ν•œ ν”„λ‘œκ·Έλž˜λ°μ„ 배울 λ•Œ μ°Έκ³ ν•  수 μžˆλŠ” ν›Œλ₯­ν•œ μ°Έκ³  μžλ£Œμ΄κΈ°λ„ ν•©λ‹ˆλ‹€.



Watch: Ultralytics μœ ν‹Έλ¦¬ν‹° > μžλ™ 주석, 탐색기 API 및 데이터 μ„ΈνŠΈ λ³€ν™˜

데이터

μžλ™ 라벨링/주석

데이터 μ„ΈνŠΈ μ–΄λ…Έν…Œμ΄μ…˜μ€ λ¦¬μ†ŒμŠ€μ™€ μ‹œκ°„μ΄ 많이 μ†Œμš”λ˜λŠ” ν”„λ‘œμ„ΈμŠ€μž…λ‹ˆλ‹€. μ μ ˆν•œ μ–‘μ˜ λ°μ΄ν„°λ‘œ ν•™μŠ΅λœ YOLO 개체 감지 λͺ¨λΈμ΄ μžˆλŠ” 경우, 이λ₯Ό μ‚¬μš©ν•˜κ³  SAM λ₯Ό μ‚¬μš©ν•˜μ—¬ μΆ”κ°€ 데이터(μ„ΈλΆ„ν™” ν˜•μ‹)에 μžλ™ 주석을 달 수 μžˆμŠ΅λ‹ˆλ‹€.

from ultralytics.data.annotator import auto_annotate

auto_annotate(
    data="path/to/new/data",
    det_model="yolo11n.pt",
    sam_model="mobile_sam.pt",
    device="cuda",
    output_dir="path/to/save_labels",
)

This function does not return any value. For further details on how the function operates:

Visualize Dataset Annotations

This function visualizes YOLO annotations on an image before training, helping to identify and correct any wrong annotations that could lead to incorrect detection results. It draws bounding boxes, labels objects with class names, and adjusts text color based on the background's luminance for better readability.

from ultralytics.data.utils import visualize_image_annotations

label_map = {  # Define the label map with all annotated class labels.
    0: "person",
    1: "car",
}

# Visualize
visualize_image_annotations(
    "path/to/image.jpg",  # Input image path.
    "path/to/annotations.txt",  # Annotation file path for the image.
    label_map,
)

μ„ΈλΆ„ν™” 마슀크λ₯Ό YOLO ν˜•μ‹μœΌλ‘œ λ³€ν™˜

μ„ΈλΆ„ν™” 마슀크λ₯Ό YOLO ν˜•μ‹μœΌλ‘œ λ³€ν™˜

μ„ΈλΆ„ν™” 마슀크 μ΄λ―Έμ§€μ˜ 데이터 μ„ΈνŠΈλ₯Ό YOLO μ„ΈλΆ„ν™” ν˜•μ‹μ„ μ‚¬μš©ν•©λ‹ˆλ‹€. 이 ν•¨μˆ˜λŠ” λ°”μ΄λ„ˆλ¦¬ ν˜•μ‹μ˜ 마슀크 이미지가 ν¬ν•¨λœ 디렉토리λ₯Ό 가져와 YOLO μ„Έκ·Έλ¨Όν…Œμ΄μ…˜ ν˜•μ‹μœΌλ‘œ λ³€ν™˜ν•©λ‹ˆλ‹€.

λ³€ν™˜λœ λ§ˆμŠ€ν¬λŠ” μ§€μ •λœ 좜λ ₯ 디렉터리에 μ €μž₯λ©λ‹ˆλ‹€.

from ultralytics.data.converter import convert_segment_masks_to_yolo_seg

# The classes here is the total classes in the dataset.
# for COCO dataset we have 80 classes.
convert_segment_masks_to_yolo_seg(masks_dir="path/to/masks_dir", output_dir="path/to/output_dir", classes=80)

COCOλ₯Ό YOLO ν˜•μ‹μœΌλ‘œ λ³€ν™˜

COCO JSON 주석을 μ μ ˆν•œ YOLO ν˜•μ‹μœΌλ‘œ λ³€ν™˜ν•˜λŠ” 데 μ‚¬μš©ν•©λ‹ˆλ‹€. 객체 감지(λ°”μš΄λ”© λ°•μŠ€) 데이터 μ„ΈνŠΈμ˜ 경우, use_segments 그리고 use_keypoints λ‘˜ λ‹€ False

from ultralytics.data.converter import convert_coco

convert_coco(  # (1)!
    "../datasets/coco/annotations/",
    use_segments=False,
    use_keypoints=False,
    cls91to80=True,
)
  1. 이 ν•¨μˆ˜μ—μ„œ λ°˜ν™˜λ˜λŠ” 것은 μ—†μŠ΅λ‹ˆλ‹€.

에 λŒ€ν•œ μžμ„Έν•œ λ‚΄μš©μ€ convert_coco ν•¨μˆ˜μž…λ‹ˆλ‹€, μ°Έμ‘° νŽ˜μ΄μ§€ λ°©λ¬Έ

λ°”μš΄λ”© λ°•μŠ€ 치수 κ°€μ Έμ˜€κΈ°

from ultralytics.utils.plotting import Annotator
from ultralytics import YOLO
import cv2

model = YOLO('yolo11n.pt')  # Load pretrain or fine-tune model

# Process the image
source = cv2.imread('path/to/image.jpg')
results = model(source)

# Extract results
annotator = Annotator(source, example=model.names)

for box in results[0].boxes.xyxy.cpu():
    width, height, area = annotator.get_bbox_dimension(box)
    print("Bounding Box Width {}, Height {}, Area {}".format(
        width.item(), height.item(), area.item()))

λ°”μš΄λ”© λ°•μŠ€λ₯Ό μ„Έκ·Έλ¨ΌνŠΈλ‘œ λ³€ν™˜ν•˜κΈ°

κΈ°μ‘΄ x y w h λ°”μš΄λ”© λ°•μŠ€ 데이터λ₯Ό μ‚¬μš©ν•˜μ—¬ μ„Έκ·Έλ¨ΌνŠΈλ‘œ λ³€ν™˜ν•©λ‹ˆλ‹€. yolo_bbox2segment ν•¨μˆ˜λ₯Ό μ‚¬μš©ν•˜μ„Έμš”. 이미지 및 μ£Όμ„μš© νŒŒμΌμ€ λ‹€μŒκ³Ό 같이 ꡬ성해야 ν•©λ‹ˆλ‹€:

data
|__ images
    β”œβ”€ 001.jpg
    β”œβ”€ 002.jpg
    β”œβ”€ ..
    └─ NNN.jpg
|__ labels
    β”œβ”€ 001.txt
    β”œβ”€ 002.txt
    β”œβ”€ ..
    └─ NNN.txt
from ultralytics.data.converter import yolo_bbox2segment

yolo_bbox2segment(  # (1)!
    im_dir="path/to/images",
    save_dir=None,  # saved to "labels-segment" in images directory
    sam_model="sam_b.pt",
)
  1. 이 ν•¨μˆ˜μ—μ„œ λ°˜ν™˜λ˜λŠ” 것은 μ—†μŠ΅λ‹ˆλ‹€.

λ°©λ¬Έν•˜κΈ° yolo_bbox2segment μ°Έμ‘° νŽ˜μ΄μ§€ κΈ°λŠ₯에 λŒ€ν•œ μžμ„Έν•œ λ‚΄μš©μ„ ν™•μΈν•˜μ„Έμš”.

μ„Έκ·Έλ¨ΌνŠΈλ₯Ό λ°”μš΄λ”© λ°•μŠ€λ‘œ λ³€ν™˜ν•˜κΈ°

λ₯Ό μ‚¬μš©ν•˜λŠ” 데이터 집합이 μžˆλŠ” 경우 μ„ΈλΆ„ν™” 데이터 μ„ΈνŠΈ ν˜•μ‹ λ₯Ό μ‚¬μš©ν•˜λ©΄ μ‰½κ²Œ 수직(λ˜λŠ” μˆ˜ν‰) 경계 μƒμžλ‘œ λ³€ν™˜ν•  수 μžˆμŠ΅λ‹ˆλ‹€(x y w h ν˜•μ‹)을 μ‚¬μš©ν•˜μ—¬ 이 ν•¨μˆ˜λ₯Ό μ‚¬μš©ν•  수 μžˆμŠ΅λ‹ˆλ‹€.

import numpy as np

from ultralytics.utils.ops import segments2boxes

segments = np.array(
    [
        [805, 392, 797, 400, ..., 808, 714, 808, 392],
        [115, 398, 113, 400, ..., 150, 400, 149, 298],
        [267, 412, 265, 413, ..., 300, 413, 299, 412],
    ]
)

segments2boxes([s.reshape(-1, 2) for s in segments])
# >>> array([[ 741.66, 631.12, 133.31, 479.25],
#           [ 146.81, 649.69, 185.62, 502.88],
#           [ 281.81, 636.19, 118.12, 448.88]],
#           dtype=float32) # xywh bounding boxes

이 κΈ°λŠ₯의 μž‘λ™ 방식을 μ΄ν•΄ν•˜λ €λ©΄ μ°Έμ‘° νŽ˜μ΄μ§€λ₯Ό λ°©λ¬Έν•˜μ„Έμš”.

μœ ν‹Έλ¦¬ν‹°

이미지 μ••μΆ•

κ°€λ‘œ μ„Έλ‘œ λΉ„μœ¨κ³Ό ν’ˆμ§ˆμ„ μœ μ§€ν•˜λ©΄μ„œ 단일 이미지 νŒŒμΌμ„ μΆ•μ†Œλœ 크기둜 μ••μΆ•ν•©λ‹ˆλ‹€. μž…λ ₯ 이미지가 μ΅œλŒ€ 크기보닀 μž‘μœΌλ©΄ 크기가 μ‘°μ •λ˜μ§€ μ•ŠμŠ΅λ‹ˆλ‹€.

from pathlib import Path

from ultralytics.data.utils import compress_one_image

for f in Path("path/to/dataset").rglob("*.jpg"):
    compress_one_image(f)  # (1)!
  1. 이 ν•¨μˆ˜μ—μ„œ λ°˜ν™˜λ˜λŠ” 것은 μ—†μŠ΅λ‹ˆλ‹€.

데이터 μ„ΈνŠΈ μžλ™ λΆ„ν• 

데이터 집합을 λ‹€μŒκ³Ό 같이 μžλ™μœΌλ‘œ λΆ„ν• ν•©λ‹ˆλ‹€. train/val/test λΆ„ν• ν•˜κ³  κ²°κ³Ό 뢄할을 autosplit_*.txt νŒŒμΌμ„ μ‚¬μš©ν•  수 μžˆμŠ΅λ‹ˆλ‹€. 이 ν•¨μˆ˜λŠ” λ¬΄μž‘μœ„ μƒ˜ν”Œλ§μ„ μ‚¬μš©ν•˜λ©°, μ΄λŠ” λ‹€μŒμ„ μ‚¬μš©ν•  λ•ŒλŠ” ν¬ν•¨λ˜μ§€ μ•ŠμŠ΅λ‹ˆλ‹€. fraction ꡐ윑용 인수.

from ultralytics.data.utils import autosplit

autosplit(  # (1)!
    path="path/to/images",
    weights=(0.9, 0.1, 0.0),  # (train, validation, test) fractional splits
    annotated_only=False,  # split only images with annotation file when True
)
  1. 이 ν•¨μˆ˜μ—μ„œ λ°˜ν™˜λ˜λŠ” 것은 μ—†μŠ΅λ‹ˆλ‹€.

이 κΈ°λŠ₯에 λŒ€ν•œ μžμ„Έν•œ λ‚΄μš©μ€ μ°Έμ‘° νŽ˜μ΄μ§€λ₯Ό μ°Έμ‘°ν•˜μ„Έμš”.

μ„Έκ·Έλ¨ΌνŠΈ λ‹€κ°ν˜•μ„ λ°”μ΄λ„ˆλ¦¬ 마슀크둜 λ³€ν™˜ν•˜κΈ°

단일 λ‹€κ°ν˜•(λͺ©λ‘)을 μ§€μ •λœ 이미지 크기의 이진 마슀크둜 λ³€ν™˜ν•©λ‹ˆλ‹€. λ‹€μŒκ³Ό 같은 ν˜•νƒœμ˜ λ‹€κ°ν˜• [N, 2] 와 ν•¨κ»˜ N 의 수둜 (x, y) λ‹€κ°ν˜• μœ€κ³½μ„ μ •μ˜ν•˜λŠ” μ μž…λ‹ˆλ‹€.

κ²½κ³ 

N 항상 κ· λ“±ν•΄μ•Ό ν•©λ‹ˆλ‹€.

import numpy as np

from ultralytics.data.utils import polygon2mask

imgsz = (1080, 810)
polygon = np.array([805, 392, 797, 400, ..., 808, 714, 808, 392])  # (238, 2)

mask = polygon2mask(
    imgsz,  # tuple
    [polygon],  # input as list
    color=255,  # 8-bit binary
    downsample_ratio=1,
)

λ°”μš΄λ”© λ°•μŠ€

λ°”μš΄λ”© λ°•μŠ€(κ°€λ‘œ) μΈμŠ€ν„΄μŠ€

λ°”μš΄λ”© λ°•μŠ€ 데이터λ₯Ό κ΄€λ¦¬ν•˜λ €λ©΄ Bboxes ν΄λž˜μŠ€λŠ” μƒμž μ’Œν‘œ μ„œμ‹ λ³€ν™˜, μƒμž 크기 μ‘°μ •, 면적 계산, μ˜€ν”„μ…‹ 포함 λ“±μ˜ μž‘μ—…μ„ λ„μ™€μ€λ‹ˆλ‹€!

import numpy as np

from ultralytics.utils.instance import Bboxes

boxes = Bboxes(
    bboxes=np.array(
        [
            [22.878, 231.27, 804.98, 756.83],
            [48.552, 398.56, 245.35, 902.71],
            [669.47, 392.19, 809.72, 877.04],
            [221.52, 405.8, 344.98, 857.54],
            [0, 550.53, 63.01, 873.44],
            [0.0584, 254.46, 32.561, 324.87],
        ]
    ),
    format="xyxy",
)

boxes.areas()
# >>> array([ 4.1104e+05,       99216,       68000,       55772,       20347,      2288.5])

boxes.convert("xywh")
print(boxes.bboxes)
# >>> array(
#     [[ 413.93, 494.05,  782.1, 525.56],
#      [ 146.95, 650.63,  196.8, 504.15],
#      [  739.6, 634.62, 140.25, 484.85],
#      [ 283.25, 631.67, 123.46, 451.74],
#      [ 31.505, 711.99,  63.01, 322.91],
#      [  16.31, 289.67, 32.503,  70.41]]
# )

μ°Έμ‘° Bboxes μ°Έμ‘° μ„Ήμ…˜ λ₯Ό ν΄λ¦­ν•˜λ©΄ 더 λ§Žμ€ 속성과 λ©”μ†Œλ“œλ₯Ό μ‚¬μš©ν•  수 μžˆμŠ΅λ‹ˆλ‹€.

팁

λ‹€μŒ κΈ°λŠ₯(및 κ·Έ 이상)은 λ‹€μŒμ„ μ‚¬μš©ν•˜μ—¬ μ•‘μ„ΈμŠ€ν•  수 μžˆμŠ΅λ‹ˆλ‹€. Bboxes 클래슀 ν•¨μˆ˜λ₯Ό 직접 μ‚¬μš©ν•˜λŠ” 것을 μ„ ν˜Έν•œλ‹€λ©΄ λ‹€μŒ ν•˜μœ„ μ„Ήμ…˜μ—μ„œ ν•¨μˆ˜λ₯Ό λ…λ¦½μ μœΌλ‘œ κ°€μ Έμ˜€λŠ” 방법을 μ°Έμ‘°ν•˜μ„Έμš”.

μŠ€μΌ€μΌλ§ λ°•μŠ€

크기λ₯Ό μ‘°μ •ν•˜κ³  이미지λ₯Ό ν™•λŒ€ λ˜λŠ” μΆ•μ†Œν•  λ•Œ ν•΄λ‹Ή λ°”μš΄λ”© λ°•μŠ€ μ’Œν‘œλŠ” λ‹€μŒμ„ μ‚¬μš©ν•˜μ—¬ μ μ ˆν•˜κ²Œ μ‘°μ •ν•  수 μžˆμŠ΅λ‹ˆλ‹€. ultralytics.utils.ops.scale_boxes.

import cv2 as cv
import numpy as np

from ultralytics.utils.ops import scale_boxes

image = cv.imread("ultralytics/assets/bus.jpg")
h, w, c = image.shape
resized = cv.resize(image, None, (), fx=1.2, fy=1.2)
new_h, new_w, _ = resized.shape

xyxy_boxes = np.array(
    [
        [22.878, 231.27, 804.98, 756.83],
        [48.552, 398.56, 245.35, 902.71],
        [669.47, 392.19, 809.72, 877.04],
        [221.52, 405.8, 344.98, 857.54],
        [0, 550.53, 63.01, 873.44],
        [0.0584, 254.46, 32.561, 324.87],
    ]
)

new_boxes = scale_boxes(
    img1_shape=(h, w),  # original image dimensions
    boxes=xyxy_boxes,  # boxes from original image
    img0_shape=(new_h, new_w),  # resized image dimensions (scale to)
    ratio_pad=None,
    padding=False,
    xywh=False,
)

print(new_boxes)  # (1)!
# >>> array(
#     [[  27.454,  277.52,  965.98,   908.2],
#     [   58.262,  478.27,  294.42,  1083.3],
#     [   803.36,  470.63,  971.66,  1052.4],
#     [   265.82,  486.96,  413.98,    1029],
#     [        0,  660.64,  75.612,  1048.1],
#     [   0.0701,  305.35,  39.073,  389.84]]
# )
  1. μƒˆ 이미지 크기에 맞게 크기가 μ‘°μ •λœ λ°”μš΄λ”© λ°•μŠ€

λ°”μš΄λ”© λ°•μŠ€ ν˜•μ‹ λ³€ν™˜

XYXY β†’ XYWH

λ°”μš΄λ”© λ°•μŠ€ μ’Œν‘œλ₯Ό (x1, y1, x2, y2) ν˜•μ‹μ—μ„œ (x, y, λ„ˆλΉ„, 높이) ν˜•μ‹μœΌλ‘œ λ³€ν™˜ν•©λ‹ˆλ‹€. μ—¬κΈ°μ„œ (x1, y1은 μ™Όμͺ½ 상단 λͺ¨μ„œλ¦¬, (x2, y2λŠ” 였λ₯Έμͺ½ ν•˜λ‹¨ λͺ¨μ„œλ¦¬μž…λ‹ˆλ‹€.

import numpy as np

from ultralytics.utils.ops import xyxy2xywh

xyxy_boxes = np.array(
    [
        [22.878, 231.27, 804.98, 756.83],
        [48.552, 398.56, 245.35, 902.71],
        [669.47, 392.19, 809.72, 877.04],
        [221.52, 405.8, 344.98, 857.54],
        [0, 550.53, 63.01, 873.44],
        [0.0584, 254.46, 32.561, 324.87],
    ]
)
xywh = xyxy2xywh(xyxy_boxes)

print(xywh)
# >>> array(
#     [[ 413.93,  494.05,   782.1, 525.56],
#     [  146.95,  650.63,   196.8, 504.15],
#     [   739.6,  634.62,  140.25, 484.85],
#     [  283.25,  631.67,  123.46, 451.74],
#     [  31.505,  711.99,   63.01, 322.91],
#     [   16.31,  289.67,  32.503,  70.41]]
# )

λͺ¨λ“  λ°”μš΄λ”© λ°•μŠ€ λ³€ν™˜

from ultralytics.utils.ops import (
    ltwh2xywh,
    ltwh2xyxy,
    xywh2ltwh,  # xywh β†’ top-left corner, w, h
    xywh2xyxy,
    xywhn2xyxy,  # normalized β†’ pixel
    xyxy2ltwh,  # xyxy β†’ top-left corner, w, h
    xyxy2xywhn,  # pixel β†’ normalized
)

for func in (ltwh2xywh, ltwh2xyxy, xywh2ltwh, xywh2xyxy, xywhn2xyxy, xyxy2ltwh, xyxy2xywhn):
    print(help(func))  # print function docstrings

각 κΈ°λŠ₯에 λŒ€ν•œ λ¬Έμ„œ λ¬Έμžμ—΄μ„ μ°Έμ‘°ν•˜κ±°λ‚˜ ultralytics.utils.ops μ°Έμ‘° νŽ˜μ΄μ§€ λ₯Ό ν΄λ¦­ν•˜μ—¬ 각 κΈ°λŠ₯에 λŒ€ν•΄ μžμ„Ένžˆ μ•Œμ•„λ³΄μ„Έμš”.

ν”Œλ‘œνŒ…

λ“œλ‘œμž‰ 주석

Ultralytics μ—λŠ” λͺ¨λ“  μ’…λ₯˜μ˜ 데이터에 주석을 λ‹€λŠ” 데 μ‚¬μš©ν•  수 μžˆλŠ” Annotator ν΄λž˜μŠ€κ°€ ν¬ν•¨λ˜μ–΄ μžˆμŠ΅λ‹ˆλ‹€. 객체 감지 λ°”μš΄λ”© λ°•μŠ€, 포즈 ν‚€ 포인트, λ°©ν–₯μ„± λ°”μš΄λ”© λ°•μŠ€μ— κ°€μž₯ μ‰½κ²Œ μ‚¬μš©ν•  수 μžˆμŠ΅λ‹ˆλ‹€.

Ultralytics μŠ€μœ• 주석

Python YOLO11 πŸš€ μ‚¬μš© 예

import cv2

from ultralytics import YOLO
from ultralytics.utils.plotting import Annotator, colors

# User defined video path and model file
cap = cv2.VideoCapture("Path/to/video/file.mp4")
model = YOLO(model="yolo11s-seg.pt")  # Model file i.e. yolo11s.pt or yolo11m-seg.pt

if not cap.isOpened():
    print("Error: Could not open video.")
    exit()

# Initialize the video writer object.
w, h, fps = (int(cap.get(x)) for x in (cv2.CAP_PROP_FRAME_WIDTH, cv2.CAP_PROP_FRAME_HEIGHT, cv2.CAP_PROP_FPS))
video_writer = cv2.VideoWriter("ultralytics.avi", cv2.VideoWriter_fourcc(*"mp4v"), fps, (w, h))

masks = None  # Initialize variable to store masks data
f = 0  # Initialize frame count variable for enabling mouse event.
line_x = w  # Store width of line.
dragging = False  # Initialize bool variable for line dragging.
classes = model.names  # Store model classes names for plotting.
window_name = "Ultralytics Sweep Annotator"


def drag_line(event, x, y, flags, param):  # Mouse callback for dragging line.
    global line_x, dragging
    if event == cv2.EVENT_LBUTTONDOWN or (flags & cv2.EVENT_FLAG_LBUTTON):
        line_x = max(0, min(x, w))
        dragging = True


while cap.isOpened():  # Loop over the video capture object.
    ret, im0 = cap.read()
    if not ret:
        break
    f = f + 1  # Increment frame count.
    count = 0  # Re-initialize count variable on every frame for precise counts.
    annotator = Annotator(im0)
    results = model.track(im0, persist=True)  # Track objects using track method.
    if f == 1:
        cv2.namedWindow(window_name)
        cv2.setMouseCallback(window_name, drag_line)

    if results[0].boxes.id is not None:
        if results[0].masks is not None:
            masks = results[0].masks.xy
        track_ids = results[0].boxes.id.int().cpu().tolist()
        clss = results[0].boxes.cls.cpu().tolist()
        boxes = results[0].boxes.xyxy.cpu()

        for mask, box, cls, t_id in zip(masks or [None] * len(boxes), boxes, clss, track_ids):
            color = colors(t_id, True)  # Assign different color to each tracked object.
            if mask is not None and mask.size > 0:
                # If you want to overlay the masks
                # mask[:, 0] = np.clip(mask[:, 0], line_x, w)
                # mask_img = cv2.fillPoly(im0.copy(), [mask.astype(int)], color)
                # cv2.addWeighted(mask_img, 0.5, im0, 0.5, 0, im0)

                if box[0] > line_x:
                    count += 1
                    annotator.seg_bbox(mask=mask, mask_color=color, label=str(classes[cls]))
            else:
                if box[0] > line_x:
                    count += 1
                    annotator.box_label(box=box, color=color, label=str(classes[cls]))

    annotator.sweep_annotator(line_x=line_x, line_y=h, label=f"COUNT:{count}")  # Display the sweep
    cv2.imshow(window_name, im0)
    video_writer.write(im0)
    if cv2.waitKey(1) & 0xFF == ord("q"):
        break

cap.release()  # Release the video capture.
video_writer.release()  # Release the video writer.
cv2.destroyAllWindows()  # Destroy all opened windows.

μˆ˜ν‰ λ°”μš΄λ”© λ°•μŠ€

import cv2 as cv
import numpy as np

from ultralytics.utils.plotting import Annotator, colors

names = {  # (1)!
    0: "person",
    5: "bus",
    11: "stop sign",
}

image = cv.imread("ultralytics/assets/bus.jpg")
ann = Annotator(
    image,
    line_width=None,  # default auto-size
    font_size=None,  # default auto-size
    font="Arial.ttf",  # must be ImageFont compatible
    pil=False,  # use PIL, otherwise uses OpenCV
)

xyxy_boxes = np.array(
    [
        [5, 22.878, 231.27, 804.98, 756.83],  # class-idx x1 y1 x2 y2
        [0, 48.552, 398.56, 245.35, 902.71],
        [0, 669.47, 392.19, 809.72, 877.04],
        [0, 221.52, 405.8, 344.98, 857.54],
        [0, 0, 550.53, 63.01, 873.44],
        [11, 0.0584, 254.46, 32.561, 324.87],
    ]
)

for nb, box in enumerate(xyxy_boxes):
    c_idx, *box = box
    label = f"{str(nb).zfill(2)}:{names.get(int(c_idx))}"
    ann.box_label(box, label, color=colors(c_idx, bgr=True))

image_with_bboxes = ann.result()
  1. 이름은 λ‹€μŒμ—μ„œ μ‚¬μš©ν•  수 μžˆμŠ΅λ‹ˆλ‹€. model.names μ–Έμ œ 탐지 κ²°κ³Ό μž‘μ—…

OBB(μ˜€λ¦¬μ—”ν‹°λ“œ λ°”μš΄λ”© λ°•μŠ€)

import cv2 as cv
import numpy as np

from ultralytics.utils.plotting import Annotator, colors

obb_names = {10: "small vehicle"}
obb_image = cv.imread("datasets/dota8/images/train/P1142__1024__0___824.jpg")
obb_boxes = np.array(
    [
        [0, 635, 560, 919, 719, 1087, 420, 803, 261],  # class-idx x1 y1 x2 y2 x3 y2 x4 y4
        [0, 331, 19, 493, 260, 776, 70, 613, -171],
        [9, 869, 161, 886, 147, 851, 101, 833, 115],
    ]
)
ann = Annotator(
    obb_image,
    line_width=None,  # default auto-size
    font_size=None,  # default auto-size
    font="Arial.ttf",  # must be ImageFont compatible
    pil=False,  # use PIL, otherwise uses OpenCV
)
for obb in obb_boxes:
    c_idx, *obb = obb
    obb = np.array(obb).reshape(-1, 4, 2).squeeze()
    label = f"{obb_names.get(int(c_idx))}"
    ann.box_label(
        obb,
        label,
        color=colors(c_idx, True),
        rotated=True,
    )

image_with_obb = ann.result()

λ°”μš΄λ”© μƒμž 원 주석 원 λ ˆμ΄λΈ”



Watch: ν…μŠ€νŠΈ 및 원 주석에 λŒ€ν•œ 심측 κ°€μ΄λ“œ( Python 라이브 데λͺ¨ 포함) | Ultralytics 주석 πŸš€

import cv2

from ultralytics import YOLO
from ultralytics.utils.plotting import Annotator

model = YOLO("yolo11s.pt")
names = model.names
cap = cv2.VideoCapture("path/to/video/file.mp4")

w, h, fps = (int(cap.get(x)) for x in (cv2.CAP_PROP_FRAME_WIDTH, cv2.CAP_PROP_FRAME_HEIGHT, cv2.CAP_PROP_FPS))
writer = cv2.VideoWriter("Ultralytics circle annotation.avi", cv2.VideoWriter_fourcc(*"MJPG"), fps, (w, h))

while True:
    ret, im0 = cap.read()
    if not ret:
        break

    annotator = Annotator(im0)
    results = model.predict(im0)
    boxes = results[0].boxes.xyxy.cpu()
    clss = results[0].boxes.cls.cpu().tolist()

    for box, cls in zip(boxes, clss):
        annotator.circle_label(box, label=names[int(cls)])

    writer.write(im0)
    cv2.imshow("Ultralytics circle annotation", im0)

    if cv2.waitKey(1) & 0xFF == ord("q"):
        break

writer.release()
cap.release()
cv2.destroyAllWindows()

λ°”μš΄λ”© λ°•μŠ€ ν…μŠ€νŠΈ 주석 ν…μŠ€νŠΈ λ ˆμ΄λΈ”

import cv2

from ultralytics import YOLO
from ultralytics.utils.plotting import Annotator

model = YOLO("yolo11s.pt")
names = model.names
cap = cv2.VideoCapture("path/to/video/file.mp4")

w, h, fps = (int(cap.get(x)) for x in (cv2.CAP_PROP_FRAME_WIDTH, cv2.CAP_PROP_FRAME_HEIGHT, cv2.CAP_PROP_FPS))
writer = cv2.VideoWriter("Ultralytics text annotation.avi", cv2.VideoWriter_fourcc(*"MJPG"), fps, (w, h))

while True:
    ret, im0 = cap.read()
    if not ret:
        break

    annotator = Annotator(im0)
    results = model.predict(im0)
    boxes = results[0].boxes.xyxy.cpu()
    clss = results[0].boxes.cls.cpu().tolist()

    for box, cls in zip(boxes, clss):
        annotator.text_label(box, label=names[int(cls)])

    writer.write(im0)
    cv2.imshow("Ultralytics text annotation", im0)

    if cv2.waitKey(1) & 0xFF == ord("q"):
        break

writer.release()
cap.release()
cv2.destroyAllWindows()

μ°Έμ‘° Annotator μ°Έμ‘° νŽ˜μ΄μ§€ λ₯Ό μ°Έμ‘°ν•˜μ„Έμš”.

기타

μ½”λ“œ ν”„λ‘œνŒŒμΌλ§

λ‹€μŒμ„ μ‚¬μš©ν•˜μ—¬ μ‹€ν–‰/μ²˜λ¦¬ν•  μ½”λ“œμ˜ 기간을 ν™•μΈν•©λ‹ˆλ‹€. with λ˜λŠ” λ°μ½”λ ˆμ΄ν„°λ‘œ μ‚¬μš©ν•  수 μžˆμŠ΅λ‹ˆλ‹€.

from ultralytics.utils.ops import Profile

with Profile(device="cuda:0") as dt:
    pass  # operation to measure

print(dt)
# >>> "Elapsed time is 9.5367431640625e-07 s"

Ultralytics μ§€μ›λ˜λŠ” ν˜•μ‹

Ultralytics μ—μ„œ μ§€μ›ν•˜λŠ” 이미지 λ˜λŠ” λ™μ˜μƒ ν˜•μ‹μ˜ ν˜•μ‹μ„ ν”„λ‘œκ·Έλž˜λ° λ°©μ‹μœΌλ‘œ μ‚¬μš©ν•˜κ³  μ‹Άκ±°λ‚˜ μ‚¬μš©ν•΄μ•Ό ν•˜λ‚˜μš”? ν•„μš”ν•œ 경우 이 μƒμˆ˜λ₯Ό μ‚¬μš©ν•˜μ„Έμš”.

from ultralytics.data.utils import IMG_FORMATS, VID_FORMATS

print(IMG_FORMATS)
# {'tiff', 'pfm', 'bmp', 'mpo', 'dng', 'jpeg', 'png', 'webp', 'tif', 'jpg'}

print(VID_FORMATS)
# {'avi', 'mpg', 'wmv', 'mpeg', 'm4v', 'mov', 'mp4', 'asf', 'mkv', 'ts', 'gif', 'webm'}

λΆ„ν•  κ°€λŠ₯ λ§Œλ“€κΈ°

에 κ°€μž₯ κ°€κΉŒμš΄ μ •μˆ˜λ₯Ό κ³„μ‚°ν•©λ‹ˆλ‹€. x 둜 λ‚˜λˆŒ λ•Œ κ· λ“±ν•˜κ²Œ λ‚˜λˆŒ 수 μžˆλ„λ‘ ν•©λ‹ˆλ‹€. y.

from ultralytics.utils.ops import make_divisible

make_divisible(7, 3)
# >>> 9
make_divisible(7, 2)
# >>> 8

자주 λ¬»λŠ” 질문

λ¨Έμ‹  λŸ¬λ‹ μ›Œν¬ν”Œλ‘œμš°λ₯Ό κ°œμ„ ν•˜κΈ° μœ„ν•΄ Ultralytics νŒ¨ν‚€μ§€μ— ν¬ν•¨λœ μœ ν‹Έλ¦¬ν‹°μ—λŠ” μ–΄λ–€ 것이 μžˆλ‚˜μš”?

Ultralytics νŒ¨ν‚€μ§€μ—λŠ” λ¨Έμ‹  λŸ¬λ‹ μ›Œν¬ν”Œλ‘œμš°λ₯Ό κ°„μ†Œν™”ν•˜κ³  μ΅œμ ν™”ν•˜λ„λ‘ μ„€κ³„λœ λ‹€μ–‘ν•œ μœ ν‹Έλ¦¬ν‹°κ°€ ν¬ν•¨λ˜μ–΄ μžˆμŠ΅λ‹ˆλ‹€. μ£Όμš” μœ ν‹Έλ¦¬ν‹°μ—λŠ” 데이터 μ„ΈνŠΈμ— 라벨을 λΆ™μ΄λŠ” μžλ™ 주석, convert_cocoλ₯Ό μ‚¬μš©ν•΄ COCOλ₯Ό YOLO ν˜•μ‹μœΌλ‘œ λ³€ν™˜ν•˜λŠ” κΈ°λŠ₯, 이미지 μ••μΆ•, 데이터 μ„ΈνŠΈ μžλ™ λΆ„ν•  등이 μžˆμŠ΅λ‹ˆλ‹€. μ΄λŸ¬ν•œ λ„κ΅¬λŠ” μˆ˜μž‘μ—…μ„ 쀄이고, 일관성을 보μž₯ν•˜λ©°, 데이터 처리 νš¨μœ¨μ„±μ„ ν–₯μƒμ‹œν‚€λŠ” 것을 λͺ©ν‘œλ‘œ ν•©λ‹ˆλ‹€.

Ultralytics 을 μ‚¬μš©ν•˜μ—¬ 데이터 집합에 μžλ™ λ ˆμ΄λΈ”μ„ μ§€μ •ν•˜λ €λ©΄ μ–΄λ–»κ²Œ ν•΄μ•Ό ν•˜λ‚˜μš”?

사전 ν•™μŠ΅λœ Ultralytics YOLO 객체 감지 λͺ¨λΈμ΄ μžˆλŠ” 경우, 이λ₯Ό μ„Έκ·Έλ¨ΌνŠΈ ν˜•μ‹μ˜ SAM λͺ¨λΈκ³Ό ν•¨κ»˜ μ‚¬μš©ν•˜μ—¬ 데이터 μ„ΈνŠΈμ— μ„ΈλΆ„ν™” ν˜•μ‹μœΌλ‘œ μžλ™ 주석을 달 수 μžˆμŠ΅λ‹ˆλ‹€. λ‹€μŒμ€ μ˜ˆμ‹œμž…λ‹ˆλ‹€:

from ultralytics.data.annotator import auto_annotate

auto_annotate(
    data="path/to/new/data",
    det_model="yolo11n.pt",
    sam_model="mobile_sam.pt",
    device="cuda",
    output_dir="path/to/save_labels",
)

μžμ„Έν•œ λ‚΄μš©μ€ μžλ™ 주석 달기 μ°Έμ‘° μ„Ήμ…˜μ„ ν™•μΈν•˜μ„Έμš”.

COCO 데이터 μ„ΈνŠΈ 주석을 Ultralytics μ—μ„œ YOLO ν˜•μ‹μœΌλ‘œ λ³€ν™˜ν•˜λ €λ©΄ μ–΄λ–»κ²Œ ν•˜λ‚˜μš”?

COCO JSON μ–΄λ…Έν…Œμ΄μ…˜μ„ 개체 감지λ₯Ό μœ„ν•΄ YOLO ν˜•μ‹μœΌλ‘œ λ³€ν™˜ν•˜λ €λ©΄ convert_coco μœ ν‹Έλ¦¬ν‹°λ₯Ό μ‚¬μš©ν•˜μ„Έμš”. λ‹€μŒμ€ μƒ˜ν”Œ μ½”λ“œ μŠ€λ‹ˆνŽ«μž…λ‹ˆλ‹€:

from ultralytics.data.converter import convert_coco

convert_coco(
    "../datasets/coco/annotations/",
    use_segments=False,
    use_keypoints=False,
    cls91to80=True,
)

μžμ„Έν•œ λ‚΄μš©μ€ convert_coco μ°Έμ‘° νŽ˜μ΄μ§€λ₯Ό μ°Έμ‘°ν•˜μ„Έμš”.

Ultralytics νŒ¨ν‚€μ§€μ˜ YOLO 데이터 νƒμƒ‰κΈ°μ˜ μš©λ„λŠ” λ¬΄μ—‡μΈκ°€μš”?

그리고 YOLO 탐색기 에 μ†Œκ°œλœ κ°•λ ₯ν•œ λ„κ΅¬μž…λ‹ˆλ‹€. 8.1.0 데이터 μ„ΈνŠΈ 이해도λ₯Ό 높이기 μœ„ν•œ μ—…λ°μ΄νŠΈμž…λ‹ˆλ‹€. ν…μŠ€νŠΈ 쿼리λ₯Ό μ‚¬μš©ν•΄ 데이터 μ„ΈνŠΈμ—μ„œ 개체 μΈμŠ€ν„΄μŠ€λ₯Ό 찾을 수 μžˆμœΌλ―€λ‘œ 데이터λ₯Ό 더 μ‰½κ²Œ λΆ„μ„ν•˜κ³  관리할 수 μžˆμŠ΅λ‹ˆλ‹€. 이 λ„κ΅¬λŠ” 데이터 μ„ΈνŠΈ ꡬ성과 뢄포에 λŒ€ν•œ κ·€μ€‘ν•œ μΈμ‚¬μ΄νŠΈλ₯Ό μ œκ³΅ν•˜μ—¬ λͺ¨λΈ ν›ˆλ ¨κ³Ό μ„±λŠ₯을 κ°œμ„ ν•˜λŠ” 데 도움이 λ©λ‹ˆλ‹€.

Ultralytics μ—μ„œ λ°”μš΄λ”© λ°•μŠ€λ₯Ό μ„Έκ·Έλ¨ΌνŠΈλ‘œ λ³€ν™˜ν•˜λ €λ©΄ μ–΄λ–»κ²Œ ν•΄μ•Ό ν•˜λ‚˜μš”?

κΈ°μ‘΄ λ°”μš΄λ”© λ°•μŠ€ 데이터λ₯Ό λ³€ν™˜ν•˜λ €λ©΄( x y w h ν˜•μ‹)을 μ„Έκ·Έλ¨ΌνŠΈμ— μΆ”κ°€ν•˜λ €λ©΄ yolo_bbox2segment κΈ°λŠ₯을 μ‚¬μš©ν•˜μ„Έμš”. 이미지와 라벨을 μœ„ν•œ λ³„λ„μ˜ λ””λ ‰ν„°λ¦¬λ‘œ νŒŒμΌμ„ μ •λ¦¬ν•˜μ„Έμš”.

from ultralytics.data.converter import yolo_bbox2segment

yolo_bbox2segment(
    im_dir="path/to/images",
    save_dir=None,  # saved to "labels-segment" in the images directory
    sam_model="sam_b.pt",
)

μžμ„Έν•œ λ‚΄μš©μ€ yolo_bbox2segment μ°Έμ‘° νŽ˜μ΄μ§€λ₯Ό μ°Έμ‘°ν•˜μ„Έμš”.

10κ°œμ›” μ „ 생성됨 ✏️ 11 일 μ „ μ—…λ°μ΄νŠΈλ¨

λŒ“κΈ€