μ½˜ν…μΈ λ‘œ κ±΄λ„ˆλ›°κΈ°

κ°„λ‹¨ν•œ μœ ν‹Έλ¦¬ν‹°

관점이 μžˆλŠ” μ½”λ“œ

그리고 ultralytics νŒ¨ν‚€μ§€μ—λŠ” μ›Œν¬ν”Œλ‘œλ₯Ό 지원, ν–₯상 및 속도λ₯Ό 높일 수 μžˆλŠ” μˆ˜λ§Žμ€ μœ ν‹Έλ¦¬ν‹°κ°€ ν¬ν•¨λ˜μ–΄ μžˆμŠ΅λ‹ˆλ‹€. 더 λ§Žμ€ μœ ν‹Έλ¦¬ν‹°κ°€ μžˆμ§€λ§Œ μ—¬κΈ°μ„œλŠ” λŒ€λΆ€λΆ„μ˜ κ°œλ°œμžμ—κ²Œ μœ μš©ν•œ λͺ‡ 가지λ₯Ό μ†Œκ°œν•©λ‹ˆλ‹€. λ˜ν•œ ν”„λ‘œκ·Έλž˜λ°μ„ 배울 λ•Œ μ°Έκ³ ν•  수 μžˆλŠ” ν›Œλ₯­ν•œ μ°Έκ³  μžλ£Œμ΄κΈ°λ„ ν•©λ‹ˆλ‹€.



Watch: Ultralytics μœ ν‹Έλ¦¬ν‹° > μžλ™ 주석, 탐색기 API 및 데이터 μ„ΈνŠΈ λ³€ν™˜

데이터

μžλ™ 라벨링/주석

데이터 μ„ΈνŠΈ μ–΄λ…Έν…Œμ΄μ…˜μ€ λ¦¬μ†ŒμŠ€μ™€ μ‹œκ°„μ΄ 많이 μ†Œμš”λ˜λŠ” ν”„λ‘œμ„ΈμŠ€μž…λ‹ˆλ‹€. μ μ ˆν•œ μ–‘μ˜ λ°μ΄ν„°λ‘œ ν•™μŠ΅λœ YOLO 개체 감지 λͺ¨λΈμ΄ μžˆλŠ” 경우, 이λ₯Ό μ‚¬μš©ν•˜κ³  SAM λ₯Ό μ‚¬μš©ν•˜μ—¬ μΆ”κ°€ 데이터(μ„ΈλΆ„ν™” ν˜•μ‹)에 μžλ™ 주석을 달 수 μžˆμŠ΅λ‹ˆλ‹€.

from ultralytics.data.annotator import auto_annotate

auto_annotate(  # (1)!
    data="path/to/new/data",
    det_model="yolo11n.pt",
    sam_model="mobile_sam.pt",
    device="cuda",
    output_dir="path/to/save_labels",
)
  1. 이 ν•¨μˆ˜μ—μ„œ λ°˜ν™˜λ˜λŠ” 것은 μ—†μŠ΅λ‹ˆλ‹€.

  2. λ‹€μŒμ— λŒ€ν•œ μ°Έμ‘° μ„Ήμ…˜μ„ μ°Έμ‘°ν•˜μ‹­μ‹œμ˜€. annotator.auto_annotate λ₯Ό μ°Έμ‘°ν•˜μ—¬ κΈ°λŠ₯ μž‘λ™ 방식에 λŒ€ν•΄ μžμ„Ένžˆ μ•Œμ•„λ³΄μ„Έμš”.

  3. 와 ν•¨κ»˜ μ‚¬μš© ν•¨μˆ˜ segments2boxes λ₯Ό μ‚¬μš©ν•˜μ—¬ 객체 감지 경계 μƒμžλ„ 생성할 수 μžˆμŠ΅λ‹ˆλ‹€.

μ„ΈλΆ„ν™” 마슀크λ₯Ό YOLO ν˜•μ‹μœΌλ‘œ λ³€ν™˜

μ„ΈλΆ„ν™” 마슀크λ₯Ό YOLO ν˜•μ‹μœΌλ‘œ λ³€ν™˜

μ„ΈλΆ„ν™” 마슀크 μ΄λ―Έμ§€μ˜ 데이터 μ„ΈνŠΈλ₯Ό YOLO μ„ΈλΆ„ν™” ν˜•μ‹μ„ μ‚¬μš©ν•©λ‹ˆλ‹€. 이 ν•¨μˆ˜λŠ” λ°”μ΄λ„ˆλ¦¬ ν˜•μ‹μ˜ 마슀크 이미지가 ν¬ν•¨λœ 디렉토리λ₯Ό 가져와 YOLO μ„Έκ·Έλ¨Όν…Œμ΄μ…˜ ν˜•μ‹μœΌλ‘œ λ³€ν™˜ν•©λ‹ˆλ‹€.

λ³€ν™˜λœ λ§ˆμŠ€ν¬λŠ” μ§€μ •λœ 좜λ ₯ 디렉터리에 μ €μž₯λ©λ‹ˆλ‹€.

from ultralytics.data.converter import convert_segment_masks_to_yolo_seg

# The classes here is the total classes in the dataset, for COCO dataset we have 80 classes
convert_segment_masks_to_yolo_seg(masks_dir="path/to/masks_dir", output_dir="path/to/output_dir", classes=80)

COCOλ₯Ό YOLO ν˜•μ‹μœΌλ‘œ λ³€ν™˜

COCO JSON 주석을 μ μ ˆν•œ YOLO ν˜•μ‹μœΌλ‘œ λ³€ν™˜ν•˜λŠ” 데 μ‚¬μš©ν•©λ‹ˆλ‹€. 객체 감지(λ°”μš΄λ”© λ°•μŠ€) 데이터 μ„ΈνŠΈμ˜ 경우, use_segments 그리고 use_keypoints λ‘˜ λ‹€ False

from ultralytics.data.converter import convert_coco

convert_coco(  # (1)!
    "../datasets/coco/annotations/",
    use_segments=False,
    use_keypoints=False,
    cls91to80=True,
)
  1. 이 ν•¨μˆ˜μ—μ„œ λ°˜ν™˜λ˜λŠ” 것은 μ—†μŠ΅λ‹ˆλ‹€.

에 λŒ€ν•œ μžμ„Έν•œ λ‚΄μš©μ€ convert_coco ν•¨μˆ˜μž…λ‹ˆλ‹€, μ°Έμ‘° νŽ˜μ΄μ§€ λ°©λ¬Έ

λ°”μš΄λ”© λ°•μŠ€ 치수 κ°€μ Έμ˜€κΈ°

from ultralytics.utils.plotting import Annotator
from ultralytics import YOLO
import cv2

model = YOLO('yolo11n.pt')  # Load pretrain or fine-tune model

# Process the image
source = cv2.imread('path/to/image.jpg')
results = model(source)

# Extract results
annotator = Annotator(source, example=model.names)

for box in results[0].boxes.xyxy.cpu():
    width, height, area = annotator.get_bbox_dimension(box)
    print("Bounding Box Width {}, Height {}, Area {}".format(
        width.item(), height.item(), area.item()))

λ°”μš΄λ”© λ°•μŠ€λ₯Ό μ„Έκ·Έλ¨ΌνŠΈλ‘œ λ³€ν™˜ν•˜κΈ°

κΈ°μ‘΄ x y w h λ°”μš΄λ”© λ°•μŠ€ 데이터λ₯Ό μ‚¬μš©ν•˜μ—¬ μ„Έκ·Έλ¨ΌνŠΈλ‘œ λ³€ν™˜ν•©λ‹ˆλ‹€. yolo_bbox2segment ν•¨μˆ˜λ₯Ό μ‚¬μš©ν•˜μ„Έμš”. 이미지 및 μ£Όμ„μš© νŒŒμΌμ€ λ‹€μŒκ³Ό 같이 ꡬ성해야 ν•©λ‹ˆλ‹€:

data
|__ images
    β”œβ”€ 001.jpg
    β”œβ”€ 002.jpg
    β”œβ”€ ..
    └─ NNN.jpg
|__ labels
    β”œβ”€ 001.txt
    β”œβ”€ 002.txt
    β”œβ”€ ..
    └─ NNN.txt
from ultralytics.data.converter import yolo_bbox2segment

yolo_bbox2segment(  # (1)!
    im_dir="path/to/images",
    save_dir=None,  # saved to "labels-segment" in images directory
    sam_model="sam_b.pt",
)
  1. 이 ν•¨μˆ˜μ—μ„œ λ°˜ν™˜λ˜λŠ” 것은 μ—†μŠ΅λ‹ˆλ‹€.

λ°©λ¬Έν•˜κΈ° yolo_bbox2segment μ°Έμ‘° νŽ˜μ΄μ§€ κΈ°λŠ₯에 λŒ€ν•œ μžμ„Έν•œ λ‚΄μš©μ„ ν™•μΈν•˜μ„Έμš”.

μ„Έκ·Έλ¨ΌνŠΈλ₯Ό λ°”μš΄λ”© λ°•μŠ€λ‘œ λ³€ν™˜ν•˜κΈ°

λ₯Ό μ‚¬μš©ν•˜λŠ” 데이터 집합이 μžˆλŠ” 경우 μ„ΈλΆ„ν™” 데이터 μ„ΈνŠΈ ν˜•μ‹ λ₯Ό μ‚¬μš©ν•˜λ©΄ μ‰½κ²Œ 수직(λ˜λŠ” μˆ˜ν‰) 경계 μƒμžλ‘œ λ³€ν™˜ν•  수 μžˆμŠ΅λ‹ˆλ‹€(x y w h ν˜•μ‹)을 μ‚¬μš©ν•˜μ—¬ 이 ν•¨μˆ˜λ₯Ό μ‚¬μš©ν•  수 μžˆμŠ΅λ‹ˆλ‹€.

import numpy as np

from ultralytics.utils.ops import segments2boxes

segments = np.array(
    [
        [805, 392, 797, 400, ..., 808, 714, 808, 392],
        [115, 398, 113, 400, ..., 150, 400, 149, 298],
        [267, 412, 265, 413, ..., 300, 413, 299, 412],
    ]
)

segments2boxes([s.reshape(-1, 2) for s in segments])
# >>> array([[ 741.66, 631.12, 133.31, 479.25],
#           [ 146.81, 649.69, 185.62, 502.88],
#           [ 281.81, 636.19, 118.12, 448.88]],
#           dtype=float32) # xywh bounding boxes

이 κΈ°λŠ₯의 μž‘λ™ 방식을 μ΄ν•΄ν•˜λ €λ©΄ μ°Έμ‘° νŽ˜μ΄μ§€λ₯Ό λ°©λ¬Έν•˜μ„Έμš”.

μœ ν‹Έλ¦¬ν‹°

이미지 μ••μΆ•

κ°€λ‘œ μ„Έλ‘œ λΉ„μœ¨κ³Ό ν’ˆμ§ˆμ„ μœ μ§€ν•˜λ©΄μ„œ 단일 이미지 νŒŒμΌμ„ μΆ•μ†Œλœ 크기둜 μ••μΆ•ν•©λ‹ˆλ‹€. μž…λ ₯ 이미지가 μ΅œλŒ€ 크기보닀 μž‘μœΌλ©΄ 크기가 μ‘°μ •λ˜μ§€ μ•ŠμŠ΅λ‹ˆλ‹€.

from pathlib import Path

from ultralytics.data.utils import compress_one_image

for f in Path("path/to/dataset").rglob("*.jpg"):
    compress_one_image(f)  # (1)!
  1. 이 ν•¨μˆ˜μ—μ„œ λ°˜ν™˜λ˜λŠ” 것은 μ—†μŠ΅λ‹ˆλ‹€.

데이터 μ„ΈνŠΈ μžλ™ λΆ„ν• 

데이터 집합을 λ‹€μŒκ³Ό 같이 μžλ™μœΌλ‘œ λΆ„ν• ν•©λ‹ˆλ‹€. train/val/test λΆ„ν• ν•˜κ³  κ²°κ³Ό 뢄할을 autosplit_*.txt νŒŒμΌμ„ μ‚¬μš©ν•  수 μžˆμŠ΅λ‹ˆλ‹€. 이 ν•¨μˆ˜λŠ” λ¬΄μž‘μœ„ μƒ˜ν”Œλ§μ„ μ‚¬μš©ν•˜λ©°, μ΄λŠ” λ‹€μŒμ„ μ‚¬μš©ν•  λ•ŒλŠ” ν¬ν•¨λ˜μ§€ μ•ŠμŠ΅λ‹ˆλ‹€. fraction ꡐ윑용 인수.

from ultralytics.data.utils import autosplit

autosplit(  # (1)!
    path="path/to/images",
    weights=(0.9, 0.1, 0.0),  # (train, validation, test) fractional splits
    annotated_only=False,  # split only images with annotation file when True
)
  1. 이 ν•¨μˆ˜μ—μ„œ λ°˜ν™˜λ˜λŠ” 것은 μ—†μŠ΅λ‹ˆλ‹€.

이 κΈ°λŠ₯에 λŒ€ν•œ μžμ„Έν•œ λ‚΄μš©μ€ μ°Έμ‘° νŽ˜μ΄μ§€λ₯Ό μ°Έμ‘°ν•˜μ„Έμš”.

μ„Έκ·Έλ¨ΌνŠΈ λ‹€κ°ν˜•μ„ λ°”μ΄λ„ˆλ¦¬ 마슀크둜 λ³€ν™˜ν•˜κΈ°

단일 λ‹€κ°ν˜•(λͺ©λ‘)을 μ§€μ •λœ 이미지 크기의 이진 마슀크둜 λ³€ν™˜ν•©λ‹ˆλ‹€. λ‹€μŒκ³Ό 같은 ν˜•νƒœμ˜ λ‹€κ°ν˜• [N, 2] 와 ν•¨κ»˜ N 의 수둜 (x, y) λ‹€κ°ν˜• μœ€κ³½μ„ μ •μ˜ν•˜λŠ” μ μž…λ‹ˆλ‹€.

κ²½κ³ 

N 항상 κ· λ“±ν•΄μ•Ό ν•©λ‹ˆλ‹€.

import numpy as np

from ultralytics.data.utils import polygon2mask

imgsz = (1080, 810)
polygon = np.array([805, 392, 797, 400, ..., 808, 714, 808, 392])  # (238, 2)

mask = polygon2mask(
    imgsz,  # tuple
    [polygon],  # input as list
    color=255,  # 8-bit binary
    downsample_ratio=1,
)

λ°”μš΄λ”© λ°•μŠ€

λ°”μš΄λ”© λ°•μŠ€(κ°€λ‘œ) μΈμŠ€ν„΄μŠ€

λ°”μš΄λ”© λ°•μŠ€ 데이터λ₯Ό κ΄€λ¦¬ν•˜λ €λ©΄ Bboxes ν΄λž˜μŠ€λŠ” μƒμž μ’Œν‘œ μ„œμ‹ λ³€ν™˜, μƒμž 크기 μ‘°μ •, 면적 계산, μ˜€ν”„μ…‹ 포함 λ“±μ˜ μž‘μ—…μ„ λ„μ™€μ€λ‹ˆλ‹€!

import numpy as np

from ultralytics.utils.instance import Bboxes

boxes = Bboxes(
    bboxes=np.array(
        [
            [22.878, 231.27, 804.98, 756.83],
            [48.552, 398.56, 245.35, 902.71],
            [669.47, 392.19, 809.72, 877.04],
            [221.52, 405.8, 344.98, 857.54],
            [0, 550.53, 63.01, 873.44],
            [0.0584, 254.46, 32.561, 324.87],
        ]
    ),
    format="xyxy",
)

boxes.areas()
# >>> array([ 4.1104e+05,       99216,       68000,       55772,       20347,      2288.5])

boxes.convert("xywh")
print(boxes.bboxes)
# >>> array(
#     [[ 413.93, 494.05,  782.1, 525.56],
#      [ 146.95, 650.63,  196.8, 504.15],
#      [  739.6, 634.62, 140.25, 484.85],
#      [ 283.25, 631.67, 123.46, 451.74],
#      [ 31.505, 711.99,  63.01, 322.91],
#      [  16.31, 289.67, 32.503,  70.41]]
# )

μ°Έμ‘° Bboxes μ°Έμ‘° μ„Ήμ…˜ λ₯Ό ν΄λ¦­ν•˜λ©΄ 더 λ§Žμ€ 속성과 λ©”μ†Œλ“œλ₯Ό μ‚¬μš©ν•  수 μžˆμŠ΅λ‹ˆλ‹€.

팁

λ‹€μŒ κΈ°λŠ₯(및 κ·Έ 이상)은 λ‹€μŒμ„ μ‚¬μš©ν•˜μ—¬ μ•‘μ„ΈμŠ€ν•  수 μžˆμŠ΅λ‹ˆλ‹€. Bboxes 클래슀 ν•¨μˆ˜λ₯Ό 직접 μ‚¬μš©ν•˜λŠ” 것을 μ„ ν˜Έν•œλ‹€λ©΄ λ‹€μŒ ν•˜μœ„ μ„Ήμ…˜μ—μ„œ ν•¨μˆ˜λ₯Ό λ…λ¦½μ μœΌλ‘œ κ°€μ Έμ˜€λŠ” 방법을 μ°Έμ‘°ν•˜μ„Έμš”.

μŠ€μΌ€μΌλ§ λ°•μŠ€

크기λ₯Ό μ‘°μ •ν•˜κ³  이미지λ₯Ό ν™•λŒ€ λ˜λŠ” μΆ•μ†Œν•  λ•Œ ν•΄λ‹Ή λ°”μš΄λ”© λ°•μŠ€ μ’Œν‘œλŠ” λ‹€μŒμ„ μ‚¬μš©ν•˜μ—¬ μ μ ˆν•˜κ²Œ μ‘°μ •ν•  수 μžˆμŠ΅λ‹ˆλ‹€. ultralytics.utils.ops.scale_boxes.

import cv2 as cv
import numpy as np

from ultralytics.utils.ops import scale_boxes

image = cv.imread("ultralytics/assets/bus.jpg")
h, w, c = image.shape
resized = cv.resize(image, None, (), fx=1.2, fy=1.2)
new_h, new_w, _ = resized.shape

xyxy_boxes = np.array(
    [
        [22.878, 231.27, 804.98, 756.83],
        [48.552, 398.56, 245.35, 902.71],
        [669.47, 392.19, 809.72, 877.04],
        [221.52, 405.8, 344.98, 857.54],
        [0, 550.53, 63.01, 873.44],
        [0.0584, 254.46, 32.561, 324.87],
    ]
)

new_boxes = scale_boxes(
    img1_shape=(h, w),  # original image dimensions
    boxes=xyxy_boxes,  # boxes from original image
    img0_shape=(new_h, new_w),  # resized image dimensions (scale to)
    ratio_pad=None,
    padding=False,
    xywh=False,
)

print(new_boxes)  # (1)!
# >>> array(
#     [[  27.454,  277.52,  965.98,   908.2],
#     [   58.262,  478.27,  294.42,  1083.3],
#     [   803.36,  470.63,  971.66,  1052.4],
#     [   265.82,  486.96,  413.98,    1029],
#     [        0,  660.64,  75.612,  1048.1],
#     [   0.0701,  305.35,  39.073,  389.84]]
# )
  1. μƒˆ 이미지 크기에 맞게 크기가 μ‘°μ •λœ λ°”μš΄λ”© λ°•μŠ€

λ°”μš΄λ”© λ°•μŠ€ ν˜•μ‹ λ³€ν™˜

XYXY β†’ XYWH

λ°”μš΄λ”© λ°•μŠ€ μ’Œν‘œλ₯Ό (x1, y1, x2, y2) ν˜•μ‹μ—μ„œ (x, y, λ„ˆλΉ„, 높이) ν˜•μ‹μœΌλ‘œ λ³€ν™˜ν•©λ‹ˆλ‹€. μ—¬κΈ°μ„œ (x1, y1은 μ™Όμͺ½ 상단 λͺ¨μ„œλ¦¬, (x2, y2λŠ” 였λ₯Έμͺ½ ν•˜λ‹¨ λͺ¨μ„œλ¦¬μž…λ‹ˆλ‹€.

import numpy as np

from ultralytics.utils.ops import xyxy2xywh

xyxy_boxes = np.array(
    [
        [22.878, 231.27, 804.98, 756.83],
        [48.552, 398.56, 245.35, 902.71],
        [669.47, 392.19, 809.72, 877.04],
        [221.52, 405.8, 344.98, 857.54],
        [0, 550.53, 63.01, 873.44],
        [0.0584, 254.46, 32.561, 324.87],
    ]
)
xywh = xyxy2xywh(xyxy_boxes)

print(xywh)
# >>> array(
#     [[ 413.93,  494.05,   782.1, 525.56],
#     [  146.95,  650.63,   196.8, 504.15],
#     [   739.6,  634.62,  140.25, 484.85],
#     [  283.25,  631.67,  123.46, 451.74],
#     [  31.505,  711.99,   63.01, 322.91],
#     [   16.31,  289.67,  32.503,  70.41]]
# )

λͺ¨λ“  λ°”μš΄λ”© λ°•μŠ€ λ³€ν™˜

from ultralytics.utils.ops import (
    ltwh2xywh,
    ltwh2xyxy,
    xywh2ltwh,  # xywh β†’ top-left corner, w, h
    xywh2xyxy,
    xywhn2xyxy,  # normalized β†’ pixel
    xyxy2ltwh,  # xyxy β†’ top-left corner, w, h
    xyxy2xywhn,  # pixel β†’ normalized
)

for func in (ltwh2xywh, ltwh2xyxy, xywh2ltwh, xywh2xyxy, xywhn2xyxy, xyxy2ltwh, xyxy2xywhn):
    print(help(func))  # print function docstrings

각 κΈ°λŠ₯에 λŒ€ν•œ λ¬Έμ„œ λ¬Έμžμ—΄μ„ μ°Έμ‘°ν•˜κ±°λ‚˜ ultralytics.utils.ops μ°Έμ‘° νŽ˜μ΄μ§€ λ₯Ό ν΄λ¦­ν•˜μ—¬ 각 κΈ°λŠ₯에 λŒ€ν•΄ μžμ„Ένžˆ μ•Œμ•„λ³΄μ„Έμš”.

ν”Œλ‘œνŒ…

λ“œλ‘œμž‰ 주석

Ultralytics μ—λŠ” λͺ¨λ“  μ’…λ₯˜μ˜ 데이터에 주석을 λ‹€λŠ” 데 μ‚¬μš©ν•  수 μžˆλŠ” Annotator ν΄λž˜μŠ€κ°€ ν¬ν•¨λ˜μ–΄ μžˆμŠ΅λ‹ˆλ‹€. 객체 감지 λ°”μš΄λ”© λ°•μŠ€, 포즈 ν‚€ 포인트, λ°©ν–₯μ„± λ°”μš΄λ”© λ°•μŠ€μ— κ°€μž₯ μ‰½κ²Œ μ‚¬μš©ν•  수 μžˆμŠ΅λ‹ˆλ‹€.

Ultralytics μŠ€μœ• 주석

Python YOLO11 πŸš€ μ‚¬μš© 예

import cv2

from ultralytics import YOLO
from ultralytics.utils.plotting import Annotator, colors

# User defined video path and model file
cap = cv2.VideoCapture("Path/to/video/file.mp4")
model = YOLO(model="yolo11s-seg.pt")  # Model file i.e. yolo11s.pt or yolo11m-seg.pt

if not cap.isOpened():
    print("Error: Could not open video.")
    exit()

# Initialize the video writer object.
w, h, fps = (int(cap.get(x)) for x in (cv2.CAP_PROP_FRAME_WIDTH, cv2.CAP_PROP_FRAME_HEIGHT, cv2.CAP_PROP_FPS))
video_writer = cv2.VideoWriter("ultralytics.avi", cv2.VideoWriter_fourcc(*"mp4v"), fps, (w, h))

masks = None  # Initialize variable to store masks data
f = 0  # Initialize frame count variable for enabling mouse event.
line_x = w  # Store width of line.
dragging = False  # Initialize bool variable for line dragging.
classes = model.names  # Store model classes names for plotting.
window_name = "Ultralytics Sweep Annotator"


def drag_line(event, x, y, flags, param):  # Mouse callback for dragging line.
    global line_x, dragging
    if event == cv2.EVENT_LBUTTONDOWN or (flags & cv2.EVENT_FLAG_LBUTTON):
        line_x = max(0, min(x, w))
        dragging = True


while cap.isOpened():  # Loop over the video capture object.
    ret, im0 = cap.read()
    if not ret:
        break
    f = f + 1  # Increment frame count.
    count = 0  # Re-initialize count variable on every frame for precise counts.
    annotator = Annotator(im0)
    results = model.track(im0, persist=True)  # Track objects using track method.
    if f == 1:
        cv2.namedWindow(window_name)
        cv2.setMouseCallback(window_name, drag_line)

    if results[0].boxes.id is not None:
        if results[0].masks is not None:
            masks = results[0].masks.xy
        track_ids = results[0].boxes.id.int().cpu().tolist()
        clss = results[0].boxes.cls.cpu().tolist()
        boxes = results[0].boxes.xyxy.cpu()

        for mask, box, cls, t_id in zip(masks or [None] * len(boxes), boxes, clss, track_ids):
            color = colors(t_id, True)  # Assign different color to each tracked object.
            if mask is not None and mask.size > 0:
                # If you want to overlay the masks
                # mask[:, 0] = np.clip(mask[:, 0], line_x, w)
                # mask_img = cv2.fillPoly(im0.copy(), [mask.astype(int)], color)
                # cv2.addWeighted(mask_img, 0.5, im0, 0.5, 0, im0)

                if box[0] > line_x:
                    count += 1
                    annotator.seg_bbox(mask=mask, mask_color=color, label=str(classes[cls]))
            else:
                if box[0] > line_x:
                    count += 1
                    annotator.box_label(box=box, color=color, label=str(classes[cls]))

    annotator.sweep_annotator(line_x=line_x, line_y=h, label=f"COUNT:{count}")  # Display the sweep
    cv2.imshow(window_name, im0)
    video_writer.write(im0)
    if cv2.waitKey(1) & 0xFF == ord("q"):
        break

cap.release()  # Release the video capture.
video_writer.release()  # Release the video writer.
cv2.destroyAllWindows()  # Destroy all opened windows.

μˆ˜ν‰ λ°”μš΄λ”© λ°•μŠ€

import cv2 as cv
import numpy as np

from ultralytics.utils.plotting import Annotator, colors

names = {  # (1)!
    0: "person",
    5: "bus",
    11: "stop sign",
}

image = cv.imread("ultralytics/assets/bus.jpg")
ann = Annotator(
    image,
    line_width=None,  # default auto-size
    font_size=None,  # default auto-size
    font="Arial.ttf",  # must be ImageFont compatible
    pil=False,  # use PIL, otherwise uses OpenCV
)

xyxy_boxes = np.array(
    [
        [5, 22.878, 231.27, 804.98, 756.83],  # class-idx x1 y1 x2 y2
        [0, 48.552, 398.56, 245.35, 902.71],
        [0, 669.47, 392.19, 809.72, 877.04],
        [0, 221.52, 405.8, 344.98, 857.54],
        [0, 0, 550.53, 63.01, 873.44],
        [11, 0.0584, 254.46, 32.561, 324.87],
    ]
)

for nb, box in enumerate(xyxy_boxes):
    c_idx, *box = box
    label = f"{str(nb).zfill(2)}:{names.get(int(c_idx))}"
    ann.box_label(box, label, color=colors(c_idx, bgr=True))

image_with_bboxes = ann.result()
  1. 이름은 λ‹€μŒμ—μ„œ μ‚¬μš©ν•  수 μžˆμŠ΅λ‹ˆλ‹€. model.names μ–Έμ œ 탐지 κ²°κ³Ό μž‘μ—…

OBB(μ˜€λ¦¬μ—”ν‹°λ“œ λ°”μš΄λ”© λ°•μŠ€)

import cv2 as cv
import numpy as np

from ultralytics.utils.plotting import Annotator, colors

obb_names = {10: "small vehicle"}
obb_image = cv.imread("datasets/dota8/images/train/P1142__1024__0___824.jpg")
obb_boxes = np.array(
    [
        [0, 635, 560, 919, 719, 1087, 420, 803, 261],  # class-idx x1 y1 x2 y2 x3 y2 x4 y4
        [0, 331, 19, 493, 260, 776, 70, 613, -171],
        [9, 869, 161, 886, 147, 851, 101, 833, 115],
    ]
)
ann = Annotator(
    obb_image,
    line_width=None,  # default auto-size
    font_size=None,  # default auto-size
    font="Arial.ttf",  # must be ImageFont compatible
    pil=False,  # use PIL, otherwise uses OpenCV
)
for obb in obb_boxes:
    c_idx, *obb = obb
    obb = np.array(obb).reshape(-1, 4, 2).squeeze()
    label = f"{obb_names.get(int(c_idx))}"
    ann.box_label(
        obb,
        label,
        color=colors(c_idx, True),
        rotated=True,
    )

image_with_obb = ann.result()

λ°”μš΄λ”© μƒμž 원 주석 원 λ ˆμ΄λΈ”



Watch: ν…μŠ€νŠΈ 및 원 주석에 λŒ€ν•œ 심측 κ°€μ΄λ“œ( Python 라이브 데λͺ¨ 포함) | Ultralytics 주석 πŸš€

import cv2

from ultralytics import YOLO
from ultralytics.utils.plotting import Annotator

model = YOLO("yolo11s.pt")
names = model.names
cap = cv2.VideoCapture("path/to/video/file.mp4")

w, h, fps = (int(cap.get(x)) for x in (cv2.CAP_PROP_FRAME_WIDTH, cv2.CAP_PROP_FRAME_HEIGHT, cv2.CAP_PROP_FPS))
writer = cv2.VideoWriter("Ultralytics circle annotation.avi", cv2.VideoWriter_fourcc(*"MJPG"), fps, (w, h))

while True:
    ret, im0 = cap.read()
    if not ret:
        break

    annotator = Annotator(im0)
    results = model.predict(im0)
    boxes = results[0].boxes.xyxy.cpu()
    clss = results[0].boxes.cls.cpu().tolist()

    for box, cls in zip(boxes, clss):
        annotator.circle_label(box, label=names[int(cls)])

    writer.write(im0)
    cv2.imshow("Ultralytics circle annotation", im0)

    if cv2.waitKey(1) & 0xFF == ord("q"):
        break

writer.release()
cap.release()
cv2.destroyAllWindows()

λ°”μš΄λ”© λ°•μŠ€ ν…μŠ€νŠΈ 주석 ν…μŠ€νŠΈ λ ˆμ΄λΈ”

import cv2

from ultralytics import YOLO
from ultralytics.utils.plotting import Annotator

model = YOLO("yolo11s.pt")
names = model.names
cap = cv2.VideoCapture("path/to/video/file.mp4")

w, h, fps = (int(cap.get(x)) for x in (cv2.CAP_PROP_FRAME_WIDTH, cv2.CAP_PROP_FRAME_HEIGHT, cv2.CAP_PROP_FPS))
writer = cv2.VideoWriter("Ultralytics text annotation.avi", cv2.VideoWriter_fourcc(*"MJPG"), fps, (w, h))

while True:
    ret, im0 = cap.read()
    if not ret:
        break

    annotator = Annotator(im0)
    results = model.predict(im0)
    boxes = results[0].boxes.xyxy.cpu()
    clss = results[0].boxes.cls.cpu().tolist()

    for box, cls in zip(boxes, clss):
        annotator.text_label(box, label=names[int(cls)])

    writer.write(im0)
    cv2.imshow("Ultralytics text annotation", im0)

    if cv2.waitKey(1) & 0xFF == ord("q"):
        break

writer.release()
cap.release()
cv2.destroyAllWindows()

μ°Έμ‘° Annotator μ°Έμ‘° νŽ˜μ΄μ§€ λ₯Ό μ°Έμ‘°ν•˜μ„Έμš”.

기타

μ½”λ“œ ν”„λ‘œνŒŒμΌλ§

λ‹€μŒμ„ μ‚¬μš©ν•˜μ—¬ μ‹€ν–‰/μ²˜λ¦¬ν•  μ½”λ“œμ˜ 기간을 ν™•μΈν•©λ‹ˆλ‹€. with λ˜λŠ” λ°μ½”λ ˆμ΄ν„°λ‘œ μ‚¬μš©ν•  수 μžˆμŠ΅λ‹ˆλ‹€.

from ultralytics.utils.ops import Profile

with Profile(device="cuda:0") as dt:
    pass  # operation to measure

print(dt)
# >>> "Elapsed time is 9.5367431640625e-07 s"

Ultralytics μ§€μ›λ˜λŠ” ν˜•μ‹

Ultralytics μ—μ„œ μ§€μ›ν•˜λŠ” 이미지 λ˜λŠ” λ™μ˜μƒ ν˜•μ‹μ˜ ν˜•μ‹μ„ ν”„λ‘œκ·Έλž˜λ° λ°©μ‹μœΌλ‘œ μ‚¬μš©ν•˜κ³  μ‹Άκ±°λ‚˜ μ‚¬μš©ν•΄μ•Ό ν•˜λ‚˜μš”? ν•„μš”ν•œ 경우 이 μƒμˆ˜λ₯Ό μ‚¬μš©ν•˜μ„Έμš”.

from ultralytics.data.utils import IMG_FORMATS, VID_FORMATS

print(IMG_FORMATS)
# {'tiff', 'pfm', 'bmp', 'mpo', 'dng', 'jpeg', 'png', 'webp', 'tif', 'jpg'}

print(VID_FORMATS)
# {'avi', 'mpg', 'wmv', 'mpeg', 'm4v', 'mov', 'mp4', 'asf', 'mkv', 'ts', 'gif', 'webm'}

λΆ„ν•  κ°€λŠ₯ λ§Œλ“€κΈ°

에 κ°€μž₯ κ°€κΉŒμš΄ μ •μˆ˜λ₯Ό κ³„μ‚°ν•©λ‹ˆλ‹€. x 둜 λ‚˜λˆŒ λ•Œ κ· λ“±ν•˜κ²Œ λ‚˜λˆŒ 수 μžˆλ„λ‘ ν•©λ‹ˆλ‹€. y.

from ultralytics.utils.ops import make_divisible

make_divisible(7, 3)
# >>> 9
make_divisible(7, 2)
# >>> 8

자주 λ¬»λŠ” 질문

λ¨Έμ‹  λŸ¬λ‹ μ›Œν¬ν”Œλ‘œμš°λ₯Ό κ°œμ„ ν•˜κΈ° μœ„ν•΄ Ultralytics νŒ¨ν‚€μ§€μ— ν¬ν•¨λœ μœ ν‹Έλ¦¬ν‹°μ—λŠ” μ–΄λ–€ 것이 μžˆλ‚˜μš”?

Ultralytics νŒ¨ν‚€μ§€μ—λŠ” λ¨Έμ‹  λŸ¬λ‹ μ›Œν¬ν”Œλ‘œμš°λ₯Ό κ°„μ†Œν™”ν•˜κ³  μ΅œμ ν™”ν•˜λ„λ‘ μ„€κ³„λœ λ‹€μ–‘ν•œ μœ ν‹Έλ¦¬ν‹°κ°€ ν¬ν•¨λ˜μ–΄ μžˆμŠ΅λ‹ˆλ‹€. μ£Όμš” μœ ν‹Έλ¦¬ν‹°μ—λŠ” 데이터 μ„ΈνŠΈμ— 라벨을 λΆ™μ΄λŠ” μžλ™ 주석, convert_cocoλ₯Ό μ‚¬μš©ν•΄ COCOλ₯Ό YOLO ν˜•μ‹μœΌλ‘œ λ³€ν™˜ν•˜λŠ” κΈ°λŠ₯, 이미지 μ••μΆ•, 데이터 μ„ΈνŠΈ μžλ™ λΆ„ν•  등이 μžˆμŠ΅λ‹ˆλ‹€. μ΄λŸ¬ν•œ λ„κ΅¬λŠ” μˆ˜μž‘μ—…μ„ 쀄이고, 일관성을 보μž₯ν•˜λ©°, 데이터 처리 νš¨μœ¨μ„±μ„ ν–₯μƒμ‹œν‚€λŠ” 것을 λͺ©ν‘œλ‘œ ν•©λ‹ˆλ‹€.

Ultralytics 을 μ‚¬μš©ν•˜μ—¬ 데이터 집합에 μžλ™ λ ˆμ΄λΈ”μ„ μ§€μ •ν•˜λ €λ©΄ μ–΄λ–»κ²Œ ν•΄μ•Ό ν•˜λ‚˜μš”?

사전 ν•™μŠ΅λœ Ultralytics YOLO 객체 감지 λͺ¨λΈμ΄ μžˆλŠ” 경우, 이λ₯Ό μ„Έκ·Έλ¨ΌνŠΈ ν˜•μ‹μ˜ SAM λͺ¨λΈκ³Ό ν•¨κ»˜ μ‚¬μš©ν•˜μ—¬ 데이터 μ„ΈνŠΈμ— μ„ΈλΆ„ν™” ν˜•μ‹μœΌλ‘œ μžλ™ 주석을 달 수 μžˆμŠ΅λ‹ˆλ‹€. λ‹€μŒμ€ μ˜ˆμ‹œμž…λ‹ˆλ‹€:

from ultralytics.data.annotator import auto_annotate

auto_annotate(
    data="path/to/new/data",
    det_model="yolo11n.pt",
    sam_model="mobile_sam.pt",
    device="cuda",
    output_dir="path/to/save_labels",
)

μžμ„Έν•œ λ‚΄μš©μ€ μžλ™ 주석 달기 μ°Έμ‘° μ„Ήμ…˜μ„ ν™•μΈν•˜μ„Έμš”.

COCO 데이터 μ„ΈνŠΈ 주석을 Ultralytics μ—μ„œ YOLO ν˜•μ‹μœΌλ‘œ λ³€ν™˜ν•˜λ €λ©΄ μ–΄λ–»κ²Œ ν•˜λ‚˜μš”?

COCO JSON μ–΄λ…Έν…Œμ΄μ…˜μ„ 개체 감지λ₯Ό μœ„ν•΄ YOLO ν˜•μ‹μœΌλ‘œ λ³€ν™˜ν•˜λ €λ©΄ convert_coco μœ ν‹Έλ¦¬ν‹°λ₯Ό μ‚¬μš©ν•˜μ„Έμš”. λ‹€μŒμ€ μƒ˜ν”Œ μ½”λ“œ μŠ€λ‹ˆνŽ«μž…λ‹ˆλ‹€:

from ultralytics.data.converter import convert_coco

convert_coco(
    "../datasets/coco/annotations/",
    use_segments=False,
    use_keypoints=False,
    cls91to80=True,
)

μžμ„Έν•œ λ‚΄μš©μ€ convert_coco μ°Έμ‘° νŽ˜μ΄μ§€λ₯Ό μ°Έμ‘°ν•˜μ„Έμš”.

Ultralytics νŒ¨ν‚€μ§€μ˜ YOLO 데이터 νƒμƒ‰κΈ°μ˜ μš©λ„λŠ” λ¬΄μ—‡μΈκ°€μš”?

그리고 YOLO 탐색기 에 μ†Œκ°œλœ κ°•λ ₯ν•œ λ„κ΅¬μž…λ‹ˆλ‹€. 8.1.0 데이터 μ„ΈνŠΈ 이해도λ₯Ό 높이기 μœ„ν•œ μ—…λ°μ΄νŠΈμž…λ‹ˆλ‹€. ν…μŠ€νŠΈ 쿼리λ₯Ό μ‚¬μš©ν•΄ 데이터 μ„ΈνŠΈμ—μ„œ 개체 μΈμŠ€ν„΄μŠ€λ₯Ό 찾을 수 μžˆμœΌλ―€λ‘œ 데이터λ₯Ό 더 μ‰½κ²Œ λΆ„μ„ν•˜κ³  관리할 수 μžˆμŠ΅λ‹ˆλ‹€. 이 λ„κ΅¬λŠ” 데이터 μ„ΈνŠΈ ꡬ성과 뢄포에 λŒ€ν•œ κ·€μ€‘ν•œ μΈμ‚¬μ΄νŠΈλ₯Ό μ œκ³΅ν•˜μ—¬ λͺ¨λΈ ν›ˆλ ¨κ³Ό μ„±λŠ₯을 κ°œμ„ ν•˜λŠ” 데 도움이 λ©λ‹ˆλ‹€.

Ultralytics μ—μ„œ λ°”μš΄λ”© λ°•μŠ€λ₯Ό μ„Έκ·Έλ¨ΌνŠΈλ‘œ λ³€ν™˜ν•˜λ €λ©΄ μ–΄λ–»κ²Œ ν•΄μ•Ό ν•˜λ‚˜μš”?

κΈ°μ‘΄ λ°”μš΄λ”© λ°•μŠ€ 데이터λ₯Ό λ³€ν™˜ν•˜λ €λ©΄( x y w h ν˜•μ‹)을 μ„Έκ·Έλ¨ΌνŠΈμ— μΆ”κ°€ν•˜λ €λ©΄ yolo_bbox2segment κΈ°λŠ₯을 μ‚¬μš©ν•˜μ„Έμš”. 이미지와 라벨을 μœ„ν•œ λ³„λ„μ˜ λ””λ ‰ν„°λ¦¬λ‘œ νŒŒμΌμ„ μ •λ¦¬ν•˜μ„Έμš”.

from ultralytics.data.converter import yolo_bbox2segment

yolo_bbox2segment(
    im_dir="path/to/images",
    save_dir=None,  # saved to "labels-segment" in the images directory
    sam_model="sam_b.pt",
)

μžμ„Έν•œ λ‚΄μš©μ€ yolo_bbox2segment μ°Έμ‘° νŽ˜μ΄μ§€λ₯Ό μ°Έμ‘°ν•˜μ„Έμš”.

9κ°œμ›” μ „ 생성됨 ✏️ 11 일 μ „ μ—…λ°μ΄νŠΈλ¨

λŒ“κΈ€