コンテンツへスキップ

Object Cropping using Ultralytics YOLO11

オブジェクトの切り抜きとは?

Object cropping with Ultralytics YOLO11 involves isolating and extracting specific detected objects from an image or video. The YOLO11 model capabilities are utilized to accurately identify and delineate objects, enabling precise cropping for further analysis or manipulation.



見るんだ: Object Cropping using Ultralytics YOLO

オブジェクトクロッピングの利点

  • Focused Analysis: YOLO11 facilitates targeted object cropping, allowing for in-depth examination or processing of individual items within a scene.
  • データ量の削減:関連性のあるオブジェクトのみを抽出することで、オブジェクトの切り出しはデータサイズを最小化するのに役立ち、保存、送信、またはその後の計算タスクを効率化する。
  • Enhanced Precision: YOLO11's object detection accuracy ensures that the cropped objects maintain their spatial relationships, preserving the integrity of the visual information for detailed analysis.

ビジュアル

空港の手荷物
Conveyor Belt at Airport Suitcases Cropping using Ultralytics YOLO11
Suitcases Cropping at airport conveyor belt using Ultralytics YOLO11

Object Cropping using YOLO11 Example

import os

import cv2

from ultralytics import YOLO
from ultralytics.utils.plotting import Annotator, colors

model = YOLO("yolo11n.pt")
names = model.names

cap = cv2.VideoCapture("path/to/video/file.mp4")
assert cap.isOpened(), "Error reading video file"
w, h, fps = (int(cap.get(x)) for x in (cv2.CAP_PROP_FRAME_WIDTH, cv2.CAP_PROP_FRAME_HEIGHT, cv2.CAP_PROP_FPS))

crop_dir_name = "ultralytics_crop"
if not os.path.exists(crop_dir_name):
    os.mkdir(crop_dir_name)

# Video writer
video_writer = cv2.VideoWriter("object_cropping_output.avi", cv2.VideoWriter_fourcc(*"mp4v"), fps, (w, h))

idx = 0
while cap.isOpened():
    success, im0 = cap.read()
    if not success:
        print("Video frame is empty or video processing has been successfully completed.")
        break

    results = model.predict(im0, show=False)
    boxes = results[0].boxes.xyxy.cpu().tolist()
    clss = results[0].boxes.cls.cpu().tolist()
    annotator = Annotator(im0, line_width=2, example=names)

    if boxes is not None:
        for box, cls in zip(boxes, clss):
            idx += 1
            annotator.box_label(box, color=colors(int(cls), True), label=names[int(cls)])

            crop_obj = im0[int(box[1]) : int(box[3]), int(box[0]) : int(box[2])]

            cv2.imwrite(os.path.join(crop_dir_name, str(idx) + ".png"), crop_obj)

    cv2.imshow("ultralytics", im0)
    video_writer.write(im0)

    if cv2.waitKey(1) & 0xFF == ord("q"):
        break

cap.release()
video_writer.release()
cv2.destroyAllWindows()

論争 model.predict

議論タイプデフォルト説明
sourcestr'ultralytics/assets'Specifies the data source for inference. Can be an image path, video file, directory, URL, or device ID for live feeds. Supports a wide range of formats and sources, enabling flexible application across different types of input.
conffloat0.25検出の最小信頼度しきい値を設定します。この閾値以下の信頼度で検出されたオブジェクトは無視されます。この値を調整することで、誤検出を減らすことができます。
ioufloat0.7Intersection Over Union (IoU) threshold for Non-Maximum Suppression (NMS). Lower values result in fewer detections by eliminating overlapping boxes, useful for reducing duplicates.
imgszint or tuple640推論のための画像サイズを定義する。単一の整数値 640 for square resizing or a (height, width) tuple. Proper sizing can improve detection accuracy and processing speed.
halfboolFalseEnables half-precision (FP16) inference, which can speed up model inference on supported GPUs with minimal impact on accuracy.
devicestrNone推論を行うデバイスを指定する(例. cpu, cuda:0 または 0).CPU 、特定のGPU 、またはモデル実行のための他のコンピュート・デバイスを選択することができます。
max_detint300画像あたりの最大検出数。1回の推論でモデルが検出できるオブジェクトの総数を制限し、密集したシーンでの過剰な出力を防ぎます。
vid_strideint1ビデオ入力のフレームストライド。時間的な解像度を犠牲にして処理を高速化するために、ビデオのフレームをスキップできるようにする。1の値はすべてのフレームを処理し、それ以上の値はフレームをスキップする。
stream_bufferboolFalseDetermines whether to queue incoming frames for video streams. If False, old frames get dropped to accomodate new frames (optimized for real-time applications). If `True', queues new frames in a buffer, ensuring no frames get skipped, but will cause latency if inference FPS is lower than stream FPS.
visualizeboolFalse推論中にモデルの特徴を可視化し、モデルが何を「見て」いるのかを知ることができます。デバッグやモデルの解釈に役立ちます。
augmentboolFalse予測に対するテスト時間拡張(TTA)を可能にし、推論速度を犠牲にすることで検出のロバスト性を向上させる可能性がある。
agnostic_nmsboolFalse異なるクラスのオーバーラップしたボックスをマージする、クラスにとらわれない非最大抑制(NMS)を有効にします。クラスの重複が一般的なマルチクラス検出シナリオで有用。
classeslist[int]Noneクラス ID のセットに予測をフィルタリングします。指定されたクラスに属する検出のみが返されます。複数クラスの検出タスクで、関連するオブジェクトに焦点を当てるのに便利です。
retina_masksboolFalseモデルに高解像度のセグメンテーションマスクがあれば、それを使用する。これにより、セグメンテーションタスクのマスク品質が向上し、より細かいディテールが得られます。
embedlist[int]NoneSpecifies the layers from which to extract feature vectors or embeddings. Useful for downstream tasks like clustering or similarity search.
projectstrNoneName of the project directory where prediction outputs are saved if save is enabled.
namestrNoneName of the prediction run. Used for creating a subdirectory within the project folder, where prediction outputs are stored if save is enabled.

よくあるご質問

What is object cropping in Ultralytics YOLO11 and how does it work?

Object cropping using Ultralytics YOLO11 involves isolating and extracting specific objects from an image or video based on YOLO11's detection capabilities. This process allows for focused analysis, reduced data volume, and enhanced precision by leveraging YOLO11 to identify objects with high accuracy and crop them accordingly. For an in-depth tutorial, refer to the object cropping example.

Why should I use Ultralytics YOLO11 for object cropping over other solutions?

Ultralytics YOLO11 stands out due to its precision, speed, and ease of use. It allows detailed and accurate object detection and cropping, essential for focused analysis and applications needing high data integrity. Moreover, YOLO11 integrates seamlessly with tools like OpenVINO and TensorRT for deployments requiring real-time capabilities and optimization on diverse hardware. Explore the benefits in the guide on model export.

オブジェクトの切り抜きでデータセットのデータ量を減らすには?

By using Ultralytics YOLO11 to crop only relevant objects from your images or videos, you can significantly reduce the data size, making it more efficient for storage and processing. This process involves training the model to detect specific objects and then using the results to crop and save these portions only. For more information on exploiting Ultralytics YOLO11's capabilities, visit our quickstart guide.

Can I use Ultralytics YOLO11 for real-time video analysis and object cropping?

Yes, Ultralytics YOLO11 can process real-time video feeds to detect and crop objects dynamically. The model's high-speed inference capabilities make it ideal for real-time applications such as surveillance, sports analysis, and automated inspection systems. Check out the tracking and prediction modes to understand how to implement real-time processing.

What are the hardware requirements for efficiently running YOLO11 for object cropping?

Ultralytics YOLO11 is optimized for both CPU and GPU environments, but to achieve optimal performance, especially for real-time or high-volume inference, a dedicated GPU (e.g., NVIDIA Tesla, RTX series) is recommended. For deployment on lightweight devices, consider using CoreML for iOS or TFLite for Android. More details on supported devices and formats can be found in our model deployment options.

📅 Created 9 months ago ✏️ Updated 22 days ago

コメント