์ฝ˜ํ…์ธ ๋กœ ๊ฑด๋„ˆ๋›ฐ๊ธฐ

YOLO-์„ธ๊ณ„ ๋ชจ๋ธ

YOLO-์„ธ๊ณ„ ๋ชจ๋ธ์—์„œ๋Š” ๊ณ ๊ธ‰ ์‹ค์‹œ๊ฐ„ Ultralytics YOLOv8-๊ธฐ๋ฐ˜ ์ ‘๊ทผ ๋ฐฉ์‹์„ ๋„์ž…ํ•ฉ๋‹ˆ๋‹ค. ์ด ํ˜์‹ ์€ ์„ค๋ช… ํ…์ŠคํŠธ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ์ด๋ฏธ์ง€ ๋‚ด์˜ ๋ชจ๋“  ๋ฌผ์ฒด๋ฅผ ๊ฐ์ง€ํ•  ์ˆ˜ ์žˆ๊ฒŒ ํ•ด์ค๋‹ˆ๋‹ค. ๊ฒฝ์Ÿ๋ ฅ ์žˆ๋Š” ์„ฑ๋Šฅ์„ ์œ ์ง€ํ•˜๋ฉด์„œ ๊ณ„์‚ฐ ์š”๊ตฌ ์‚ฌํ•ญ์„ ํฌ๊ฒŒ ๋‚ฎ์ถ˜ YOLO-World๋Š” ๋‹ค์–‘ํ•œ ๋น„์ „ ๊ธฐ๋ฐ˜ ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์„ ์œ„ํ•œ ๋‹ค๋ชฉ์  ๋„๊ตฌ๋กœ ๋ถ€์ƒํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

YOLO-์›”๋“œ ๋ชจ๋ธ ์•„ํ‚คํ…์ฒ˜ ๊ฐœ์š”

๊ฐœ์š”

YOLO-World๋Š” ๊ด‘๋ฒ”์œ„ํ•œ ๊ณ„์‚ฐ ๋ฆฌ์†Œ์Šค๋ฅผ ํ•„์š”๋กœ ํ•˜๋Š” ๋ฒˆ๊ฑฐ๋กœ์šด ํŠธ๋žœ์Šคํฌ๋จธ ๋ชจ๋ธ์— ์˜์กดํ•˜๋Š” ๊ธฐ์กด์˜ ๊ฐœ๋ฐฉํ˜• ์–ดํœ˜ ๊ฐ์ง€ ๋ชจ๋ธ์ด ์ง๋ฉดํ•œ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•ฉ๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ๋ชจ๋ธ์€ ์‚ฌ์ „ ์ •์˜๋œ ๊ฐœ์ฒด ๋ฒ”์ฃผ์— ์˜์กดํ•˜๊ธฐ ๋•Œ๋ฌธ์— ๋™์  ์‹œ๋‚˜๋ฆฌ์˜ค์—์„œ ์œ ์šฉ์„ฑ์ด ์ œํ•œ๋˜๊ธฐ๋„ ํ•ฉ๋‹ˆ๋‹ค. YOLO-์„ธ๊ณ„๋Š” ๊ฐœ๋ฐฉํ˜• ์–ดํœ˜ ๊ฐ์ง€ ๊ธฐ๋Šฅ์œผ๋กœ YOLOv8 ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ํ™œ์„ฑํ™”ํ•˜์—ฌ ๋น„์ „ ์–ธ์–ด ๋ชจ๋ธ๋ง๊ณผ ๋ฐฉ๋Œ€ํ•œ ๋ฐ์ดํ„ฐ ์„ธํŠธ์— ๋Œ€ํ•œ ์‚ฌ์ „ ํ•™์Šต์„ ํ†ตํ•ด ์ œ๋กœ ์ƒท ์‹œ๋‚˜๋ฆฌ์˜ค์—์„œ ํƒ์›”ํ•œ ํšจ์œจ์„ฑ์œผ๋กœ ๊ด‘๋ฒ”์œ„ํ•œ ๊ฐ์ฒด๋ฅผ ์‹๋ณ„ํ•ฉ๋‹ˆ๋‹ค.

์ฃผ์š” ๊ธฐ๋Šฅ

  1. ์‹ค์‹œ๊ฐ„ ์†”๋ฃจ์…˜: CNN์˜ ๊ณ„์‚ฐ ์†๋„๋ฅผ ํ™œ์šฉํ•˜๋Š” YOLO-World๋Š” ์ฆ‰๊ฐ์ ์ธ ๊ฒฐ๊ณผ๋ฅผ ํ•„์š”๋กœ ํ•˜๋Š” ์‚ฐ์—…์— ๋งž๋Š” ์‹ ์†ํ•œ ๊ฐœ๋ฐฉํ˜• ์–ดํœ˜ ๊ฐ์ง€ ์†”๋ฃจ์…˜์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.

  2. ํšจ์œจ์„ฑ ๋ฐ ์„ฑ๋Šฅ: YOLO- ์„ฑ๋Šฅ ์ €ํ•˜ ์—†์ด ๊ณ„์‚ฐ ๋ฐ ๋ฆฌ์†Œ์Šค ์š”๊ตฌ ์‚ฌํ•ญ์„ ์ ˆ๊ฐํ•˜์—ฌ SAM ๊ฐ™์€ ๋ชจ๋ธ์— ๋Œ€ํ•œ ๊ฐ•๋ ฅํ•œ ๋Œ€์•ˆ์„ ์ œ๊ณตํ•˜์ง€๋งŒ ๊ณ„์‚ฐ ๋น„์šฉ์€ ํ›จ์”ฌ ์ €๋ ดํ•˜์—ฌ ์‹ค์‹œ๊ฐ„ ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์„ ์ง€์›ํ•ฉ๋‹ˆ๋‹ค.

  3. ์˜คํ”„๋ผ์ธ ์–ดํœ˜๋ฅผ ์‚ฌ์šฉํ•œ ์ถ”๋ก : YOLO-World๋Š” ์˜คํ”„๋ผ์ธ ์–ดํœ˜๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํšจ์œจ์„ฑ์„ ๋”์šฑ ํ–ฅ์ƒ์‹œํ‚ค๋Š” 'ํ”„๋กฌํ”„ํŠธ ํ›„ ๊ฐ์ง€' ์ „๋žต์„ ๋„์ž…ํ–ˆ์Šต๋‹ˆ๋‹ค. ์ด ์ ‘๊ทผ ๋ฐฉ์‹์„ ์‚ฌ์šฉํ•˜๋ฉด ์บก์…˜์ด๋‚˜ ์นดํ…Œ๊ณ ๋ฆฌ๋ฅผ ํฌํ•จํ•˜์—ฌ ์„ ํ—˜์ ์œผ๋กœ ๊ณ„์‚ฐ๋œ ์‚ฌ์šฉ์ž ์ง€์ • ํ”„๋กฌํ”„ํŠธ๋ฅผ ์˜คํ”„๋ผ์ธ ์–ดํœ˜ ์ž„๋ฒ ๋”ฉ์œผ๋กœ ์ธ์ฝ”๋”ฉํ•˜๊ณ  ์ €์žฅํ•˜์—ฌ ๊ฐ์ง€ ํ”„๋กœ์„ธ์Šค๋ฅผ ๊ฐ„์†Œํ™”ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

  4. ์ œ๊ณต: YOLOv8: ๊ธฐ๋ฐ˜ Ultralytics YOLOv8, YOLO-World๋Š” ์‹ค์‹œ๊ฐ„ ๊ฐ์ฒด ๊ฐ์ง€์˜ ์ตœ์‹  ๊ธฐ์ˆ ์„ ํ™œ์šฉํ•˜์—ฌ ํƒ์›”ํ•œ ์ •ํ™•๋„์™€ ์†๋„๋กœ ๊ฐœ๋ฐฉํ˜• ์–ดํœ˜ ๊ฐ์ง€๋ฅผ ์šฉ์ดํ•˜๊ฒŒ ํ•ฉ๋‹ˆ๋‹ค.

  5. ๋ฒค์น˜๋งˆํฌ ์šฐ์ˆ˜์„ฑ: YOLO-ํ‘œ์ค€ ๋ฒค์น˜๋งˆํฌ์—์„œ ์†๋„์™€ ํšจ์œจ์„ฑ ์ธก๋ฉด์—์„œ MDETR ๋ฐ GLIP ์‹œ๋ฆฌ์ฆˆ๋ฅผ ํฌํ•จํ•œ ๊ธฐ์กด ์˜คํ”ˆ ์–ดํœ˜ ๊ฒ€์ถœ๊ธฐ๋ณด๋‹ค ๋›ฐ์–ด๋‚œ ์„ฑ๋Šฅ์„ ๋ณด์—ฌ์คŒ์œผ๋กœ์จ ๋‹จ์ผ NVIDIA V100 GPU์—์„œ YOLOv8 ์˜ ๋›ฐ์–ด๋‚œ ์„ฑ๋Šฅ์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค.

  6. ๋‹ค์–‘ํ•œ ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜: YOLO-์„ธ๊ณ„์˜ ํ˜์‹ ์ ์ธ ์ ‘๊ทผ ๋ฐฉ์‹์€ ๋‹ค์–‘ํ•œ ๋น„์ „ ์ž‘์—…์— ๋Œ€ํ•œ ์ƒˆ๋กœ์šด ๊ฐ€๋Šฅ์„ฑ์„ ์—ด์–ด์ฃผ๋ฉฐ ๊ธฐ์กด ๋ฐฉ์‹๋ณด๋‹ค ๋ช‡ ๋ฐฐ์˜ ์†๋„ ํ–ฅ์ƒ์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.

์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ ๋ชจ๋ธ, ์ง€์›๋˜๋Š” ์ž‘์—… ๋ฐ ์ž‘๋™ ๋ชจ๋“œ

์ด ์„น์…˜์—์„œ๋Š” ํŠน์ • ์‚ฌ์ „ ํ•™์Šต๋œ ๊ฐ€์ค‘์น˜์™€ ํ•จ๊ป˜ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ๋ชจ๋ธ, ์ง€์›๋˜๋Š” ์ž‘์—…, ์ถ”๋ก , ๊ฒ€์ฆ, ํ•™์Šต, ๋‚ด๋ณด๋‚ด๊ธฐ ๋“ฑ ๋‹ค์–‘ํ•œ ์ž‘๋™ ๋ชจ๋“œ์™€์˜ ํ˜ธํ™˜์„ฑ์— ๋Œ€ํ•ด ์ž์„ธํžˆ ์„ค๋ช…ํ•˜๋ฉฐ ์ง€์›๋˜๋Š” ๋ชจ๋“œ์˜ ๊ฒฝ์šฐ โœ…๋กœ, ์ง€์›๋˜์ง€ ์•Š๋Š” ๋ชจ๋“œ์˜ ๊ฒฝ์šฐ โŒ๋กœ ํ‘œ์‹œ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค.

๋ชจ๋ธ ์œ ํ˜• ์‚ฌ์ „ ํ•™์Šต๋œ ๊ฐ€์ค‘์น˜ ์ง€์›๋˜๋Š” ์ž‘์—… ์ถ”๋ก  ์œ ํšจ์„ฑ ๊ฒ€์‚ฌ ๊ต์œก ๋‚ด๋ณด๋‚ด๊ธฐ
YOLOv8s-์„ธ๊ณ„ yolov8s-world.pt ๋ฌผ์ฒด ๊ฐ์ง€ โœ… โœ… โŒ โŒ
YOLOv8s-worldv2 yolov8s-worldv2.pt ๋ฌผ์ฒด ๊ฐ์ง€ โœ… โœ… โŒ โœ…
YOLOv8m-์„ธ๊ณ„ yolov8m-world.pt ๋ฌผ์ฒด ๊ฐ์ง€ โœ… โœ… โŒ โŒ
YOLOv8m-worldv2 yolov8m-worldv2.pt ๋ฌผ์ฒด ๊ฐ์ง€ โœ… โœ… โŒ โœ…
YOLOv8l-์„ธ๊ณ„ yolov8l-world.pt ๋ฌผ์ฒด ๊ฐ์ง€ โœ… โœ… โŒ โŒ
YOLOv8l-worldv2 yolov8l-worldv2.pt ๋ฌผ์ฒด ๊ฐ์ง€ โœ… โœ… โŒ โœ…
YOLOv8x-์„ธ๊ณ„ yolov8x-world.pt ๋ฌผ์ฒด ๊ฐ์ง€ โœ… โœ… โŒ โŒ
YOLOv8x-worldv2 yolov8x-worldv2.pt ๋ฌผ์ฒด ๊ฐ์ง€ โœ… โœ… โŒ โœ…

COCO ๋ฐ์ดํ„ฐ ์„ธํŠธ์˜ ์ œ๋กœ ์ƒท ์ „์†ก

๋ชจ๋ธ ์œ ํ˜• mAP mAP50 mAP75
yolov8s-์„ธ๊ณ„ 37.4 52.0 40.6
yolov8s-worldv2 37.7 52.2 41.0
yolov8m-์„ธ๊ณ„ 42.0 57.0 45.6
yolov8m-worldv2 43.0 58.4 46.8
yolov8l-์„ธ๊ณ„ 45.7 61.3 49.8
yolov8l-worldv2 45.8 61.3 49.8
yolov8x-์„ธ๊ณ„ 47.0 63.0 51.2
yolov8x-worldv2 47.1 62.8 51.4

์‚ฌ์šฉ ์˜ˆ

YOLO-World ๋ชจ๋ธ์€ Python ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์— ์‰ฝ๊ฒŒ ํ†ตํ•ฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. Ultralytics ์‚ฌ์šฉ์ž ์นœํ™”์ ์ธ Python API์™€ CLI ๋ช…๋ น์–ด๋ฅผ ์ œ๊ณตํ•˜์—ฌ ๊ฐœ๋ฐœ์„ ๊ฐ„์†Œํ™”ํ•ฉ๋‹ˆ๋‹ค.

์—ด์ฐจ ์‚ฌ์šฉ๋Ÿ‰

ํŒ

๋‹ค์Œ์„ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด ์ข‹์Šต๋‹ˆ๋‹ค. yolov8-worldv2 ๋ชจ๋ธ์€ ๊ฒฐ์ •๋ก ์  ํ›ˆ๋ จ์„ ์ง€์›ํ•˜๊ณ  ๋‹ค๋ฅธ ํ˜•์‹(์˜ˆ: onnx/tensorrt)์„ ์‰ฝ๊ฒŒ ๋‚ด๋ณด๋‚ผ ์ˆ˜ ์žˆ๊ธฐ ๋•Œ๋ฌธ์— ์‚ฌ์šฉ์ž ์ง€์ • ํ›ˆ๋ จ์— ์ ํ•ฉํ•ฉ๋‹ˆ๋‹ค.

๋ฌผ์ฒด ๊ฐ์ง€๋Š” train ๋ฉ”์„œ๋“œ๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค:

์˜ˆ

PyTorch ์‚ฌ์ „ ๊ต์œก *.pt ๋ชจ๋ธ ๋ฐ ๊ตฌ์„ฑ *.yaml ํŒŒ์ผ์„ YOLOWorld() ํด๋ž˜์Šค๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ python ์—์„œ ๋ชจ๋ธ ์ธ์Šคํ„ด์Šค๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค:

from ultralytics import YOLOWorld

# Load a pretrained YOLOv8s-worldv2 model
model = YOLOWorld('yolov8s-worldv2.pt')

# Train the model on the COCO8 example dataset for 100 epochs
results = model.train(data='coco8.yaml', epochs=100, imgsz=640)

# Run inference with the YOLOv8n model on the 'bus.jpg' image
results = model('path/to/bus.jpg')
# Load a pretrained YOLOv8s-worldv2 model and train it on the COCO8 example dataset for 100 epochs
yolo train model=yolov8s-worldv2.yaml data=coco8.yaml epochs=100 imgsz=640

์‚ฌ์šฉ๋Ÿ‰ ์˜ˆ์ธก

๋ฌผ์ฒด ๊ฐ์ง€๋Š” predict ๋ฉ”์„œ๋“œ๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค:

์˜ˆ

from ultralytics import YOLOWorld

# Initialize a YOLO-World model
model = YOLOWorld('yolov8s-world.pt')  # or select yolov8m/l-world.pt for different sizes

# Execute inference with the YOLOv8s-world model on the specified image
results = model.predict('path/to/image.jpg')

# Show results
results[0].show()
# Perform object detection using a YOLO-World model
yolo predict model=yolov8s-world.pt source=path/to/image.jpg imgsz=640

์ด ์Šค๋‹ˆํŽซ์€ ์‚ฌ์ „ ํ•™์Šต๋œ ๋ชจ๋ธ์„ ๋กœ๋“œํ•˜๊ณ  ์ด๋ฏธ์ง€์— ๋Œ€ํ•ด ์˜ˆ์ธก์„ ์‹คํ–‰ํ•˜๋Š” ๊ฐ„๋‹จํ•œ ๋ฐฉ๋ฒ•์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค.

Val ์‚ฌ์šฉ๋ฒ•

๋ฐ์ดํ„ฐ ์„ธํŠธ์— ๋Œ€ํ•œ ๋ชจ๋ธ ์œ ํšจ์„ฑ ๊ฒ€์‚ฌ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์ด ๊ฐ„์†Œํ™”๋ฉ๋‹ˆ๋‹ค:

์˜ˆ

from ultralytics import YOLO

# Create a YOLO-World model
model = YOLO('yolov8s-world.pt')  # or select yolov8m/l-world.pt for different sizes

# Conduct model validation on the COCO8 example dataset
metrics = model.val(data='coco8.yaml')
# Validate a YOLO-World model on the COCO8 dataset with a specified image size
yolo val model=yolov8s-world.pt data=coco8.yaml imgsz=640

์ฐธ๊ณ 

Ultralytics ์—์„œ ์ œ๊ณตํ•˜๋Š” YOLO-World ๋ชจ๋ธ์€ ์˜คํ”„๋ผ์ธ ์–ดํœ˜์˜ ์ผ๋ถ€๋กœ COCO ๋ฐ์ดํ„ฐ ์„ธํŠธ ์นดํ…Œ๊ณ ๋ฆฌ๊ฐ€ ์‚ฌ์ „ ๊ตฌ์„ฑ๋˜์–ด ์žˆ์–ด ์ฆ‰์‹œ ์ ์šฉํ•  ์ˆ˜ ์žˆ๋„๋ก ํšจ์œจ์„ฑ์„ ๋†’์ž…๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ํ†ตํ•ฉ์„ ํ†ตํ•ด YOLOv8-World ๋ชจ๋ธ์€ ์ถ”๊ฐ€ ์„ค์ •์ด๋‚˜ ์‚ฌ์šฉ์ž ์ง€์ • ์—†์ด๋„ COCO ๋ฐ์ดํ„ฐ์„ธํŠธ์— ์ •์˜๋œ 80๊ฐœ์˜ ํ‘œ์ค€ ์นดํ…Œ๊ณ ๋ฆฌ๋ฅผ ์ง์ ‘ ์ธ์‹ํ•˜๊ณ  ์˜ˆ์ธกํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

ํ”„๋กฌํ”„ํŠธ ์„ค์ •

YOLO-์›”๋“œ ํ”„๋กฌํ”„ํŠธ ํด๋ž˜์Šค ์ด๋ฆ„ ๊ฐœ์š”

YOLO-World ํ”„๋ ˆ์ž„์›Œํฌ๋Š” ์‚ฌ์šฉ์ž ์ง€์ • ํ”„๋กฌํ”„ํŠธ๋ฅผ ํ†ตํ•ด ํด๋ž˜์Šค๋ฅผ ๋™์ ์œผ๋กœ ์ง€์ •ํ•  ์ˆ˜ ์žˆ์–ด ์‚ฌ์šฉ์ž๊ฐ€ ์žฌํ•™์Šต ์—†์ด๋„ ํŠน์ • ์š”๊ตฌ ์‚ฌํ•ญ์— ๋งž๊ฒŒ ๋ชจ๋ธ์„ ์กฐ์ •ํ•  ์ˆ˜ ์žˆ๋„๋ก ์ง€์›ํ•ฉ๋‹ˆ๋‹ค. ์ด ๊ธฐ๋Šฅ์€ ์›๋ž˜ ํ•™์Šต ๋ฐ์ดํ„ฐ์— ํฌํ•จ๋˜์ง€ ์•Š์•˜๋˜ ์ƒˆ๋กœ์šด ๋„๋ฉ”์ธ์ด๋‚˜ ํŠน์ • ์ž‘์—…์— ๋ชจ๋ธ์„ ์ ์šฉํ•˜๋Š” ๋ฐ ํŠนํžˆ ์œ ์šฉํ•ฉ๋‹ˆ๋‹ค. ์‚ฌ์šฉ์ž ์ง€์ • ํ”„๋กฌํ”„ํŠธ๋ฅผ ์„ค์ •ํ•จ์œผ๋กœ์จ ์‚ฌ์šฉ์ž๋Š” ๊ธฐ๋ณธ์ ์œผ๋กœ ๋ชจ๋ธ์˜ ์ดˆ์ ์„ ๊ด€์‹ฌ ์žˆ๋Š” ๊ฐ์ฒด๋กœ ์œ ๋„ํ•˜์—ฌ ํƒ์ง€ ๊ฒฐ๊ณผ์˜ ๊ด€๋ จ์„ฑ๊ณผ ์ •ํ™•์„ฑ์„ ํ–ฅ์ƒ์‹œํ‚ฌ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์˜ˆ๋ฅผ ๋“ค์–ด, ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์—์„œ '์‚ฌ๋žŒ'๊ณผ '๋ฒ„์Šค' ๊ฐ์ฒด๋งŒ ๊ฐ์ง€ํ•ด์•ผ ํ•˜๋Š” ๊ฒฝ์šฐ ์ด๋Ÿฌํ•œ ํด๋ž˜์Šค๋ฅผ ์ง์ ‘ ์ง€์ •ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค:

์˜ˆ

from ultralytics import YOLO

# Initialize a YOLO-World model
model = YOLO('yolov8s-world.pt')  # or choose yolov8m/l-world.pt

# Define custom classes
model.set_classes(["person", "bus"])

# Execute prediction for specified categories on an image
results = model.predict('path/to/image.jpg')

# Show results
results[0].show()

์‚ฌ์šฉ์ž ์ง€์ • ํด๋ž˜์Šค๋ฅผ ์„ค์ •ํ•œ ํ›„ ๋ชจ๋ธ์„ ์ €์žฅํ•  ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋ ‡๊ฒŒ ํ•˜๋ฉด ํŠน์ • ์‚ฌ์šฉ ์‚ฌ๋ก€์— ํŠนํ™”๋œ YOLO-World ๋ชจ๋ธ ๋ฒ„์ „์„ ๋งŒ๋“ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด ํ”„๋กœ์„ธ์Šค๋Š” ์‚ฌ์šฉ์ž ์ง€์ • ํด๋ž˜์Šค ์ •์˜๋ฅผ ๋ชจ๋ธ ํŒŒ์ผ์— ์ง์ ‘ ํฌํ•จํ•˜๋ฏ€๋กœ ์ถ”๊ฐ€ ์กฐ์ • ์—†์ด ์ง€์ •ํ•œ ํด๋ž˜์Šค์—์„œ ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์‚ฌ์šฉ์ž ์ง€์ • YOLOv8 ๋ชจ๋ธ์„ ์ €์žฅํ•˜๊ณ  ๋กœ๋“œํ•˜๋ ค๋ฉด ๋‹ค์Œ ๋‹จ๊ณ„๋ฅผ ๋”ฐ๋ฅด์„ธ์š”:

์˜ˆ

๋จผ์ € YOLO-World ๋ชจ๋ธ์„ ๋กœ๋“œํ•˜๊ณ  ์‚ฌ์šฉ์ž ์ง€์ • ํด๋ž˜์Šค๋ฅผ ์„ค์ •ํ•œ ํ›„ ์ €์žฅํ•ฉ๋‹ˆ๋‹ค:

from ultralytics import YOLO

# Initialize a YOLO-World model
model = YOLO('yolov8s-world.pt')  # or select yolov8m/l-world.pt

# Define custom classes
model.set_classes(["person", "bus"])

# Save the model with the defined offline vocabulary
model.save("custom_yolov8s.pt")

์ €์žฅ ํ›„ custom_yolov8s.pt ๋ชจ๋ธ์€ ์‚ฌ์ „ ํ•™์Šต๋œ ๋‹ค๋ฅธ YOLOv8 ๋ชจ๋ธ๊ณผ ๋™์ผํ•˜๊ฒŒ ์ž‘๋™ํ•˜์ง€๋งŒ ์ค‘์š”ํ•œ ์ฐจ์ด์ ์ด ์žˆ์Šต๋‹ˆ๋‹ค. ์ด์ œ ์‚ฌ์šฉ์ž๊ฐ€ ์ •์˜ํ•œ ํด๋ž˜์Šค๋งŒ ๊ฐ์ง€ํ•˜๋„๋ก ์ตœ์ ํ™”๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค. ์ด ์‚ฌ์šฉ์ž ์ง€์ •์€ ํŠน์ • ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜ ์‹œ๋‚˜๋ฆฌ์˜ค์— ๋Œ€ํ•œ ํƒ์ง€ ์„ฑ๋Šฅ๊ณผ ํšจ์œจ์„ฑ์„ ํฌ๊ฒŒ ํ–ฅ์ƒ์‹œํ‚ฌ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

from ultralytics import YOLO

# Load your custom model
model = YOLO('custom_yolov8s.pt')

# Run inference to detect your custom classes
results = model.predict('path/to/image.jpg')

# Show results
results[0].show()

์‚ฌ์šฉ์ž ์ง€์ • ์–ดํœ˜๋ฅผ ํ†ตํ•œ ์ €์žฅ์˜ ์ด์ 

  • ํšจ์œจ์„ฑ: ๊ด€๋ จ ๊ฐ์ฒด์— ์ง‘์ค‘ํ•˜๊ณ  ๊ณ„์‚ฐ ์˜ค๋ฒ„ํ—ค๋“œ๋ฅผ ์ค„์ด๋ฉฐ ์ถ”๋ก  ์†๋„๋ฅผ ๋†’์—ฌ ํƒ์ง€ ํ”„๋กœ์„ธ์Šค๋ฅผ ๊ฐ„์†Œํ™”ํ•ฉ๋‹ˆ๋‹ค.
  • ์œ ์—ฐ์„ฑ: ๊ด‘๋ฒ”์œ„ํ•œ ์žฌ๊ต์œก์ด๋‚˜ ๋ฐ์ดํ„ฐ ์ˆ˜์ง‘ ์—†์ด๋„ ์ƒˆ๋กœ์šด ๋˜๋Š” ํ‹ˆ์ƒˆ ํƒ์ง€ ์ž‘์—…์— ๋ชจ๋ธ์„ ์‰ฝ๊ฒŒ ์ ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • ๋‹จ์ˆœ์„ฑ: ๋Ÿฐํƒ€์ž„์— ์‚ฌ์šฉ์ž ์ง€์ • ํด๋ž˜์Šค๋ฅผ ๋ฐ˜๋ณต์ ์œผ๋กœ ์ง€์ •ํ•  ํ•„์š”๊ฐ€ ์—†์–ด ๋ฐฐํฌ๊ฐ€ ๊ฐ„์†Œํ™”๋˜๋ฉฐ, ๋‚ด์žฅ๋œ ์–ดํœ˜๋กœ ๋ชจ๋ธ์„ ๋ฐ”๋กœ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • ์„ฑ๋Šฅ: ๋ชจ๋ธ์˜ ์ฃผ์˜์™€ ๋ฆฌ์†Œ์Šค๋ฅผ ์ •์˜๋œ ๊ฐ์ฒด๋ฅผ ์ธ์‹ํ•˜๋Š” ๋ฐ ์ง‘์ค‘ํ•˜์—ฌ ์ง€์ •๋œ ํด๋ž˜์Šค์— ๋Œ€ํ•œ ํƒ์ง€ ์ •ํ™•๋„๋ฅผ ํ–ฅ์ƒ์‹œํ‚ต๋‹ˆ๋‹ค.

์ด ์ ‘๊ทผ ๋ฐฉ์‹์€ ํŠน์ • ์ž‘์—…์— ๋งž๊ฒŒ ์ตœ์ฒจ๋‹จ ๊ฐ์ฒด ๊ฐ์ง€ ๋ชจ๋ธ์„ ์‚ฌ์šฉ์ž ์ง€์ •ํ•  ์ˆ˜ ์žˆ๋Š” ๊ฐ•๋ ฅํ•œ ์ˆ˜๋‹จ์„ ์ œ๊ณตํ•˜์—ฌ ๊ณ ๊ธ‰ AI๋ฅผ ๋ณด๋‹ค ์‰ฝ๊ฒŒ ์ ‘๊ทผํ•˜๊ณ  ๊ด‘๋ฒ”์œ„ํ•œ ์‹ค์ œ ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์— ์ ์šฉํ•  ์ˆ˜ ์žˆ๊ฒŒ ํ•ด์ค๋‹ˆ๋‹ค.

๊ณต์‹ ๊ฒฐ๊ณผ๋ฅผ ์ฒ˜์Œ๋ถ€ํ„ฐ ์žฌํ˜„(์‹คํ—˜์ )

๋ฐ์ดํ„ฐ ์ง‘ํ•ฉ ์ค€๋น„

  • ๋ฐ์ดํ„ฐ ํ›ˆ๋ จ
๋ฐ์ดํ„ฐ ์„ธํŠธ ์œ ํ˜• ์ƒ˜ํ”Œ ์ƒ์ž ์ฃผ์„ ํŒŒ์ผ
์˜ค๋ธŒ์ ํŠธ365v1 ํƒ์ง€ 609k 9621k objects365_train.json
GQA ์ ‘์ง€ 621k 3681k final_mixed_train_no_coco.json
Flickr30k ์ ‘์ง€ 149k 641k final_flickr_separateGT_train.json
  • Val ๋ฐ์ดํ„ฐ
๋ฐ์ดํ„ฐ ์„ธํŠธ ์œ ํ˜• ์ฃผ์„ ํŒŒ์ผ
LVIS ๋ฏธ๋‹ˆ๋ฐด ํƒ์ง€ minival.txt

์ฒ˜์Œ๋ถ€ํ„ฐ ๊ต์œก ์‹œ์ž‘

์ฐธ๊ณ 

WorldTrainerFromScratch ๋Š” ํƒ์ง€ ๋ฐ์ดํ„ฐ ์„ธํŠธ์™€ ์ ‘์ง€ ๋ฐ์ดํ„ฐ ์„ธํŠธ ๋ชจ๋‘์—์„œ ๋™์‹œ์— yolo-world ๋ชจ๋ธ์„ ํ•™์Šตํ•  ์ˆ˜ ์žˆ๋„๋ก ๊ณ ๋„๋กœ ๋งž์ถคํ™”๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค. ์ž์„ธํ•œ ๋‚ด์šฉ์€ ๋‹ค์Œ์„ ํ™•์ธํ•˜์„ธ์š”. ultralytics.model.yolo.world.train_world.py.

์˜ˆ

from ultralytics.models.yolo.world.train_world import WorldTrainerFromScratch
from ultralytics import YOLOWorld

data = dict(
    train=dict(
        yolo_data=["Objects365.yaml"],
        grounding_data=[
            dict(
                img_path="../datasets/flickr30k/images",
                json_file="../datasets/flickr30k/final_flickr_separateGT_train.json",
            ),
            dict(
                img_path="../datasets/GQA/images",
                json_file="../datasets/GQA/final_mixed_train_no_coco.json",
            ),
        ],
    ),
    val=dict(yolo_data=["lvis.yaml"]),
)
model = YOLOWorld("yolov8s-worldv2.yaml")
model.train(data=data, batch=128, epochs=100, trainer=WorldTrainerFromScratch)

์ธ์šฉ ๋ฐ ๊ฐ์‚ฌ

์‹ค์‹œ๊ฐ„ ๊ฐœ๋ฐฉํ˜• ์–ดํœ˜ ๊ฐ์ฒด ๊ฐ์ง€( YOLO-World)๋ฅผ ํ†ตํ•ด ์„ ๊ตฌ์ ์ธ ์ž‘์—…์„ ์ˆ˜ํ–‰ํ•œ Tencent AILab ์ปดํ“จํ„ฐ ๋น„์ „ ์„ผํ„ฐ์— ๊ฐ์‚ฌ๋ฅผ ํ‘œํ•ฉ๋‹ˆ๋‹ค:

@article{cheng2024yolow,
title={YOLO-World: Real-Time Open-Vocabulary Object Detection},
author={Cheng, Tianheng and Song, Lin and Ge, Yixiao and Liu, Wenyu and Wang, Xinggang and Shan, Ying},
journal={arXiv preprint arXiv:2401.17270},
year={2024}
}

์ž์„ธํ•œ ๋‚ด์šฉ์„ ๋ณด๋ ค๋ฉด ์›๋ณธ YOLO-World ๋…ผ๋ฌธ์€ arXiv์—์„œ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ํ”„๋กœ์ ํŠธ์˜ ์†Œ์Šค ์ฝ”๋“œ์™€ ์ถ”๊ฐ€ ๋ฆฌ์†Œ์Šค๋Š” GitHub ๋ฆฌํฌ์ง€ํ† ๋ฆฌ๋ฅผ ํ†ตํ•ด ์•ก์„ธ์Šคํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด ๋ถ„์•ผ๋ฅผ ๋ฐœ์ „์‹œํ‚ค๊ณ  ๊ท€์ค‘ํ•œ ์ธ์‚ฌ์ดํŠธ๋ฅผ ์ปค๋ฎค๋‹ˆํ‹ฐ์™€ ๊ณต์œ ํ•˜๋ ค๋Š” ๊ทธ๋“ค์˜ ๋…ธ๋ ฅ์— ๊ฐ์‚ฌ๋“œ๋ฆฝ๋‹ˆ๋‹ค.



์ƒ์„ฑ 2024-02-14, ์—…๋ฐ์ดํŠธ 2024-04-02
์ž‘์„ฑ์ž: Burhan-Q (1), Laughing-q (4), glenn-jocher (1)

๋Œ“๊ธ€