์ฝ˜ํ…์ธ ๋กœ ๊ฑด๋„ˆ๋›ฐ๊ธฐ

Neural Magic์˜ ๋”ฅ์ŠคํŽ˜์ด์Šค

์†Œํ”„ํŠธ์›จ์–ด ์ œ๊ณต AI์— ์˜ค์‹  ๊ฒƒ์„ ํ™˜์˜ํ•ฉ๋‹ˆ๋‹ค.

์ด ๊ฐ€์ด๋“œ์—์„œ๋Š” Neural Magic ์˜ DeepSparse๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ YOLOv5 ๋ฅผ ๋ฐฐํฌํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์„ค๋ช…ํ•ฉ๋‹ˆ๋‹ค.

DeepSparse๋Š” CPU์—์„œ ํƒ์›”ํ•œ ์„ฑ๋Šฅ์„ ๋ฐœํœ˜ํ•˜๋Š” ์ถ”๋ก  ๋Ÿฐํƒ€์ž„์ž…๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, ONNX ๋Ÿฐํƒ€์ž„ ๊ธฐ์ค€๊ณผ ๋น„๊ตํ–ˆ์„ ๋•Œ, ๋™์ผํ•œ ์ปดํ“จํ„ฐ์—์„œ ์‹คํ–‰๋˜๋Š” YOLOv5์˜ ๊ฒฝ์šฐ DeepSparse๋Š” 5.8๋ฐฐ์˜ ์†๋„ ํ–ฅ์ƒ์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค!

YOLOv5 ์†๋„ ํ–ฅ์ƒ

์ฒ˜์Œ์œผ๋กœ ํ•˜๋“œ์›จ์–ด ๊ฐ€์†๊ธฐ์˜ ๋ณต์žก์„ฑ๊ณผ ๋น„์šฉ ์—†์ด๋„ ๋”ฅ ๋Ÿฌ๋‹ ์›Œํฌ๋กœ๋“œ๊ฐ€ ํ”„๋กœ๋•์…˜์˜ ์„ฑ๋Šฅ ์š”๊ตฌ ์‚ฌํ•ญ์„ ์ถฉ์กฑํ•  ์ˆ˜ ์žˆ๊ฒŒ ๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ๊ฐ„๋‹จํžˆ ๋งํ•ด, DeepSparse๋Š” GPU์˜ ์„ฑ๋Šฅ๊ณผ ์†Œํ”„ํŠธ์›จ์–ด์˜ ๋‹จ์ˆœ์„ฑ์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค:

  • ์œ ์—ฐํ•œ ๋ฐฐํฌ: ์ธํ…”, AMD, ARM ๋“ฑ ๋ชจ๋“  ํ•˜๋“œ์›จ์–ด ๊ณต๊ธ‰์—…์ฒด๋ฅผ ํ†ตํ•ด ํด๋ผ์šฐ๋“œ, ๋ฐ์ดํ„ฐ์„ผํ„ฐ, ์—ฃ์ง€ ์ „๋ฐ˜์—์„œ ์ผ๊ด€๋˜๊ฒŒ ์‹คํ–‰ํ•˜์„ธ์š”.
  • ๋ฌดํ•œํ•œ ํ™•์žฅ์„ฑ: ํ‘œ์ค€ Kubernetes๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ 100๊ฐœ์˜ ์ฝ”์–ด๋กœ ์ˆ˜์ง ํ™•์žฅํ•˜๊ฑฐ๋‚˜, ์„œ๋ฒ„๋ฆฌ์Šค๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์™„์ „ํžˆ ์ถ”์ƒํ™”ํ•˜์„ธ์š”.
  • ๊ฐ„ํŽธํ•œ ํ†ตํ•ฉ: ๋ชจ๋ธ์„ ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์— ํ†ตํ•ฉํ•˜๊ณ  ํ”„๋กœ๋•์…˜ ํ™˜๊ฒฝ์—์„œ ๋ชจ๋‹ˆํ„ฐ๋งํ•  ์ˆ˜ ์žˆ๋Š” ๊น”๋”ํ•œ API

๋”ฅ์ŠคํŽ˜์ด์Šค๋Š” ์–ด๋–ป๊ฒŒ GPU๊ธ‰ ์„ฑ๋Šฅ์„ ๊ตฌํ˜„ํ•˜๋‚˜์š”?

๋”ฅ์ŠคํŽ˜์ด์Šค๋Š” ๋ชจ๋ธ ํฌ์†Œ์„ฑ์„ ํ™œ์šฉํ•˜์—ฌ ์„ฑ๋Šฅ ์†๋„๋ฅผ ๋†’์ž…๋‹ˆ๋‹ค.

๊ฐ€์ง€์น˜๊ธฐ์™€ ์–‘์žํ™”๋ฅผ ํ†ตํ•œ ์ŠคํŒŒ์Šคํ™”๋Š” ๊ด‘๋ฒ”์œ„ํ•˜๊ฒŒ ์—ฐ๊ตฌ๋œ ๊ธฐ๋ฒ•์œผ๋กœ, ๋†’์€ ์ •ํ™•๋„๋ฅผ ์œ ์ง€ํ•˜๋ฉด์„œ ๋„คํŠธ์›Œํฌ๋ฅผ ์‹คํ–‰ํ•˜๋Š” ๋ฐ ํ•„์š”ํ•œ ํฌ๊ธฐ์™€ ์ปดํ“จํŒ…์„ ๋Œ€ํญ ์ค„์ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋”ฅ์ŠคํŽ˜์ด์Šค๋Š” ํฌ์†Œ์„ฑ์„ ์ธ์‹ํ•˜๋ฏ€๋กœ 0์ด ๋œ ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ๊ฑด๋„ˆ๋›ฐ๊ณ  ํฌ์›Œ๋“œ ํŒจ์Šค์—์„œ ๊ณ„์‚ฐ๋Ÿ‰์„ ์ค„์ž…๋‹ˆ๋‹ค. ์ด์ œ ํฌ์†Œ์„ฑ ๊ณ„์‚ฐ์ด ๋ฉ”๋ชจ๋ฆฌ์— ๊ตฌ์†๋˜๋ฏ€๋กœ, DeepSparse๋Š” ๋„คํŠธ์›Œํฌ๋ฅผ ๊นŠ์ด ์žˆ๊ฒŒ ์‹คํ–‰ํ•˜์—ฌ ๋ฌธ์ œ๋ฅผ Tensor ์—ด, ์ฆ‰ ์บ์‹œ์— ๋งž๋Š” ์ˆ˜์ง ์ค„๋ฌด๋Šฌ ๊ณ„์‚ฐ์œผ๋กœ ๋ถ„ํ• ํ•ฉ๋‹ˆ๋‹ค.

YOLO ๋ชจ๋ธ ๊ฐ€์ง€์น˜๊ธฐ

์บ์‹œ์—์„œ ๊นŠ์ด ๋‹จ์œ„๋กœ ์‹คํ–‰๋˜๋Š” ์••์ถ• ์—ฐ์‚ฐ์ด ํฌํ•จ๋œ ์ŠคํŒŒ์Šค ๋„คํŠธ์›Œํฌ๋ฅผ ํ†ตํ•ด DeepSparse๋Š” CPU์—์„œ GPU๊ธ‰ ์„ฑ๋Šฅ์„ ์ œ๊ณตํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค!

๋‚ด ๋ฐ์ดํ„ฐ๋กœ ํ›ˆ๋ จ๋œ YOLOv5 ์ŠคํŒŒ์Šค ๋ฒ„์ „์„ ๋งŒ๋“ค๋ ค๋ฉด ์–ด๋–ป๊ฒŒ ํ•ด์•ผ ํ•˜๋‚˜์š”?

Neural Magic์˜ ์˜คํ”ˆ ์†Œ์Šค ๋ชจ๋ธ ์ €์žฅ์†Œ์ธ SparseZoo์—๋Š” ๊ฐ YOLOv5 ๋ชจ๋ธ์— ๋Œ€ํ•ด ๋ฏธ๋ฆฌ ์ŠคํŒŒ์Šคํ™”๋œ ์ฒดํฌํฌ์ธํŠธ๊ฐ€ ํฌํ•จ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค. Ultralytics ์™€ ํ†ตํ•ฉ๋œ SparseML์„ ์‚ฌ์šฉํ•˜๋ฉด CLI ๋ช…๋ น ํ•œ ๋ฒˆ์œผ๋กœ ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•œ ์ŠคํŒŒ์Šค ์ฒดํฌํฌ์ธํŠธ๋ฅผ ๋ฏธ์„ธ ์กฐ์ •ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์ž์„ธํ•œ ๋‚ด์šฉ์€ Neural Magic ์˜ YOLOv5 ๋ฌธ์„œ๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.

DeepSparse ์‚ฌ์šฉ

๋”ฅ์ŠคํŽ˜์ด์Šค๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ŠคํŒŒ์Šค ๋ฒ„์ „์˜ YOLOv5๋ฅผ ๋ฒค์น˜๋งˆํ‚นํ•˜๊ณ  ๋ฐฐํฌํ•˜๋Š” ์˜ˆ์ œ๋ฅผ ์‚ดํŽด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

DeepSparse ์„ค์น˜

๋‹ค์Œ์„ ์‹คํ–‰ํ•˜์—ฌ DeepSparse๋ฅผ ์„ค์น˜ํ•ฉ๋‹ˆ๋‹ค. Python ์œผ๋กœ ๊ฐ€์ƒ ํ™˜๊ฒฝ์„ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด ์ข‹์Šต๋‹ˆ๋‹ค.

pip install "deepsparse[server,yolo,onnxruntime]"

ONNX ํŒŒ์ผ ์ˆ˜์ง‘

DeepSparse๋Š” ONNX ํ˜•์‹์œผ๋กœ ์ „๋‹ฌ๋œ ๋ชจ๋ธ์„ ํ—ˆ์šฉํ•ฉ๋‹ˆ๋‹ค:

  • SparseZoo์˜ ONNX ํŒŒ์ผ์„ ์‹๋ณ„ํ•˜๋Š” SparseZoo ์Šคํ…์ž…๋‹ˆ๋‹ค.
  • ํŒŒ์ผ ์‹œ์Šคํ…œ์—์„œ ONNX ๋ชจ๋ธ์— ๋Œ€ํ•œ ๋กœ์ปฌ ๊ฒฝ๋กœ

์•„๋ž˜ ์˜ˆ์‹œ์—์„œ๋Š” ๋‹ค์Œ SparseZoo ์Šคํ…์œผ๋กœ ์‹๋ณ„๋˜๋Š” ํ‘œ์ค€ ๋ฐ€๋„ ๋ฐ ๊ฐ€์ง€์น˜๊ธฐ ์ •๋Ÿ‰ํ™”๋œ YOLOv5s ์ฒดํฌํฌ์ธํŠธ๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค:

zoo:cv/detection/yolov5-s/pytorch/ultralytics/coco/base-none
zoo:cv/detection/yolov5-s/pytorch/ultralytics/coco/pruned65_quant-none

๋ชจ๋ธ ๋ฐฐํฌ

DeepSparse๋Š” ๋ชจ๋ธ์„ ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์— ํ†ตํ•ฉํ•˜๊ธฐ ์œ„ํ•œ ํŽธ๋ฆฌํ•œ API๋ฅผ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.

์•„๋ž˜ ๋ฐฐํฌ ์˜ˆ์ œ๋ฅผ ์‚ฌ์šฉํ•ด ๋ณด๋ ค๋ฉด ์ƒ˜ํ”Œ ์ด๋ฏธ์ง€๋ฅผ ๊ฐ€์ ธ์™€์„œ ๋‹ค์Œ ์ด๋ฆ„์œผ๋กœ ์ €์žฅํ•˜์„ธ์š”. basilica.jpg ๋ฅผ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค:

wget -O basilica.jpg https://raw.githubusercontent.com/neuralmagic/deepsparse/main/src/deepsparse/yolo/sample_images/basilica.jpg

Python API

Pipelines ๋Š” ์ „์ฒ˜๋ฆฌ์™€ ์ถœ๋ ฅ ํ›„์ฒ˜๋ฆฌ๋ฅผ ๋Ÿฐํƒ€์ž„์— ๋ž˜ํ•‘ํ•˜์—ฌ ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์— DeepSparse๋ฅผ ์ถ”๊ฐ€ํ•˜๊ธฐ ์œ„ํ•œ ๊น”๋”ํ•œ ์ธํ„ฐํŽ˜์ด์Šค๋ฅผ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. DeepSparse-Ultralytics ํ†ตํ•ฉ์—๋Š” ์ฆ‰์‹œ ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ Pipeline ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์›์‹œ ์ด๋ฏธ์ง€๋ฅผ ๋ฐ›์•„๋“ค์ด๊ณ  ๊ฒฝ๊ณ„ ์ƒ์ž๋ฅผ ์ถœ๋ ฅํ•ฉ๋‹ˆ๋‹ค.

์ƒ์„ฑํ•˜๊ธฐ Pipeline ๋ฅผ ํด๋ฆญํ•˜๊ณ  ์ถ”๋ก ์„ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค:

from deepsparse import Pipeline

# list of images in local filesystem
images = ["basilica.jpg"]

# create Pipeline
model_stub = "zoo:cv/detection/yolov5-s/pytorch/ultralytics/coco/pruned65_quant-none"
yolo_pipeline = Pipeline.create(
    task="yolo",
    model_path=model_stub,
)

# run inference on images, receive bounding boxes + classes
pipeline_outputs = yolo_pipeline(images=images, iou_thres=0.6, conf_thres=0.001)
print(pipeline_outputs)

ํด๋ผ์šฐ๋“œ์—์„œ ์‹คํ–‰ํ•˜๋Š” ๊ฒฝ์šฐ, open-cv๊ฐ€ ์ฐพ์„ ์ˆ˜ ์—†๋‹ค๋Š” ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. libGL.so.1. Ubuntu์—์„œ ๋‹ค์Œ์„ ์‹คํ–‰ํ•˜๋ฉด ์„ค์น˜๋ฉ๋‹ˆ๋‹ค:

apt-get install libgl1

HTTP ์„œ๋ฒ„

DeepSparse Server๋Š” ๋„๋ฆฌ ์‚ฌ์šฉ๋˜๋Š” FastAPI ์›น ํ”„๋ ˆ์ž„์›Œํฌ์™€ Uvicorn ์›น ์„œ๋ฒ„ ์œ„์—์„œ ์‹คํ–‰๋ฉ๋‹ˆ๋‹ค. CLI ๋ช…๋ น ํ•œ ๋ฒˆ์œผ๋กœ ๋ชจ๋ธ ์„œ๋น„์Šค ์—”๋“œํฌ์ธํŠธ๋ฅผ ๋”ฅ์ŠคํŽ˜์ด์Šค๋กœ ์‰ฝ๊ฒŒ ์„ค์ •ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด ์„œ๋ฒ„๋Š” YOLOv5 ์„ ํ†ตํ•œ ๊ฐ์ฒด ๊ฐ์ง€๋ฅผ ํฌํ•จํ•˜์—ฌ DeepSparse์˜ ๋ชจ๋“  ํŒŒ์ดํ”„๋ผ์ธ์„ ์ง€์›ํ•˜๋ฏ€๋กœ ์›์‹œ ์ด๋ฏธ์ง€๋ฅผ ์—”๋“œํฌ์ธํŠธ๋กœ ์ „์†กํ•˜๊ณ  ๋ฐ”์šด๋”ฉ ๋ฐ•์Šค๋ฅผ ์ˆ˜์‹ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์ •๋ฆฌ๋œ ์ •๋Ÿ‰ํ™”๋œ YOLOv5๋กœ ์„œ๋ฒ„๋ฅผ ์Šคํ•€์—…ํ•ฉ๋‹ˆ๋‹ค:

deepsparse.server \
    --task yolo \
    --model_path zoo:cv/detection/yolov5-s/pytorch/ultralytics/coco/pruned65_quant-none

Python ๋ฅผ ์‚ฌ์šฉํ•œ ์š”์ฒญ ์˜ˆ์‹œ requests ํŒจํ‚ค์ง€:

import requests, json

# list of images for inference (local files on client side)
path = ['basilica.jpg']
files = [('request', open(img, 'rb')) for img in path]

# send request over HTTP to /predict/from_files endpoint
url = 'http://0.0.0.0:5543/predict/from_files'
resp = requests.post(url=url, files=files)

# response is returned in JSON
annotations = json.loads(resp.text)  # dictionary of annotation results
bounding_boxes = annotations["boxes"]
labels = annotations["labels"]

์ฃผ์„ ๋‹ฌ๊ธฐ CLI

์ฃผ์„ ๋‹ฌ๊ธฐ ๋ช…๋ น์„ ์‚ฌ์šฉํ•˜์—ฌ ์—”์ง„์ด ์ฃผ์„์ด ๋‹ฌ๋ฆฐ ์‚ฌ์ง„์„ ๋””์Šคํฌ์— ์ €์žฅํ•˜๋„๋ก ํ•  ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค. ์†Œ์Šค 0์„ ์‚ฌ์šฉํ•ด ๋ผ์ด๋ธŒ ์›น์บ  ํ”ผ๋“œ์— ์ฃผ์„์„ ๋‹ฌ์•„ ๋ณด์„ธ์š”!

deepsparse.object_detection.annotate --model_filepath zoo:cv/detection/yolov5-s/pytorch/ultralytics/coco/pruned65_quant-none --source basilica.jpg

์œ„ ๋ช…๋ น์„ ์‹คํ–‰ํ•˜๋ฉด annotation-results ํด๋”์— ์ฃผ์„์ด ๋‹ฌ๋ฆฐ ์ด๋ฏธ์ง€๋ฅผ ์ €์žฅํ•ฉ๋‹ˆ๋‹ค.

์ฃผ์„์ด ๋‹ฌ๋ฆฐ

์„ฑ๋Šฅ ๋ฒค์น˜๋งˆํ‚น

DeepSparse์˜ ๋ฒค์น˜๋งˆํ‚น ์Šคํฌ๋ฆฝํŠธ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ YOLOv5์—์„œ ONNX ๋Ÿฐํƒ€์ž„์˜ ์ฒ˜๋ฆฌ๋Ÿ‰๊ณผ DeepSparse์˜ ์ฒ˜๋ฆฌ๋Ÿ‰์„ ๋น„๊ตํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

๋ฒค์น˜๋งˆํฌ๋Š” AWS์—์„œ ์‹คํ–‰๋˜์—ˆ์Šต๋‹ˆ๋‹ค. c6i.8xlarge ์ธ์Šคํ„ด์Šค(16๊ฐœ ์ฝ”์–ด).

๋ฐฐ์น˜ 32 ์„ฑ๋Šฅ ๋น„๊ต

ONNX ๋Ÿฐํƒ€์ž„ ๊ธฐ์ค€์„ 

๋ฐฐ์น˜ 32์—์„œ ONNX ๋Ÿฐํƒ€์ž„์€ ํ‘œ์ค€ ๊ณ ๋ฐ€๋„ YOLOv5๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ดˆ๋‹น 42๊ฐœ์˜ ์ด๋ฏธ์ง€๋ฅผ ๋‹ฌ์„ฑํ•ฉ๋‹ˆ๋‹ค:

deepsparse.benchmark zoo:cv/detection/yolov5-s/pytorch/ultralytics/coco/base-none -s sync -b 32 -nstreams 1 -e onnxruntime

> Original Model Path: zoo:cv/detection/yolov5-s/pytorch/ultralytics/coco/base-none
> Batch Size: 32
> Scenario: sync
> Throughput (items/sec): 41.9025

๋”ฅ์ŠคํŽ˜์ด์Šค ๊ณ ๋ฐ€๋„ ์„ฑ๋Šฅ

๋”ฅ์ŠคํŽ˜์ด์Šค๋Š” ์ตœ์ ํ™”๋œ ์ŠคํŒŒ์Šค ๋ชจ๋ธ์—์„œ ์ตœ๊ณ ์˜ ์„ฑ๋Šฅ์„ ์ œ๊ณตํ•˜์ง€๋งŒ, ํ‘œ์ค€ ๊ณ ๋ฐ€๋„ YOLOv5์—์„œ๋„ ์šฐ์ˆ˜ํ•œ ์„ฑ๋Šฅ์„ ๋ฐœํœ˜ํ•ฉ๋‹ˆ๋‹ค.

๋ฐฐ์น˜ 32์—์„œ ๋”ฅ์ŠคํŽ˜์ด์Šค๋Š” ํ‘œ์ค€ ๊ณ ๋ฐ€๋„ YOLOv5๋ฅผ ์‚ฌ์šฉํ•ด ์ดˆ๋‹น 70๊ฐœ์˜ ์ด๋ฏธ์ง€๋ฅผ ์ฒ˜๋ฆฌํ•˜๋ฉฐ, ์ด๋Š” ORT๋ณด๋‹ค 1.7๋ฐฐ ํ–ฅ์ƒ๋œ ์„ฑ๋Šฅ์ž…๋‹ˆ๋‹ค!

deepsparse.benchmark zoo:cv/detection/yolov5-s/pytorch/ultralytics/coco/base-none -s sync -b 32 -nstreams 1

> Original Model Path: zoo:cv/detection/yolov5-s/pytorch/ultralytics/coco/base-none
> Batch Size: 32
> Scenario: sync
> Throughput (items/sec): 69.5546

๋”ฅ์ŠคํŒŒ์ด์Šค ์ŠคํŒŒ์Šค ์„ฑ๋Šฅ

๋ชจ๋ธ์— ํฌ์†Œ์„ฑ์„ ์ ์šฉํ•˜๋ฉด ONNX ๋Ÿฐํƒ€์ž„์— ๋น„ํ•ด DeepSparse์˜ ์„ฑ๋Šฅ ํ–ฅ์ƒ์€ ํ›จ์”ฌ ๋” ๊ฐ•๋ ฅํ•ด์ง‘๋‹ˆ๋‹ค.

๋ฐฐ์น˜ 32์—์„œ ๋”ฅ์ŠคํŽ˜์ด์Šค๋Š” ์ •๋ฆฌ๋œ ์ •๋Ÿ‰ํ™”๋œ YOLOv5๋ฅผ ์‚ฌ์šฉํ•ด ์ดˆ๋‹น 241๊ฐœ์˜ ์ด๋ฏธ์ง€๋ฅผ ์ฒ˜๋ฆฌํ•˜๋ฉฐ, ์ด๋Š” ORT๋ณด๋‹ค 5.8๋ฐฐ ํ–ฅ์ƒ๋œ ์„ฑ๋Šฅ์ž…๋‹ˆ๋‹ค!

deepsparse.benchmark zoo:cv/detection/yolov5-s/pytorch/ultralytics/coco/pruned65_quant-none -s sync -b 32 -nstreams 1

> Original Model Path: zoo:cv/detection/yolov5-s/pytorch/ultralytics/coco/pruned65_quant-none
> Batch Size: 32
> Scenario: sync
> Throughput (items/sec): 241.2452

๋ฐฐ์น˜ 1 ์„ฑ๋Šฅ ๋น„๊ต

๋”ฅ์ŠคํŽ˜์ด์Šค๋Š” ์ง€์—ฐ ์‹œ๊ฐ„์— ๋ฏผ๊ฐํ•œ ๋ฐฐ์น˜ 1 ์‹œ๋‚˜๋ฆฌ์˜ค์˜ ๊ฒฝ์šฐ ONNX ๋Ÿฐํƒ€์ž„๋ณด๋‹ค ์†๋„๋ฅผ ํ–ฅ์ƒ์‹œํ‚ฌ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

ONNX ๋Ÿฐํƒ€์ž„ ๊ธฐ์ค€์„ 

๋ฐฐ์น˜ 1์—์„œ ONNX ๋Ÿฐํƒ€์ž„์€ ํ‘œ์ค€ ๊ณ ๋ฐ€๋„ YOLOv5๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ดˆ๋‹น 48๊ฐœ์˜ ์ด๋ฏธ์ง€๋ฅผ ๋‹ฌ์„ฑํ•ฉ๋‹ˆ๋‹ค.

deepsparse.benchmark zoo:cv/detection/yolov5-s/pytorch/ultralytics/coco/base-none -s sync -b 1 -nstreams 1 -e onnxruntime

> Original Model Path: zoo:cv/detection/yolov5-s/pytorch/ultralytics/coco/base-none
> Batch Size: 1
> Scenario: sync
> Throughput (items/sec): 48.0921

๋”ฅ์ŠคํŒŒ์ด์Šค ์ŠคํŒŒ์Šค ์„ฑ๋Šฅ

๋ฐฐ์น˜ 1์—์„œ ๋”ฅ์ŠคํŽ˜์ด์Šค๋Š” ์ •๋ฆฌ๋œ ์ •๋Ÿ‰ํ™”๋œ YOLOv5s๋กœ ์ดˆ๋‹น 135๊ฐœ์˜ ํ•ญ๋ชฉ์„ ๋‹ฌ์„ฑํ•˜์—ฌ ONNX ๋Ÿฐํƒ€์ž„์— ๋น„ํ•ด 2.8๋ฐฐ์˜ ์„ฑ๋Šฅ ํ–ฅ์ƒ์„ ๋‹ฌ์„ฑํ–ˆ์Šต๋‹ˆ๋‹ค !

deepsparse.benchmark zoo:cv/detection/yolov5-s/pytorch/ultralytics/coco/pruned65_quant-none -s sync -b 1 -nstreams 1

> Original Model Path: zoo:cv/detection/yolov5-s/pytorch/ultralytics/coco/pruned65_quant-none
> Batch Size: 1
> Scenario: sync
> Throughput (items/sec): 134.9468

์ดํ›„ c6i.8xlarge ์ธ์Šคํ„ด์Šค์— VNNI ๋ช…๋ น์–ด๊ฐ€ ์žˆ๋Š” ๊ฒฝ์šฐ, ๊ฐ€์ค‘์น˜๋ฅผ 4๋ธ”๋ก์œผ๋กœ ์ž˜๋ผ๋‚ด๋ฉด ๋”ฅ์ŠคํŽ˜์ด์Šค์˜ ์ฒ˜๋ฆฌ๋Ÿ‰์„ ๋” ๋†’์ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

๋ฐฐ์น˜ 1์—์„œ ๋”ฅ์ŠคํŽ˜์ด์Šค๋Š” 4๋ธ”๋ก ํ”„๋ฃจ๋‹ ์ •๋Ÿ‰ํ™”๋œ YOLOv5๋กœ ์ดˆ๋‹น 180๊ฐœ์˜ ํ•ญ๋ชฉ์„ ๋‹ฌ์„ฑํ•˜์—ฌ ONNX ๋Ÿฐํƒ€์ž„์— ๋น„ํ•ด 3.7๋ฐฐ์˜ ์„ฑ๋Šฅ ํ–ฅ์ƒ์„ ๋‹ฌ์„ฑํ–ˆ์Šต๋‹ˆ๋‹ค !

deepsparse.benchmark zoo:cv/detection/yolov5-s/pytorch/ultralytics/coco/pruned35_quant-none-vnni -s sync -b 1 -nstreams 1

> Original Model Path: zoo:cv/detection/yolov5-s/pytorch/ultralytics/coco/pruned35_quant-none-vnni
> Batch Size: 1
> Scenario: sync
> Throughput (items/sec): 179.7375

DeepSparse ์‹œ์ž‘ํ•˜๊ธฐ

์—ฐ๊ตฌ ๋˜๋Š” ํ…Œ์ŠคํŠธ์šฉ? ๋”ฅ์ŠคํŽ˜์ด์Šค ์ปค๋ฎค๋‹ˆํ‹ฐ๋Š” ์—ฐ๊ตฌ ๋ฐ ํ…Œ์ŠคํŠธ๋ฅผ ์œ„ํ•ด ๋ฌด๋ฃŒ๋กœ ์ œ๊ณต๋ฉ๋‹ˆ๋‹ค. ์„ค๋ช…์„œ๋ฅผ ์ฐธ์กฐํ•˜์—ฌ ์‹œ์ž‘ํ•˜์„ธ์š”.



์ƒ์„ฑ๋จ 2023-11-12, ์—…๋ฐ์ดํŠธ๋จ 2023-12-03
์ž‘์„ฑ์ž: glenn-jocher (3)

๋Œ“๊ธ€