์ฝ˜ํ…์ธ ๋กœ ๊ฑด๋„ˆ๋›ฐ๊ธฐ

MobileSAM ๋กœ๊ณ 

๋ชจ๋ฐ”์ผ ์„ธ๊ทธ๋จผํŠธ ์• ๋‹ˆ์”ฝ (MobileSAM)

MobileSAM ๋…ผ๋ฌธ์€ ํ˜„์žฌ arXiv์—์„œ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

CPU ์—์„œ ์‹คํ–‰๋˜๋Š” MobileSAM ๋ฐ๋ชจ๋Š” ์ด ๋ฐ๋ชจ ๋งํฌ์—์„œ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. Mac i5 CPU ์—์„œ์˜ ์„ฑ๋Šฅ์€ ์•ฝ 3์ดˆ๊ฐ€ ์†Œ์š”๋ฉ๋‹ˆ๋‹ค. Hugging Face ๋ฐ๋ชจ์—์„œ๋Š” ์ธํ„ฐํŽ˜์ด์Šค์™€ ๋‚ฎ์€ ์„ฑ๋Šฅ์˜ CPU๋กœ ์ธํ•ด ์‘๋‹ต ์†๋„๊ฐ€ ๋Š๋ ค์ง€์ง€๋งŒ ์—ฌ์ „ํžˆ ํšจ๊ณผ์ ์œผ๋กœ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค.



Watch: How to Run Inference with MobileSAM using Ultralytics | Step-by-Step Guide ๐ŸŽ‰

MobileSAM ์ ‘์ง€( SAM ) , ์• ๋‹ˆ๋ผ๋ฒจ๋ง, 3D ์„ธ๊ทธ๋จผํŠธ ์• ๋‹ˆ์”ฝ ๋“ฑ ๋‹ค์–‘ํ•œ ํ”„๋กœ์ ํŠธ์—์„œ ๊ตฌํ˜„๋˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

MobileSAM ๋Š” 100,000๊ฐœ์˜ ๋ฐ์ดํ„ฐ ์„ธํŠธ(์›๋ณธ ์ด๋ฏธ์ง€์˜ 1%)๋กœ ๊ตฌ์„ฑ๋œ ๋‹จ์ผ GPU ์— ๋Œ€ํ•ด ํ•˜๋ฃจ๋„ ์ฑ„ ๊ฑธ๋ฆฌ์ง€ ์•Š๊ณ  ํ•™์Šต๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ์ด ํ›ˆ๋ จ์— ๋Œ€ํ•œ ์ฝ”๋“œ๋Š” ํ–ฅํ›„ ๊ณต๊ฐœ๋  ์˜ˆ์ •์ž…๋‹ˆ๋‹ค.

์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ ๋ชจ๋ธ, ์ง€์›๋˜๋Š” ์ž‘์—… ๋ฐ ์ž‘๋™ ๋ชจ๋“œ

์ด ํ‘œ์—๋Š” ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ ๋ชจ๋ธ๊ณผ ํ•จ๊ป˜ ํŠน์ • ์‚ฌ์ „ ํ•™์Šต๋œ ๊ฐ€์ค‘์น˜, ์ง€์›๋˜๋Š” ์ž‘์—…, ์ถ”๋ก , ๊ฒ€์ฆ, ํ•™์Šต ๋ฐ ๋‚ด๋ณด๋‚ด๊ธฐ์™€ ๊ฐ™์€ ๋‹ค์–‘ํ•œ ์ž‘๋™ ๋ชจ๋“œ์™€์˜ ํ˜ธํ™˜์„ฑ์ด ํ‘œ์‹œ๋˜์–ด ์žˆ์œผ๋ฉฐ, ์ง€์›๋˜๋Š” ๋ชจ๋“œ์˜ ๊ฒฝ์šฐ โœ… ์ด๋ชจํ‹ฐ์ฝ˜, ์ง€์›๋˜์ง€ ์•Š๋Š” ๋ชจ๋“œ์˜ ๊ฒฝ์šฐ โŒ ์ด๋ชจํ‹ฐ์ฝ˜์œผ๋กœ ํ‘œ์‹œ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค.

๋ชจ๋ธ ์œ ํ˜• ์‚ฌ์ „ ํ•™์Šต๋œ ๊ฐ€์ค‘์น˜ ์ง€์›๋˜๋Š” ์ž‘์—… ์ถ”๋ก  ์œ ํšจ์„ฑ ๊ฒ€์‚ฌ ๊ต์œก ๋‚ด๋ณด๋‚ด๊ธฐ
MobileSAM mobile_sam.pt ์ธ์Šคํ„ด์Šค ์„ธ๋ถ„ํ™” โœ… โŒ โŒ โŒ

SAM ์—์„œ MobileSAM

MobileSAM ์€ ์›๋ณธ SAM ๊ณผ ๋™์ผํ•œ ํŒŒ์ดํ”„๋ผ์ธ์„ ์œ ์ง€ํ•˜๋ฏ€๋กœ ์›๋ณธ์˜ ์ „์ฒ˜๋ฆฌ, ํ›„์ฒ˜๋ฆฌ ๋ฐ ๊ธฐํƒ€ ๋ชจ๋“  ์ธํ„ฐํŽ˜์ด์Šค๋ฅผ ํ†ตํ•ฉํ–ˆ์Šต๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ ํ˜„์žฌ ์›๋ณธ SAM ์„ ์‚ฌ์šฉ ์ค‘์ธ ์‚ฌ์šฉ์ž๋Š” ์ตœ์†Œํ•œ์˜ ๋…ธ๋ ฅ์œผ๋กœ MobileSAM ์œผ๋กœ ์ „ํ™˜ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

MobileSAM ๋Š” ์ด๋ฏธ์ง€ ์ธ์ฝ”๋”์˜ ๋ณ€๊ฒฝ์„ ์ œ์™ธํ•˜๊ณ ๋Š” ๋™์ผํ•œ ํŒŒ์ดํ”„๋ผ์ธ์„ ์œ ์ง€ํ•˜๋ฉด์„œ ๊ธฐ์กด SAM ๊ณผ ๋น„์Šทํ•œ ์„ฑ๋Šฅ์„ ๋ฐœํœ˜ํ•ฉ๋‹ˆ๋‹ค. ํŠนํžˆ, ๊ธฐ์กด์˜ ๋ฌด๊ฑฐ์šด ViT-H ์ธ์ฝ”๋”(632M)๋ฅผ ๋” ์ž‘์€ Tiny-ViT(5M)๋กœ ๊ต์ฒดํ–ˆ์Šต๋‹ˆ๋‹ค. ๋‹จ์ผ GPU, MobileSAM ์—์„œ ์ด๋ฏธ์ง€๋‹น ์•ฝ 12ms๋กœ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค: ์ด๋ฏธ์ง€ ์ธ์ฝ”๋”์—์„œ 8ms, ๋งˆ์Šคํฌ ๋””์ฝ”๋”์—์„œ 4ms์ž…๋‹ˆ๋‹ค.

๋‹ค์Œ ํ‘œ๋Š” ViT ๊ธฐ๋ฐ˜ ์ด๋ฏธ์ง€ ์ธ์ฝ”๋”๋ฅผ ๋น„๊ตํ•œ ๊ฒƒ์ž…๋‹ˆ๋‹ค:

์ด๋ฏธ์ง€ ์ธ์ฝ”๋” ์›๋ณธ SAM MobileSAM
๋งค๊ฐœ๋ณ€์ˆ˜ 611M 5M
์†๋„ 452ms 8ms

์›๋ณธ SAM ๋ฐ MobileSAM ๋ชจ๋‘ ๋™์ผํ•œ ํ”„๋กฌํ”„ํŠธ ์•ˆ๋‚ด ๋งˆ์Šคํฌ ๋””์ฝ”๋”๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค:

๋งˆ์Šคํฌ ๋””์ฝ”๋” ์›๋ณธ SAM MobileSAM
๋งค๊ฐœ๋ณ€์ˆ˜ 3.876M 3.876M
์†๋„ 4ms 4ms

๋‹ค์Œ์€ ์ „์ฒด ํŒŒ์ดํ”„๋ผ์ธ์„ ๋น„๊ตํ•œ ๊ฒƒ์ž…๋‹ˆ๋‹ค:

์ „์ฒด ํŒŒ์ดํ”„๋ผ์ธ(Enc+Dec) ์›๋ณธ SAM MobileSAM
๋งค๊ฐœ๋ณ€์ˆ˜ 615M 9.66M
์†๋„ 456ms 12ms

MobileSAM ๋ฐ ์›๋ณธ SAM ์˜ ์„ฑ๋Šฅ์€ ์ ๊ณผ ์ƒ์ž๋ฅผ ๋ชจ๋‘ ํ”„๋กฌํ”„ํŠธ๋กœ ์‚ฌ์šฉํ•˜์—ฌ ์‹œ์—ฐํ•ฉ๋‹ˆ๋‹ค.

ํฌ์ธํŠธ๊ฐ€ ์žˆ๋Š” ์ด๋ฏธ์ง€๋ฅผ ํ”„๋กฌํ”„ํŠธ๋กœ ํ‘œ์‹œ

์ƒ์ž๋ฅผ ํ”„๋กฌํ”„ํŠธ๋กœ ์‚ฌ์šฉํ•˜๋Š” ์ด๋ฏธ์ง€

์šฐ์ˆ˜ํ•œ ์„ฑ๋Šฅ์„ ๊ฐ–์ถ˜ MobileSAM ์€ ํ˜„์žฌ FastSAM ๋ณด๋‹ค ์•ฝ 5๋ฐฐ ์ž‘๊ณ  7๋ฐฐ ๋น ๋ฆ…๋‹ˆ๋‹ค. ์ž์„ธํ•œ ๋‚ด์šฉ์€ MobileSAM ํ”„๋กœ์ ํŠธ ํŽ˜์ด์ง€์—์„œ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

ํ…Œ์ŠคํŠธ MobileSAM Ultralytics

๊ธฐ์กด SAM ๊ณผ ๋งˆ์ฐฌ๊ฐ€์ง€๋กœ Ultralytics ์—์„œ ํฌ์ธํŠธ ๋ฐ ๋ฐ•์Šค ํ”„๋กฌํ”„ํŠธ ๋ชจ๋“œ๋ฅผ ํฌํ•จํ•œ ๊ฐ„๋‹จํ•œ ํ…Œ์ŠคํŠธ ๋ฐฉ๋ฒ•์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.

๋ชจ๋ธ ๋‹ค์šด๋กœ๋“œ

์—ฌ๊ธฐ์—์„œ ๋ชจ๋ธ์„ ๋‹ค์šด๋กœ๋“œํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

ํฌ์ธํŠธ ํ”„๋กฌํ”„ํŠธ

์˜ˆ

from ultralytics import SAM

# Load the model
model = SAM("mobile_sam.pt")

# Predict a segment based on a single point prompt
model.predict("ultralytics/assets/zidane.jpg", points=[900, 370], labels=[1])

# Predict multiple segments based on multiple points prompt
model.predict("ultralytics/assets/zidane.jpg", points=[[400, 370], [900, 370]], labels=[1, 1])

# Predict a segment based on multiple points prompt per object
model.predict("ultralytics/assets/zidane.jpg", points=[[[400, 370], [900, 370]]], labels=[[1, 1]])

# Predict a segment using both positive and negative prompts.
model.predict("ultralytics/assets/zidane.jpg", points=[[[400, 370], [900, 370]]], labels=[[1, 0]])

์ƒ์ž ํ”„๋กฌํ”„ํŠธ

์˜ˆ

from ultralytics import SAM

# Load the model
model = SAM("mobile_sam.pt")

# Predict a segment based on a single point prompt
model.predict("ultralytics/assets/zidane.jpg", points=[900, 370], labels=[1])

# Predict mutiple segments based on multiple points prompt
model.predict("ultralytics/assets/zidane.jpg", points=[[400, 370], [900, 370]], labels=[1, 1])

# Predict a segment based on multiple points prompt per object
model.predict("ultralytics/assets/zidane.jpg", points=[[[400, 370], [900, 370]]], labels=[[1, 1]])

# Predict a segment using both positive and negative prompts.
model.predict("ultralytics/assets/zidane.jpg", points=[[[400, 370], [900, 370]]], labels=[[1, 0]])

์šฐ๋ฆฌ๋Š” ๊ตฌํ˜„ํ–ˆ์Šต๋‹ˆ๋‹ค MobileSAM ๊ทธ๋ฆฌ๊ณ  SAM ๋™์ผํ•œ API๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. ์ž์„ธํ•œ ์‚ฌ์šฉ ์ •๋ณด๋Š” SAM ํŽ˜์ด์ง€.

์ธ์šฉ ๋ฐ ๊ฐ์‚ฌ

์—ฐ๊ตฌ ๋˜๋Š” ๊ฐœ๋ฐœ ์ž‘์—…์— MobileSAM ์ด ์œ ์šฉํ•˜๋‹ค๊ณ  ์ƒ๊ฐ๋˜๋ฉด ์ €ํฌ ๋…ผ๋ฌธ์„ ์ธ์šฉํ•ด ์ฃผ์„ธ์š”:

@article{mobile_sam,
  title={Faster Segment Anything: Towards Lightweight SAM for Mobile Applications},
  author={Zhang, Chaoning and Han, Dongshen and Qiao, Yu and Kim, Jung Uk and Bae, Sung Ho and Lee, Seungkyu and Hong, Choong Seon},
  journal={arXiv preprint arXiv:2306.14289},
  year={2023}
}

์ž์ฃผ ๋ฌป๋Š” ์งˆ๋ฌธ

MobileSAM ์ด๋ž€ ๋ฌด์—‡์ด๋ฉฐ ๊ธฐ์กด SAM ๋ชจ๋ธ๊ณผ ์–ด๋–ป๊ฒŒ ๋‹ค๋ฅธ๊ฐ€์š”?

MobileSAM is a lightweight, fast image segmentation model designed for mobile applications. It retains the same pipeline as the original SAM but replaces the heavyweight ViT-H encoder (632M parameters) with a smaller Tiny-ViT encoder (5M parameters). This change results in MobileSAM being approximately 5 times smaller and 7 times faster than the original SAM. For instance, MobileSAM operates at about 12ms per image, compared to the original SAM's 456ms. You can learn more about the MobileSAM implementation in various projects here.

Ultralytics ์„ ์‚ฌ์šฉํ•˜์—ฌ MobileSAM ํ…Œ์ŠคํŠธํ•˜๋ ค๋ฉด ์–ด๋–ป๊ฒŒ ํ•ด์•ผ ํ•˜๋‚˜์š”?

Ultralytics ์—์„œ MobileSAM ํ…Œ์ŠคํŠธ๋Š” ๊ฐ„๋‹จํ•œ ๋ฐฉ๋ฒ•์œผ๋กœ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ํฌ์ธํŠธ ๋ฐ ๋ฐ•์Šค ํ”„๋กฌํ”„ํŠธ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์„ธ๊ทธ๋จผํŠธ๋ฅผ ์˜ˆ์ธกํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋‹ค์Œ์€ ํฌ์ธํŠธ ํ”„๋กฌํ”„ํŠธ๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ์˜ˆ์ œ์ž…๋‹ˆ๋‹ค:

from ultralytics import SAM

# Load the model
model = SAM("mobile_sam.pt")

# Predict a segment based on a point prompt
model.predict("ultralytics/assets/zidane.jpg", points=[900, 370], labels=[1])

์ž์„ธํ•œ ๋‚ด์šฉ์€ ํ…Œ์ŠคํŠธ MobileSAM ์„น์…˜์„ ์ฐธ์กฐํ•˜์„ธ์š”.

๋ชจ๋ฐ”์ผ ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์— MobileSAM ์„ ์‚ฌ์šฉํ•ด์•ผ ํ•˜๋Š” ์ด์œ ๋Š” ๋ฌด์—‡์ธ๊ฐ€์š”?

MobileSAM ๋Š” ๊ฐ€๋ฒผ์šด ์•„ํ‚คํ…์ฒ˜์™€ ๋น ๋ฅธ ์ถ”๋ก  ์†๋„๋กœ ์ธํ•ด ๋ชจ๋ฐ”์ผ ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์— ์ด์ƒ์ ์ž…๋‹ˆ๋‹ค. ๊ธฐ์กด SAM ๊ณผ ๋น„๊ตํ•˜๋ฉด MobileSAM ์€ ์•ฝ 5๋ฐฐ ์ž‘๊ณ  7๋ฐฐ ๋น ๋ฅด๋ฏ€๋กœ ์ปดํ“จํŒ… ๋ฆฌ์†Œ์Šค๊ฐ€ ์ œํ•œ๋œ ํ™˜๊ฒฝ์— ์ ํ•ฉํ•ฉ๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ํšจ์œจ์„ฑ ๋•๋ถ„์— ๋ชจ๋ฐ”์ผ ๋””๋ฐ”์ด์Šค์—์„œ ์ƒ๋‹นํ•œ ์ง€์—ฐ ์‹œ๊ฐ„ ์—†์ด ์‹ค์‹œ๊ฐ„ ์ด๋ฏธ์ง€ ๋ถ„ํ• ์„ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋˜ํ•œ MobileSAM ์˜ ์ถ”๋ก ๊ณผ ๊ฐ™์€ ๋ชจ๋ธ์€ ๋ชจ๋ฐ”์ผ ์„ฑ๋Šฅ์— ์ตœ์ ํ™”๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค.

MobileSAM ๊ต์œก์€ ์–ด๋–ป๊ฒŒ ์ง„ํ–‰๋˜์—ˆ์œผ๋ฉฐ, ๊ต์œก ์ฝ”๋“œ๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋‚˜์š”?

MobileSAM ๋Š” ์›๋ณธ ์ด๋ฏธ์ง€์˜ 1%์— ํ•ด๋‹นํ•˜๋Š” 10๋งŒ ๊ฐœ์˜ ๋ฐ์ดํ„ฐ์…‹์œผ๋กœ ํ•˜๋ฃจ๋„ ์ฑ„ ์•ˆ ๋˜๋Š” ์‹œ๊ฐ„ ๋‚ด์— ๋‹จ์ผ GPU ๋กœ ํ•™์Šต๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ํŠธ๋ ˆ์ด๋‹ ์ฝ”๋“œ๋Š” ํ–ฅํ›„ ์ œ๊ณต๋  ์˜ˆ์ •์ด์ง€๋งŒ, ํ˜„์žฌ๋Š” MobileSAM GitHub ๋ฆฌํฌ์ง€ํ† ๋ฆฌ์—์„œ MobileSAM ์˜ ๋‹ค๋ฅธ ์ธก๋ฉด์„ ์‚ดํŽด๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด ๋ฆฌํฌ์ง€ํ† ๋ฆฌ์—๋Š” ๋‹ค์–‘ํ•œ ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์„ ์œ„ํ•œ ์‚ฌ์ „ ํ•™์Šต๋œ ๊ฐ€์ค‘์น˜์™€ ๊ตฌํ˜„ ์„ธ๋ถ€ ์ •๋ณด๊ฐ€ ํฌํ•จ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค.

MobileSAM ์˜ ์ฃผ์š” ์‚ฌ์šฉ ์‚ฌ๋ก€๋Š” ๋ฌด์—‡์ธ๊ฐ€์š”?

MobileSAM ๋Š” ๋ชจ๋ฐ”์ผ ํ™˜๊ฒฝ์—์„œ ๋น ๋ฅด๊ณ  ํšจ์œจ์ ์ธ ์ด๋ฏธ์ง€ ๋ถ„ํ• ์„ ์œ„ํ•ด ์„ค๊ณ„๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ์ฃผ์š” ์‚ฌ์šฉ ์‚ฌ๋ก€๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค:

  • Real-time object detection and segmentation for mobile applications.
  • ์ปดํ“จํŒ… ๋ฆฌ์†Œ์Šค๊ฐ€ ์ œํ•œ๋œ ๊ธฐ๊ธฐ์—์„œ ์ง€์—ฐ ์‹œ๊ฐ„์ด ์งง์€ ์ด๋ฏธ์ง€ ์ฒ˜๋ฆฌ.
  • ์ฆ๊ฐ• ํ˜„์‹ค(AR) ๋ฐ ์‹ค์‹œ๊ฐ„ ๋ถ„์„๊ณผ ๊ฐ™์€ ์ž‘์—…์„ ์œ„ํ•ด AI ๊ธฐ๋ฐ˜ ๋ชจ๋ฐ”์ผ ์•ฑ์— ํ†ตํ•ฉํ•ฉ๋‹ˆ๋‹ค.

์ž์„ธํ•œ ์‚ฌ์šฉ ์‚ฌ๋ก€์™€ ์„ฑ๋Šฅ ๋น„๊ต๋Š” SAM ์—์„œ MobileSAM ์œผ๋กœ์˜ ์ ์‘ ์„น์…˜์„ ์ฐธ์กฐํ•˜์„ธ์š”.


๐Ÿ“… Created 11 months ago โœ๏ธ Updated 0 days ago

๋Œ“๊ธ€