์ฝ˜ํ…์ธ ๋กœ ๊ฑด๋„ˆ๋›ฐ๊ธฐ

๋ฌด์—‡์ด๋“  ์„ธ๊ทธ๋จผํŠธ ๋ชจ๋ธ (SAM)

์ด๋ฏธ์ง€ ์„ธ๊ทธ ๋จผํ…Œ์ด์…˜์˜ ์ƒˆ๋กœ์šด ์ง€ํ‰์„ ์—ฐ Segment Anything ๋ชจ๋ธ( SAM)์— ์˜ค์‹  ๊ฒƒ์„ ํ™˜์˜ํ•ฉ๋‹ˆ๋‹ค. ์ด ํ˜์‹ ์ ์ธ ๋ชจ๋ธ์€ ์‹ค์‹œ๊ฐ„ ์„ฑ๋Šฅ์„ ๊ฐ–์ถ˜ ์‹ ์†ํ•œ ์ด๋ฏธ์ง€ ์„ธ๊ทธ๋จผํ…Œ์ด์…˜์„ ๋„์ž…ํ•˜์—ฌ ์—…๊ณ„์˜ ํŒ๋„๋ฅผ ๋ฐ”๊พธ๊ณ  ์ด ๋ถ„์•ผ์˜ ์ƒˆ๋กœ์šด ํ‘œ์ค€์„ ์„ธ์› ์Šต๋‹ˆ๋‹ค.

์†Œ๊ฐœ SAM: ์„ธ๊ทธ๋จผํŠธ ์• ๋‹ˆ์”ฝ ๋ชจ๋ธ

์„ธ๊ทธ๋จผํŠธ ์• ๋‹ˆ์”ฝ ๋ชจ๋ธ( SAM)์€ ์ด๋ฏธ์ง€ ๋ถ„์„ ์ž‘์—…์—์„œ ๋น„๊ตํ•  ์ˆ˜ ์—†๋Š” ๋‹ค์šฉ๋„์„ฑ์„ ์ œ๊ณตํ•˜๋ฉฐ ์ฆ‰๊ฐ์ ์ธ ๋ถ„ํ• ์„ ๊ฐ€๋Šฅํ•˜๊ฒŒ ํ•˜๋Š” ์ตœ์ฒจ๋‹จ ์ด๋ฏธ์ง€ ๋ถ„ํ•  ๋ชจ๋ธ๋กœ, ์ด๋ฏธ์ง€ ๋ถ„ํ• ์„ ์œ„ํ•œ ์ƒˆ๋กœ์šด ๋ชจ๋ธ, ์ž‘์—… ๋ฐ ๋ฐ์ดํ„ฐ ์„ธํŠธ๋ฅผ ๋„์ž…ํ•˜๋Š” ํš๊ธฐ์ ์ธ ํ”„๋กœ์ ํŠธ์ธ ์„ธ๊ทธ๋จผํŠธ ์• ๋‹ˆ์”ฝ ์ด๋‹ˆ์…”ํ‹ฐ๋ธŒ( SAM )์˜ ํ•ต์‹ฌ์„ ํ˜•์„ฑํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

SAM์˜ ๊ณ ๊ธ‰ ์„ค๊ณ„ ๋•๋ถ„์— ์‚ฌ์ „ ์ง€์‹ ์—†์ด๋„ ์ƒˆ๋กœ์šด ์ด๋ฏธ์ง€ ๋ถ„ํฌ์™€ ์ž‘์—…์— ์ ์‘ํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ, ์ด ๊ธฐ๋Šฅ์„ ์ œ๋กœ ์ƒท ์ „์†ก์ด๋ผ๊ณ  ํ•ฉ๋‹ˆ๋‹ค. 1,100๋งŒ ๊ฐœ์˜ ์—„์„ ๋œ ์ด๋ฏธ์ง€์— 10์–ต ๊ฐœ ์ด์ƒ์˜ ๋งˆ์Šคํฌ๊ฐ€ ํฌํ•จ๋œ ๋ฐฉ๋Œ€ํ•œ SA-1B ๋ฐ์ดํ„ฐ ์„ธํŠธ์—์„œ ํ•™์Šต๋œ SAM ์€ ๋งŽ์€ ๊ฒฝ์šฐ ์ด์ „์˜ ์™„์ „ ๊ฐ๋… ๊ฒฐ๊ณผ๋ฅผ ๋Šฅ๊ฐ€ํ•˜๋Š” ์ธ์ƒ์ ์ธ ์ œ๋กœ ์ƒท ์„ฑ๋Šฅ์„ ๋ณด์—ฌ์ฃผ์—ˆ์Šต๋‹ˆ๋‹ค.

๋ฐ์ดํ„ฐ ์„ธํŠธ ์ƒ˜ํ”Œ ์ด๋ฏธ์ง€ SA-1B ์˜ˆ์‹œ ์ด๋ฏธ์ง€. ์ƒˆ๋กœ ๋„์ž…๋œ SA-1B ๋ฐ์ดํ„ฐ ์„ธํŠธ์˜ ๋งˆ์Šคํฌ๋ฅผ ์˜ค๋ฒ„๋ ˆ์ดํ•œ ๋ฐ์ดํ„ฐ ์„ธํŠธ ์ด๋ฏธ์ง€. SA-1B์—๋Š” 1100๋งŒ ๊ฐœ์˜ ๋‹ค์–‘ํ•œ ๊ณ ํ•ด์ƒ๋„, ๋ผ์ด์„ ์Šค ๋ฐ ๊ฐœ์ธ์ •๋ณด ๋ณดํ˜ธ๊ฐ€ ์ ์šฉ๋œ ์ด๋ฏธ์ง€์™€ 11์–ต ๊ฐœ์˜ ๊ณ ํ’ˆ์งˆ ์„ธ๋ถ„ํ™” ๋งˆ์Šคํฌ๊ฐ€ ํฌํ•จ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ๋งˆ์Šคํฌ๋Š” SAM ์— ์˜ํ•ด ์™„์ „ ์ž๋™์œผ๋กœ ์ฃผ์„์„ ๋‹ฌ์•˜์œผ๋ฉฐ, ์‚ฌ๋žŒ์˜ ํ‰๊ฐ€์™€ ์ˆ˜๋งŽ์€ ์‹คํ—˜์„ ํ†ตํ•ด ํ™•์ธ๋œ ๋ฐ”์™€ ๊ฐ™์ด ๋†’์€ ํ’ˆ์งˆ๊ณผ ๋‹ค์–‘์„ฑ์„ ๊ฐ–์ถ”๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋ฏธ์ง€๋Š” ์‹œ๊ฐํ™”๋ฅผ ์œ„ํ•ด ์ด๋ฏธ์ง€๋‹น ๋งˆ์Šคํฌ ์ˆ˜์— ๋”ฐ๋ผ ๊ทธ๋ฃนํ™”๋ฉ๋‹ˆ๋‹ค(ํ‰๊ท ์ ์œผ๋กœ ์ด๋ฏธ์ง€๋‹น 100๊ฐœ ์ •๋„์˜ ๋งˆ์Šคํฌ๊ฐ€ ์žˆ์Œ).

์„ธ๊ทธ๋จผํŠธ ์• ๋‹ˆ์”ฝ ๋ชจ๋ธ์˜ ์ฃผ์š” ๊ธฐ๋Šฅ (SAM)

  • ํ”„๋กฌํ”„ํŠธ ๊ฐ€๋Šฅํ•œ ์„ธ๊ทธ๋ฉ˜ํ…Œ์ด์…˜ ์ž‘์—…: SAM ์€ ํ”„๋กฌํ”„ํŠธ ๊ฐ€๋Šฅํ•œ ์„ธ๊ทธ๋ฉ˜ํ…Œ์ด์…˜ ์ž‘์—…์„ ์—ผ๋‘์— ๋‘๊ณ  ์„ค๊ณ„๋˜์–ด ๊ฐ์ฒด๋ฅผ ์‹๋ณ„ํ•˜๋Š” ๊ณต๊ฐ„ ๋˜๋Š” ํ…์ŠคํŠธ ๋‹จ์„œ ๋“ฑ ์ฃผ์–ด์ง„ ํ”„๋กฌํ”„ํŠธ์—์„œ ์œ ํšจํ•œ ์„ธ๊ทธ๋ฉ˜ํ…Œ์ด์…˜ ๋งˆ์Šคํฌ๋ฅผ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • ๊ณ ๊ธ‰ ์•„ํ‚คํ…์ฒ˜: Segment Anything ๋ชจ๋ธ์€ ๊ฐ•๋ ฅํ•œ ์ด๋ฏธ์ง€ ์ธ์ฝ”๋”, ํ”„๋กฌํ”„ํŠธ ์ธ์ฝ”๋”, ๊ฒฝ๋Ÿ‰ ๋งˆ์Šคํฌ ๋””์ฝ”๋”๋ฅผ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. ์ด ๊ณ ์œ ํ•œ ์•„ํ‚คํ…์ฒ˜๋Š” ์œ ์—ฐํ•œ ํ”„๋กฌํ”„ํŠธ, ์‹ค์‹œ๊ฐ„ ๋งˆ์Šคํฌ ๊ณ„์‚ฐ, ์„ธ๊ทธ๋จผํŠธ ์ž‘์—…์˜ ๋ชจํ˜ธ์„ฑ ์ธ์‹์„ ๊ฐ€๋Šฅํ•˜๊ฒŒ ํ•ฉ๋‹ˆ๋‹ค.
  • SA-1B ๋ฐ์ดํ„ฐ ์„ธํŠธ: Segment Anything ํ”„๋กœ์ ํŠธ์—์„œ ๋„์ž…ํ•œ SA-1B ๋ฐ์ดํ„ฐ ์„ธํŠธ๋Š” 1100๋งŒ ๊ฐœ์˜ ์ด๋ฏธ์ง€์— 10์–ต ๊ฐœ ์ด์ƒ์˜ ๋งˆ์Šคํฌ๋ฅผ ํฌํ•จํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. ํ˜„์žฌ๊นŒ์ง€ ๊ฐ€์žฅ ํฐ ๊ทœ๋ชจ์˜ ์„ธ๋ถ„ํ™” ๋ฐ์ดํ„ฐ ์„ธํŠธ์ธ ์ด ๋ฐ์ดํ„ฐ ์„ธํŠธ๋Š” SAM ์— ๋‹ค์–‘ํ•˜๊ณ  ๋Œ€๊ทœ๋ชจ์˜ ํ•™์Šต ๋ฐ์ดํ„ฐ ์†Œ์Šค๋ฅผ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.
  • ์ œ๋กœ ์ƒท ์„ฑ๋Šฅ: SAM ์€ ๋‹ค์–‘ํ•œ ์„ธ๋ถ„ํ™” ์ž‘์—…์—์„œ ๋›ฐ์–ด๋‚œ ์ œ๋กœ ์ƒท ์„ฑ๋Šฅ์„ ๋ณด์—ฌ ์ฃผ๋ฉฐ, ์‹ ์†ํ•œ ์—”์ง€๋‹ˆ์–ด๋ง ์—†์ด๋„ ๋‹ค์–‘ํ•œ ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์— ๋ฐ”๋กœ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ๋„๊ตฌ์ž…๋‹ˆ๋‹ค.

์„ธ๊ทธ๋จผํŠธ ์• ๋‹ˆ์”ฝ ๋ชจ๋ธ๊ณผ SA-1B ๋ฐ์ดํ„ฐ ์„ธํŠธ์— ๋Œ€ํ•ด ์ž์„ธํžˆ ์•Œ์•„๋ณด๋ ค๋ฉด ์„ธ๊ทธ๋จผํŠธ ์• ๋‹ˆ์”ฝ ์›น์‚ฌ์ดํŠธ๋ฅผ ๋ฐฉ๋ฌธํ•˜์—ฌ ์—ฐ๊ตฌ ๋…ผ๋ฌธ ' ์„ธ๊ทธ๋จผํŠธ ์• ๋‹ˆ์”ฝ'์„ ํ™•์ธํ•˜์‹œ๊ธฐ ๋ฐ”๋ž๋‹ˆ๋‹ค.

์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ ๋ชจ๋ธ, ์ง€์›๋˜๋Š” ์ž‘์—… ๋ฐ ์ž‘๋™ ๋ชจ๋“œ

์ด ํ‘œ์—๋Š” ์‚ฌ์šฉ ๊ฐ€๋Šฅํ•œ ๋ชจ๋ธ๊ณผ ํ•จ๊ป˜ ํŠน์ • ์‚ฌ์ „ ํ•™์Šต๋œ ๊ฐ€์ค‘์น˜, ์ง€์›๋˜๋Š” ์ž‘์—…, ์ถ”๋ก , ๊ฒ€์ฆ, ํ•™์Šต ๋ฐ ๋‚ด๋ณด๋‚ด๊ธฐ์™€ ๊ฐ™์€ ๋‹ค์–‘ํ•œ ์ž‘๋™ ๋ชจ๋“œ์™€์˜ ํ˜ธํ™˜์„ฑ์ด ํ‘œ์‹œ๋˜์–ด ์žˆ์œผ๋ฉฐ, ์ง€์›๋˜๋Š” ๋ชจ๋“œ์˜ ๊ฒฝ์šฐ โœ… ์ด๋ชจํ‹ฐ์ฝ˜, ์ง€์›๋˜์ง€ ์•Š๋Š” ๋ชจ๋“œ์˜ ๊ฒฝ์šฐ โŒ ์ด๋ชจํ‹ฐ์ฝ˜์œผ๋กœ ํ‘œ์‹œ๋˜์–ด ์žˆ์Šต๋‹ˆ๋‹ค.

๋ชจ๋ธ ์œ ํ˜• ์‚ฌ์ „ ํ•™์Šต๋œ ๊ฐ€์ค‘์น˜ ์ง€์›๋˜๋Š” ์ž‘์—… ์ถ”๋ก  ์œ ํšจ์„ฑ ๊ฒ€์‚ฌ ๊ต์œก ๋‚ด๋ณด๋‚ด๊ธฐ
SAM base sam_b.pt ์ธ์Šคํ„ด์Šค ์„ธ๋ถ„ํ™” โœ… โŒ โŒ โŒ
SAM large sam_l.pt ์ธ์Šคํ„ด์Šค ์„ธ๋ถ„ํ™” โœ… โŒ โŒ โŒ

์‚ฌ์šฉ ๋ฐฉ๋ฒ• SAM: ์ด๋ฏธ์ง€ ์„ธ๋ถ„ํ™”์˜ ๋‹ค์–‘์„ฑ๊ณผ ๊ฐ•๋ ฅํ•จ

์„ธ๊ทธ๋จผํŠธ ์• ๋‹ˆ์”ฝ ๋ชจ๋ธ์€ ํ•™์Šต ๋ฐ์ดํ„ฐ๋ฅผ ๋„˜์–ด์„œ๋Š” ๋‹ค์–‘ํ•œ ๋‹ค์šด์ŠคํŠธ๋ฆผ ์ž‘์—…์— ํ™œ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์—๋Š” ์—์ง€ ๊ฐ์ง€, ๊ฐ์ฒด ์ œ์•ˆ ์ƒ์„ฑ, ์ธ์Šคํ„ด์Šค ์„ธ๋ถ„ํ™”, ์˜ˆ๋น„ ํ…์ŠคํŠธ-๋งˆ์Šคํฌ ์˜ˆ์ธก ๋“ฑ์ด ํฌํ•จ๋ฉ๋‹ˆ๋‹ค. ์‹ ์†ํ•œ ์—”์ง€๋‹ˆ์–ด๋ง์„ ํ†ตํ•ด SAM ์ƒˆ๋กœ์šด ์ž‘์—…๊ณผ ๋ฐ์ดํ„ฐ ๋ฐฐํฌ์— ์ œ๋กœ ์ƒท ๋ฐฉ์‹์œผ๋กœ ์‹ ์†ํ•˜๊ฒŒ ์ ์‘ํ•  ์ˆ˜ ์žˆ์œผ๋ฏ€๋กœ ๋ชจ๋“  ์ด๋ฏธ์ง€ ์„ธ๋ถ„ํ™” ์š”๊ตฌ ์‚ฌํ•ญ์„ ์ถฉ์กฑํ•˜๋Š” ๋‹ค์žฌ๋‹ค๋Šฅํ•˜๊ณ  ๊ฐ•๋ ฅํ•œ ๋„๊ตฌ๋กœ ์ž๋ฆฌ๋งค๊น€ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

SAM ์˜ˆ์ธก ์˜ˆ์ œ

ํ”„๋กฌํ”„ํŠธ๊ฐ€ ์žˆ๋Š” ์„ธ๊ทธ๋จผํŠธ

์ฃผ์–ด์ง„ ํ”„๋กฌํ”„ํŠธ์— ๋”ฐ๋ผ ์ด๋ฏธ์ง€๋ฅผ ๋ถ„ํ• ํ•ฉ๋‹ˆ๋‹ค.

from ultralytics import SAM

# Load a model
model = SAM("sam_b.pt")

# Display model information (optional)
model.info()

# Run inference with bboxes prompt
results = model("ultralytics/assets/zidane.jpg", bboxes=[439, 437, 524, 709])

# Run inference with single point
results = model(points=[900, 370], labels=[1])

# Run inference with multiple points
results = model(points=[[400, 370], [900, 370]], labels=[1, 1])

# Run inference with multiple points prompt per object
results = model(points=[[[400, 370], [900, 370]]], labels=[[1, 1]])

# Run inference with negative points prompt
results = model(points=[[[400, 370], [900, 370]]], labels=[[1, 0]])

๋ชจ๋“  ๊ฒƒ์„ ์„ธ๋ถ„ํ™”

์ „์ฒด ์ด๋ฏธ์ง€๋ฅผ ๋ถ„ํ• ํ•ฉ๋‹ˆ๋‹ค.

from ultralytics import SAM

# Load a model
model = SAM("sam_b.pt")

# Display model information (optional)
model.info()

# Run inference
model("path/to/image.jpg")
# Run inference with a SAM model
yolo predict model=sam_b.pt source=path/to/image.jpg
  • ์—ฌ๊ธฐ์„œ ๋กœ์ง์€ ํ”„๋กฌํ”„ํŠธ(b๋ฐ•์Šค/ํฌ์ธํŠธ/๋งˆ์Šคํฌ)๋ฅผ ์ „๋‹ฌํ•˜์ง€ ์•Š์œผ๋ฉด ์ „์ฒด ์ด๋ฏธ์ง€๋ฅผ ๋ถ„ํ• ํ•˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.

SAMPredictor ์˜ˆ์ œ

์ด๋ ‡๊ฒŒ ํ•˜๋ฉด ์ด๋ฏธ์ง€ ์ธ์ฝ”๋”๋ฅผ ์—ฌ๋Ÿฌ ๋ฒˆ ์‹คํ–‰ํ•˜์ง€ ์•Š๊ณ ๋„ ์ด๋ฏธ์ง€๋ฅผ ํ•œ ๋ฒˆ ์„ค์ •ํ•˜๊ณ  ํ”„๋กฌํ”„ํŠธ ์ถ”๋ก ์„ ์—ฌ๋Ÿฌ ๋ฒˆ ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

from ultralytics.models.sam import Predictor as SAMPredictor

# Create SAMPredictor
overrides = dict(conf=0.25, task="segment", mode="predict", imgsz=1024, model="mobile_sam.pt")
predictor = SAMPredictor(overrides=overrides)

# Set image
predictor.set_image("ultralytics/assets/zidane.jpg")  # set with image file
predictor.set_image(cv2.imread("ultralytics/assets/zidane.jpg"))  # set with np.ndarray
results = predictor(bboxes=[439, 437, 524, 709])

# Run inference with single point prompt
results = predictor(points=[900, 370], labels=[1])

# Run inference with multiple points prompt
results = predictor(points=[[400, 370], [900, 370]], labels=[[1, 1]])

# Run inference with negative points prompt
results = predictor(points=[[[400, 370], [900, 370]]], labels=[[1, 0]])

# Reset image
predictor.reset_image()

์ถ”๊ฐ€ ์ธ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ชจ๋“  ํ•ญ๋ชฉ์„ ์„ธ๋ถ„ํ™”ํ•ฉ๋‹ˆ๋‹ค.

from ultralytics.models.sam import Predictor as SAMPredictor

# Create SAMPredictor
overrides = dict(conf=0.25, task="segment", mode="predict", imgsz=1024, model="mobile_sam.pt")
predictor = SAMPredictor(overrides=overrides)

# Segment with additional args
results = predictor(source="ultralytics/assets/zidane.jpg", crop_n_layers=1, points_stride=64)

์ฐธ๊ณ 

๋ฐ˜ํ™˜๋œ ๋ชจ๋“  results ์œ„์˜ ์˜ˆ์—์„œ ๊ฒฐ๊ณผ ๊ฐ์ฒด๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์˜ˆ์ธก๋œ ๋งˆ์Šคํฌ์™€ ์†Œ์Šค ์ด๋ฏธ์ง€์— ์‰ฝ๊ฒŒ ์•ก์„ธ์Šคํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

SAM ๋น„๊ต ๋Œ€ YOLOv8

YOLOv8n์—ฌ๊ธฐ์„œ๋Š” Meta์˜ ๊ฐ€์žฅ ์ž‘์€ SAM ๋ชจ๋ธ์ธ SAM-b์™€ ๊ฐ€์žฅ ์ž‘์€ ์„ธ๋ถ„ํ™” ๋ชจ๋ธ์ธ Ultralytics -seg๋ฅผ ๋น„๊ตํ•ฉ๋‹ˆ๋‹ค :

๋ชจ๋ธ ํฌ๊ธฐ
(MB)
๋งค๊ฐœ๋ณ€์ˆ˜
(M)
์†๋„ (CPU)
(ms/im)
๋ฉ”ํƒ€ SAM-b 358 94.7 51096
MobileSAM 40.7 10.1 46122
FastSAM-s์™€ YOLOv8 ๋ฐฑ๋ณธ 23.7 11.8 115
Ultralytics YOLOv8n-seg 6.7 (53.4๋ฐฐ ์ž‘์•„์ง) 3.4 (27.9๋ฐฐ ๊ฐ์†Œ) 59 (866๋ฐฐ ๋นจ๋ผ์ง)

์ด ๋น„๊ต๋Š” ๋ชจ๋ธ ๊ฐ„์˜ ๋ชจ๋ธ ํฌ๊ธฐ์™€ ์†๋„์—์„œ ์—„์ฒญ๋‚œ ์ฐจ์ด๋ฅผ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค. SAM ์€ ์ž๋™ ์„ธ๊ทธ๋จผํŠธ๋ฅผ ์œ„ํ•œ ๊ณ ์œ ํ•œ ๊ธฐ๋Šฅ์„ ์ œ๊ณตํ•˜์ง€๋งŒ, ๋” ์ž‘๊ณ  ๋น ๋ฅด๋ฉฐ ํšจ์œจ์ ์ธ YOLOv8 ์„ธ๊ทธ๋จผํŠธ ๋ชจ๋ธ๊ณผ ์ง์ ‘์ ์œผ๋กœ ๊ฒฝ์Ÿํ•˜๋Š” ๊ฒƒ์€ ์•„๋‹™๋‹ˆ๋‹ค.

ํ…Œ์ŠคํŠธ๋Š” 16GB RAM์ด ์žฅ์ฐฉ๋œ 2023๋…„ํ˜• Apple M2 Macbook์—์„œ ์‹คํ–‰๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ์ด ํ…Œ์ŠคํŠธ๋ฅผ ์žฌํ˜„ํ•ฉ๋‹ˆ๋‹ค:

์˜ˆ

from ultralytics import ASSETS, SAM, YOLO, FastSAM

# Profile SAM-b, MobileSAM
for file in ["sam_b.pt", "mobile_sam.pt"]:
    model = SAM(file)
    model.info()
    model(ASSETS)

# Profile FastSAM-s
model = FastSAM("FastSAM-s.pt")
model.info()
model(ASSETS)

# Profile YOLOv8n-seg
model = YOLO("yolov8n-seg.pt")
model.info()
model(ASSETS)

์ž๋™ ์ฃผ์„: ์„ธ๋ถ„ํ™” ๋ฐ์ดํ„ฐ ์ง‘ํ•ฉ์— ๋Œ€ํ•œ ๋น ๋ฅธ ๊ฒฝ๋กœ

์ž๋™ ์ฃผ์„์€ SAM ์˜ ํ•ต์‹ฌ ๊ธฐ๋Šฅ์œผ๋กœ, ์‚ฌ์šฉ์ž๊ฐ€ ์‚ฌ์ „ ํ•™์Šต๋œ ๊ฐ์ง€ ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜์—ฌ ์„ธ๋ถ„ํ™” ๋ฐ์ดํ„ฐ ์„ธํŠธ๋ฅผ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด ๊ธฐ๋Šฅ์„ ์‚ฌ์šฉํ•˜๋ฉด ๋งŽ์€ ์ˆ˜์˜ ์ด๋ฏธ์ง€์— ๋น ๋ฅด๊ณ  ์ •ํ™•ํ•˜๊ฒŒ ์ฃผ์„์„ ๋‹ฌ ์ˆ˜ ์žˆ์–ด ์‹œ๊ฐ„์ด ๋งŽ์ด ๊ฑธ๋ฆฌ๋Š” ์ˆ˜๋™ ๋ผ๋ฒจ๋ง ์ž‘์—…์„ ์ƒ๋žตํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

ํƒ์ง€ ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜์—ฌ ์„ธ๋ถ„ํ™” ๋ฐ์ดํ„ฐ ์„ธํŠธ ์ƒ์„ฑํ•˜๊ธฐ

Ultralytics ํ”„๋ ˆ์ž„์›Œํฌ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ฐ์ดํ„ฐ ์ง‘ํ•ฉ์— ์ž๋™ ์ฃผ์„์„ ๋‹ฌ๋ ค๋ฉด, ๋‹ค์Œ์„ ์‚ฌ์šฉํ•˜์‹ญ์‹œ์˜ค. auto_annotate ํ•จ์ˆ˜๋ฅผ ํ˜ธ์ถœํ•ฉ๋‹ˆ๋‹ค:

์˜ˆ

from ultralytics.data.annotator import auto_annotate

auto_annotate(data="path/to/images", det_model="yolo11x.pt", sam_model="sam_b.pt")
์ธ์ˆ˜ ์œ ํ˜• ๊ธฐ๋ณธ๊ฐ’ ์„ค๋ช…
data str required Path to directory containing target images/videos for annotation or segmentation.
det_model str "yolo11x.pt" YOLO detection model path for initial object detection.
sam_model str "sam2_b.pt" SAM2 model path for segmentation (supports t/s/b/l variants and SAM2.1) and mobile_sam models.
device str "" Computation device (e.g., 'cuda:0', 'cpu', or '' for automatic device detection).
conf float 0.25 YOLO detection confidence threshold for filtering weak detections.
iou float 0.45 IoU threshold for Non-Maximum Suppression to filter overlapping boxes.
imgsz int 640 Input size for resizing images (must be multiple of 32).
max_det int 300 Maximum number of detections per image for memory efficiency.
classes list[int] None List of class indices to detect (e.g., [0, 1] for person & bicycle).
output_dir str None Save directory for annotations (defaults to './labels' relative to data path).

๊ทธ๋ฆฌ๊ณ  auto_annotate ํ•จ์ˆ˜๋Š” ์ด๋ฏธ์ง€ ๊ฒฝ๋กœ์™€ ํ•จ๊ป˜ ์‚ฌ์ „ ํ•™์Šต๋œ ํƒ์ง€ ๋ฐ SAM ์„ธ๋ถ„ํ™” ๋ชจ๋ธ, ๋ชจ๋ธ์„ ์‹คํ–‰ํ•  ์žฅ์น˜, ์ฃผ์„์ด ๋‹ฌ๋ฆฐ ๊ฒฐ๊ณผ๋ฅผ ์ €์žฅํ•  ์ถœ๋ ฅ ๋””๋ ‰ํ„ฐ๋ฆฌ๋ฅผ ์ง€์ •ํ•˜๊ธฐ ์œ„ํ•œ ์„ ํƒ์  ์ธ์ˆ˜๋ฅผ ๋ฐ›์Šต๋‹ˆ๋‹ค.

์‚ฌ์ „ ํ•™์Šต๋œ ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•œ ์ž๋™ ์ฃผ์„์„ ์‚ฌ์šฉํ•˜๋ฉด ๊ณ ํ’ˆ์งˆ์˜ ์„ธ๋ถ„ํ™” ๋ฐ์ดํ„ฐ ์„ธํŠธ๋ฅผ ๋งŒ๋“œ๋Š” ๋ฐ ํ•„์š”ํ•œ ์‹œ๊ฐ„๊ณผ ๋…ธ๋ ฅ์„ ํš๊ธฐ์ ์œผ๋กœ ์ค„์ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด ๊ธฐ๋Šฅ์€ ๋Œ€๊ทœ๋ชจ ์ด๋ฏธ์ง€ ์ปฌ๋ ‰์…˜์„ ๋‹ค๋ฃจ๋Š” ์—ฐ๊ตฌ์ž์™€ ๊ฐœ๋ฐœ์ž๊ฐ€ ์ˆ˜๋™ ์ฃผ์„ ์ž‘์—… ๋Œ€์‹  ๋ชจ๋ธ ๊ฐœ๋ฐœ๊ณผ ํ‰๊ฐ€์— ์ง‘์ค‘ํ•  ์ˆ˜ ์žˆ๊ฒŒ ํ•ด์ฃผ๋ฏ€๋กœ ํŠนํžˆ ์œ ์šฉํ•ฉ๋‹ˆ๋‹ค.

์ธ์šฉ ๋ฐ ๊ฐ์‚ฌ

์—ฐ๊ตฌ ๋˜๋Š” ๊ฐœ๋ฐœ ์ž‘์—…์— SAM ์ด ์œ ์šฉํ•˜๋‹ค๊ณ  ์ƒ๊ฐ๋˜๋ฉด ์ €ํฌ ๋…ผ๋ฌธ์„ ์ธ์šฉํ•ด ์ฃผ์„ธ์š”:

@misc{kirillov2023segment,
      title={Segment Anything},
      author={Alexander Kirillov and Eric Mintun and Nikhila Ravi and Hanzi Mao and Chloe Rolland and Laura Gustafson and Tete Xiao and Spencer Whitehead and Alexander C. Berg and Wan-Yen Lo and Piotr Dollรกr and Ross Girshick},
      year={2023},
      eprint={2304.02643},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

์ปดํ“จํ„ฐ ๋น„์ „ ์ปค๋ฎค๋‹ˆํ‹ฐ๋ฅผ ์œ„ํ•ด ์ด ๊ท€์ค‘ํ•œ ๋ฆฌ์†Œ์Šค๋ฅผ ๋งŒ๋“ค๊ณ  ์œ ์ง€ ๊ด€๋ฆฌํ•ด ์ฃผ์‹  Meta AI์— ๊ฐ์‚ฌ์˜ ๋ง์”€์„ ์ „ํ•ฉ๋‹ˆ๋‹ค.

์ž์ฃผ ๋ฌป๋Š” ์งˆ๋ฌธ

Ultralytics ์˜ ์„ธ๊ทธ๋จผํŠธ ์• ๋‹ˆ์”ฝ ๋ชจ๋ธ(SAM)์ด๋ž€ ๋ฌด์—‡์ธ๊ฐ€์š”?

Ultralytics Segment Anything Model(SAM)์€ ํ”„๋กฌํ”„ํŠธ ๋ถ„ํ•  ์ž‘์—…์„ ์œ„ํ•ด ์„ค๊ณ„๋œ ํ˜์‹ ์ ์ธ ์ด๋ฏธ์ง€ ๋ถ„ํ•  ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค. ์ด ๋ชจ๋ธ์€ ๊ฒฝ๋Ÿ‰ ๋งˆ์Šคํฌ ๋””์ฝ”๋”์™€ ๊ฒฐํ•ฉ๋œ ์ด๋ฏธ์ง€ ๋ฐ ํ”„๋กฌํ”„ํŠธ ์ธ์ฝ”๋”๋ฅผ ํฌํ•จํ•œ ๊ณ ๊ธ‰ ์•„ํ‚คํ…์ฒ˜๋ฅผ ํ™œ์šฉํ•˜์—ฌ ๊ณต๊ฐ„ ๋˜๋Š” ํ…์ŠคํŠธ ๋‹จ์„œ์™€ ๊ฐ™์€ ๋‹ค์–‘ํ•œ ํ”„๋กฌํ”„ํŠธ์—์„œ ๊ณ ํ’ˆ์งˆ ์„ธ๊ทธ๋จผํ…Œ์ด์…˜ ๋งˆ์Šคํฌ๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. ๋ฐฉ๋Œ€ํ•œ SA-1B ๋ฐ์ดํ„ฐ ์„ธํŠธ์—์„œ ํ•™์Šต๋œ SAM ์€ ์ œ๋กœ ์ƒท ์„ฑ๋Šฅ์ด ๋›ฐ์–ด๋‚˜ ์‚ฌ์ „ ์ง€์‹ ์—†์ด๋„ ์ƒˆ๋กœ์šด ์ด๋ฏธ์ง€ ๋ถ„ํฌ์™€ ์ž‘์—…์— ์ ์‘ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์—ฌ๊ธฐ์—์„œ ์ž์„ธํžˆ ์•Œ์•„๋ณด์„ธ์š”.

์ด๋ฏธ์ง€ ๋ถ„ํ• ์„ ์œ„ํ•ด Segment Anything ๋ชจ๋ธ(SAM)์„ ์‚ฌ์šฉํ•˜๋ ค๋ฉด ์–ด๋–ป๊ฒŒ ํ•ด์•ผ ํ•˜๋‚˜์š”?

๊ฒฝ๊ณ„ ์ƒ์ž๋‚˜ ์ ๊ณผ ๊ฐ™์€ ๋‹ค์–‘ํ•œ ํ”„๋กฌํ”„ํŠธ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ถ”๋ก ์„ ์‹คํ–‰ํ•˜์—ฌ ์ด๋ฏธ์ง€ ๋ถ„ํ• ์„ ์œ„ํ•ด Segment Anything ๋ชจ๋ธ(SAM)์„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋‹ค์Œ์€ Python ์„ ์‚ฌ์šฉํ•œ ์˜ˆ์‹œ์ž…๋‹ˆ๋‹ค:

from ultralytics import SAM

# Load a model
model = SAM("sam_b.pt")

# Segment with bounding box prompt
model("ultralytics/assets/zidane.jpg", bboxes=[439, 437, 524, 709])

# Segment with points prompt
model("ultralytics/assets/zidane.jpg", points=[900, 370], labels=[1])

# Segment with multiple points prompt
model("ultralytics/assets/zidane.jpg", points=[[400, 370], [900, 370]], labels=[[1, 1]])

# Segment with multiple points prompt per object
model("ultralytics/assets/zidane.jpg", points=[[[400, 370], [900, 370]]], labels=[[1, 1]])

# Segment with negative points prompt.
model("ultralytics/assets/zidane.jpg", points=[[[400, 370], [900, 370]]], labels=[[1, 0]])

๋˜๋Š” ๋ช…๋ น์ค„ ์ธํ„ฐํŽ˜์ด์Šค(CLI)์—์„œ SAM ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ถ”๋ก ์„ ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค:

yolo predict model=sam_b.pt source=path/to/image.jpg

์ž์„ธํ•œ ์‚ฌ์šฉ ๋ฐฉ๋ฒ•์€ ์„ธ๋ถ„ํ™” ์„น์…˜์„ ์ฐธ์กฐํ•˜์„ธ์š”.

SAM ์™€ YOLOv8 ์˜ ์„ฑ๋Šฅ์€ ์–ด๋–ป๊ฒŒ ๋น„๊ต๋˜๋‚˜์š”?

YOLOv8, SAM ๊ณผ ๋น„๊ตํ•˜๋ฉด SAM-b ๋ฐ FastSAM-s์™€ ๊ฐ™์€ ๋ชจ๋ธ์€ ๋” ํฌ๊ณ  ๋Š๋ฆฌ์ง€๋งŒ ์ž๋™ ์„ธ๋ถ„ํ™”๋ฅผ ์œ„ํ•œ ๊ณ ์œ ํ•œ ๊ธฐ๋Šฅ์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด Ultralytics YOLOv8n -seg๋Š” SAM-b๋ณด๋‹ค 53.4๋ฐฐ ์ž‘๊ณ  866๋ฐฐ ๋น ๋ฆ…๋‹ˆ๋‹ค. ํ•˜์ง€๋งŒ SAM ์˜ ์ œ๋กœ ์ƒท ์„ฑ๋Šฅ์€ ํ›ˆ๋ จ๋˜์ง€ ์•Š์€ ๋‹ค์–‘ํ•œ ์ž‘์—…์—์„œ ๋งค์šฐ ์œ ์—ฐํ•˜๊ณ  ํšจ์œจ์ ์ž…๋‹ˆ๋‹ค. SAM ์™€ YOLOv8 ์˜ ์„ฑ๋Šฅ ๋น„๊ต์— ๋Œ€ํ•œ ์ž์„ธํ•œ ๋‚ด์šฉ์€์—ฌ๊ธฐ๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.

SAM ์„ ์‚ฌ์šฉํ•˜์—ฌ ๋ฐ์ดํ„ฐ ์ง‘ํ•ฉ์— ์ž๋™ ์ฃผ์„์„ ๋‹ฌ๋ ค๋ฉด ์–ด๋–ป๊ฒŒ ํ•ด์•ผ ํ•˜๋‚˜์š”?

Ultralytics' SAM ์—์„œ๋Š” ์‚ฌ์ „ ํ•™์Šต๋œ ํƒ์ง€ ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜์—ฌ ์„ธ๋ถ„ํ™” ๋ฐ์ดํ„ฐ ์„ธํŠธ๋ฅผ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ๋Š” ์ž๋™ ์ฃผ์„ ๊ธฐ๋Šฅ์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. ๋‹ค์Œ์€ Python ์— ์žˆ๋Š” ์˜ˆ์ œ์ž…๋‹ˆ๋‹ค:

from ultralytics.data.annotator import auto_annotate

auto_annotate(data="path/to/images", det_model="yolov8x.pt", sam_model="sam_b.pt")

์ด ํ•จ์ˆ˜๋Š” ์ด๋ฏธ์ง€ ๊ฒฝ๋กœ์™€ ์‚ฌ์ „ ํ•™์Šต๋œ ๊ฐ์ง€ ๋ฐ SAM ์„ธ๋ถ„ํ™” ๋ชจ๋ธ์— ๋Œ€ํ•œ ์„ ํƒ์  ์ธ์ˆ˜๋ฅผ ์žฅ์น˜ ๋ฐ ์ถœ๋ ฅ ๋””๋ ‰ํ† ๋ฆฌ ์‚ฌ์–‘๊ณผ ํ•จ๊ป˜ ๊ฐ€์ ธ์˜ต๋‹ˆ๋‹ค. ์ „์ฒด ๊ฐ€์ด๋“œ๋Š” ์ž๋™ ์ฃผ์„์„ ์ฐธ์กฐํ•˜์„ธ์š”.

์„ธ๊ทธ๋จผํŠธ ์• ๋‹ˆ์”ฝ ๋ชจ๋ธ(SAM)์„ ํ•™์Šตํ•˜๋Š” ๋ฐ ์–ด๋–ค ๋ฐ์ดํ„ฐ ์„ธํŠธ๊ฐ€ ์‚ฌ์šฉ๋˜๋‚˜์š”?

SAM ๋Š” 1100๋งŒ ๊ฐœ์˜ ์ด๋ฏธ์ง€์— ๊ฑธ์ณ 10์–ต ๊ฐœ ์ด์ƒ์˜ ๋งˆ์Šคํฌ๋กœ ๊ตฌ์„ฑ๋œ ๋ฐฉ๋Œ€ํ•œ SA-1B ๋ฐ์ดํ„ฐ ์„ธํŠธ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•™์Šต๋ฉ๋‹ˆ๋‹ค. SA-1B๋Š” ํ˜„์žฌ๊นŒ์ง€ ๊ฐ€์žฅ ํฐ ์„ธ๋ถ„ํ™” ๋ฐ์ดํ„ฐ ์„ธํŠธ๋กœ, ๊ณ ํ’ˆ์งˆ์˜ ๋‹ค์–‘ํ•œ ํ•™์Šต ๋ฐ์ดํ„ฐ๋ฅผ ์ œ๊ณตํ•˜์—ฌ ๋‹ค์–‘ํ•œ ์„ธ๋ถ„ํ™” ์ž‘์—…์—์„œ ์ธ์ƒ์ ์ธ ์ œ๋กœ ์ƒท ์„ฑ๋Šฅ์„ ๋ณด์žฅํ•ฉ๋‹ˆ๋‹ค. ์ž์„ธํ•œ ๋‚ด์šฉ์€ ๋ฐ์ดํ„ฐ ์„ธํŠธ ์„น์…˜์„ ์ฐธ์กฐํ•˜์„ธ์š”.

๐Ÿ“…1 ๋…„ ์ „ ์ƒ์„ฑ๋จ โœ๏ธ ์—…๋ฐ์ดํŠธ๋จ 19 ์ผ ์ „

๋Œ“๊ธ€