빠른 세그먼트 무엇이든 모델 (FastSAM)

Q: What is FastSAM and how does it differ from SAM?

FastSAM의 줄임말인 고속 세그먼트 애니씽 모델은 객체 분할 작업에서 높은 성능을 유지하면서 계산 수요를 줄이도록 설계된 실시간 컨볼루션 신경망(CNN) 기반 솔루션입니다. 더 무거운 Transformer 기반 아키텍처를 사용하는 세그먼트 애니씽 모델(SAM)과 달리, FastSAM 은 Ultralytics YOLOv8 -seg를 활용하여 전체 인스턴스 세그먼트와 프롬프트 가이드 선택의 두 단계로 효율적인 인스턴스 세그먼테이션을 수행합니다.

Q: What are the practical applications of FastSAM?

FastSAM 는 실시간 세분화 성능이 필요한 다양한 컴퓨터 비전 작업에 유용합니다. 응용 분야는 다음과 같습니다: 다양한 사용자 상호작용 프롬프트를 처리할 수 있는 기능 덕분에 FastSAM 다양한 시나리오에 유연하게 적응할 수 있습니다.

Q: What types of prompts does FastSAM support for segmentation tasks?

FastSAM 는 세분화 작업을 안내하는 다양한 프롬프트 유형을 지원합니다: 이러한 유연성을 통해 FastSAM 다양한 사용자 상호작용 시나리오에 적응할 수 있어 여러 애플리케이션에서 활용도를 높일 수 있습니다. 이러한 프롬프트 사용에 대한 자세한 내용은 주요 기능 섹션을 참조하세요.

고속 세그먼트 모델(FastSAM)은 무엇이든 세그먼트 작업을 위한 새로운 실시간 CNN 기반 솔루션입니다. 이 작업은 다양한 사용자 상호작용 프롬프트를 기반으로 이미지 내의 모든 물체를 분할하도록 설계되었습니다. FastSAM ) 경쟁력 있는 성능을 유지하면서 계산 수요를 크게 줄여 다양한 비전 작업에 실용적인 선택이 될 수 있습니다.

Watch: FastSAM 를 사용한 개체 추적 Ultralytics

모델 아키텍처

패스트 세그먼트 애니씽 모델 (FastSAM) 아키텍처 개요

개요

FastSAM is designed to address the limitations of the Segment Anything Model (SAM), a heavy Transformer model with substantial computational resource requirements. The FastSAM decouples the segment anything task into two sequential stages: all-instance segmentation and prompt-guided selection. The first stage uses YOLOv8-seg to produce the segmentation masks of all instances in the image. In the second stage, it outputs the region-of-interest corresponding to the prompt.

주요 기능

실시간 솔루션: FastSAM 은 CNN의 계산 효율성을 활용하여 세그먼트 애니웨어 작업을 위한 실시간 솔루션을 제공하므로 빠른 결과가 필요한 산업 애플리케이션에 유용합니다.
효율성 및 성능: FastSAM 성능 품질은 그대로 유지하면서 컴퓨팅 및 리소스 요구량을 크게 줄였습니다. SAM 와 비슷한 성능을 제공하지만 컴퓨팅 리소스를 대폭 줄여 실시간 애플리케이션을 구현할 수 있습니다.
프롬프트 안내 세그먼트: FastSAM 는 다양한 사용자 상호작용 프롬프트에 따라 이미지 내의 모든 개체를 세그먼트화하여 다양한 시나리오에서 유연성과 적응성을 제공합니다.
YOLOv8 -seg 기반: FastSAM 은 인스턴스 분할 분기가 장착된 객체 검출기인 YOLOv8-seg를 기반으로 합니다. 이를 통해 이미지의 모든 인스턴스에 대한 분할 마스크를 효과적으로 생성할 수 있습니다.
벤치마크에서의 경쟁력 있는 결과: MS COCO의 객체 제안 작업에서 FastSAM 은 단일 RTX 3090보다 훨씬 빠른 속도로 높은 점수를 획득했습니다. SAM 단일 NVIDIA RTX 3090보다 훨씬 빠른 속도로 높은 점수를 획득하여 효율성과 성능을 입증했습니다.
실용적인 애플리케이션: 제안된 접근 방식은 현재 방법보다 수십, 수백 배 빠른 속도로 수많은 비전 작업을 위한 새롭고 실용적인 솔루션을 제공합니다.
모델 압축 가능성: FastSAM 은 구조에 인공적인 선행 요소를 도입하여 계산 노력을 크게 줄일 수 있는 경로의 가능성을 보여줌으로써 일반적인 비전 작업을 위한 대규모 모델 아키텍처의 새로운 가능성을 열어줍니다.

사용 가능한 모델, 지원되는 작업 및 작동 모드

이 표에는 사용 가능한 모델과 함께 특정 사전 학습된 가중치, 지원되는 작업, 추론, 검증, 학습 및 내보내기와 같은 다양한 작동 모드와의 호환성이 표시되어 있으며, 지원되는 모드의 경우 ✅ 이모티콘, 지원되지 않는 모드의 경우 ❌ 이모티콘으로 표시되어 있습니다.

모델 유형	사전 학습된 가중치	지원되는 작업	추론	유효성 검사	교육	내보내기
FastSAM-s	FastSAM-s.pt	인스턴스 세분화	✅	❌	❌	✅
FastSAM-x	FastSAM-x.pt	인스턴스 세분화	✅	❌	❌	✅

사용 예

FastSAM 모델은 Python 애플리케이션에 쉽게 통합할 수 있습니다. Ultralytics 사용자 친화적인 Python API와 CLI 명령어를 제공하여 개발을 간소화합니다.

사용량 예측

To perform object detection on an image, use the predict 메서드를 사용합니다:

예

PythonCLI

from ultralytics import FastSAM

# Define an inference source
source = "path/to/bus.jpg"

# Create a FastSAM model
model = FastSAM("FastSAM-s.pt")  # or FastSAM-x.pt

# Run inference on an image
everything_results = model(source, device="cpu", retina_masks=True, imgsz=1024, conf=0.4, iou=0.9)

# Run inference with bboxes prompt
results = model(source, bboxes=[439, 437, 524, 709])

# Run inference with points prompt
results = model(source, points=[[200, 200]], labels=[1])

# Run inference with texts prompt
results = model(source, texts="a photo of a dog")

# Run inference with bboxes and points and texts prompt at the same time
results = model(source, bboxes=[439, 437, 524, 709], points=[[200, 200]], labels=[1], texts="a photo of a dog")

# Load a FastSAM model and segment everything with it
yolo segment predict model=FastSAM-s.pt source=path/to/bus.jpg imgsz=640

이 스니펫은 사전 학습된 모델을 로드하고 이미지에 대해 예측을 실행하는 간단한 방법을 보여줍니다.

FastSAMPredictor 예제

이렇게 하면 이미지에서 추론을 실행하고 모든 세그먼트를 얻을 수 있습니다. results 를 한 번만 실행하고 프롬프트 추론을 여러 번 실행하지 않고 추론을 여러 번 실행합니다.

신속한 추론

from ultralytics.models.fastsam import FastSAMPredictor

# Create FastSAMPredictor
overrides = dict(conf=0.25, task="segment", mode="predict", model="FastSAM-s.pt", save=False, imgsz=1024)
predictor = FastSAMPredictor(overrides=overrides)

# Segment everything
everything_results = predictor("ultralytics/assets/bus.jpg")

# Prompt inference
bbox_results = predictor.prompt(everything_results, bboxes=[[200, 200, 300, 300]])
point_results = predictor.prompt(everything_results, points=[200, 200])
text_results = predictor.prompt(everything_results, texts="a photo of a dog")

참고

반환된 모든 results 위의 예에서 결과 객체를 사용하여 예측된 마스크와 소스 이미지에 쉽게 액세스할 수 있습니다.

Val 사용법

데이터 세트에 대한 모델 유효성 검사는 다음과 같이 수행할 수 있습니다:

예

PythonCLI

from ultralytics import FastSAM

# Create a FastSAM model
model = FastSAM("FastSAM-s.pt")  # or FastSAM-x.pt

# Validate the model
results = model.val(data="coco8-seg.yaml")

# Load a FastSAM model and validate it on the COCO8 example dataset at image size 640
yolo segment val model=FastSAM-s.pt data=coco8.yaml imgsz=640

FastSAM 은 단일 클래스 오브젝트의 감지 및 세분화만 지원한다는 점에 유의하세요. 즉, 모든 객체를 동일한 클래스로 인식하고 세분화합니다. 따라서 데이터 세트를 준비할 때 모든 객체 카테고리 ID를 0으로 변환해야 합니다.

사용량 추적

이미지에서 개체 추적을 수행하려면 track 메서드를 사용합니다:

예

PythonCLI

from ultralytics import FastSAM

# Create a FastSAM model
model = FastSAM("FastSAM-s.pt")  # or FastSAM-x.pt

# Track with a FastSAM model on a video
results = model.track(source="path/to/video.mp4", imgsz=640)

yolo segment track model=FastSAM-s.pt source="path/to/video/file.mp4" imgsz=640

FastSAM 공식 사용법

FastSAM https://github.com/CASIA-IVA-Lab/ FastSAM 리포지토리에서 직접 다운로드할 수도 있습니다. 다음은 FastSAM 을 사용하는 일반적인 단계에 대한 간략한 개요입니다:

설치

FastSAM 리포지토리를 복제합니다:

git clone https://github.com/CASIA-IVA-Lab/FastSAM.git

Python 3.9를 사용하여 Conda 환경을 만들고 활성화합니다:
```
conda create -n FastSAM python=3.9
conda activate FastSAM
```
복제된 리포지토리로 이동하여 필요한 패키지를 설치합니다:
```
cd FastSAM
pip install -r requirements.txt
```

CLIP 모델을 설치합니다:

pip install git+https://github.com/ultralytics/CLIP.git

사용 예

모델 체크포인트를 다운로드하세요.

추론에는 FastSAM 을 사용합니다. 명령 예시:

이미지의 모든 항목을 세분화하세요:

python Inference.py --model_path ./weights/FastSAM.pt --img_path ./images/dogs.jpg

텍스트 프롬프트를 사용하여 특정 개체를 세분화합니다:

python Inference.py --model_path ./weights/FastSAM.pt --img_path ./images/dogs.jpg --text_prompt "the yellow dog"

Segment objects within a bounding box (provide box coordinates in xywh format):

python Inference.py --model_path ./weights/FastSAM.pt --img_path ./images/dogs.jpg --box_prompt "[570,200,230,400]"

특정 지점 근처의 오브젝트를 세그먼트화합니다:

python Inference.py --model_path ./weights/FastSAM.pt --img_path ./images/dogs.jpg --point_prompt "[[520,360],[620,300]]" --point_label "[1,0]"

또한 Colab 데모 또는 HuggingFace 웹 데모를 통해 FastSAM 에서 시각적 경험을 해볼 수 있습니다.

인용 및 감사

실시간 인스턴스 세분화 분야에서 크게 기여한 FastSAM 작성자에게 감사의 말씀을 전합니다:

BibTeX

@misc{zhao2023fast,
      title={Fast Segment Anything},
      author={Xu Zhao and Wenchao Ding and Yongqi An and Yinglong Du and Tao Yu and Min Li and Ming Tang and Jinqiao Wang},
      year={2023},
      eprint={2306.12156},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

원본 FastSAM 논문은 arXiv에서 확인할 수 있습니다. 저자들은 자신의 작업을 공개했으며, 코드베이스는 GitHub에서 액세스할 수 있습니다. 이 분야를 발전시키고 더 많은 커뮤니티가 자신의 연구에 접근할 수 있도록 한 저자들의 노력에 감사드립니다.

자주 묻는 질문

FastSAM 이란 무엇이며 SAM 과 어떻게 다른가요?

FastSAM, short for Fast Segment Anything Model, is a real-time convolutional neural network (CNN)-based solution designed to reduce computational demands while maintaining high performance in object segmentation tasks. Unlike the Segment Anything Model (SAM), which uses a heavier Transformer-based architecture, FastSAM leverages Ultralytics YOLOv8-seg for efficient instance segmentation in two stages: all-instance segmentation followed by prompt-guided selection.

FastSAM 어떻게 실시간 세분화 성능을 달성하나요?

FastSAM 는 세분화 작업을 YOLOv8-seg 및 프롬프트 안내 선택 단계를 통해 모든 인스턴스 세분화로 분리하여 실시간 세분화를 달성합니다. FastSAM 은 CNN의 계산 효율성을 활용하여 경쟁력 있는 성능을 유지하면서 계산 및 리소스 수요를 크게 줄입니다. 이러한 이중 단계 접근 방식을 통해 FastSAM 은 빠른 결과가 필요한 애플리케이션에 적합한 빠르고 효율적인 세분화를 제공할 수 있습니다.

FastSAM 의 실제 적용 분야는 무엇인가요?

FastSAM is practical for a variety of computer vision tasks that require real-time segmentation performance. Applications include:

품질 관리 및 보증을 위한 산업 자동화
보안 및 감시를 위한 실시간 비디오 분석
물체 감지 및 세분화를 위한 자율 주행 차량
정확하고 빠른 세분화 작업을 위한 의료용 이미징

다양한 사용자 상호작용 프롬프트를 처리할 수 있는 기능으로 FastSAM 다양한 시나리오에 유연하게 적응할 수 있습니다.

Python 에서 추론에 FastSAM 모델을 사용하려면 어떻게 해야 하나요?

Python 에서 추론에 FastSAM 을 사용하려면 아래 예시를 따르세요:

from ultralytics import FastSAM

# Define an inference source
source = "path/to/bus.jpg"

# Create a FastSAM model
model = FastSAM("FastSAM-s.pt")  # or FastSAM-x.pt

# Run inference on an image
everything_results = model(source, device="cpu", retina_masks=True, imgsz=1024, conf=0.4, iou=0.9)

# Run inference with bboxes prompt
results = model(source, bboxes=[439, 437, 524, 709])

# Run inference with points prompt
results = model(source, points=[[200, 200]], labels=[1])

# Run inference with texts prompt
results = model(source, texts="a photo of a dog")

# Run inference with bboxes and points and texts prompt at the same time
results = model(source, bboxes=[439, 437, 524, 709], points=[[200, 200]], labels=[1], texts="a photo of a dog")

추론 방법에 대한 자세한 내용은 문서의 사용 예측 섹션을 참조하세요.

세분화 작업에 대해 FastSAM 어떤 유형의 프롬프트가 지원되나요?

FastSAM 는 세분화 작업을 안내하는 다양한 프롬프트 유형을 지원합니다:

모든 프롬프트: 보이는 모든 개체에 대한 세그먼테이션을 생성합니다.
바운딩 상자(BBox) 프롬프트: 지정된 경계 상자 내에서 개체를 분할합니다.
텍스트 프롬프트: 설명 텍스트를 사용하여 설명과 일치하는 개체를 분할합니다.
포인트 프롬프트: 특정 사용자 정의 지점 근처의 개체를 세그먼트화합니다.

이러한 유연성을 통해 FastSAM 다양한 사용자 상호작용 시나리오에 적응할 수 있어 여러 애플리케이션에서 유용성이 향상됩니다. 이러한 프롬프트 사용에 대한 자세한 내용은 주요 기능 섹션을 참조하세요.

📅 Created 11 months ago ✏️ Updated 29 days ago

빠른 세그먼트 무엇이든 모델 (FastSAM)

모델 아키텍처

개요

주요 기능

사용 가능한 모델, 지원되는 작업 및 작동 모드

사용 예

사용량 예측

Val 사용법

사용량 추적

FastSAM 공식 사용법

설치

사용 예

인용 및 감사

자주 묻는 질문

FastSAM 이란 무엇이며 SAM 과 어떻게 다른가요?

FastSAM 어떻게 실시간 세분화 성능을 달성하나요?

FastSAM 의 실제 적용 분야는 무엇인가요?

Python 에서 추론에 FastSAM 모델을 사용하려면 어떻게 해야 하나요?

세분화 작업에 대해 FastSAM 어떤 유형의 프롬프트가 지원되나요?

댓글