์ฝ˜ํ…์ธ ๋กœ ๊ฑด๋„ˆ๋›ฐ๊ธฐ

์ฐธ์กฐ ultralytics/models/sam/model.py

์ฐธ๊ณ 

์ด ํŒŒ์ผ์€ https://github.com/ultralytics/ ultralytics/blob/main/ ultralytics/models/ sam/model .py์—์„œ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋ฌธ์ œ๋ฅผ ๋ฐœ๊ฒฌํ•˜๋ฉด ํ’€ ๋ฆฌํ€˜์ŠคํŠธ ๐Ÿ› ๏ธ ์— ๊ธฐ์—ฌํ•˜์—ฌ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๋„๋ก ๋„์™€์ฃผ์„ธ์š”. ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค ๐Ÿ™!



ultralytics.models.sam.model.SAM

๊ธฐ์ง€: Model

SAM (์„ธ๊ทธ๋จผํŠธ ๋ฌด์—‡์ด๋“  ๋ชจ๋ธ) ์ธํ„ฐํŽ˜์ด์Šค ํด๋ž˜์Šค์ž…๋‹ˆ๋‹ค.

SAM ๋Š” ์‹ ์†ํ•œ ์‹ค์‹œ๊ฐ„ ์ด๋ฏธ์ง€ ๋ถ„ํ• ์„ ์œ„ํ•ด ์„ค๊ณ„๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ๋‹ค์Œ๊ณผ ๊ฐ™์€ ๋‹ค์–‘ํ•œ ํ”„๋กฌํ”„ํŠธ์™€ ํ•จ๊ป˜ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋ฐ”์šด๋”ฉ ๋ฐ•์Šค, ํฌ์ธํŠธ, ๋ ˆ์ด๋ธ” ๋“ฑ ๋‹ค์–‘ํ•œ ํ”„๋กฌํ”„ํŠธ์™€ ํ•จ๊ป˜ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด ๋ชจ๋ธ์€ ์ œ๋กœ ์ƒท ์„ฑ๋Šฅ์„ ์œ„ํ•œ ๊ธฐ๋Šฅ์„ ๊ฐ–์ถ”๊ณ  ์žˆ์œผ๋ฉฐ SA-1B ๋ฐ์ดํ„ฐ ์„ธํŠธ๋กœ ํ•™์Šต๋ฉ๋‹ˆ๋‹ค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/models/sam/model.py
class SAM(Model):
    """
    SAM (Segment Anything Model) interface class.

    SAM is designed for promptable real-time image segmentation. It can be used with a variety of prompts such as
    bounding boxes, points, or labels. The model has capabilities for zero-shot performance and is trained on the SA-1B
    dataset.
    """

    def __init__(self, model="sam_b.pt") -> None:
        """
        Initializes the SAM model with a pre-trained model file.

        Args:
            model (str): Path to the pre-trained SAM model file. File should have a .pt or .pth extension.

        Raises:
            NotImplementedError: If the model file extension is not .pt or .pth.
        """
        if model and Path(model).suffix not in (".pt", ".pth"):
            raise NotImplementedError("SAM prediction requires pre-trained *.pt or *.pth model.")
        super().__init__(model=model, task="segment")

    def _load(self, weights: str, task=None):
        """
        Loads the specified weights into the SAM model.

        Args:
            weights (str): Path to the weights file.
            task (str, optional): Task name. Defaults to None.
        """
        self.model = build_sam(weights)

    def predict(self, source, stream=False, bboxes=None, points=None, labels=None, **kwargs):
        """
        Performs segmentation prediction on the given image or video source.

        Args:
            source (str): Path to the image or video file, or a PIL.Image object, or a numpy.ndarray object.
            stream (bool, optional): If True, enables real-time streaming. Defaults to False.
            bboxes (list, optional): List of bounding box coordinates for prompted segmentation. Defaults to None.
            points (list, optional): List of points for prompted segmentation. Defaults to None.
            labels (list, optional): List of labels for prompted segmentation. Defaults to None.

        Returns:
            (list): The model predictions.
        """
        overrides = dict(conf=0.25, task="segment", mode="predict", imgsz=1024)
        kwargs.update(overrides)
        prompts = dict(bboxes=bboxes, points=points, labels=labels)
        return super().predict(source, stream, prompts=prompts, **kwargs)

    def __call__(self, source=None, stream=False, bboxes=None, points=None, labels=None, **kwargs):
        """
        Alias for the 'predict' method.

        Args:
            source (str): Path to the image or video file, or a PIL.Image object, or a numpy.ndarray object.
            stream (bool, optional): If True, enables real-time streaming. Defaults to False.
            bboxes (list, optional): List of bounding box coordinates for prompted segmentation. Defaults to None.
            points (list, optional): List of points for prompted segmentation. Defaults to None.
            labels (list, optional): List of labels for prompted segmentation. Defaults to None.

        Returns:
            (list): The model predictions.
        """
        return self.predict(source, stream, bboxes, points, labels, **kwargs)

    def info(self, detailed=False, verbose=True):
        """
        Logs information about the SAM model.

        Args:
            detailed (bool, optional): If True, displays detailed information about the model. Defaults to False.
            verbose (bool, optional): If True, displays information on the console. Defaults to True.

        Returns:
            (tuple): A tuple containing the model's information.
        """
        return model_info(self.model, detailed=detailed, verbose=verbose)

    @property
    def task_map(self):
        """
        Provides a mapping from the 'segment' task to its corresponding 'Predictor'.

        Returns:
            (dict): A dictionary mapping the 'segment' task to its corresponding 'Predictor'.
        """
        return {"segment": {"predictor": Predictor}}

task_map property

'์„ธ๊ทธ๋จผํŠธ' ์ž‘์—…์—์„œ ํ•ด๋‹น '์˜ˆ์ธก์ž'๋กœ์˜ ๋งคํ•‘์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค.

๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค:

์œ ํ˜• ์„ค๋ช…
dict

'์„ธ๊ทธ๋จผํŠธ' ์ž‘์—…์„ ํ•ด๋‹น '์˜ˆ์ธก์ž'์— ๋งคํ•‘ํ•˜๋Š” ์‚ฌ์ „์ž…๋‹ˆ๋‹ค.

__call__(source=None, stream=False, bboxes=None, points=None, labels=None, **kwargs)

'์˜ˆ์ธก' ๋ฉ”์„œ๋“œ์˜ ๋ณ„์นญ์ž…๋‹ˆ๋‹ค.

๋งค๊ฐœ๋ณ€์ˆ˜:

์ด๋ฆ„ ์œ ํ˜• ์„ค๋ช… ๊ธฐ๋ณธ๊ฐ’
source str

์ด๋ฏธ์ง€ ๋˜๋Š” ๋น„๋””์˜ค ํŒŒ์ผ์˜ ๊ฒฝ๋กœ, PIL.Image ๊ฐ์ฒด ๋˜๋Š” numpy.ndarray ๊ฐ์ฒด์ž…๋‹ˆ๋‹ค.

None
stream bool

True์ด๋ฉด ์‹ค์‹œ๊ฐ„ ์ŠคํŠธ๋ฆฌ๋ฐ์„ ํ™œ์„ฑํ™”ํ•ฉ๋‹ˆ๋‹ค. ๊ธฐ๋ณธ๊ฐ’์€ False์ž…๋‹ˆ๋‹ค.

False
bboxes list

ํ”„๋กฌํ”„ํŠธ ์„ธ๋ถ„ํ™”๋ฅผ ์œ„ํ•œ ๊ฒฝ๊ณ„ ์ƒ์ž ์ขŒํ‘œ ๋ชฉ๋ก์ž…๋‹ˆ๋‹ค. ๊ธฐ๋ณธ๊ฐ’์€ ์—†์Œ์ž…๋‹ˆ๋‹ค.

None
points list

ํ”„๋กฌํ”„ํŠธ ์„ธ๋ถ„ํ™”๋ฅผ ์œ„ํ•œ ํฌ์ธํŠธ ๋ชฉ๋ก์ž…๋‹ˆ๋‹ค. ๊ธฐ๋ณธ๊ฐ’์€ ์—†์Œ์ž…๋‹ˆ๋‹ค.

None
labels list

ํ”„๋กฌํ”„ํŠธ ์„ธ๋ถ„ํ™”๋ฅผ ์œ„ํ•œ ๋ ˆ์ด๋ธ” ๋ชฉ๋ก์ž…๋‹ˆ๋‹ค. ๊ธฐ๋ณธ๊ฐ’์€ ์—†์Œ์ž…๋‹ˆ๋‹ค.

None

๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค:

์œ ํ˜• ์„ค๋ช…
list

๋ชจ๋ธ ์˜ˆ์ธก.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/models/sam/model.py
def __call__(self, source=None, stream=False, bboxes=None, points=None, labels=None, **kwargs):
    """
    Alias for the 'predict' method.

    Args:
        source (str): Path to the image or video file, or a PIL.Image object, or a numpy.ndarray object.
        stream (bool, optional): If True, enables real-time streaming. Defaults to False.
        bboxes (list, optional): List of bounding box coordinates for prompted segmentation. Defaults to None.
        points (list, optional): List of points for prompted segmentation. Defaults to None.
        labels (list, optional): List of labels for prompted segmentation. Defaults to None.

    Returns:
        (list): The model predictions.
    """
    return self.predict(source, stream, bboxes, points, labels, **kwargs)

__init__(model='sam_b.pt')

๋ฏธ๋ฆฌ ํ•™์Šต๋œ ๋ชจ๋ธ ํŒŒ์ผ๋กœ SAM ๋ชจ๋ธ์„ ์ดˆ๊ธฐํ™”ํ•ฉ๋‹ˆ๋‹ค.

๋งค๊ฐœ๋ณ€์ˆ˜:

์ด๋ฆ„ ์œ ํ˜• ์„ค๋ช… ๊ธฐ๋ณธ๊ฐ’
model str

์‚ฌ์ „ ํ•™์Šต๋œ SAM ๋ชจ๋ธ ํŒŒ์ผ์˜ ๊ฒฝ๋กœ์ž…๋‹ˆ๋‹ค. ํŒŒ์ผ ํ™•์žฅ์ž๋Š” .pt ๋˜๋Š” .pth์—ฌ์•ผ ํ•ฉ๋‹ˆ๋‹ค.

'sam_b.pt'

์˜ฌ๋ฆฌ๊ธฐ:

์œ ํ˜• ์„ค๋ช…
NotImplementedError

๋ชจ๋ธ ํŒŒ์ผ ํ™•์žฅ์ž๊ฐ€ .pt ๋˜๋Š” .pth๊ฐ€ ์•„๋‹Œ ๊ฒฝ์šฐ.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/models/sam/model.py
def __init__(self, model="sam_b.pt") -> None:
    """
    Initializes the SAM model with a pre-trained model file.

    Args:
        model (str): Path to the pre-trained SAM model file. File should have a .pt or .pth extension.

    Raises:
        NotImplementedError: If the model file extension is not .pt or .pth.
    """
    if model and Path(model).suffix not in (".pt", ".pth"):
        raise NotImplementedError("SAM prediction requires pre-trained *.pt or *.pth model.")
    super().__init__(model=model, task="segment")

info(detailed=False, verbose=True)

SAM ๋ชจ๋ธ์— ๋Œ€ํ•œ ์ •๋ณด๋ฅผ ๊ธฐ๋กํ•ฉ๋‹ˆ๋‹ค.

๋งค๊ฐœ๋ณ€์ˆ˜:

์ด๋ฆ„ ์œ ํ˜• ์„ค๋ช… ๊ธฐ๋ณธ๊ฐ’
detailed bool

True์ธ ๊ฒฝ์šฐ ๋ชจ๋ธ์— ๋Œ€ํ•œ ์ž์„ธํ•œ ์ •๋ณด๋ฅผ ํ‘œ์‹œํ•ฉ๋‹ˆ๋‹ค. ๊ธฐ๋ณธ๊ฐ’์€ False์ž…๋‹ˆ๋‹ค.

False
verbose bool

True์ธ ๊ฒฝ์šฐ ์ฝ˜์†”์— ์ •๋ณด๋ฅผ ํ‘œ์‹œํ•ฉ๋‹ˆ๋‹ค. ๊ธฐ๋ณธ๊ฐ’์€ True์ž…๋‹ˆ๋‹ค.

True

๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค:

์œ ํ˜• ์„ค๋ช…
tuple

๋ชจ๋ธ์˜ ์ •๋ณด๊ฐ€ ํฌํ•จ๋œ ํŠœํ”Œ์ž…๋‹ˆ๋‹ค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/models/sam/model.py
def info(self, detailed=False, verbose=True):
    """
    Logs information about the SAM model.

    Args:
        detailed (bool, optional): If True, displays detailed information about the model. Defaults to False.
        verbose (bool, optional): If True, displays information on the console. Defaults to True.

    Returns:
        (tuple): A tuple containing the model's information.
    """
    return model_info(self.model, detailed=detailed, verbose=verbose)

predict(source, stream=False, bboxes=None, points=None, labels=None, **kwargs)

์ง€์ •๋œ ์ด๋ฏธ์ง€ ๋˜๋Š” ๋น„๋””์˜ค ์†Œ์Šค์— ๋Œ€ํ•ด ์„ธ๊ทธ๋จผํŠธ ์˜ˆ์ธก์„ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค.

๋งค๊ฐœ๋ณ€์ˆ˜:

์ด๋ฆ„ ์œ ํ˜• ์„ค๋ช… ๊ธฐ๋ณธ๊ฐ’
source str

์ด๋ฏธ์ง€ ๋˜๋Š” ๋น„๋””์˜ค ํŒŒ์ผ์˜ ๊ฒฝ๋กœ, PIL.Image ๊ฐ์ฒด ๋˜๋Š” numpy.ndarray ๊ฐ์ฒด์ž…๋‹ˆ๋‹ค.

ํ•„์ˆ˜
stream bool

True์ด๋ฉด ์‹ค์‹œ๊ฐ„ ์ŠคํŠธ๋ฆฌ๋ฐ์„ ํ™œ์„ฑํ™”ํ•ฉ๋‹ˆ๋‹ค. ๊ธฐ๋ณธ๊ฐ’์€ False์ž…๋‹ˆ๋‹ค.

False
bboxes list

ํ”„๋กฌํ”„ํŠธ ์„ธ๋ถ„ํ™”๋ฅผ ์œ„ํ•œ ๊ฒฝ๊ณ„ ์ƒ์ž ์ขŒํ‘œ ๋ชฉ๋ก์ž…๋‹ˆ๋‹ค. ๊ธฐ๋ณธ๊ฐ’์€ ์—†์Œ์ž…๋‹ˆ๋‹ค.

None
points list

ํ”„๋กฌํ”„ํŠธ ์„ธ๋ถ„ํ™”๋ฅผ ์œ„ํ•œ ํฌ์ธํŠธ ๋ชฉ๋ก์ž…๋‹ˆ๋‹ค. ๊ธฐ๋ณธ๊ฐ’์€ ์—†์Œ์ž…๋‹ˆ๋‹ค.

None
labels list

ํ”„๋กฌํ”„ํŠธ ์„ธ๋ถ„ํ™”๋ฅผ ์œ„ํ•œ ๋ ˆ์ด๋ธ” ๋ชฉ๋ก์ž…๋‹ˆ๋‹ค. ๊ธฐ๋ณธ๊ฐ’์€ ์—†์Œ์ž…๋‹ˆ๋‹ค.

None

๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค:

์œ ํ˜• ์„ค๋ช…
list

๋ชจ๋ธ ์˜ˆ์ธก.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/models/sam/model.py
def predict(self, source, stream=False, bboxes=None, points=None, labels=None, **kwargs):
    """
    Performs segmentation prediction on the given image or video source.

    Args:
        source (str): Path to the image or video file, or a PIL.Image object, or a numpy.ndarray object.
        stream (bool, optional): If True, enables real-time streaming. Defaults to False.
        bboxes (list, optional): List of bounding box coordinates for prompted segmentation. Defaults to None.
        points (list, optional): List of points for prompted segmentation. Defaults to None.
        labels (list, optional): List of labels for prompted segmentation. Defaults to None.

    Returns:
        (list): The model predictions.
    """
    overrides = dict(conf=0.25, task="segment", mode="predict", imgsz=1024)
    kwargs.update(overrides)
    prompts = dict(bboxes=bboxes, points=points, labels=labels)
    return super().predict(source, stream, prompts=prompts, **kwargs)





์ƒ์„ฑ๋จ 2023-11-12, ์—…๋ฐ์ดํŠธ๋จ 2023-11-25
์ž‘์„ฑ์ž: glenn-jocher (3)