์ฝ˜ํ…์ธ ๋กœ ๊ฑด๋„ˆ๋›ฐ๊ธฐ

์ฐธ์กฐ ultralytics/utils/triton.py

์ฐธ๊ณ 

์ด ํŒŒ์ผ์€ https://github.com/ultralytics/ ultralytics/blob/main/ ultralytics/utils/ triton.py์—์„œ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋ฌธ์ œ๋ฅผ ๋ฐœ๊ฒฌํ•˜๋ฉด ํ’€ ๋ฆฌํ€˜์ŠคํŠธ ๐Ÿ› ๏ธ ์— ๊ธฐ์—ฌํ•˜์—ฌ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๋„๋ก ๋„์™€์ฃผ์„ธ์š”. ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค ๐Ÿ™!



ultralytics.utils.triton.TritonRemoteModel

์›๊ฒฉ Triton ์ถ”๋ก  ์„œ๋ฒ„ ๋ชจ๋ธ๊ณผ ์ƒํ˜ธ ์ž‘์šฉํ•˜๊ธฐ ์œ„ํ•œ ํด๋ผ์ด์–ธํŠธ์ž…๋‹ˆ๋‹ค.

์†์„ฑ:

์ด๋ฆ„ ์œ ํ˜• ์„ค๋ช…
endpoint str

Triton ์„œ๋ฒ„์— ์žˆ๋Š” ๋ชจ๋ธ์˜ ์ด๋ฆ„์ž…๋‹ˆ๋‹ค.

url str

Triton ์„œ๋ฒ„์˜ URL์ž…๋‹ˆ๋‹ค.

triton_client

Triton ํด๋ผ์ด์–ธํŠธ(HTTP ๋˜๋Š” gRPC).

InferInput

Triton ํด๋ผ์ด์–ธํŠธ์˜ ์ž…๋ ฅ ํด๋ž˜์Šค์ž…๋‹ˆ๋‹ค.

InferRequestedOutput

Triton ํด๋ผ์ด์–ธํŠธ์˜ ์ถœ๋ ฅ ์š”์ฒญ ํด๋ž˜์Šค์ž…๋‹ˆ๋‹ค.

input_formats List[str]

๋ชจ๋ธ ์ž…๋ ฅ์˜ ๋ฐ์ดํ„ฐ ์œ ํ˜•์ž…๋‹ˆ๋‹ค.

np_input_formats List[type]

๋ชจ๋ธ ์ž…๋ ฅ์˜ ์ˆซ์ž ๋ฐ์ดํ„ฐ ์œ ํ˜•์ž…๋‹ˆ๋‹ค.

input_names List[str]

๋ชจ๋ธ ์ž…๋ ฅ์˜ ์ด๋ฆ„์ž…๋‹ˆ๋‹ค.

output_names List[str]

๋ชจ๋ธ ์ถœ๋ ฅ์˜ ์ด๋ฆ„์ž…๋‹ˆ๋‹ค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/utils/triton.py
class TritonRemoteModel:
    """
    Client for interacting with a remote Triton Inference Server model.

    Attributes:
        endpoint (str): The name of the model on the Triton server.
        url (str): The URL of the Triton server.
        triton_client: The Triton client (either HTTP or gRPC).
        InferInput: The input class for the Triton client.
        InferRequestedOutput: The output request class for the Triton client.
        input_formats (List[str]): The data types of the model inputs.
        np_input_formats (List[type]): The numpy data types of the model inputs.
        input_names (List[str]): The names of the model inputs.
        output_names (List[str]): The names of the model outputs.
    """

    def __init__(self, url: str, endpoint: str = "", scheme: str = ""):
        """
        Initialize the TritonRemoteModel.

        Arguments may be provided individually or parsed from a collective 'url' argument of the form
            <scheme>://<netloc>/<endpoint>/<task_name>

        Args:
            url (str): The URL of the Triton server.
            endpoint (str): The name of the model on the Triton server.
            scheme (str): The communication scheme ('http' or 'grpc').
        """
        if not endpoint and not scheme:  # Parse all args from URL string
            splits = urlsplit(url)
            endpoint = splits.path.strip("/").split("/")[0]
            scheme = splits.scheme
            url = splits.netloc

        self.endpoint = endpoint
        self.url = url

        # Choose the Triton client based on the communication scheme
        if scheme == "http":
            import tritonclient.http as client  # noqa

            self.triton_client = client.InferenceServerClient(url=self.url, verbose=False, ssl=False)
            config = self.triton_client.get_model_config(endpoint)
        else:
            import tritonclient.grpc as client  # noqa

            self.triton_client = client.InferenceServerClient(url=self.url, verbose=False, ssl=False)
            config = self.triton_client.get_model_config(endpoint, as_json=True)["config"]

        # Sort output names alphabetically, i.e. 'output0', 'output1', etc.
        config["output"] = sorted(config["output"], key=lambda x: x.get("name"))

        # Define model attributes
        type_map = {"TYPE_FP32": np.float32, "TYPE_FP16": np.float16, "TYPE_UINT8": np.uint8}
        self.InferRequestedOutput = client.InferRequestedOutput
        self.InferInput = client.InferInput
        self.input_formats = [x["data_type"] for x in config["input"]]
        self.np_input_formats = [type_map[x] for x in self.input_formats]
        self.input_names = [x["name"] for x in config["input"]]
        self.output_names = [x["name"] for x in config["output"]]

    def __call__(self, *inputs: np.ndarray) -> List[np.ndarray]:
        """
        Call the model with the given inputs.

        Args:
            *inputs (List[np.ndarray]): Input data to the model.

        Returns:
            (List[np.ndarray]): Model outputs.
        """
        infer_inputs = []
        input_format = inputs[0].dtype
        for i, x in enumerate(inputs):
            if x.dtype != self.np_input_formats[i]:
                x = x.astype(self.np_input_formats[i])
            infer_input = self.InferInput(self.input_names[i], [*x.shape], self.input_formats[i].replace("TYPE_", ""))
            infer_input.set_data_from_numpy(x)
            infer_inputs.append(infer_input)

        infer_outputs = [self.InferRequestedOutput(output_name) for output_name in self.output_names]
        outputs = self.triton_client.infer(model_name=self.endpoint, inputs=infer_inputs, outputs=infer_outputs)

        return [outputs.as_numpy(output_name).astype(input_format) for output_name in self.output_names]

__call__(*inputs)

์ฃผ์–ด์ง„ ์ž…๋ ฅ๊ฐ’์œผ๋กœ ๋ชจ๋ธ์„ ํ˜ธ์ถœํ•ฉ๋‹ˆ๋‹ค.

๋งค๊ฐœ๋ณ€์ˆ˜:

์ด๋ฆ„ ์œ ํ˜• ์„ค๋ช… ๊ธฐ๋ณธ๊ฐ’
*inputs List[ndarray]

๋ชจ๋ธ์— ๋ฐ์ดํ„ฐ๋ฅผ ์ž…๋ ฅํ•ฉ๋‹ˆ๋‹ค.

()

๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค:

์œ ํ˜• ์„ค๋ช…
List[ndarray]

๋ชจ๋ธ ์ถœ๋ ฅ.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/utils/triton.py
def __call__(self, *inputs: np.ndarray) -> List[np.ndarray]:
    """
    Call the model with the given inputs.

    Args:
        *inputs (List[np.ndarray]): Input data to the model.

    Returns:
        (List[np.ndarray]): Model outputs.
    """
    infer_inputs = []
    input_format = inputs[0].dtype
    for i, x in enumerate(inputs):
        if x.dtype != self.np_input_formats[i]:
            x = x.astype(self.np_input_formats[i])
        infer_input = self.InferInput(self.input_names[i], [*x.shape], self.input_formats[i].replace("TYPE_", ""))
        infer_input.set_data_from_numpy(x)
        infer_inputs.append(infer_input)

    infer_outputs = [self.InferRequestedOutput(output_name) for output_name in self.output_names]
    outputs = self.triton_client.infer(model_name=self.endpoint, inputs=infer_inputs, outputs=infer_outputs)

    return [outputs.as_numpy(output_name).astype(input_format) for output_name in self.output_names]

__init__(url, endpoint='', scheme='')

ํŠธ๋ฆฌํ†ค๋ฆฌ๋ชจํŠธ๋ชจ๋ธ์„ ์ดˆ๊ธฐํ™”ํ•ฉ๋‹ˆ๋‹ค.

์ธ์ˆ˜๋Š” ๊ฐœ๋ณ„์ ์œผ๋กœ ์ œ๊ณต๋˜๊ฑฐ๋‚˜ ๋‹ค์Œ๊ณผ ๊ฐ™์€ ํ˜•์‹์˜ ์ง‘ํ•ฉ์  'URL' ์ธ์ˆ˜์—์„œ ๊ตฌ๋ฌธ ๋ถ„์„๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. :////

๋งค๊ฐœ๋ณ€์ˆ˜:

์ด๋ฆ„ ์œ ํ˜• ์„ค๋ช… ๊ธฐ๋ณธ๊ฐ’
url str

Triton ์„œ๋ฒ„์˜ URL์ž…๋‹ˆ๋‹ค.

ํ•„์ˆ˜
endpoint str

Triton ์„œ๋ฒ„์— ์žˆ๋Š” ๋ชจ๋ธ์˜ ์ด๋ฆ„์ž…๋‹ˆ๋‹ค.

''
scheme str

ํ†ต์‹  ์ฒด๊ณ„('http' ๋˜๋Š” 'grpc').

''
์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/utils/triton.py
def __init__(self, url: str, endpoint: str = "", scheme: str = ""):
    """
    Initialize the TritonRemoteModel.

    Arguments may be provided individually or parsed from a collective 'url' argument of the form
        <scheme>://<netloc>/<endpoint>/<task_name>

    Args:
        url (str): The URL of the Triton server.
        endpoint (str): The name of the model on the Triton server.
        scheme (str): The communication scheme ('http' or 'grpc').
    """
    if not endpoint and not scheme:  # Parse all args from URL string
        splits = urlsplit(url)
        endpoint = splits.path.strip("/").split("/")[0]
        scheme = splits.scheme
        url = splits.netloc

    self.endpoint = endpoint
    self.url = url

    # Choose the Triton client based on the communication scheme
    if scheme == "http":
        import tritonclient.http as client  # noqa

        self.triton_client = client.InferenceServerClient(url=self.url, verbose=False, ssl=False)
        config = self.triton_client.get_model_config(endpoint)
    else:
        import tritonclient.grpc as client  # noqa

        self.triton_client = client.InferenceServerClient(url=self.url, verbose=False, ssl=False)
        config = self.triton_client.get_model_config(endpoint, as_json=True)["config"]

    # Sort output names alphabetically, i.e. 'output0', 'output1', etc.
    config["output"] = sorted(config["output"], key=lambda x: x.get("name"))

    # Define model attributes
    type_map = {"TYPE_FP32": np.float32, "TYPE_FP16": np.float16, "TYPE_UINT8": np.uint8}
    self.InferRequestedOutput = client.InferRequestedOutput
    self.InferInput = client.InferInput
    self.input_formats = [x["data_type"] for x in config["input"]]
    self.np_input_formats = [type_map[x] for x in self.input_formats]
    self.input_names = [x["name"] for x in config["input"]]
    self.output_names = [x["name"] for x in config["output"]]





Created 2023-11-12, Updated 2024-06-02
Authors: glenn-jocher (6), Burhan-Q (1)