triton.py`#

Improvements

This page is sourced from https://github.com/ultralytics/ultralytics/blob/main/ultralytics/nn/backends/triton.py. Have an improvement or example to add? Open a Pull Request — thank you! 🙏

Summary

TritonBackend

Link to this sectionClass `ultralytics.nn.backends.triton.TritonBackend`#

TritonBackend()

Bases: BaseBackend

NVIDIA Triton Inference Server backend for remote model serving.

Connects to and runs inference with models hosted on an NVIDIA Triton Inference Server instance via HTTP or gRPC protocols. The model is specified using a triton:// URL scheme.

Methods

Name	Description
`forward`	Run inference via the NVIDIA Triton Inference Server.
`load_model`	Connect to a remote model on an NVIDIA Triton Inference Server.

Source code in ultralytics/nn/backends/triton.py

View on GitHub

class TritonBackend(BaseBackend):

Link to this sectionMethod `ultralytics.nn.backends.triton.TritonBackend.forward`#

def forward(self, im: torch.Tensor) -> list

Run inference via the NVIDIA Triton Inference Server.

Args

Name	Type	Description	Default
`im`	`torch.Tensor`	Input image tensor in BCHW format, normalized to [0, 1].	required

Returns

Type	Description
`list`	Model predictions as a list of numpy arrays from the Triton server.

Source code in ultralytics/nn/backends/triton.py

View on GitHub

def forward(self, im: torch.Tensor) -> list:
    """Run inference via the NVIDIA Triton Inference Server.

    Args:
        im (torch.Tensor): Input image tensor in BCHW format, normalized to [0, 1].

    Returns:
        (list): Model predictions as a list of numpy arrays from the Triton server.
    """
    return self.model(im.cpu().numpy())

Link to this sectionMethod `ultralytics.nn.backends.triton.TritonBackend.load_model`#

def load_model(self, weight: str | Path) -> None

Connect to a remote model on an NVIDIA Triton Inference Server.

Args

Name	Type	Description	Default
`weight`	`str	Path`	Triton model URL (e.g., 'triton://host:8000/model_name').

Source code in ultralytics/nn/backends/triton.py

View on GitHub

def load_model(self, weight: str | Path) -> None:
    """Connect to a remote model on an NVIDIA Triton Inference Server.

    Args:
        weight (str | Path): Triton model URL (e.g., 'triton://host:8000/model_name').
    """
    check_requirements("tritonclient[all]")
    from ultralytics.utils.triton import TritonRemoteModel

    self.model = TritonRemoteModel(weight)

    # Copy metadata from Triton model
    if hasattr(self.model, "metadata"):
        self.apply_metadata(self.model.metadata)

Contributors

GLglenn-jocher² LALaughing-q¹

Created 4 months agoUpdated 3 weeks ago

Link to this sectionReference for ultralytics/nn/backends/triton.py#

Link to this sectionClass ultralytics.nn.backends.triton.TritonBackend#

Link to this sectionMethod ultralytics.nn.backends.triton.TritonBackend.forward#

Link to this sectionMethod ultralytics.nn.backends.triton.TritonBackend.load_model#

Link to this sectionReference for `ultralytics/nn/backends/triton.py`#

Link to this sectionClass `ultralytics.nn.backends.triton.TritonBackend`#

Link to this sectionMethod `ultralytics.nn.backends.triton.TritonBackend.forward`#

Link to this sectionMethod `ultralytics.nn.backends.triton.TritonBackend.load_model`#