Reference for ultralytics/nn/backends/triton.py
Improvements
This page is sourced from https://github.com/ultralytics/ultralytics/blob/main/ultralytics/nn/backends/triton.py. Have an improvement or example to add? Open a Pull Request — thank you! 🙏
class ultralytics.nn.backends.triton.TritonBackend
TritonBackend()
Bases: BaseBackend
NVIDIA Triton Inference Server backend for remote model serving.
Connects to and runs inference with models hosted on an NVIDIA Triton Inference Server instance via HTTP or gRPC protocols. The model is specified using a triton:// URL scheme.
Methods
| Name | Description |
|---|---|
forward | Run inference via the NVIDIA Triton Inference Server. |
load_model | Connect to a remote model on an NVIDIA Triton Inference Server. |
method ultralytics.nn.backends.triton.TritonBackend.forward
def forward(self, im: torch.Tensor) -> list
Run inference via the NVIDIA Triton Inference Server.
Args
| Name | Type | Description | Default |
|---|---|---|---|
im | torch.Tensor | Input image tensor in BCHW format, normalized to [0, 1]. | required |
Returns
| Type | Description |
|---|---|
list | Model predictions as a list of numpy arrays from the Triton server. |
Source code in ultralytics/nn/backends/triton.py
View on GitHubdef forward(self, im: torch.Tensor) -> list:
"""Run inference via the NVIDIA Triton Inference Server.
Args:
im (torch.Tensor): Input image tensor in BCHW format, normalized to [0, 1].
Returns:
(list): Model predictions as a list of numpy arrays from the Triton server.
"""
return self.model(im.cpu().numpy())
method ultralytics.nn.backends.triton.TritonBackend.load_model
def load_model(self, weight: str | Path) -> None
Connect to a remote model on an NVIDIA Triton Inference Server.
Args
| Name | Type | Description | Default |
|---|---|---|---|
weight | str | Path | Triton model URL (e.g., 'http://localhost:8000/model_name'). | required |
Source code in ultralytics/nn/backends/triton.py
View on GitHubdef load_model(self, weight: str | Path) -> None:
"""Connect to a remote model on an NVIDIA Triton Inference Server.
Args:
weight (str | Path): Triton model URL (e.g., 'http://localhost:8000/model_name').
"""
check_requirements("tritonclient[all]")
from ultralytics.utils.triton import TritonRemoteModel
self.model = TritonRemoteModel(weight)
# Copy metadata from Triton model
if hasattr(self.model, "metadata"):
self.apply_metadata(self.model.metadata)
📅 Created 0 days ago ✏️ Updated 0 days ago