Ultralytics HUB Inference API
After you train a model, you can use the Shared Inference API for free. If you are a Pro user, you can access the Dedicated Inference API. The Ultralytics HUB Inference API allows you to run inference through our REST API without the need to install and set up the Ultralytics YOLO environment locally.
Watch: Ultralytics HUB Inference API Walkthrough
Dedicated Inference API
In response to high demand and widespread interest, we are thrilled to unveil the Ultralytics HUB Dedicated Inference API, offering single-click deployment in a dedicated environment for our Pro users!
Note
We are excited to offer this feature FREE during our public beta as part of the Pro Plan, with paid tiers possible in the future.
- Global Coverage: Deployed across 38 regions worldwide, ensuring low-latency access from any location. See the full list of Google Cloud regions.
- Google Cloud Run-Backed: Backed by Google Cloud Run, providing infinitely scalable and highly reliable infrastructure.
- High Speed: Sub-100ms latency is possible for YOLOv8n inference at 640 resolution from nearby regions based on Ultralytics testing.
- Enhanced Security: Provides robust security features to protect your data and ensure compliance with industry standards. Learn more about Google Cloud security.
To use the Ultralytics HUB Dedicated Inference API, click on the Start Endpoint button. Next, use the unique endpoint URL as described in the guides below.
Tip
Choose the region with the lowest latency for the best performance as described in the documentation.
To shut down the dedicated endpoint, click on the Stop Endpoint button.
Shared Inference API
To use the Ultralytics HUB Shared Inference API, follow the guides below.
Free users have the following usage limits:
- 100 calls / hour
- 1000 calls / month
Pro users have the following usage limits:
- 1000 calls / hour
- 10000 calls / month
Python
To access the Ultralytics HUB Inference API using Python, use the following code:
import requests
# API URL
url = "https://predict.ultralytics.com"
# Headers, use actual API_KEY
headers = {"x-api-key": "API_KEY"}
# Inference arguments (use actual MODEL_ID)
data = {"model": "https://hub.ultralytics.com/models/MODEL_ID", "imgsz": 640, "conf": 0.25, "iou": 0.45}
# Load image and send request
with open("path/to/image.jpg", "rb") as image_file:
files = {"file": image_file}
response = requests.post(url, headers=headers, files=files, data=data)
print(response.json())
Note
Replace MODEL_ID
with the desired model ID, API_KEY
with your actual API key, and path/to/image.jpg
with the path to the image you want to run inference on.
If you are using our Dedicated Inference API, replace the url
as well.
cURL
To access the Ultralytics HUB Inference API using cURL, use the following code:
curl -X POST "https://predict.ultralytics.com" \
-H "x-api-key: API_KEY" \
-F "model=https://hub.ultralytics.com/models/MODEL_ID" \
-F "file=@/path/to/image.jpg" \
-F "imgsz=640" \
-F "conf=0.25" \
-F "iou=0.45"
Note
Replace MODEL_ID
with the desired model ID, API_KEY
with your actual API key, and path/to/image.jpg
with the path to the image you want to run inference on.
If you are using our Dedicated Inference API, replace the url
as well.
Arguments
See the table below for a full list of available inference arguments.
Argument | Default | Type | Description |
---|---|---|---|
file | file | Image or video file to be used for inference. | |
imgsz | 640 | int | Size of the input image, valid range is 32 - 1280 pixels. |
conf | 0.25 | float | Confidence threshold for predictions, valid range 0.01 - 1.0 . |
iou | 0.45 | float | Intersection over Union (IoU) threshold, valid range 0.0 - 0.95 . |
Response
The Ultralytics HUB Inference API returns a JSON response.
Classification
Classification Model
import requests
# API URL
url = "https://predict.ultralytics.com"
# Headers, use actual API_KEY
headers = {"x-api-key": "API_KEY"}
# Inference arguments (use actual MODEL_ID)
data = {"model": "https://hub.ultralytics.com/models/MODEL_ID", "imgsz": 640, "conf": 0.25, "iou": 0.45}
# Load image and send request
with open("path/to/image.jpg", "rb") as image_file:
files = {"file": image_file}
response = requests.post(url, headers=headers, files=files, data=data)
print(response.json())
Detection
Detection Model
import requests
# API URL
url = "https://predict.ultralytics.com"
# Headers, use actual API_KEY
headers = {"x-api-key": "API_KEY"}
# Inference arguments (use actual MODEL_ID)
data = {"model": "https://hub.ultralytics.com/models/MODEL_ID", "imgsz": 640, "conf": 0.25, "iou": 0.45}
# Load image and send request
with open("path/to/image.jpg", "rb") as image_file:
files = {"file": image_file}
response = requests.post(url, headers=headers, files=files, data=data)
print(response.json())
OBB
OBB Model
import requests
# API URL
url = "https://predict.ultralytics.com"
# Headers, use actual API_KEY
headers = {"x-api-key": "API_KEY"}
# Inference arguments (use actual MODEL_ID)
data = {"model": "https://hub.ultralytics.com/models/MODEL_ID", "imgsz": 640, "conf": 0.25, "iou": 0.45}
# Load image and send request
with open("path/to/image.jpg", "rb") as image_file:
files = {"file": image_file}
response = requests.post(url, headers=headers, files=files, data=data)
print(response.json())
{
"images": [
{
"results": [
{
"class": 0,
"name": "person",
"confidence": 0.92,
"box": {
"x1": 374.85565,
"x2": 392.31824,
"x3": 412.81805,
"x4": 395.35547,
"y1": 264.40704,
"y2": 267.45728,
"y3": 150.0966,
"y4": 147.04634
}
}
],
"shape": [
750,
600
],
"speed": {
"inference": 200.8,
"postprocess": 0.8,
"preprocess": 2.8
}
}
],
"metadata": ...
}
Segmentation
Segmentation Model
import requests
# API URL
url = "https://predict.ultralytics.com"
# Headers, use actual API_KEY
headers = {"x-api-key": "API_KEY"}
# Inference arguments (use actual MODEL_ID)
data = {"model": "https://hub.ultralytics.com/models/MODEL_ID", "imgsz": 640, "conf": 0.25, "iou": 0.45}
# Load image and send request
with open("path/to/image.jpg", "rb") as image_file:
files = {"file": image_file}
response = requests.post(url, headers=headers, files=files, data=data)
print(response.json())
{
"images": [
{
"results": [
{
"class": 0,
"name": "person",
"confidence": 0.92,
"box": {
"x1": 118,
"x2": 416,
"y1": 112,
"y2": 660
},
"segments": {
"x": [
266.015625,
266.015625,
258.984375,
...
],
"y": [
110.15625,
113.67188262939453,
120.70311737060547,
...
]
}
}
],
"shape": [
750,
600
],
"speed": {
"inference": 200.8,
"postprocess": 0.8,
"preprocess": 2.8
}
}
],
"metadata": ...
}
Pose
Pose Model
import requests
# API URL
url = "https://predict.ultralytics.com"
# Headers, use actual API_KEY
headers = {"x-api-key": "API_KEY"}
# Inference arguments (use actual MODEL_ID)
data = {"model": "https://hub.ultralytics.com/models/MODEL_ID", "imgsz": 640, "conf": 0.25, "iou": 0.45}
# Load image and send request
with open("path/to/image.jpg", "rb") as image_file:
files = {"file": image_file}
response = requests.post(url, headers=headers, files=files, data=data)
print(response.json())
{
"images": [
{
"results": [
{
"class": 0,
"name": "person",
"confidence": 0.92,
"box": {
"x1": 118,
"x2": 416,
"y1": 112,
"y2": 660
},
"keypoints": {
"visible": [
0.9909399747848511,
0.8162999749183655,
0.9872099757194519,
...
],
"x": [
316.3871765136719,
315.9374694824219,
304.878173828125,
...
],
"y": [
156.4207763671875,
148.05775451660156,
144.93240356445312,
...
]
}
}
],
"shape": [
750,
600
],
"speed": {
"inference": 200.8,
"postprocess": 0.8,
"preprocess": 2.8
}
}
],
"metadata": ...
}