Skip to content

Inference

Ultralytics Platform provides an inference API for testing trained models. Use the browser-based Test tab for quick validation or the REST API for programmatic access.

Test Tab

Every model includes a Test tab for browser-based inference:

  1. Navigate to your model
  2. Click the Test tab
  3. Upload an image or use examples
  4. View predictions instantly

Upload Image

Drag and drop or click to upload:

  • Supported formats: JPG, PNG, WebP, GIF
  • Max size: 10MB
  • Auto-inference: Results appear automatically

Example Images

Use built-in example images for quick testing:

ImageContent
bus.jpgStreet scene with vehicles
zidane.jpgSports scene with people

View Results

Inference results display:

  • Bounding boxes with class labels
  • Confidence scores for each detection
  • Class colors matching your dataset

Inference Parameters

Adjust detection behavior with parameters:

ParameterRangeDefaultDescription
Confidence0.0-1.00.25Minimum confidence threshold
IoU0.0-1.00.45NMS IoU threshold
Image Size32-1280640Input resize dimension

Confidence Threshold

Filter predictions by confidence:

  • Higher (0.5+): Fewer, more certain predictions
  • Lower (0.1-0.25): More predictions, some noise
  • Default (0.25): Balanced for most use cases

IoU Threshold

Control Non-Maximum Suppression:

  • Higher (0.7+): Allow overlapping boxes
  • Lower (0.3-0.45): Merge nearby detections
  • Default (0.45): Standard NMS behavior

REST API

Access inference programmatically:

Authentication

Include your API key in requests:

Authorization: Bearer YOUR_API_KEY

Endpoint

POST https://platform.ultralytics.com/api/models/{model_slug}/predict

Request

curl -X POST \
  "https://platform.ultralytics.com/api/models/username/project/model/predict" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "file=@image.jpg" \
  -F "conf=0.25" \
  -F "iou=0.45"
import requests

url = "https://platform.ultralytics.com/api/models/username/project/model/predict"
headers = {"Authorization": "Bearer YOUR_API_KEY"}
files = {"file": open("image.jpg", "rb")}
data = {"conf": 0.25, "iou": 0.45}

response = requests.post(url, headers=headers, files=files, data=data)
print(response.json())

Response

{
    "success": true,
    "predictions": [
        {
            "class": "person",
            "confidence": 0.92,
            "box": {
                "x1": 100,
                "y1": 50,
                "x2": 300,
                "y2": 400
            }
        },
        {
            "class": "car",
            "confidence": 0.87,
            "box": {
                "x1": 400,
                "y1": 200,
                "x2": 600,
                "y2": 350
            }
        }
    ],
    "image": {
        "width": 1920,
        "height": 1080
    }
}

Response Fields

FieldTypeDescription
successbooleanRequest status
predictionsarrayList of detections
predictions[].classstringClass name
predictions[].confidencefloatDetection confidence (0-1)
predictions[].boxobjectBounding box coordinates
imageobjectOriginal image dimensions

Task-Specific Responses

Response format varies by task:

{
  "class": "person",
  "confidence": 0.92,
  "box": {"x1": 100, "y1": 50, "x2": 300, "y2": 400}
}
{
  "class": "person",
  "confidence": 0.92,
  "box": {"x1": 100, "y1": 50, "x2": 300, "y2": 400},
  "segments": [[100, 50], [150, 60], ...]
}
{
  "class": "person",
  "confidence": 0.92,
  "box": {"x1": 100, "y1": 50, "x2": 300, "y2": 400},
  "keypoints": [
    {"x": 200, "y": 75, "conf": 0.95},
    ...
  ]
}
{
  "predictions": [
    {"class": "cat", "confidence": 0.95},
    {"class": "dog", "confidence": 0.03}
  ]
}

Rate Limits

Shared inference has rate limits:

PlanRequests/MinuteRequests/Day
Free10100
Pro6010,000

For higher limits, deploy a dedicated endpoint.

Error Handling

Common error responses:

CodeMessageSolution
400Invalid imageCheck file format
401UnauthorizedVerify API key
404Model not foundCheck model slug
429Rate limitedWait or upgrade plan
500Server errorRetry request

FAQ

Can I run inference on video?

The API accepts individual frames. For video:

  1. Extract frames locally
  2. Send each frame to the API
  3. Aggregate results

For real-time video, consider deploying a dedicated endpoint.

How do I get the annotated image?

The API returns JSON predictions. To visualize:

  1. Use predictions to draw boxes locally
  2. Use Ultralytics plot() method:
from ultralytics import YOLO

model = YOLO("yolo11n.pt")
results = model("image.jpg")
results[0].save("annotated.jpg")

What's the maximum image size?

  • Upload limit: 10MB
  • Recommended: <5MB for fast inference
  • Auto-resize: Images are resized to imgsz parameter

Large images are automatically resized while preserving aspect ratio.

Can I run batch inference?

The current API processes one image per request. For batch:

  1. Send concurrent requests
  2. Use a dedicated endpoint for higher throughput
  3. Consider local inference for large batches


📅 Created 0 days ago ✏️ Updated 0 days ago
glenn-jocher

Comments