Inference
Ultralytics Platform provides an inference API for testing trained models. Use the browser-based Test tab for quick validation or the REST API for programmatic access.
Test Tab
Every model includes a Test tab for browser-based inference:
- Navigate to your model
- Click the Test tab
- Upload an image or use examples
- View predictions instantly
Upload Image
Drag and drop or click to upload:
- Supported formats: JPG, PNG, WebP, GIF
- Max size: 10MB
- Auto-inference: Results appear automatically
Example Images
Use built-in example images for quick testing:
| Image | Content |
|---|---|
bus.jpg | Street scene with vehicles |
zidane.jpg | Sports scene with people |
View Results
Inference results display:
- Bounding boxes with class labels
- Confidence scores for each detection
- Class colors matching your dataset
Inference Parameters
Adjust detection behavior with parameters:
| Parameter | Range | Default | Description |
|---|---|---|---|
| Confidence | 0.0-1.0 | 0.25 | Minimum confidence threshold |
| IoU | 0.0-1.0 | 0.45 | NMS IoU threshold |
| Image Size | 32-1280 | 640 | Input resize dimension |
Confidence Threshold
Filter predictions by confidence:
- Higher (0.5+): Fewer, more certain predictions
- Lower (0.1-0.25): More predictions, some noise
- Default (0.25): Balanced for most use cases
IoU Threshold
Control Non-Maximum Suppression:
- Higher (0.7+): Allow overlapping boxes
- Lower (0.3-0.45): Merge nearby detections
- Default (0.45): Standard NMS behavior
REST API
Access inference programmatically:
Authentication
Include your API key in requests:
Authorization: Bearer YOUR_API_KEY
Endpoint
POST https://platform.ultralytics.com/api/models/{model_slug}/predict
Request
curl -X POST \
"https://platform.ultralytics.com/api/models/username/project/model/predict" \
-H "Authorization: Bearer YOUR_API_KEY" \
-F "file=@image.jpg" \
-F "conf=0.25" \
-F "iou=0.45"
import requests
url = "https://platform.ultralytics.com/api/models/username/project/model/predict"
headers = {"Authorization": "Bearer YOUR_API_KEY"}
files = {"file": open("image.jpg", "rb")}
data = {"conf": 0.25, "iou": 0.45}
response = requests.post(url, headers=headers, files=files, data=data)
print(response.json())
Response
{
"success": true,
"predictions": [
{
"class": "person",
"confidence": 0.92,
"box": {
"x1": 100,
"y1": 50,
"x2": 300,
"y2": 400
}
},
{
"class": "car",
"confidence": 0.87,
"box": {
"x1": 400,
"y1": 200,
"x2": 600,
"y2": 350
}
}
],
"image": {
"width": 1920,
"height": 1080
}
}
Response Fields
| Field | Type | Description |
|---|---|---|
success | boolean | Request status |
predictions | array | List of detections |
predictions[].class | string | Class name |
predictions[].confidence | float | Detection confidence (0-1) |
predictions[].box | object | Bounding box coordinates |
image | object | Original image dimensions |
Task-Specific Responses
Response format varies by task:
{
"class": "person",
"confidence": 0.92,
"box": {"x1": 100, "y1": 50, "x2": 300, "y2": 400}
}
{
"class": "person",
"confidence": 0.92,
"box": {"x1": 100, "y1": 50, "x2": 300, "y2": 400},
"segments": [[100, 50], [150, 60], ...]
}
{
"class": "person",
"confidence": 0.92,
"box": {"x1": 100, "y1": 50, "x2": 300, "y2": 400},
"keypoints": [
{"x": 200, "y": 75, "conf": 0.95},
...
]
}
{
"predictions": [
{"class": "cat", "confidence": 0.95},
{"class": "dog", "confidence": 0.03}
]
}
Rate Limits
Shared inference has rate limits:
| Plan | Requests/Minute | Requests/Day |
|---|---|---|
| Free | 10 | 100 |
| Pro | 60 | 10,000 |
For higher limits, deploy a dedicated endpoint.
Error Handling
Common error responses:
| Code | Message | Solution |
|---|---|---|
| 400 | Invalid image | Check file format |
| 401 | Unauthorized | Verify API key |
| 404 | Model not found | Check model slug |
| 429 | Rate limited | Wait or upgrade plan |
| 500 | Server error | Retry request |
FAQ
Can I run inference on video?
The API accepts individual frames. For video:
- Extract frames locally
- Send each frame to the API
- Aggregate results
For real-time video, consider deploying a dedicated endpoint.
How do I get the annotated image?
The API returns JSON predictions. To visualize:
- Use predictions to draw boxes locally
- Use Ultralytics
plot()method:
from ultralytics import YOLO
model = YOLO("yolo11n.pt")
results = model("image.jpg")
results[0].save("annotated.jpg")
What's the maximum image size?
- Upload limit: 10MB
- Recommended: <5MB for fast inference
- Auto-resize: Images are resized to
imgszparameter
Large images are automatically resized while preserving aspect ratio.
Can I run batch inference?
The current API processes one image per request. For batch:
- Send concurrent requests
- Use a dedicated endpoint for higher throughput
- Consider local inference for large batches