Skip to content

Monitoring

Ultralytics Platform provides comprehensive monitoring for deployed endpoints. Track request metrics, view logs, and analyze performance in real-time.

Monitoring Dashboard

Access the global monitoring dashboard from the sidebar:

  1. Click Monitoring in the sidebar
  2. View all deployments at a glance
  3. Click individual endpoints for details

Overview Cards

MetricDescription
Total RequestsRequests across all endpoints (24h)
Active DeploymentsCurrently running endpoints
Error RatePercentage of failed requests
Avg LatencyMean response time

Deployments Table

View all deployments with key metrics:

ColumnDescription
ModelModel name with link
RegionDeployed region with flag
StatusRunning/Stopped indicator
RequestsRequest count (24h)
LatencyP50 response time
ErrorsError count (24h)
SparklineTraffic trend visualization

Real-Time Updates

The dashboard polls every 30 seconds. Click refresh for immediate updates.

Endpoint Metrics

View detailed metrics for individual endpoints:

  1. Navigate to your model's Deploy tab
  2. Click on an endpoint
  3. View the metrics panel

Available Metrics

MetricDescriptionUnit
Request CountTotal requests over timecount
Request LatencyResponse time distributionms
Error RateFailed request percentage%
Instance CountActive container instancescount
CPU UtilizationProcessor usage%
Memory UsageRAM consumptionMB

Time Ranges

Select time range for metrics:

RangeDescription
1hLast hour
6hLast 6 hours
24hLast 24 hours (default)
7dLast 7 days

Metric Charts

Interactive charts show:

  • Line graphs for trends over time
  • Hover for exact values
  • Zoom to analyze specific periods

Logs

View request logs for debugging:

Log Entries

Each log entry shows:

FieldDescription
TimestampRequest time
SeverityINFO, WARNING, ERROR
MessageLog content
Request IDUnique identifier

Severity Levels

Filter logs by severity:

LevelColorDescription
INFOBlueNormal requests
WARNINGYellowNon-critical issues
ERRORRedFailed requests

Log Filtering

Filter logs to find issues:

  1. Select severity level
  2. Search by keyword
  3. Filter by time range

Alerts

Set up alerts for endpoint issues (coming soon):

Alert TypeTrigger
High Error RateError rate > threshold
High LatencyP95 latency > threshold
No RequestsZero requests for period
ScalingInstances at max capacity

Performance Optimization

Use monitoring data to optimize:

High Latency

If latency is too high:

  1. Check instance count (may need more)
  2. Verify model size is appropriate
  3. Consider closer region
  4. Check image sizes being sent

High Error Rate

If errors are occurring:

  1. Review error logs for details
  2. Check request format
  3. Verify API key is valid
  4. Check rate limits

Scaling Issues

If hitting capacity:

  1. Increase max instances
  2. Set min instances > 0
  3. Consider multiple regions
  4. Optimize request batching

Export Data

Export monitoring data for analysis:

  1. Select time range
  2. Click Export
  3. Download CSV file

Export includes:

  • Timestamp
  • Request count
  • Latency metrics
  • Error counts
  • Instance metrics

FAQ

How long is data retained?

Data TypeRetention
Metrics30 days
Logs7 days
Alerts90 days

Can I set up external monitoring?

Yes, endpoint URLs work with external monitoring tools:

  • Uptime monitoring (Pingdom, UptimeRobot)
  • APM tools (Datadog, New Relic)
  • Custom health checks

How accurate are the latency numbers?

Latency metrics measure:

  • P50: Median response time
  • P95: 95th percentile
  • P99: 99th percentile

These represent server-side processing time, not including network latency to your users.

Why are my metrics delayed?

Metrics have a ~2 minute delay due to:

  • Metrics aggregation pipeline
  • Aggregation windows
  • Dashboard caching

For real-time debugging, check logs which are near-instant.

Can I monitor multiple endpoints together?

Yes, the global monitoring dashboard shows all endpoints. Use the table to compare performance across deployments.



📅 Created 0 days ago ✏️ Updated 0 days ago
glenn-jocher

Comments