Link to this sectionModal Quickstart Guide for Ultralytics#
This guide provides a comprehensive introduction to running Ultralytics YOLO26 on Modal, covering serverless GPU inference and model training.
Link to this sectionWhat is Modal?#
Modal is a serverless cloud computing platform for AI and machine learning workloads. It handles provisioning, scaling, and execution automatically — you write Python code locally and Modal runs it in the cloud with GPU access. This makes it ideal for running deep learning models like YOLO26 without managing infrastructure.
Link to this sectionWhat You Will Learn#
- Setting up Modal and authenticating
- Running YOLO26 inference on Modal
- Using GPUs for faster inference
- Training YOLO26 models on Modal
Link to this sectionPrerequisites#
- A Modal account (sign up for free at modal.com)
- Python 3.9 or later installed on your local machine
Link to this sectionInstallation#
Install the Modal Python package and authenticate:
pip install modalmodal token newThe modal token new command will open a browser window to authenticate your Modal account. After authentication, you can run Modal commands from the terminal.
Link to this sectionRunning YOLO26 Inference#
Create a new Python file called modal_yolo.py with the following code:
"""
Modal + Ultralytics YOLO26 Quickstart
Run: modal run modal_yolo.py.
"""
import modal
app = modal.App("ultralytics-yolo")
image = modal.Image.debian_slim(python_version="3.11").apt_install("libgl1", "libglib2.0-0").pip_install("ultralytics")
@app.function(image=image)
def predict(image_url: str):
"""Run YOLO26 inference on an image URL."""
from ultralytics import YOLO
model = YOLO("yolo26n.pt")
results = model(image_url)
for r in results:
print(f"Detected {len(r.boxes)} objects:")
for box in r.boxes:
print(f" - {model.names[int(box.cls)]}: {float(box.conf):.2f}")
@app.local_entrypoint()
def main():
"""Test inference with sample image."""
predict.remote("https://ultralytics.com/images/bus.jpg")Run the inference:
modal run modal_yolo.pyExpected output:
✓ Initialized. View run at https://modal.com/apps/your-username/main/ap-xxxxxxxx
✓ Created objects.
├── 🔨 Created mount modal_yolo.py
└── 🔨 Created function predict.
Downloading https://github.com/ultralytics/assets/releases/download/v8.4.0/yolo26n.pt to 'yolo26n.pt'...
Downloading https://ultralytics.com/images/bus.jpg to 'bus.jpg'...
image 1/1 /root/bus.jpg: 640x480 4 persons, 1 bus, 377.8ms
Speed: 5.8ms preprocess, 377.8ms inference, 0.3ms postprocess per image at shape (1, 3, 640, 480)
Detected 5 objects:
- bus: 0.92
- person: 0.91
- person: 0.91
- person: 0.87
- person: 0.53
✓ App completed.You can monitor your function execution in the Modal dashboard:
Link to this sectionUsing GPU for Faster Inference#
Add a GPU to your function by specifying the gpu parameter:
@app.function(image=image, gpu="T4") # Options: "T4", "A10G", "A100", "H100"
def predict_gpu(image_url: str):
"""Run YOLO26 inference on GPU."""
from ultralytics import YOLO
model = YOLO("yolo26n.pt")
results = model(image_url)
print(results[0].boxes)| GPU | Memory | Best For |
|---|---|---|
| T4 | 16 GB | Inference, small model training |
| A10G | 24 GB | Medium training jobs |
| A100 | 40 GB | Large-scale training |
| H100 | 80 GB | Maximum performance |
Link to this sectionTraining YOLO26 on Modal#
For training, use a GPU and Modal Volumes for persistent storage. Create a new Python file called train_yolo.py:
import modal
app = modal.App("ultralytics-training")
volume = modal.Volume.from_name("yolo-training-vol", create_if_missing=True)
image = modal.Image.debian_slim(python_version="3.11").apt_install("libgl1", "libglib2.0-0").pip_install("ultralytics")
@app.function(image=image, gpu="T4", timeout=3600, volumes={"/data": volume})
def train():
"""Train YOLO26 model on Modal."""
from ultralytics import YOLO
model = YOLO("yolo26n.pt")
model.train(data="coco8.yaml", epochs=3, imgsz=640, project="/data/runs")
@app.local_entrypoint()
def main():
train.remote()Run training:
modal run train_yolo.pyModal Volumes persist data between function runs. Trained weights are saved to /data/runs/detect/train/weights/.
Congratulations! You have successfully set up Ultralytics YOLO26 on Modal. For further learning:
- Explore the Ultralytics YOLO26 documentation for advanced features
- Learn about training custom models with your own datasets
- Visit the Modal documentation for advanced platform features
Link to this sectionFAQ#
Link to this sectionHow do I choose the right GPU for my YOLO26 workload?#
For inference, an NVIDIA T4 (16 GB) is typically sufficient and cost-effective. For training or larger models like YOLO26x, consider A10G or A100 GPUs.
Link to this sectionHow much does it cost to run YOLO26 on Modal?#
Modal uses pay-per-second pricing. Approximate rates: CPU ~$0.05/hr, T4 ~$0.59/hr, A10G ~$1.10/hr, A100 ~$2.10/hr. Check Modal pricing for current rates.
Link to this sectionCan I use my own custom-trained YOLO model?#
Yes! Load custom models from a Modal Volume:
model = YOLO("/data/my_custom_model.pt")For more information on training custom models, see the training guide.