Medical Pills Dataset
The medical-pills detection dataset is a proof-of-concept (POC) dataset, carefully curated to demonstrate the potential of AI in pharmaceutical applications. It contains labeled images specifically designed to train computer vision models for identifying medical-pills.
Watch: How to train Ultralytics YOLO11 Model on Medical Pills Detection Dataset in Google Colab
This dataset serves as a foundational resource for automating essential tasks such as quality control, packaging automation, and efficient sorting in pharmaceutical workflows. By integrating this dataset into projects, researchers and developers can explore innovative solutions that enhance accuracy, streamline operations, and ultimately contribute to improved healthcare outcomes.
Dataset Structure
The medical-pills dataset is divided into two subsets:
- Training set: Consisting of 92 images, each annotated with the class pill.
- Validation set: Comprising 23 images with corresponding annotations.
Applications
Using computer vision for medical-pills detection enables automation in the pharmaceutical industry, supporting tasks like:
- Pharmaceutical Sorting: Automating the sorting of pills based on size, shape, or color to enhance production efficiency.
- AI Research and Development: Serving as a benchmark for developing and testing computer vision algorithms in pharmaceutical use cases.
- Digital Inventory Systems: Powering smart inventory solutions by integrating automated pill recognition for real-time stock monitoring and replenishment planning.
- Quality Control: Ensuring consistency in pill production by identifying defects, irregularities, or contamination.
- Counterfeit Detection: Helping identify potentially counterfeit medications by analyzing visual characteristics against known standards.
Dataset YAML
A YAML configuration file is provided to define the dataset's structure, including paths and classes. For the medical-pills dataset, the medical-pills.yaml file can be accessed at https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/medical-pills.yaml.
ultralytics/cfg/datasets/medical-pills.yaml
# Ultralytics 🚀 AGPL-3.0 License - https://ultralytics.com/license
# Medical-pills dataset by Ultralytics
# Documentation: https://docs.ultralytics.com/datasets/detect/medical-pills/
# Example usage: yolo train data=medical-pills.yaml
# parent
# ├── ultralytics
# └── datasets
#     └── medical-pills ← downloads here (8.19 MB)
# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
path: medical-pills # dataset root dir
train: images/train # train images (relative to 'path') 92 images
val: images/val # val images (relative to 'path') 23 images
# Classes
names:
  0: pill
# Download script/URL (optional)
download: https://github.com/ultralytics/assets/releases/download/v0.0.0/medical-pills.zip
Usage
To train a YOLO11n model on the medical-pills dataset for 100 epochs with an image size of 640, use the following examples. For detailed arguments, refer to the model's Training page.
Train Example
from ultralytics import YOLO
# Load a model
model = YOLO("yolo11n.pt")  # load a pretrained model (recommended for training)
# Train the model
results = model.train(data="medical-pills.yaml", epochs=100, imgsz=640)
# Start training from a pretrained *.pt model
yolo detect train data=medical-pills.yaml model=yolo11n.pt epochs=100 imgsz=640
Inference Example
from ultralytics import YOLO
# Load a model
model = YOLO("path/to/best.pt")  # load a fine-tuned model
# Inference using the model
results = model.predict("https://ultralytics.com/assets/medical-pills-sample.jpg")
# Start prediction with a fine-tuned *.pt model
yolo detect predict model='path/to/best.pt' imgsz=640 source="https://ultralytics.com/assets/medical-pills-sample.jpg"
Sample Images and Annotations
The medical-pills dataset features labeled images showcasing the diversity of pills. Below is an example of a labeled image from the dataset:

- Mosaiced Image: Displayed is a training batch comprising mosaiced dataset images. Mosaicing enhances training diversity by consolidating multiple images into one, improving model generalization.
Integration with Other Datasets
For more comprehensive pharmaceutical analysis, consider combining the medical-pills dataset with other related datasets like package-seg for packaging identification or medical imaging datasets like brain-tumor to develop end-to-end healthcare AI solutions.
Citations and Acknowledgments
The dataset is available under the AGPL-3.0 License.
If you use the Medical-pills dataset in your research or development work, please cite it using the mentioned details:
@dataset{Jocher_Ultralytics_Datasets_2024,
    author = {Jocher, Glenn and Rizwan, Muhammad},
    license = {AGPL-3.0},
    month = {Dec},
    title = {Ultralytics Datasets: Medical-pills Detection Dataset},
    url = {https://docs.ultralytics.com/datasets/detect/medical-pills/},
    version = {1.0.0},
    year = {2024}
}
FAQ
What is the structure of the medical-pills dataset?
The dataset includes 92 images for training and 23 images for validation. Each image is annotated with the class pill, enabling effective training and evaluation of models for pharmaceutical applications.
How can I train a YOLO11 model on the medical-pills dataset?
You can train a YOLO11 model for 100 epochs with an image size of 640px using the Python or CLI methods provided. Refer to the Training Example section for detailed instructions and check the YOLO11 documentation for more information on model capabilities.
What are the benefits of using the medical-pills dataset in AI projects?
The dataset enables automation in pill detection, contributing to counterfeit prevention, quality assurance, and pharmaceutical process optimization. It also serves as a valuable resource for developing AI solutions that can improve medication safety and supply chain efficiency.
How do I perform inference on the medical-pills dataset?
Inference can be done using Python or CLI methods with a fine-tuned YOLO11 model. Refer to the Inference Example section for code snippets and the Predict mode documentation for additional options.
Where can I find the YAML configuration file for the medical-pills dataset?
The YAML file is available at medical-pills.yaml, containing dataset paths, classes, and additional configuration details essential for training models on this dataset.