Reference for ultralytics/data/build.py
Note
This file is available at https://github.com/ultralytics/ultralytics/blob/main/ultralytics/data/build.py. If you spot a problem please help fix it by contributing a Pull Request 🛠️. Thank you 🙏!
ultralytics.data.build.InfiniteDataLoader
Bases: DataLoader
Dataloader that reuses workers.
This dataloader extends the PyTorch DataLoader to provide infinite recycling of workers, which improves efficiency for training loops that need to iterate through the dataset multiple times.
Attributes:
Name | Type | Description |
---|---|---|
batch_sampler |
_RepeatSampler
|
A sampler that repeats indefinitely. |
iterator |
Iterator
|
The iterator from the parent DataLoader. |
Methods:
Source code in ultralytics/data/build.py
__del__
Ensure that workers are properly terminated when the dataloader is deleted.
Source code in ultralytics/data/build.py
__iter__
__len__
ultralytics.data.build._RepeatSampler
Sampler that repeats forever.
This sampler wraps another sampler and yields its contents indefinitely, allowing for infinite iteration over a dataset.
Attributes:
Name | Type | Description |
---|---|---|
sampler |
sampler
|
The sampler to repeat. |
Source code in ultralytics/data/build.py
ultralytics.data.build.seed_worker
Set dataloader worker seed for reproducibility across worker processes.
ultralytics.data.build.build_yolo_dataset
build_yolo_dataset(
cfg,
img_path,
batch,
data,
mode="train",
rect=False,
stride=32,
multi_modal=False,
)
Build and return a YOLO dataset based on configuration parameters.
Source code in ultralytics/data/build.py
ultralytics.data.build.build_grounding
Build and return a GroundingDataset based on configuration parameters.
Source code in ultralytics/data/build.py
ultralytics.data.build.build_dataloader
Create and return an InfiniteDataLoader or DataLoader for training or validation.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dataset
|
Dataset
|
Dataset to load data from. |
required |
batch
|
int
|
Batch size for the dataloader. |
required |
workers
|
int
|
Number of worker threads for loading data. |
required |
shuffle
|
bool
|
Whether to shuffle the dataset. |
True
|
rank
|
int
|
Process rank in distributed training. -1 for single-GPU training. |
-1
|
Returns:
Type | Description |
---|---|
InfiniteDataLoader
|
A dataloader that can be used for training or validation. |
Source code in ultralytics/data/build.py
ultralytics.data.build.check_source
Check the type of input source and return corresponding flag values.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
source
|
str | int | Path | List | Tuple | ndarray | Image | Tensor
|
The input source to check. |
required |
Returns:
Type | Description |
---|---|
tuple
|
A tuple containing: - source: The processed source. - webcam (bool): Whether the source is a webcam. - screenshot (bool): Whether the source is a screenshot. - from_img (bool): Whether the source is an image or list of images. - in_memory (bool): Whether the source is an in-memory object. - tensor (bool): Whether the source is a torch.Tensor. |
Raises:
Type | Description |
---|---|
TypeError
|
If the source type is unsupported. |
Source code in ultralytics/data/build.py
ultralytics.data.build.load_inference_source
Load an inference source for object detection and apply necessary transformations.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
source
|
str | Path | Tensor | Image | ndarray
|
The input source for inference. |
None
|
batch
|
int
|
Batch size for dataloaders. |
1
|
vid_stride
|
int
|
The frame interval for video sources. |
1
|
buffer
|
bool
|
Whether stream frames will be buffered. |
False
|
Returns:
Type | Description |
---|---|
Dataset
|
A dataset object for the specified input source with attached source_type attribute. |