Reference for `ultralytics/data/augment.py`

Improvements

This page is sourced from https://github.com/ultralytics/ultralytics/blob/main/ultralytics/data/augment.py. Have an improvement or example to add? Open a Pull Request — thank you! 🙏

Summary

ClassesMethodsFunctions

BaseTransform
Compose
BaseMixTransform
Mosaic
MixUp
CutMix
RandomPerspective
RandomHSV
RandomFlip
LetterBox
CopyPaste
Albumentations
Format
LoadVisualPrompt
RandomLoadText
ClassifyLetterBox
CenterCrop
ToTensor

BaseTransform.apply_image
BaseTransform.apply_instances
BaseTransform.apply_semantic
BaseTransform.__call__
Compose.__call__
Compose.append
Compose.insert
Compose.__getitem__
Compose.__setitem__
Compose.tolist
Compose.__repr__
BaseMixTransform.__call__
BaseMixTransform._mix_transform
BaseMixTransform.get_indexes
BaseMixTransform._update_label_text
Mosaic.get_indexes
Mosaic._mix_transform
Mosaic._mosaic3
Mosaic._mosaic4
Mosaic._mosaic9
Mosaic._update_labels
Mosaic._cat_labels
MixUp._mix_transform
CutMix._rand_bbox
CutMix._mix_transform
RandomPerspective.affine_transform
RandomPerspective.apply_bboxes
RandomPerspective.apply_segments
RandomPerspective.apply_keypoints
RandomPerspective.__call__
RandomPerspective.box_candidates
RandomHSV.__call__
RandomFlip.__call__
LetterBox.__call__
LetterBox._update_labels
CopyPaste._mix_transform
CopyPaste.__call__
CopyPaste._transform
Albumentations.__call__
Format.__call__
Format._format_img
Format._format_segments
LoadVisualPrompt.make_mask
LoadVisualPrompt.__call__
LoadVisualPrompt.get_visuals
RandomLoadText.__call__
ClassifyLetterBox.__call__
CenterCrop.__call__
ToTensor.__call__

v8_transforms
classify_transforms
classify_augmentations

class `ultralytics.data.augment.BaseTransform`

BaseTransform(self) -> None

Base class for image transformations in the Ultralytics library.

This class serves as a foundation for implementing various image processing operations, designed to be compatible with both classification and semantic segmentation tasks.

This constructor sets up the base transformation object, which can be extended for specific image processing tasks. It is designed to be compatible with both classification and semantic segmentation.

Methods

Name	Description
`__call__`	Apply all label transformations to an image, instances, and semantic masks.
`apply_image`	Apply image transformations to labels.
`apply_instances`	Apply transformations to object instances in labels.
`apply_semantic`	Apply semantic segmentation transformations to an image.

Examples

>>> transform = BaseTransform()
>>> labels = {"image": np.array(...), "instances": [...], "semantic": np.array(...)}
>>> transformed_labels = transform(labels)

Source code in ultralytics/data/augment.py

Name	Description
`__call__`	Apply a series of transformations to input data.
`__getitem__`	Retrieve a specific transform or a set of transforms using indexing.
`__repr__`	Return a string representation of the Compose object.
`__setitem__`	Set one or more transforms in the composition using indexing.
`append`	Append a new transform to the existing list of transforms.
`insert`	Insert a new transform at a specified index in the existing list of transforms.
`tolist`	Convert the list of transforms to a standard Python list.

Name	Type	Description	Default
`index`	`int \| list[int]`	Index or list of indices to set transforms at.	required
`value`	`Any \| list[Any]`	Transform or list of transforms to set at the specified index(es).	required

Name	Type	Description	Default
`index`	`int`	The index at which to insert the new transform.	required
`transform`	`BaseTransform`	The transform object to be inserted.	required

Name	Type	Description	Default
`dataset`	`Any`	The dataset object containing images and labels for mixing.	required
`pre_transform`	`Callable \| None`	Optional transform to apply before mixing.	`None`
`p`	`float`	Probability of applying the mix transformation. Should be in the range [0.0, 1.0].	`0.0`

Name	Description
`__call__`	Apply pre-processing transforms and cutmix/mixup/mosaic transforms to labels data.
`_mix_transform`	Apply CutMix, MixUp or Mosaic augmentation to the label dictionary.
`_update_label_text`	Update label text and class IDs for mixed labels in image augmentation.
`get_indexes`	Get a list of shuffled indexes for mosaic augmentation.

Name	Type	Description	Default
`dataset`	`Any`	The dataset on which the mosaic augmentation is applied.	required
`imgsz`	`int`	Image size (height and width) after mosaic pipeline of a single image.	`640`
`p`	`float`	Probability of applying the mosaic augmentation. Must be in the range 0-1.	`1.0`
`n`	`int`	The grid size, either 4 (for 2x2) or 9 (for 3x3).	`4`

Name	Description
`_cat_labels`	Concatenate and process labels for mosaic augmentation.
`_mix_transform`	Apply mosaic augmentation to the input image and labels.
`_mosaic3`	Create a 1x3 image mosaic by combining three images.
`_mosaic4`	Create a 2x2 image mosaic from four input images.
`_mosaic9`	Create a 3x3 image mosaic from the input image and eight additional images.
`_update_labels`	Update label coordinates with padding values.
`get_indexes`	Return a list of random indexes from the dataset for mosaic augmentation.

Name	Type	Description	Default
`labels`	`dict[str, Any]`	A dictionary containing the input image and its associated labels. It should have	required
`the following keys:<br> - 'img' (np.ndarray): The input image.<br> - 'resized_shape' (tuple[int, int]): The shape of the resized image (height, width).<br> - 'mix_labels' (list[dict]): A list of dictionaries containing information for the additional`			required
`eight images, each with the same structure as the input labels.`			required

Type	Description
`dict[str, Any]`	A dictionary containing the mosaic image and updated labels. It includes the following
`keys`	- 'img' (np.ndarray): The final mosaic image.

Name	Type	Description	Default
`labels`	`dict[str, Any]`	A dictionary containing image and instance information.	required
`padw`	`int`	Padding width to be added to the x-coordinates.	required
`padh`	`int`	Padding height to be added to the y-coordinates.	required

Name	Type	Description	Default
`dataset`	`Any`	The dataset to which MixUp augmentation will be applied.	required
`pre_transform`	`Callable \| None`	Optional transform to apply to images before MixUp.	`None`
`p`	`float`	Probability of applying MixUp augmentation to an image. Must be in the range [0, 1].	`0.0`

Name	Type	Description	Default
`dataset`	`Any`	The dataset to which CutMix augmentation will be applied.	required
`pre_transform`	`Callable \| None`	Optional transform to apply before CutMix.	`None`
`p`	`float`	Probability of applying CutMix augmentation.	`0.0`
`beta`	`float`	Beta distribution parameter for sampling the mixing ratio.	`1.0`
`num_areas`	`int`	Number of areas to try to cut and mix.	`3`

Name	Description
`_mix_transform`	Apply CutMix augmentation to the input labels.
`_rand_bbox`	Generate random bounding box coordinates for the cut region.

Name	Type	Description	Default
`width`	`int`	Width of the image.	required
`height`	`int`	Height of the image.	required

Name	Type	Description	Default
`degrees`	`float`	Degree range for random rotations.	`0.0`
`translate`	`float`	Fraction of total width and height for random translation.	`0.1`
`scale`	`float`	Scaling factor interval, e.g., a scale factor of 0.5 allows a resize between 50%-150%.	`0.5`
`shear`	`float`	Shear intensity (angle in degrees).	`0.0`
`perspective`	`float`	Perspective distortion factor.	`0.0`
`border`	`tuple[int, int]`	Tuple specifying mosaic border (top/bottom, left/right).	`(0, 0)`
`pre_transform`	`Callable \| None`	Function/transform to apply to the image before starting the random transformation.	`None`

Name	Type	Description
`degrees`	`float`	Maximum absolute degree range for random rotations.
`translate`	`float`	Maximum translation as a fraction of the image size.
`scale`	`float`	Scaling factor range, e.g., scale=0.1 means 0.9-1.1.
`shear`	`float`	Maximum shear angle in degrees.
`perspective`	`float`	Perspective distortion factor.
`border`	`tuple[int, int]`	Mosaic border size as (x, y).
`pre_transform`	`Callable \| None`	Optional transform to apply before the random perspective.

Name	Description
`__call__`	Apply random perspective and affine transformations to an image and its associated labels.
`affine_transform`	Apply a sequence of affine transformations centered around the image center.
`apply_bboxes`	Apply affine transformation to bounding boxes.
`apply_keypoints`	Apply affine transformation to keypoints.
`apply_segments`	Apply affine transformations to segments and generate new bounding boxes.
`box_candidates`	Compute candidate boxes for further processing based on size and aspect ratio criteria.

Name	Type	Description	Default
`img`	`np.ndarray`	Input image to be transformed.	required
`border`	`tuple[int, int]`	Border dimensions for the transformed image.	required

Type	Description
`img (np.ndarray)`	Transformed image.
`M (np.ndarray)`	3x3 transformation matrix.
`s (float)`	Scale factor applied during the transformation.

Name	Type	Description	Default
`bboxes`	`np.ndarray`	Bounding boxes in xyxy format with shape (N, 4), where N is the number of bounding boxes.	required
`M`	`np.ndarray`	Affine transformation matrix with shape (3, 3).	required

Reference for ultralytics/data/augment.py

class ultralytics.data.augment.BaseTransform

method ultralytics.data.augment.BaseTransform.__call__

method ultralytics.data.augment.BaseTransform.apply_image

method ultralytics.data.augment.BaseTransform.apply_instances

method ultralytics.data.augment.BaseTransform.apply_semantic

class ultralytics.data.augment.Compose

method ultralytics.data.augment.Compose.__call__

method ultralytics.data.augment.Compose.__getitem__

method ultralytics.data.augment.Compose.__repr__

method ultralytics.data.augment.Compose.__setitem__

method ultralytics.data.augment.Compose.append

method ultralytics.data.augment.Compose.insert

method ultralytics.data.augment.Compose.tolist

class ultralytics.data.augment.BaseMixTransform

method ultralytics.data.augment.BaseMixTransform.__call__

method ultralytics.data.augment.BaseMixTransform._mix_transform

method ultralytics.data.augment.BaseMixTransform._update_label_text

method ultralytics.data.augment.BaseMixTransform.get_indexes

class ultralytics.data.augment.Mosaic

method ultralytics.data.augment.Mosaic._cat_labels

method ultralytics.data.augment.Mosaic._mix_transform

method ultralytics.data.augment.Mosaic._mosaic3

method ultralytics.data.augment.Mosaic._mosaic4

method ultralytics.data.augment.Mosaic._mosaic9

method ultralytics.data.augment.Mosaic._update_labels

method ultralytics.data.augment.Mosaic.get_indexes

class ultralytics.data.augment.MixUp

method ultralytics.data.augment.MixUp._mix_transform

class ultralytics.data.augment.CutMix

method ultralytics.data.augment.CutMix._mix_transform

method ultralytics.data.augment.CutMix._rand_bbox

class ultralytics.data.augment.RandomPerspective

method ultralytics.data.augment.RandomPerspective.__call__

method ultralytics.data.augment.RandomPerspective.affine_transform

method ultralytics.data.augment.RandomPerspective.apply_bboxes

method ultralytics.data.augment.RandomPerspective.apply_keypoints

method ultralytics.data.augment.RandomPerspective.apply_segments

method ultralytics.data.augment.RandomPerspective.box_candidates

class ultralytics.data.augment.RandomHSV

method ultralytics.data.augment.RandomHSV.__call__

class ultralytics.data.augment.RandomFlip

method ultralytics.data.augment.RandomFlip.__call__

class ultralytics.data.augment.LetterBox

method ultralytics.data.augment.LetterBox.__call__

method ultralytics.data.augment.LetterBox._update_labels

class ultralytics.data.augment.CopyPaste

method ultralytics.data.augment.CopyPaste.__call__

method ultralytics.data.augment.CopyPaste._mix_transform

method ultralytics.data.augment.CopyPaste._transform

class ultralytics.data.augment.Albumentations

method ultralytics.data.augment.Albumentations.__call__

class ultralytics.data.augment.Format

method ultralytics.data.augment.Format.__call__

method ultralytics.data.augment.Format._format_img

method ultralytics.data.augment.Format._format_segments

class ultralytics.data.augment.LoadVisualPrompt

method ultralytics.data.augment.LoadVisualPrompt.__call__

method ultralytics.data.augment.LoadVisualPrompt.get_visuals

method ultralytics.data.augment.LoadVisualPrompt.make_mask

class ultralytics.data.augment.RandomLoadText

method ultralytics.data.augment.RandomLoadText.__call__

class ultralytics.data.augment.ClassifyLetterBox

method ultralytics.data.augment.ClassifyLetterBox.__call__

class ultralytics.data.augment.CenterCrop

method ultralytics.data.augment.CenterCrop.__call__

class ultralytics.data.augment.ToTensor

method ultralytics.data.augment.ToTensor.__call__

function ultralytics.data.augment.v8_transforms

function ultralytics.data.augment.classify_transforms

function ultralytics.data.augment.classify_augmentations

Reference for `ultralytics/data/augment.py`

class `ultralytics.data.augment.BaseTransform`

method `ultralytics.data.augment.BaseTransform.call`

method `ultralytics.data.augment.BaseTransform.apply_image`

method `ultralytics.data.augment.BaseTransform.apply_instances`

method `ultralytics.data.augment.BaseTransform.apply_semantic`

class `ultralytics.data.augment.Compose`

method `ultralytics.data.augment.Compose.call`

method `ultralytics.data.augment.Compose.getitem`

method `ultralytics.data.augment.Compose.repr`

method `ultralytics.data.augment.Compose.setitem`

method `ultralytics.data.augment.Compose.append`

method `ultralytics.data.augment.Compose.insert`

method `ultralytics.data.augment.Compose.tolist`

class `ultralytics.data.augment.BaseMixTransform`

method `ultralytics.data.augment.BaseMixTransform.call`

method `ultralytics.data.augment.BaseMixTransform._mix_transform`

method `ultralytics.data.augment.BaseMixTransform._update_label_text`

method `ultralytics.data.augment.BaseMixTransform.get_indexes`

class `ultralytics.data.augment.Mosaic`

method `ultralytics.data.augment.Mosaic._cat_labels`

method `ultralytics.data.augment.Mosaic._mix_transform`

method `ultralytics.data.augment.Mosaic._mosaic3`

method `ultralytics.data.augment.Mosaic._mosaic4`

method `ultralytics.data.augment.Mosaic._mosaic9`

method `ultralytics.data.augment.Mosaic._update_labels`

method `ultralytics.data.augment.Mosaic.get_indexes`

class `ultralytics.data.augment.MixUp`

method `ultralytics.data.augment.MixUp._mix_transform`

class `ultralytics.data.augment.CutMix`

method `ultralytics.data.augment.CutMix._mix_transform`

method `ultralytics.data.augment.CutMix._rand_bbox`

class `ultralytics.data.augment.RandomPerspective`

method `ultralytics.data.augment.RandomPerspective.call`

method `ultralytics.data.augment.RandomPerspective.affine_transform`

method `ultralytics.data.augment.RandomPerspective.apply_bboxes`

method `ultralytics.data.augment.RandomPerspective.apply_keypoints`

method `ultralytics.data.augment.RandomPerspective.apply_segments`

method `ultralytics.data.augment.RandomPerspective.box_candidates`

class `ultralytics.data.augment.RandomHSV`

method `ultralytics.data.augment.RandomHSV.call`

class `ultralytics.data.augment.RandomFlip`

method `ultralytics.data.augment.RandomFlip.call`

class `ultralytics.data.augment.LetterBox`

method `ultralytics.data.augment.LetterBox.call`

method `ultralytics.data.augment.LetterBox._update_labels`

class `ultralytics.data.augment.CopyPaste`

method `ultralytics.data.augment.CopyPaste.call`

method `ultralytics.data.augment.CopyPaste._mix_transform`

method `ultralytics.data.augment.CopyPaste._transform`

class `ultralytics.data.augment.Albumentations`

method `ultralytics.data.augment.Albumentations.call`

class `ultralytics.data.augment.Format`

method `ultralytics.data.augment.Format.call`

method `ultralytics.data.augment.Format._format_img`

method `ultralytics.data.augment.Format._format_segments`

class `ultralytics.data.augment.LoadVisualPrompt`

method `ultralytics.data.augment.LoadVisualPrompt.call`

method `ultralytics.data.augment.LoadVisualPrompt.get_visuals`

method `ultralytics.data.augment.LoadVisualPrompt.make_mask`

class `ultralytics.data.augment.RandomLoadText`

method `ultralytics.data.augment.RandomLoadText.call`

class `ultralytics.data.augment.ClassifyLetterBox`

method `ultralytics.data.augment.ClassifyLetterBox.call`

class `ultralytics.data.augment.CenterCrop`

method `ultralytics.data.augment.CenterCrop.call`

class `ultralytics.data.augment.ToTensor`

method `ultralytics.data.augment.ToTensor.call`

function `ultralytics.data.augment.v8_transforms`

function `ultralytics.data.augment.classify_transforms`

function `ultralytics.data.augment.classify_augmentations`

Name	Type	Description	Default
`keypoints`	`np.ndarray`	Array of keypoints with shape (N, 17, 3), where N is the number of instances, 17 is the number of keypoints per instance, and 3 represents (x, y, visibility).	required
`M`	`np.ndarray`	3x3 affine transformation matrix.	required

Name	Type	Description	Default
`segments`	`np.ndarray`	Input segments with shape (N, M, 2), where N is the number of segments and M is the number of points in each segment.	required
`M`	`np.ndarray`	Affine transformation matrix with shape (3, 3).	required

Type	Description
`bboxes (np.ndarray)`	New bounding boxes with shape (N, 4) in xyxy format.
`segments (np.ndarray)`	Transformed and clipped segments with shape (N, M, 2).

Name	Type	Description	Default
`box1`	`np.ndarray`	Original boxes before augmentation, shape (4, N) where n is the number of boxes. Format is [x1, y1, x2, y2] in absolute coordinates.	required
`box2`	`np.ndarray`	Augmented boxes after transformation, shape (4, N). Format is [x1, y1, x2, y2] in absolute coordinates.	required
`wh_thr`	`int`	Width and height threshold in pixels. Boxes smaller than this in either dimension are rejected.	`2`
`ar_thr`	`int`	Aspect ratio threshold. Boxes with an aspect ratio greater than this value are rejected.	`100`
`area_thr`	`float`	Area ratio threshold. Boxes with an area ratio (new/old) less than this value are rejected.	`0.1`
`eps`	`float`	Small epsilon value to prevent division by zero.	`1e-16`

Name	Type	Description	Default
`hgain`	`float`	Maximum variation for hue. Should be in the range [0, 1].	`0.5`
`sgain`	`float`	Maximum variation for saturation. Should be in the range [0, 1].	`0.5`
`vgain`	`float`	Maximum variation for value. Should be in the range [0, 1].	`0.5`

Name	Type	Description
`hgain`	`float`	Maximum variation for hue. Range is typically [0, 1].
`sgain`	`float`	Maximum variation for saturation. Range is typically [0, 1].
`vgain`	`float`	Maximum variation for value. Range is typically [0, 1].

Name	Type	Description	Default
`p`	`float`	The probability of applying the flip. Must be between 0 and 1.	`0.5`
`direction`	`str`	The direction to apply the flip. Must be 'horizontal' or 'vertical'.	`"horizontal"`
`flip_idx`	`list[int] \| None`	Index mapping for flipping keypoints, if any.	`None`

Name	Type	Description
`p`	`float`	Probability of applying the flip. Must be between 0 and 1.
`direction`	`str`	Direction of flip, either 'horizontal' or 'vertical'.
`flip_idx`	`array-like`	Index mapping for flipping keypoints, if applicable.

Name	Type	Description	Default
`new_shape`	`tuple[int, int]`	Target size (height, width) for the resized image.	`(640, 640)`
`auto`	`bool`	If True, use minimum rectangle to resize. If False, use new_shape directly.	`False`
`scale_fill`	`bool`	If True, stretch the image to new_shape without padding.	`False`
`scaleup`	`bool`	If True, allow scaling up. If False, only scale down.	`True`
`center`	`bool`	If True, center the placed image. If False, place image in top-left corner.	`True`
`stride`	`int`	Stride of the model (e.g., 32 for YOLOv5).	`32`
`padding_value`	`int`	Value for padding the image. Default is 114.	`114`
`interpolation`	`int`	Interpolation method for resizing. Default is cv2.INTER_LINEAR.	`cv2.INTER_LINEAR`

Name	Type	Description
`new_shape`	`tuple`	Target shape (height, width) for resizing.
`auto`	`bool`	Whether to use minimum rectangle.
`scale_fill`	`bool`	Whether to stretch the image to new_shape.
`scaleup`	`bool`	Whether to allow scaling up. If False, only scale down.
`stride`	`int`	Stride for rounding padding.
`center`	`bool`	Whether to center the image or align to top-left.