Reference for ultralytics/utils/torch_utils.py
Note
This file is available at https://github.com/ultralytics/ultralytics/blob/main/ultralytics/utils/torch_utils.py. If you spot a problem please help fix it by contributing a Pull Request 🛠️. Thank you 🙏!
ultralytics.utils.torch_utils.ModelEMA
Updated Exponential Moving Average (EMA) from https://github.com/rwightman/pytorch-image-models. Keeps a moving average of everything in the model state_dict (parameters and buffers).
For EMA details see https://www.tensorflow.org/api_docs/python/tf/train/ExponentialMovingAverage
To disable EMA set the enabled
attribute to False
.
Source code in ultralytics/utils/torch_utils.py
update
Update EMA parameters.
Source code in ultralytics/utils/torch_utils.py
update_attr
Updates attributes and saves stripped model with optimizer removed.
ultralytics.utils.torch_utils.EarlyStopping
Early stopping class that stops training when a specified number of epochs have passed without improvement.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
patience
|
int
|
Number of epochs to wait after fitness stops improving before stopping. |
50
|
Source code in ultralytics/utils/torch_utils.py
__call__
Check whether to stop training.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
epoch
|
int
|
Current epoch of training |
required |
fitness
|
float
|
Fitness value of current epoch |
required |
Returns:
Type | Description |
---|---|
bool
|
True if training should stop, False otherwise |
Source code in ultralytics/utils/torch_utils.py
ultralytics.utils.torch_utils.FXModel
Bases: Module
A custom model class for torch.fx compatibility.
This class extends torch.nn.Module
and is designed to ensure compatibility with torch.fx for tracing and graph manipulation.
It copies attributes from an existing model and explicitly sets the model attribute to ensure proper copying.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model
|
Module
|
The original model to wrap for torch.fx compatibility. |
required |
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model
|
Module
|
The original model to wrap for torch.fx compatibility. |
required |
Source code in ultralytics/utils/torch_utils.py
forward
Forward pass through the model.
This method performs the forward pass through the model, handling the dependencies between layers and saving intermediate outputs.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x
|
Tensor
|
The input tensor to the model. |
required |
Returns:
Type | Description |
---|---|
Tensor
|
The output tensor from the model. |
Source code in ultralytics/utils/torch_utils.py
ultralytics.utils.torch_utils.torch_distributed_zero_first
Ensures all processes in distributed training wait for the local master (rank 0) to complete a task first.
Source code in ultralytics/utils/torch_utils.py
ultralytics.utils.torch_utils.smart_inference_mode
Applies torch.inference_mode() decorator if torch>=1.9.0 else torch.no_grad() decorator.
Source code in ultralytics/utils/torch_utils.py
ultralytics.utils.torch_utils.autocast
Get the appropriate autocast context manager based on PyTorch version and AMP setting.
This function returns a context manager for automatic mixed precision (AMP) training that is compatible with both older and newer versions of PyTorch. It handles the differences in the autocast API between PyTorch versions.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
enabled
|
bool
|
Whether to enable automatic mixed precision. |
required |
device
|
str
|
The device to use for autocast. Defaults to 'cuda'. |
'cuda'
|
Returns:
Type | Description |
---|---|
autocast
|
The appropriate autocast context manager. |
Note
- For PyTorch versions 1.13 and newer, it uses
torch.amp.autocast
. - For older versions, it uses
torch.cuda.autocast
.
Source code in ultralytics/utils/torch_utils.py
ultralytics.utils.torch_utils.get_cpu_info
Return a string with system CPU information, i.e. 'Apple M2'.
Source code in ultralytics/utils/torch_utils.py
ultralytics.utils.torch_utils.get_gpu_info
Return a string with system GPU information, i.e. 'Tesla T4, 15102MiB'.
ultralytics.utils.torch_utils.select_device
Selects the appropriate PyTorch device based on the provided arguments.
The function takes a string specifying the device or a torch.device object and returns a torch.device object representing the selected device. The function also validates the number of available devices and raises an exception if the requested device(s) are not available.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
device
|
str | device
|
Device string or torch.device object. Options are 'None', 'cpu', or 'cuda', or '0' or '0,1,2,3'. Defaults to an empty string, which auto-selects the first available GPU, or CPU if no GPU is available. |
''
|
batch
|
int
|
Batch size being used in your model. Defaults to 0. |
0
|
newline
|
bool
|
If True, adds a newline at the end of the log string. Defaults to False. |
False
|
verbose
|
bool
|
If True, logs the device information. Defaults to True. |
True
|
Returns:
Type | Description |
---|---|
device
|
Selected device. |
Raises:
Type | Description |
---|---|
ValueError
|
If the specified device is not available or if the batch size is not a multiple of the number of devices when using multiple GPUs. |
Examples:
Note
Sets the 'CUDA_VISIBLE_DEVICES' environment variable for specifying which GPUs to use.
Source code in ultralytics/utils/torch_utils.py
133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 |
|
ultralytics.utils.torch_utils.time_sync
ultralytics.utils.torch_utils.fuse_conv_and_bn
Fuse Conv2d() and BatchNorm2d() layers https://tehnokv.com/posts/fusing-batchnorm-and-conv/.
Source code in ultralytics/utils/torch_utils.py
ultralytics.utils.torch_utils.fuse_deconv_and_bn
Fuse ConvTranspose2d() and BatchNorm2d() layers.
Source code in ultralytics/utils/torch_utils.py
ultralytics.utils.torch_utils.model_info
Print and return detailed model information layer by layer.
Source code in ultralytics/utils/torch_utils.py
ultralytics.utils.torch_utils.get_num_params
ultralytics.utils.torch_utils.get_num_gradients
Return the total number of parameters with gradients in a YOLO model.
ultralytics.utils.torch_utils.model_info_for_loggers
Return model info dict with useful model information.
Example
YOLOv8n info for loggers
Source code in ultralytics/utils/torch_utils.py
ultralytics.utils.torch_utils.get_flops
Return a YOLO model's FLOPs.
Source code in ultralytics/utils/torch_utils.py
ultralytics.utils.torch_utils.get_flops_with_torch_profiler
Compute model FLOPs (thop package alternative, but 2-10x slower unfortunately).
Source code in ultralytics/utils/torch_utils.py
ultralytics.utils.torch_utils.initialize_weights
Initialize model weights to random values.
Source code in ultralytics/utils/torch_utils.py
ultralytics.utils.torch_utils.scale_img
Scales and pads an image tensor, optionally maintaining aspect ratio and padding to gs multiple.
Source code in ultralytics/utils/torch_utils.py
ultralytics.utils.torch_utils.copy_attr
Copies attributes from object 'b' to object 'a', with options to include/exclude certain attributes.
Source code in ultralytics/utils/torch_utils.py
ultralytics.utils.torch_utils.get_latest_opset
Return the second-most recent ONNX opset version supported by this version of PyTorch, adjusted for maturity.
Source code in ultralytics/utils/torch_utils.py
ultralytics.utils.torch_utils.intersect_dicts
Returns a dictionary of intersecting keys with matching shapes, excluding 'exclude' keys, using da values.
Source code in ultralytics/utils/torch_utils.py
ultralytics.utils.torch_utils.is_parallel
ultralytics.utils.torch_utils.de_parallel
De-parallelize a model: returns single-GPU model if model is of type DP or DDP.
ultralytics.utils.torch_utils.one_cycle
Returns a lambda function for sinusoidal ramp from y1 to y2 https://arxiv.org/pdf/1812.01187.pdf.
ultralytics.utils.torch_utils.init_seeds
Initialize random number generator (RNG) seeds https://pytorch.org/docs/stable/notes/randomness.html.
Source code in ultralytics/utils/torch_utils.py
ultralytics.utils.torch_utils.strip_optimizer
Strip optimizer from 'f' to finalize training, optionally save as 's'.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
f
|
str
|
file path to model to strip the optimizer from. Default is 'best.pt'. |
'best.pt'
|
s
|
str
|
file path to save the model with stripped optimizer to. If not provided, 'f' will be overwritten. |
''
|
updates
|
dict
|
a dictionary of updates to overlay onto the checkpoint before saving. |
None
|
Returns:
Type | Description |
---|---|
dict
|
The combined checkpoint dictionary. |
Example
Note
Use ultralytics.nn.torch_safe_load
for missing modules with x = torch_safe_load(f)[0]
Source code in ultralytics/utils/torch_utils.py
ultralytics.utils.torch_utils.convert_optimizer_state_dict_to_fp16
Converts the state_dict of a given optimizer to FP16, focusing on the 'state' key for tensor conversions.
This method aims to reduce storage size without altering 'param_groups' as they contain non-tensor data.
Source code in ultralytics/utils/torch_utils.py
ultralytics.utils.torch_utils.cuda_memory_usage
Monitor and manage CUDA memory usage.
This function checks if CUDA is available and, if so, empties the CUDA cache to free up unused memory. It then yields a dictionary containing memory usage information, which can be updated by the caller. Finally, it updates the dictionary with the amount of memory reserved by CUDA on the specified device.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
device
|
device
|
The CUDA device to query memory usage for. Defaults to None. |
None
|
Yields:
Type | Description |
---|---|
dict
|
A dictionary with a key 'memory' initialized to 0, which will be updated with the reserved memory. |
Source code in ultralytics/utils/torch_utils.py
ultralytics.utils.torch_utils.profile
Ultralytics speed, memory and FLOPs profiler.