Reference for ultralytics/nn/modules/transformer.py
Note
This file is available at https://github.com/ultralytics/ultralytics/blob/main/ultralytics/nn/modules/transformer.py. If you spot a problem please help fix it by contributing a Pull Request 🛠️. Thank you 🙏!
ultralytics.nn.modules.transformer.TransformerEncoderLayer
Bases: Module
Defines a single layer of the transformer encoder.
Source code in ultralytics/nn/modules/transformer.py
__init__(c1, cm=2048, num_heads=8, dropout=0.0, act=nn.GELU(), normalize_before=False)
Initialize the TransformerEncoderLayer with specified parameters.
Source code in ultralytics/nn/modules/transformer.py
forward(src, src_mask=None, src_key_padding_mask=None, pos=None)
Forward propagates the input through the encoder module.
Source code in ultralytics/nn/modules/transformer.py
forward_post(src, src_mask=None, src_key_padding_mask=None, pos=None)
Performs forward pass with post-normalization.
Source code in ultralytics/nn/modules/transformer.py
forward_pre(src, src_mask=None, src_key_padding_mask=None, pos=None)
Performs forward pass with pre-normalization.
Source code in ultralytics/nn/modules/transformer.py
with_pos_embed(tensor, pos=None)
staticmethod
ultralytics.nn.modules.transformer.AIFI
Bases: TransformerEncoderLayer
Defines the AIFI transformer layer.
Source code in ultralytics/nn/modules/transformer.py
__init__(c1, cm=2048, num_heads=8, dropout=0, act=nn.GELU(), normalize_before=False)
Initialize the AIFI instance with specified parameters.
build_2d_sincos_position_embedding(w, h, embed_dim=256, temperature=10000.0)
staticmethod
Builds 2D sine-cosine position embedding.
Source code in ultralytics/nn/modules/transformer.py
forward(x)
Forward pass for the AIFI transformer layer.
Source code in ultralytics/nn/modules/transformer.py
ultralytics.nn.modules.transformer.TransformerLayer
Bases: Module
Transformer layer https://arxiv.org/abs/2010.11929 (LayerNorm layers removed for better performance).
Source code in ultralytics/nn/modules/transformer.py
__init__(c, num_heads)
Initializes a self-attention mechanism using linear transformations and multi-head attention.
Source code in ultralytics/nn/modules/transformer.py
forward(x)
Apply a transformer block to the input x and return the output.
ultralytics.nn.modules.transformer.TransformerBlock
Bases: Module
Vision Transformer https://arxiv.org/abs/2010.11929.
Source code in ultralytics/nn/modules/transformer.py
__init__(c1, c2, num_heads, num_layers)
Initialize a Transformer module with position embedding and specified number of heads and layers.
Source code in ultralytics/nn/modules/transformer.py
forward(x)
Forward propagates the input through the bottleneck module.
Source code in ultralytics/nn/modules/transformer.py
ultralytics.nn.modules.transformer.MLPBlock
Bases: Module
Implements a single block of a multi-layer perceptron.
Source code in ultralytics/nn/modules/transformer.py
__init__(embedding_dim, mlp_dim, act=nn.GELU)
Initialize the MLPBlock with specified embedding dimension, MLP dimension, and activation function.
Source code in ultralytics/nn/modules/transformer.py
ultralytics.nn.modules.transformer.MLP
Bases: Module
Implements a simple multi-layer perceptron (also called FFN).
Source code in ultralytics/nn/modules/transformer.py
__init__(input_dim, hidden_dim, output_dim, num_layers)
Initialize the MLP with specified input, hidden, output dimensions and number of layers.
Source code in ultralytics/nn/modules/transformer.py
ultralytics.nn.modules.transformer.LayerNorm2d
Bases: Module
2D Layer Normalization module inspired by Detectron2 and ConvNeXt implementations.
Original implementations in https://github.com/facebookresearch/detectron2/blob/main/detectron2/layers/batch_norm.py and https://github.com/facebookresearch/ConvNeXt/blob/main/models/convnext.py.
Source code in ultralytics/nn/modules/transformer.py
__init__(num_channels, eps=1e-06)
Initialize LayerNorm2d with the given parameters.
Source code in ultralytics/nn/modules/transformer.py
forward(x)
Perform forward pass for 2D layer normalization.
Source code in ultralytics/nn/modules/transformer.py
ultralytics.nn.modules.transformer.MSDeformAttn
Bases: Module
Multi-Scale Deformable Attention Module based on Deformable-DETR and PaddleDetection implementations.
https://github.com/fundamentalvision/Deformable-DETR/blob/main/models/ops/modules/ms_deform_attn.py
Source code in ultralytics/nn/modules/transformer.py
205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 |
|
__init__(d_model=256, n_levels=4, n_heads=8, n_points=4)
Initialize MSDeformAttn with the given parameters.
Source code in ultralytics/nn/modules/transformer.py
forward(query, refer_bbox, value, value_shapes, value_mask=None)
Perform forward pass for multiscale deformable attention.
https://github.com/PaddlePaddle/PaddleDetection/blob/develop/ppdet/modeling/transformers/deformable_transformer.py
Parameters:
Name | Type | Description | Default |
---|---|---|---|
query |
Tensor
|
[bs, query_length, C] |
required |
refer_bbox |
Tensor
|
[bs, query_length, n_levels, 2], range in [0, 1], top-left (0,0), bottom-right (1, 1), including padding area |
required |
value |
Tensor
|
[bs, value_length, C] |
required |
value_shapes |
List
|
[n_levels, 2], [(H_0, W_0), (H_1, W_1), ..., (H_{L-1}, W_{L-1})] |
required |
value_mask |
Tensor
|
[bs, value_length], True for non-padding elements, False for padding elements |
None
|
Returns:
Name | Type | Description |
---|---|---|
output |
Tensor
|
[bs, Length_{query}, C] |
Source code in ultralytics/nn/modules/transformer.py
ultralytics.nn.modules.transformer.DeformableTransformerDecoderLayer
Bases: Module
Deformable Transformer Decoder Layer inspired by PaddleDetection and Deformable-DETR implementations.
https://github.com/PaddlePaddle/PaddleDetection/blob/develop/ppdet/modeling/transformers/deformable_transformer.py https://github.com/fundamentalvision/Deformable-DETR/blob/main/models/deformable_transformer.py
Source code in ultralytics/nn/modules/transformer.py
__init__(d_model=256, n_heads=8, d_ffn=1024, dropout=0.0, act=nn.ReLU(), n_levels=4, n_points=4)
Initialize the DeformableTransformerDecoderLayer with the given parameters.
Source code in ultralytics/nn/modules/transformer.py
forward(embed, refer_bbox, feats, shapes, padding_mask=None, attn_mask=None, query_pos=None)
Perform the forward pass through the entire decoder layer.
Source code in ultralytics/nn/modules/transformer.py
forward_ffn(tgt)
Perform forward pass through the Feed-Forward Network part of the layer.
with_pos_embed(tensor, pos)
staticmethod
ultralytics.nn.modules.transformer.DeformableTransformerDecoder
Bases: Module
Implementation of Deformable Transformer Decoder based on PaddleDetection.
https://github.com/PaddlePaddle/PaddleDetection/blob/develop/ppdet/modeling/transformers/deformable_transformer.py
Source code in ultralytics/nn/modules/transformer.py
__init__(hidden_dim, decoder_layer, num_layers, eval_idx=-1)
Initialize the DeformableTransformerDecoder with the given parameters.
Source code in ultralytics/nn/modules/transformer.py
forward(embed, refer_bbox, feats, shapes, bbox_head, score_head, pos_mlp, attn_mask=None, padding_mask=None)
Perform the forward pass through the entire decoder.