Reference for ultralytics/models/sam/modules/memory_attention.py
Note
This file is available at https://github.com/ultralytics/ultralytics/blob/main/ultralytics/models/sam/modules/memory_attention.py. If you spot a problem please help fix it by contributing a Pull Request 🛠️. Thank you 🙏!
ultralytics.models.sam.modules.memory_attention.MemoryAttentionLayer
MemoryAttentionLayer(
d_model: int = 256,
dim_feedforward: int = 2048,
dropout: float = 0.1,
pos_enc_at_attn: bool = False,
pos_enc_at_cross_attn_keys: bool = True,
pos_enc_at_cross_attn_queries: bool = False,
)
Bases: Module
Implements a memory attention layer with self-attention and cross-attention mechanisms for neural networks.
This class combines self-attention, cross-attention, and feedforward components to process input tensors and generate memory-based attention outputs.
Attributes:
Name | Type | Description |
---|---|---|
d_model |
int
|
Dimensionality of the model. |
dim_feedforward |
int
|
Dimensionality of the feedforward network. |
dropout_value |
float
|
Dropout rate for regularization. |
self_attn |
RoPEAttention
|
Self-attention mechanism using RoPE (Rotary Position Embedding). |
cross_attn_image |
RoPEAttention
|
Cross-attention mechanism for image processing. |
linear1 |
Linear
|
First linear layer of the feedforward network. |
linear2 |
Linear
|
Second linear layer of the feedforward network. |
norm1 |
LayerNorm
|
Layer normalization for self-attention output. |
norm2 |
LayerNorm
|
Layer normalization for cross-attention output. |
norm3 |
LayerNorm
|
Layer normalization for feedforward network output. |
dropout1 |
Dropout
|
Dropout layer after self-attention. |
dropout2 |
Dropout
|
Dropout layer after cross-attention. |
dropout3 |
Dropout
|
Dropout layer after feedforward network. |
activation |
ReLU
|
Activation function for the feedforward network. |
pos_enc_at_attn |
bool
|
Flag to add positional encoding at attention. |
pos_enc_at_cross_attn_queries |
bool
|
Flag to add positional encoding to cross-attention queries. |
pos_enc_at_cross_attn_keys |
bool
|
Flag to add positional encoding to cross-attention keys. |
Methods:
Name | Description |
---|---|
forward |
Performs the full memory attention operation on input tensors. |
_forward_sa |
Performs self-attention on input tensor. |
_forward_ca |
Performs cross-attention between target and memory tensors. |
Examples:
>>> layer = MemoryAttentionLayer(d_model=256, dim_feedforward=2048, dropout=0.1)
>>> tgt = torch.randn(1, 100, 256)
>>> memory = torch.randn(1, 100, 64)
>>> pos = torch.randn(1, 100, 256)
>>> query_pos = torch.randn(1, 100, 256)
>>> output = layer(tgt, memory, pos, query_pos)
>>> print(output.shape)
torch.Size([1, 100, 256])
Source code in ultralytics/models/sam/modules/memory_attention.py
forward
forward(
tgt,
memory,
pos: Optional[Tensor] = None,
query_pos: Optional[Tensor] = None,
num_k_exclude_rope: int = 0,
) -> torch.Tensor
Processes input tensors using self-attention, cross-attention, and MLP for memory-based attention.
Source code in ultralytics/models/sam/modules/memory_attention.py
ultralytics.models.sam.modules.memory_attention.MemoryAttention
MemoryAttention(
d_model: int,
pos_enc_at_input: bool,
layer: nn.Module,
num_layers: int,
batch_first: bool = True,
)
Bases: Module
Memory attention module for processing sequential data with self and cross-attention mechanisms.
This class implements a multi-layer attention mechanism that combines self-attention and cross-attention for processing sequential data, particularly useful in transformer-like architectures.
Attributes:
Name | Type | Description |
---|---|---|
d_model |
int
|
The dimension of the model's hidden state. |
layers |
ModuleList
|
A list of MemoryAttentionLayer modules. |
num_layers |
int
|
The number of attention layers. |
norm |
LayerNorm
|
Layer normalization applied to the output. |
pos_enc_at_input |
bool
|
Whether to apply positional encoding at the input. |
batch_first |
bool
|
Whether the input tensors are in batch-first format. |
Methods:
Name | Description |
---|---|
forward |
Processes input tensors through the attention layers. |
Examples:
>>> d_model = 256
>>> layer = MemoryAttentionLayer(d_model)
>>> attention = MemoryAttention(d_model, pos_enc_at_input=True, layer=layer, num_layers=3)
>>> curr = torch.randn(10, 32, d_model) # (seq_len, batch_size, d_model)
>>> memory = torch.randn(20, 32, d_model) # (mem_len, batch_size, d_model)
>>> curr_pos = torch.randn(10, 32, d_model)
>>> memory_pos = torch.randn(20, 32, d_model)
>>> output = attention(curr, memory, curr_pos, memory_pos)
>>> print(output.shape)
torch.Size([10, 32, 256])
Source code in ultralytics/models/sam/modules/memory_attention.py
forward
forward(
curr: torch.Tensor,
memory: torch.Tensor,
curr_pos: Optional[Tensor] = None,
memory_pos: Optional[Tensor] = None,
num_obj_ptr_tokens: int = 0,
)
Processes input tensors through multiple attention layers, applying self and cross-attention mechanisms.