์ฝ˜ํ…์ธ ๋กœ ๊ฑด๋„ˆ๋›ฐ๊ธฐ

์ฐธ์กฐ ultralytics/nn/modules/block.py

์ฐธ๊ณ 

์ด ํŒŒ์ผ์€ https://github.com/ultralytics/ ultralytics/blob/main/ ultralytics/nn/modules/block .py์—์„œ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋ฌธ์ œ๋ฅผ ๋ฐœ๊ฒฌํ•˜๋ฉด ํ’€ ๋ฆฌํ€˜์ŠคํŠธ (๐Ÿ› ๏ธ) ๋ฅผ ํ†ตํ•ด ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๋„๋ก ๋„์™€์ฃผ์„ธ์š”. ๊ฐ์‚ฌํ•ฉ๋‹ˆ๋‹ค ๐Ÿ™!



ultralytics.nn.modules.block.DFL

๋ฒ ์ด์Šค: Module

๋ถ„ํฌ ์ดˆ์  ์†์‹ค(DFL)์˜ ํ†ตํ•ฉ ๋ชจ๋“ˆ์ž…๋‹ˆ๋‹ค.

์ผ๋ฐ˜ํ™”๋œ ์ดˆ์  ์†์‹ค์—์„œ ์ œ์•ˆ๋จ https://ieeexplore.ieee.org/document/9792391

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
class DFL(nn.Module):
    """
    Integral module of Distribution Focal Loss (DFL).

    Proposed in Generalized Focal Loss https://ieeexplore.ieee.org/document/9792391
    """

    def __init__(self, c1=16):
        """Initialize a convolutional layer with a given number of input channels."""
        super().__init__()
        self.conv = nn.Conv2d(c1, 1, 1, bias=False).requires_grad_(False)
        x = torch.arange(c1, dtype=torch.float)
        self.conv.weight.data[:] = nn.Parameter(x.view(1, c1, 1, 1))
        self.c1 = c1

    def forward(self, x):
        """Applies a transformer layer on input tensor 'x' and returns a tensor."""
        b, _, a = x.shape  # batch, channels, anchors
        return self.conv(x.view(b, 4, self.c1, a).transpose(2, 1).softmax(1)).view(b, 4, a)

__init__(c1=16)

์ฃผ์–ด์ง„ ์ž…๋ ฅ ์ฑ„๋„ ์ˆ˜๋กœ ์ปจ๋ณผ๋ฃจ์…˜ ๋ ˆ์ด์–ด๋ฅผ ์ดˆ๊ธฐํ™”ํ•ฉ๋‹ˆ๋‹ค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
def __init__(self, c1=16):
    """Initialize a convolutional layer with a given number of input channels."""
    super().__init__()
    self.conv = nn.Conv2d(c1, 1, 1, bias=False).requires_grad_(False)
    x = torch.arange(c1, dtype=torch.float)
    self.conv.weight.data[:] = nn.Parameter(x.view(1, c1, 1, 1))
    self.c1 = c1

forward(x)

์ž…๋ ฅ tensor 'x'์— ํŠธ๋žœ์Šคํฌ๋จธ ๋ ˆ์ด์–ด๋ฅผ ์ ์šฉํ•˜๊ณ  tensor ์„ ๋ฐ˜ํ™˜ํ•ฉ๋‹ˆ๋‹ค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
def forward(self, x):
    """Applies a transformer layer on input tensor 'x' and returns a tensor."""
    b, _, a = x.shape  # batch, channels, anchors
    return self.conv(x.view(b, 4, self.c1, a).transpose(2, 1).softmax(1)).view(b, 4, a)



ultralytics.nn.modules.block.Proto

๋ฒ ์ด์Šค: Module

YOLOv8 ๋งˆ์Šคํฌ ์„ธ๋ถ„ํ™” ๋ชจ๋ธ์šฉ ํ”„๋กœํ†  ๋ชจ๋“ˆ์ž…๋‹ˆ๋‹ค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
class Proto(nn.Module):
    """YOLOv8 mask Proto module for segmentation models."""

    def __init__(self, c1, c_=256, c2=32):
        """
        Initializes the YOLOv8 mask Proto module with specified number of protos and masks.

        Input arguments are ch_in, number of protos, number of masks.
        """
        super().__init__()
        self.cv1 = Conv(c1, c_, k=3)
        self.upsample = nn.ConvTranspose2d(c_, c_, 2, 2, 0, bias=True)  # nn.Upsample(scale_factor=2, mode='nearest')
        self.cv2 = Conv(c_, c_, k=3)
        self.cv3 = Conv(c_, c2)

    def forward(self, x):
        """Performs a forward pass through layers using an upsampled input image."""
        return self.cv3(self.cv2(self.upsample(self.cv1(x))))

__init__(c1, c_=256, c2=32)

์ง€์ •๋œ ์ˆ˜์˜ ํ”„๋กœํ† ์™€ ๋งˆ์Šคํฌ๋กœ YOLOv8 ๋งˆ์Šคํฌ ํ”„๋กœํ†  ๋ชจ๋“ˆ์„ ์ดˆ๊ธฐํ™”ํ•ฉ๋‹ˆ๋‹ค.

์ž…๋ ฅ ์ธ์ˆ˜๋Š” ch_in, ํ”„๋กœํ†  ๊ฐœ์ˆ˜, ๋งˆ์Šคํฌ ๊ฐœ์ˆ˜์ž…๋‹ˆ๋‹ค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
def __init__(self, c1, c_=256, c2=32):
    """
    Initializes the YOLOv8 mask Proto module with specified number of protos and masks.

    Input arguments are ch_in, number of protos, number of masks.
    """
    super().__init__()
    self.cv1 = Conv(c1, c_, k=3)
    self.upsample = nn.ConvTranspose2d(c_, c_, 2, 2, 0, bias=True)  # nn.Upsample(scale_factor=2, mode='nearest')
    self.cv2 = Conv(c_, c_, k=3)
    self.cv3 = Conv(c_, c2)

forward(x)

์—…์ƒ˜ํ”Œ๋ง๋œ ์ž…๋ ฅ ์ด๋ฏธ์ง€๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ ˆ์ด์–ด๋ฅผ ํ†ต๊ณผํ•˜๋Š” ํฌ์›Œ๋“œ ํŒจ์Šค๋ฅผ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
def forward(self, x):
    """Performs a forward pass through layers using an upsampled input image."""
    return self.cv3(self.cv2(self.upsample(self.cv1(x))))



ultralytics.nn.modules.block.HGStem

๋ฒ ์ด์Šค: Module

5๊ฐœ์˜ ์ปจ๋ณผ๋ฃจ์…˜๊ณผ ํ•˜๋‚˜์˜ maxpool2d๊ฐ€ ์žˆ๋Š” PPHGNetV2์˜ StemBlock.

https://github.com/PaddlePaddle/PaddleDetection/blob/develop/ppdet/modeling/backbones/hgnet_v2.py

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
class HGStem(nn.Module):
    """
    StemBlock of PPHGNetV2 with 5 convolutions and one maxpool2d.

    https://github.com/PaddlePaddle/PaddleDetection/blob/develop/ppdet/modeling/backbones/hgnet_v2.py
    """

    def __init__(self, c1, cm, c2):
        """Initialize the SPP layer with input/output channels and specified kernel sizes for max pooling."""
        super().__init__()
        self.stem1 = Conv(c1, cm, 3, 2, act=nn.ReLU())
        self.stem2a = Conv(cm, cm // 2, 2, 1, 0, act=nn.ReLU())
        self.stem2b = Conv(cm // 2, cm, 2, 1, 0, act=nn.ReLU())
        self.stem3 = Conv(cm * 2, cm, 3, 2, act=nn.ReLU())
        self.stem4 = Conv(cm, c2, 1, 1, act=nn.ReLU())
        self.pool = nn.MaxPool2d(kernel_size=2, stride=1, padding=0, ceil_mode=True)

    def forward(self, x):
        """Forward pass of a PPHGNetV2 backbone layer."""
        x = self.stem1(x)
        x = F.pad(x, [0, 1, 0, 1])
        x2 = self.stem2a(x)
        x2 = F.pad(x2, [0, 1, 0, 1])
        x2 = self.stem2b(x2)
        x1 = self.pool(x)
        x = torch.cat([x1, x2], dim=1)
        x = self.stem3(x)
        x = self.stem4(x)
        return x

__init__(c1, cm, c2)

์ตœ๋Œ€ ํ’€๋ง์„ ์œ„ํ•ด ์ž…๋ ฅ/์ถœ๋ ฅ ์ฑ„๋„๊ณผ ์ง€์ •๋œ ์ปค๋„ ํฌ๊ธฐ๋กœ SPP ๋ ˆ์ด์–ด๋ฅผ ์ดˆ๊ธฐํ™”ํ•ฉ๋‹ˆ๋‹ค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
def __init__(self, c1, cm, c2):
    """Initialize the SPP layer with input/output channels and specified kernel sizes for max pooling."""
    super().__init__()
    self.stem1 = Conv(c1, cm, 3, 2, act=nn.ReLU())
    self.stem2a = Conv(cm, cm // 2, 2, 1, 0, act=nn.ReLU())
    self.stem2b = Conv(cm // 2, cm, 2, 1, 0, act=nn.ReLU())
    self.stem3 = Conv(cm * 2, cm, 3, 2, act=nn.ReLU())
    self.stem4 = Conv(cm, c2, 1, 1, act=nn.ReLU())
    self.pool = nn.MaxPool2d(kernel_size=2, stride=1, padding=0, ceil_mode=True)

forward(x)

PPHGNetV2 ๋ฐฑ๋ณธ ๋ ˆ์ด์–ด์˜ ํฌ์›Œ๋“œ ํŒจ์Šค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
def forward(self, x):
    """Forward pass of a PPHGNetV2 backbone layer."""
    x = self.stem1(x)
    x = F.pad(x, [0, 1, 0, 1])
    x2 = self.stem2a(x)
    x2 = F.pad(x2, [0, 1, 0, 1])
    x2 = self.stem2b(x2)
    x1 = self.pool(x)
    x = torch.cat([x1, x2], dim=1)
    x = self.stem3(x)
    x = self.stem4(x)
    return x



ultralytics.nn.modules.block.HGBlock

๋ฒ ์ด์Šค: Module

2๊ฐœ์˜ ์ปจ๋ณผ๋ฃจ์…˜๊ณผ LightConv๊ฐ€ ํฌํ•จ๋œ PPHGNetV2์˜ HG_Block.

https://github.com/PaddlePaddle/PaddleDetection/blob/develop/ppdet/modeling/backbones/hgnet_v2.py

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
class HGBlock(nn.Module):
    """
    HG_Block of PPHGNetV2 with 2 convolutions and LightConv.

    https://github.com/PaddlePaddle/PaddleDetection/blob/develop/ppdet/modeling/backbones/hgnet_v2.py
    """

    def __init__(self, c1, cm, c2, k=3, n=6, lightconv=False, shortcut=False, act=nn.ReLU()):
        """Initializes a CSP Bottleneck with 1 convolution using specified input and output channels."""
        super().__init__()
        block = LightConv if lightconv else Conv
        self.m = nn.ModuleList(block(c1 if i == 0 else cm, cm, k=k, act=act) for i in range(n))
        self.sc = Conv(c1 + n * cm, c2 // 2, 1, 1, act=act)  # squeeze conv
        self.ec = Conv(c2 // 2, c2, 1, 1, act=act)  # excitation conv
        self.add = shortcut and c1 == c2

    def forward(self, x):
        """Forward pass of a PPHGNetV2 backbone layer."""
        y = [x]
        y.extend(m(y[-1]) for m in self.m)
        y = self.ec(self.sc(torch.cat(y, 1)))
        return y + x if self.add else y

__init__(c1, cm, c2, k=3, n=6, lightconv=False, shortcut=False, act=nn.ReLU())

์ง€์ •๋œ ์ž…๋ ฅ ๋ฐ ์ถœ๋ ฅ ์ฑ„๋„์„ ์‚ฌ์šฉํ•˜์—ฌ 1 ์ปจ๋ณผ๋ฃจ์…˜์œผ๋กœ CSP ๋ณ‘๋ชฉ ํ˜„์ƒ์„ ์ดˆ๊ธฐํ™”ํ•ฉ๋‹ˆ๋‹ค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
def __init__(self, c1, cm, c2, k=3, n=6, lightconv=False, shortcut=False, act=nn.ReLU()):
    """Initializes a CSP Bottleneck with 1 convolution using specified input and output channels."""
    super().__init__()
    block = LightConv if lightconv else Conv
    self.m = nn.ModuleList(block(c1 if i == 0 else cm, cm, k=k, act=act) for i in range(n))
    self.sc = Conv(c1 + n * cm, c2 // 2, 1, 1, act=act)  # squeeze conv
    self.ec = Conv(c2 // 2, c2, 1, 1, act=act)  # excitation conv
    self.add = shortcut and c1 == c2

forward(x)

PPHGNetV2 ๋ฐฑ๋ณธ ๋ ˆ์ด์–ด์˜ ํฌ์›Œ๋“œ ํŒจ์Šค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
def forward(self, x):
    """Forward pass of a PPHGNetV2 backbone layer."""
    y = [x]
    y.extend(m(y[-1]) for m in self.m)
    y = self.ec(self.sc(torch.cat(y, 1)))
    return y + x if self.add else y



ultralytics.nn.modules.block.SPP

๋ฒ ์ด์Šค: Module

๊ณต๊ฐ„ ํ”ผ๋ผ๋ฏธ๋“œ ํ’€๋ง(SPP) ๋ ˆ์ด์–ด https://arxiv.org/abs/1406.4729.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
class SPP(nn.Module):
    """Spatial Pyramid Pooling (SPP) layer https://arxiv.org/abs/1406.4729."""

    def __init__(self, c1, c2, k=(5, 9, 13)):
        """Initialize the SPP layer with input/output channels and pooling kernel sizes."""
        super().__init__()
        c_ = c1 // 2  # hidden channels
        self.cv1 = Conv(c1, c_, 1, 1)
        self.cv2 = Conv(c_ * (len(k) + 1), c2, 1, 1)
        self.m = nn.ModuleList([nn.MaxPool2d(kernel_size=x, stride=1, padding=x // 2) for x in k])

    def forward(self, x):
        """Forward pass of the SPP layer, performing spatial pyramid pooling."""
        x = self.cv1(x)
        return self.cv2(torch.cat([x] + [m(x) for m in self.m], 1))

__init__(c1, c2, k=(5, 9, 13))

์ž…๋ ฅ/์ถœ๋ ฅ ์ฑ„๋„๊ณผ ํ’€๋ง ์ปค๋„ ํฌ๊ธฐ๋กœ SPP ๋ ˆ์ด์–ด๋ฅผ ์ดˆ๊ธฐํ™”ํ•ฉ๋‹ˆ๋‹ค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
def __init__(self, c1, c2, k=(5, 9, 13)):
    """Initialize the SPP layer with input/output channels and pooling kernel sizes."""
    super().__init__()
    c_ = c1 // 2  # hidden channels
    self.cv1 = Conv(c1, c_, 1, 1)
    self.cv2 = Conv(c_ * (len(k) + 1), c2, 1, 1)
    self.m = nn.ModuleList([nn.MaxPool2d(kernel_size=x, stride=1, padding=x // 2) for x in k])

forward(x)

๊ณต๊ฐ„ ํ”ผ๋ผ๋ฏธ๋“œ ํ’€๋ง์„ ์ˆ˜ํ–‰ํ•˜๋Š” SPP ๋ ˆ์ด์–ด์˜ ํฌ์›Œ๋“œ ํŒจ์Šค์ž…๋‹ˆ๋‹ค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
def forward(self, x):
    """Forward pass of the SPP layer, performing spatial pyramid pooling."""
    x = self.cv1(x)
    return self.cv2(torch.cat([x] + [m(x) for m in self.m], 1))



ultralytics.nn.modules.block.SPPF

๋ฒ ์ด์Šค: Module

๊ณต๊ฐ„ ํ”ผ๋ผ๋ฏธ๋“œ ํ’€๋ง - ๋น ๋ฅธ(SPPF) ๋ ˆ์ด์–ด( YOLOv5 ์šฉ) ์ž‘์„ฑ์ž: Glenn Jocher.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
class SPPF(nn.Module):
    """Spatial Pyramid Pooling - Fast (SPPF) layer for YOLOv5 by Glenn Jocher."""

    def __init__(self, c1, c2, k=5):
        """
        Initializes the SPPF layer with given input/output channels and kernel size.

        This module is equivalent to SPP(k=(5, 9, 13)).
        """
        super().__init__()
        c_ = c1 // 2  # hidden channels
        self.cv1 = Conv(c1, c_, 1, 1)
        self.cv2 = Conv(c_ * 4, c2, 1, 1)
        self.m = nn.MaxPool2d(kernel_size=k, stride=1, padding=k // 2)

    def forward(self, x):
        """Forward pass through Ghost Convolution block."""
        y = [self.cv1(x)]
        y.extend(self.m(y[-1]) for _ in range(3))
        return self.cv2(torch.cat(y, 1))

__init__(c1, c2, k=5)

์ฃผ์–ด์ง„ ์ž…๋ ฅ/์ถœ๋ ฅ ์ฑ„๋„๊ณผ ์ปค๋„ ํฌ๊ธฐ๋กœ SPPF ๋ ˆ์ด์–ด๋ฅผ ์ดˆ๊ธฐํ™”ํ•ฉ๋‹ˆ๋‹ค.

์ด ๋ชจ๋“ˆ์€ SPP(k=(5, 9, 13))์™€ ๋™์ผํ•ฉ๋‹ˆ๋‹ค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
def __init__(self, c1, c2, k=5):
    """
    Initializes the SPPF layer with given input/output channels and kernel size.

    This module is equivalent to SPP(k=(5, 9, 13)).
    """
    super().__init__()
    c_ = c1 // 2  # hidden channels
    self.cv1 = Conv(c1, c_, 1, 1)
    self.cv2 = Conv(c_ * 4, c2, 1, 1)
    self.m = nn.MaxPool2d(kernel_size=k, stride=1, padding=k // 2)

forward(x)

๊ณ ์ŠคํŠธ ์ปจ๋ณผ๋ฃจ์…˜ ๋ธ”๋ก์„ ํฌ์›Œ๋“œ ํŒจ์Šคํ•ฉ๋‹ˆ๋‹ค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
def forward(self, x):
    """Forward pass through Ghost Convolution block."""
    y = [self.cv1(x)]
    y.extend(self.m(y[-1]) for _ in range(3))
    return self.cv2(torch.cat(y, 1))



ultralytics.nn.modules.block.C1

๋ฒ ์ด์Šค: Module

์ปจ๋ณผ๋ฃจ์…˜์ด 1๊ฐœ์ธ CSP ๋ณ‘๋ชฉ ํ˜„์ƒ.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
class C1(nn.Module):
    """CSP Bottleneck with 1 convolution."""

    def __init__(self, c1, c2, n=1):
        """Initializes the CSP Bottleneck with configurations for 1 convolution with arguments ch_in, ch_out, number."""
        super().__init__()
        self.cv1 = Conv(c1, c2, 1, 1)
        self.m = nn.Sequential(*(Conv(c2, c2, 3) for _ in range(n)))

    def forward(self, x):
        """Applies cross-convolutions to input in the C3 module."""
        y = self.cv1(x)
        return self.m(y) + y

__init__(c1, c2, n=1)

์ธ์ž ch_in, ch_out, number๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ 1 ์ปจ๋ณผ๋ฃจ์…˜์— ๋Œ€ํ•œ ๊ตฌ์„ฑ์œผ๋กœ CSP ๋ณ‘๋ชฉํ˜„์ƒ์„ ์ดˆ๊ธฐํ™”ํ•ฉ๋‹ˆ๋‹ค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
def __init__(self, c1, c2, n=1):
    """Initializes the CSP Bottleneck with configurations for 1 convolution with arguments ch_in, ch_out, number."""
    super().__init__()
    self.cv1 = Conv(c1, c2, 1, 1)
    self.m = nn.Sequential(*(Conv(c2, c2, 3) for _ in range(n)))

forward(x)

C3 ๋ชจ๋“ˆ์˜ ์ž…๋ ฅ์— ๊ต์ฐจ ์ปจ๋ณผ๋ฃจ์…˜์„ ์ ์šฉํ•ฉ๋‹ˆ๋‹ค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
def forward(self, x):
    """Applies cross-convolutions to input in the C3 module."""
    y = self.cv1(x)
    return self.m(y) + y



ultralytics.nn.modules.block.C2

๋ฒ ์ด์Šค: Module

2๊ฐœ์˜ ์ปจ๋ณผ๋ฃจ์…˜์ด ์žˆ๋Š” CSP ๋ณ‘๋ชฉ ํ˜„์ƒ.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
class C2(nn.Module):
    """CSP Bottleneck with 2 convolutions."""

    def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5):
        """Initializes the CSP Bottleneck with 2 convolutions module with arguments ch_in, ch_out, number, shortcut,
        groups, expansion.
        """
        super().__init__()
        self.c = int(c2 * e)  # hidden channels
        self.cv1 = Conv(c1, 2 * self.c, 1, 1)
        self.cv2 = Conv(2 * self.c, c2, 1)  # optional act=FReLU(c2)
        # self.attention = ChannelAttention(2 * self.c)  # or SpatialAttention()
        self.m = nn.Sequential(*(Bottleneck(self.c, self.c, shortcut, g, k=((3, 3), (3, 3)), e=1.0) for _ in range(n)))

    def forward(self, x):
        """Forward pass through the CSP bottleneck with 2 convolutions."""
        a, b = self.cv1(x).chunk(2, 1)
        return self.cv2(torch.cat((self.m(a), b), 1))

__init__(c1, c2, n=1, shortcut=True, g=1, e=0.5)

์ธ์ž ch_in, ch_out, number, ๋‹จ์ถ•ํ‚ค๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ 2๊ฐœ์˜ ์ปจ๋ณผ๋ฃจ์…˜ ๋ชจ๋“ˆ๋กœ CSP ๋ณดํ‹€๋„ฅ์„ ์ดˆ๊ธฐํ™”ํ•ฉ๋‹ˆ๋‹ค, ๊ทธ๋ฃน, ํ™•์žฅ.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5):
    """Initializes the CSP Bottleneck with 2 convolutions module with arguments ch_in, ch_out, number, shortcut,
    groups, expansion.
    """
    super().__init__()
    self.c = int(c2 * e)  # hidden channels
    self.cv1 = Conv(c1, 2 * self.c, 1, 1)
    self.cv2 = Conv(2 * self.c, c2, 1)  # optional act=FReLU(c2)
    # self.attention = ChannelAttention(2 * self.c)  # or SpatialAttention()
    self.m = nn.Sequential(*(Bottleneck(self.c, self.c, shortcut, g, k=((3, 3), (3, 3)), e=1.0) for _ in range(n)))

forward(x)

2๊ฐœ์˜ ์ปจ๋ณผ๋ฃจ์…˜์œผ๋กœ CSP ๋ณ‘๋ชฉ ์ง€์ ์„ ํ†ต๊ณผํ•ฉ๋‹ˆ๋‹ค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
def forward(self, x):
    """Forward pass through the CSP bottleneck with 2 convolutions."""
    a, b = self.cv1(x).chunk(2, 1)
    return self.cv2(torch.cat((self.m(a), b), 1))



ultralytics.nn.modules.block.C2f

๋ฒ ์ด์Šค: Module

2๊ฐœ์˜ ์ปจ๋ณผ๋ฃจ์…˜์œผ๋กœ CSP ๋ณ‘๋ชฉ ํ˜„์ƒ์„ ๋” ๋น ๋ฅด๊ฒŒ ๊ตฌํ˜„ํ•ฉ๋‹ˆ๋‹ค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
class C2f(nn.Module):
    """Faster Implementation of CSP Bottleneck with 2 convolutions."""

    def __init__(self, c1, c2, n=1, shortcut=False, g=1, e=0.5):
        """Initialize CSP bottleneck layer with two convolutions with arguments ch_in, ch_out, number, shortcut, groups,
        expansion.
        """
        super().__init__()
        self.c = int(c2 * e)  # hidden channels
        self.cv1 = Conv(c1, 2 * self.c, 1, 1)
        self.cv2 = Conv((2 + n) * self.c, c2, 1)  # optional act=FReLU(c2)
        self.m = nn.ModuleList(Bottleneck(self.c, self.c, shortcut, g, k=((3, 3), (3, 3)), e=1.0) for _ in range(n))

    def forward(self, x):
        """Forward pass through C2f layer."""
        y = list(self.cv1(x).chunk(2, 1))
        y.extend(m(y[-1]) for m in self.m)
        return self.cv2(torch.cat(y, 1))

    def forward_split(self, x):
        """Forward pass using split() instead of chunk()."""
        y = list(self.cv1(x).split((self.c, self.c), 1))
        y.extend(m(y[-1]) for m in self.m)
        return self.cv2(torch.cat(y, 1))

__init__(c1, c2, n=1, shortcut=False, g=1, e=0.5)

์ธ์ž ch_in, ch_out, number, shortcut, groups๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋‘ ๊ฐœ์˜ ์ปจ๋ณผ๋ฃจ์…˜์œผ๋กœ CSP ๋ณ‘๋ชฉ ๊ณ„์ธต์„ ์ดˆ๊ธฐํ™”ํ•ฉ๋‹ˆ๋‹ค, ํ™•์žฅ.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
def __init__(self, c1, c2, n=1, shortcut=False, g=1, e=0.5):
    """Initialize CSP bottleneck layer with two convolutions with arguments ch_in, ch_out, number, shortcut, groups,
    expansion.
    """
    super().__init__()
    self.c = int(c2 * e)  # hidden channels
    self.cv1 = Conv(c1, 2 * self.c, 1, 1)
    self.cv2 = Conv((2 + n) * self.c, c2, 1)  # optional act=FReLU(c2)
    self.m = nn.ModuleList(Bottleneck(self.c, self.c, shortcut, g, k=((3, 3), (3, 3)), e=1.0) for _ in range(n))

forward(x)

C2f ๋ ˆ์ด์–ด๋ฅผ ํ†ต๊ณผํ•ฉ๋‹ˆ๋‹ค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
def forward(self, x):
    """Forward pass through C2f layer."""
    y = list(self.cv1(x).chunk(2, 1))
    y.extend(m(y[-1]) for m in self.m)
    return self.cv2(torch.cat(y, 1))

forward_split(x)

chunk() ๋Œ€์‹  split()์„ ์‚ฌ์šฉํ•˜์—ฌ ์ „๋‹ฌํ•ฉ๋‹ˆ๋‹ค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
def forward_split(self, x):
    """Forward pass using split() instead of chunk()."""
    y = list(self.cv1(x).split((self.c, self.c), 1))
    y.extend(m(y[-1]) for m in self.m)
    return self.cv2(torch.cat(y, 1))



ultralytics.nn.modules.block.C3

๋ฒ ์ด์Šค: Module

3๊ฐœ์˜ ์ปจ๋ณผ๋ฃจ์…˜์ด ์žˆ๋Š” CSP ๋ณ‘๋ชฉ ํ˜„์ƒ.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
class C3(nn.Module):
    """CSP Bottleneck with 3 convolutions."""

    def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5):
        """Initialize the CSP Bottleneck with given channels, number, shortcut, groups, and expansion values."""
        super().__init__()
        c_ = int(c2 * e)  # hidden channels
        self.cv1 = Conv(c1, c_, 1, 1)
        self.cv2 = Conv(c1, c_, 1, 1)
        self.cv3 = Conv(2 * c_, c2, 1)  # optional act=FReLU(c2)
        self.m = nn.Sequential(*(Bottleneck(c_, c_, shortcut, g, k=((1, 1), (3, 3)), e=1.0) for _ in range(n)))

    def forward(self, x):
        """Forward pass through the CSP bottleneck with 2 convolutions."""
        return self.cv3(torch.cat((self.m(self.cv1(x)), self.cv2(x)), 1))

__init__(c1, c2, n=1, shortcut=True, g=1, e=0.5)

์ฃผ์–ด์ง„ ์ฑ„๋„, ์ˆ˜, ๋‹จ์ถ•ํ‚ค, ๊ทธ๋ฃน ๋ฐ ํ™•์žฅ ๊ฐ’์œผ๋กœ CSP ๋ณ‘๋ชฉ ํ˜„์ƒ์„ ์ดˆ๊ธฐํ™”ํ•ฉ๋‹ˆ๋‹ค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5):
    """Initialize the CSP Bottleneck with given channels, number, shortcut, groups, and expansion values."""
    super().__init__()
    c_ = int(c2 * e)  # hidden channels
    self.cv1 = Conv(c1, c_, 1, 1)
    self.cv2 = Conv(c1, c_, 1, 1)
    self.cv3 = Conv(2 * c_, c2, 1)  # optional act=FReLU(c2)
    self.m = nn.Sequential(*(Bottleneck(c_, c_, shortcut, g, k=((1, 1), (3, 3)), e=1.0) for _ in range(n)))

forward(x)

2๊ฐœ์˜ ์ปจ๋ณผ๋ฃจ์…˜์œผ๋กœ CSP ๋ณ‘๋ชฉ ์ง€์ ์„ ํ†ต๊ณผํ•ฉ๋‹ˆ๋‹ค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
def forward(self, x):
    """Forward pass through the CSP bottleneck with 2 convolutions."""
    return self.cv3(torch.cat((self.m(self.cv1(x)), self.cv2(x)), 1))



ultralytics.nn.modules.block.C3x

๋ฒ ์ด์Šค: C3

๊ต์ฐจ ์ปจ๋ณผ๋ฃจ์…˜์ด ์žˆ๋Š” C3 ๋ชจ๋“ˆ.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
class C3x(C3):
    """C3 module with cross-convolutions."""

    def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5):
        """Initialize C3TR instance and set default parameters."""
        super().__init__(c1, c2, n, shortcut, g, e)
        self.c_ = int(c2 * e)
        self.m = nn.Sequential(*(Bottleneck(self.c_, self.c_, shortcut, g, k=((1, 3), (3, 1)), e=1) for _ in range(n)))

__init__(c1, c2, n=1, shortcut=True, g=1, e=0.5)

C3TR ์ธ์Šคํ„ด์Šค๋ฅผ ์ดˆ๊ธฐํ™”ํ•˜๊ณ  ๊ธฐ๋ณธ ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ์„ค์ •ํ•ฉ๋‹ˆ๋‹ค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5):
    """Initialize C3TR instance and set default parameters."""
    super().__init__(c1, c2, n, shortcut, g, e)
    self.c_ = int(c2 * e)
    self.m = nn.Sequential(*(Bottleneck(self.c_, self.c_, shortcut, g, k=((1, 3), (3, 1)), e=1) for _ in range(n)))



ultralytics.nn.modules.block.RepC3

๋ฒ ์ด์Šค: Module

Rep C3.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
class RepC3(nn.Module):
    """Rep C3."""

    def __init__(self, c1, c2, n=3, e=1.0):
        """Initialize CSP Bottleneck with a single convolution using input channels, output channels, and number."""
        super().__init__()
        c_ = int(c2 * e)  # hidden channels
        self.cv1 = Conv(c1, c2, 1, 1)
        self.cv2 = Conv(c1, c2, 1, 1)
        self.m = nn.Sequential(*[RepConv(c_, c_) for _ in range(n)])
        self.cv3 = Conv(c_, c2, 1, 1) if c_ != c2 else nn.Identity()

    def forward(self, x):
        """Forward pass of RT-DETR neck layer."""
        return self.cv3(self.m(self.cv1(x)) + self.cv2(x))

__init__(c1, c2, n=3, e=1.0)

์ž…๋ ฅ ์ฑ„๋„, ์ถœ๋ ฅ ์ฑ„๋„, ์ˆซ์ž๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋‹จ์ผ ์ปจ๋ณผ๋ฃจ์…˜์œผ๋กœ CSP ๋ณ‘๋ชฉ ํ˜„์ƒ์„ ์ดˆ๊ธฐํ™”ํ•ฉ๋‹ˆ๋‹ค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
def __init__(self, c1, c2, n=3, e=1.0):
    """Initialize CSP Bottleneck with a single convolution using input channels, output channels, and number."""
    super().__init__()
    c_ = int(c2 * e)  # hidden channels
    self.cv1 = Conv(c1, c2, 1, 1)
    self.cv2 = Conv(c1, c2, 1, 1)
    self.m = nn.Sequential(*[RepConv(c_, c_) for _ in range(n)])
    self.cv3 = Conv(c_, c2, 1, 1) if c_ != c2 else nn.Identity()

forward(x)

RT-DETR ๋„ฅ ๋ ˆ์ด์–ด์˜ ํฌ์›Œ๋“œ ํŒจ์Šค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
def forward(self, x):
    """Forward pass of RT-DETR neck layer."""
    return self.cv3(self.m(self.cv1(x)) + self.cv2(x))



ultralytics.nn.modules.block.C3TR

๋ฒ ์ด์Šค: C3

C3 ๋ชจ๋“ˆ๊ณผ TransformerBlock().

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
class C3TR(C3):
    """C3 module with TransformerBlock()."""

    def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5):
        """Initialize C3Ghost module with GhostBottleneck()."""
        super().__init__(c1, c2, n, shortcut, g, e)
        c_ = int(c2 * e)
        self.m = TransformerBlock(c_, c_, 4, n)

__init__(c1, c2, n=1, shortcut=True, g=1, e=0.5)

GhostBottleneck()์œผ๋กœ C3Ghost ๋ชจ๋“ˆ์„ ์ดˆ๊ธฐํ™”ํ•ฉ๋‹ˆ๋‹ค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5):
    """Initialize C3Ghost module with GhostBottleneck()."""
    super().__init__(c1, c2, n, shortcut, g, e)
    c_ = int(c2 * e)
    self.m = TransformerBlock(c_, c_, 4, n)



ultralytics.nn.modules.block.C3Ghost

๋ฒ ์ด์Šค: C3

C3 ๋ชจ๋“ˆ๊ณผ GhostBottleneck().

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
class C3Ghost(C3):
    """C3 module with GhostBottleneck()."""

    def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5):
        """Initialize 'SPP' module with various pooling sizes for spatial pyramid pooling."""
        super().__init__(c1, c2, n, shortcut, g, e)
        c_ = int(c2 * e)  # hidden channels
        self.m = nn.Sequential(*(GhostBottleneck(c_, c_) for _ in range(n)))

__init__(c1, c2, n=1, shortcut=True, g=1, e=0.5)

๊ณต๊ฐ„ ํ”ผ๋ผ๋ฏธ๋“œ ํ’€๋ง์„ ์œ„ํ•ด ๋‹ค์–‘ํ•œ ํ’€๋ง ํฌ๊ธฐ๋กœ 'SPP' ๋ชจ๋“ˆ์„ ์ดˆ๊ธฐํ™”ํ•ฉ๋‹ˆ๋‹ค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5):
    """Initialize 'SPP' module with various pooling sizes for spatial pyramid pooling."""
    super().__init__(c1, c2, n, shortcut, g, e)
    c_ = int(c2 * e)  # hidden channels
    self.m = nn.Sequential(*(GhostBottleneck(c_, c_) for _ in range(n)))



ultralytics.nn.modules.block.GhostBottleneck

๋ฒ ์ด์Šค: Module

๊ณ ์ŠคํŠธ ๋ณ‘๋ชฉ ํ˜„์ƒ https://github.com/huawei-noah/ghostnet.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
class GhostBottleneck(nn.Module):
    """Ghost Bottleneck https://github.com/huawei-noah/ghostnet."""

    def __init__(self, c1, c2, k=3, s=1):
        """Initializes GhostBottleneck module with arguments ch_in, ch_out, kernel, stride."""
        super().__init__()
        c_ = c2 // 2
        self.conv = nn.Sequential(
            GhostConv(c1, c_, 1, 1),  # pw
            DWConv(c_, c_, k, s, act=False) if s == 2 else nn.Identity(),  # dw
            GhostConv(c_, c2, 1, 1, act=False),  # pw-linear
        )
        self.shortcut = (
            nn.Sequential(DWConv(c1, c1, k, s, act=False), Conv(c1, c2, 1, 1, act=False)) if s == 2 else nn.Identity()
        )

    def forward(self, x):
        """Applies skip connection and concatenation to input tensor."""
        return self.conv(x) + self.shortcut(x)

__init__(c1, c2, k=3, s=1)

์ธ์ž ch_in, ch_out, kernel, stride๋กœ GhostBottleneck ๋ชจ๋“ˆ์„ ์ดˆ๊ธฐํ™”ํ•ฉ๋‹ˆ๋‹ค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
def __init__(self, c1, c2, k=3, s=1):
    """Initializes GhostBottleneck module with arguments ch_in, ch_out, kernel, stride."""
    super().__init__()
    c_ = c2 // 2
    self.conv = nn.Sequential(
        GhostConv(c1, c_, 1, 1),  # pw
        DWConv(c_, c_, k, s, act=False) if s == 2 else nn.Identity(),  # dw
        GhostConv(c_, c2, 1, 1, act=False),  # pw-linear
    )
    self.shortcut = (
        nn.Sequential(DWConv(c1, c1, k, s, act=False), Conv(c1, c2, 1, 1, act=False)) if s == 2 else nn.Identity()
    )

forward(x)

์ž…๋ ฅ์— ์—ฐ๊ฒฐ ๊ฑด๋„ˆ๋›ฐ๊ธฐ ๋ฐ ์—ฐ๊ฒฐ์„ ์ ์šฉํ•ฉ๋‹ˆ๋‹ค tensor.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
def forward(self, x):
    """Applies skip connection and concatenation to input tensor."""
    return self.conv(x) + self.shortcut(x)



ultralytics.nn.modules.block.Bottleneck

๋ฒ ์ด์Šค: Module

ํ‘œ์ค€ ๋ณ‘๋ชฉ ํ˜„์ƒ.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
class Bottleneck(nn.Module):
    """Standard bottleneck."""

    def __init__(self, c1, c2, shortcut=True, g=1, k=(3, 3), e=0.5):
        """Initializes a bottleneck module with given input/output channels, shortcut option, group, kernels, and
        expansion.
        """
        super().__init__()
        c_ = int(c2 * e)  # hidden channels
        self.cv1 = Conv(c1, c_, k[0], 1)
        self.cv2 = Conv(c_, c2, k[1], 1, g=g)
        self.add = shortcut and c1 == c2

    def forward(self, x):
        """'forward()' applies the YOLO FPN to input data."""
        return x + self.cv2(self.cv1(x)) if self.add else self.cv2(self.cv1(x))

__init__(c1, c2, shortcut=True, g=1, k=(3, 3), e=0.5)

์ฃผ์–ด์ง„ ์ž…๋ ฅ/์ถœ๋ ฅ ์ฑ„๋„, ๋ฐ”๋กœ ๊ฐ€๊ธฐ ์˜ต์…˜, ๊ทธ๋ฃน, ์ปค๋„, ํ™•์žฅ์œผ๋กœ ๋ณ‘๋ชฉ ๋ชจ๋“ˆ์„ ์ดˆ๊ธฐํ™”ํ•ฉ๋‹ˆ๋‹ค. ํ™•์žฅ.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
def __init__(self, c1, c2, shortcut=True, g=1, k=(3, 3), e=0.5):
    """Initializes a bottleneck module with given input/output channels, shortcut option, group, kernels, and
    expansion.
    """
    super().__init__()
    c_ = int(c2 * e)  # hidden channels
    self.cv1 = Conv(c1, c_, k[0], 1)
    self.cv2 = Conv(c_, c2, k[1], 1, g=g)
    self.add = shortcut and c1 == c2

forward(x)

'forward()'๋Š” ์ž…๋ ฅ ๋ฐ์ดํ„ฐ์— YOLO FPN์„ ์ ์šฉํ•ฉ๋‹ˆ๋‹ค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
def forward(self, x):
    """'forward()' applies the YOLO FPN to input data."""
    return x + self.cv2(self.cv1(x)) if self.add else self.cv2(self.cv1(x))



ultralytics.nn.modules.block.BottleneckCSP

๋ฒ ์ด์Šค: Module

CSP ๋ณ‘๋ชฉ ํ˜„์ƒ https://github.com/WongKinYiu/CrossStagePartialNetworks.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
class BottleneckCSP(nn.Module):
    """CSP Bottleneck https://github.com/WongKinYiu/CrossStagePartialNetworks."""

    def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5):
        """Initializes the CSP Bottleneck given arguments for ch_in, ch_out, number, shortcut, groups, expansion."""
        super().__init__()
        c_ = int(c2 * e)  # hidden channels
        self.cv1 = Conv(c1, c_, 1, 1)
        self.cv2 = nn.Conv2d(c1, c_, 1, 1, bias=False)
        self.cv3 = nn.Conv2d(c_, c_, 1, 1, bias=False)
        self.cv4 = Conv(2 * c_, c2, 1, 1)
        self.bn = nn.BatchNorm2d(2 * c_)  # applied to cat(cv2, cv3)
        self.act = nn.SiLU()
        self.m = nn.Sequential(*(Bottleneck(c_, c_, shortcut, g, e=1.0) for _ in range(n)))

    def forward(self, x):
        """Applies a CSP bottleneck with 3 convolutions."""
        y1 = self.cv3(self.m(self.cv1(x)))
        y2 = self.cv2(x)
        return self.cv4(self.act(self.bn(torch.cat((y1, y2), 1))))

__init__(c1, c2, n=1, shortcut=True, g=1, e=0.5)

ch_in, ch_out, ์ˆซ์ž, ๋‹จ์ถ•ํ‚ค, ๊ทธ๋ฃน, ํ™•์žฅ์— ๋Œ€ํ•œ ์ธ์ˆ˜๊ฐ€ ์ฃผ์–ด์ง€๋ฉด CSP ๋ณ‘๋ชฉํ˜„์ƒ์„ ์ดˆ๊ธฐํ™”ํ•ฉ๋‹ˆ๋‹ค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5):
    """Initializes the CSP Bottleneck given arguments for ch_in, ch_out, number, shortcut, groups, expansion."""
    super().__init__()
    c_ = int(c2 * e)  # hidden channels
    self.cv1 = Conv(c1, c_, 1, 1)
    self.cv2 = nn.Conv2d(c1, c_, 1, 1, bias=False)
    self.cv3 = nn.Conv2d(c_, c_, 1, 1, bias=False)
    self.cv4 = Conv(2 * c_, c2, 1, 1)
    self.bn = nn.BatchNorm2d(2 * c_)  # applied to cat(cv2, cv3)
    self.act = nn.SiLU()
    self.m = nn.Sequential(*(Bottleneck(c_, c_, shortcut, g, e=1.0) for _ in range(n)))

forward(x)

3๊ฐœ์˜ ์ปจ๋ณผ๋ฃจ์…˜์œผ๋กœ CSP ๋ณ‘๋ชฉ ํ˜„์ƒ์„ ์ ์šฉํ•ฉ๋‹ˆ๋‹ค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
def forward(self, x):
    """Applies a CSP bottleneck with 3 convolutions."""
    y1 = self.cv3(self.m(self.cv1(x)))
    y2 = self.cv2(x)
    return self.cv4(self.act(self.bn(torch.cat((y1, y2), 1))))



ultralytics.nn.modules.block.ResNetBlock

๋ฒ ์ด์Šค: Module

ํ‘œ์ค€ ์ปจ๋ณผ๋ฃจ์…˜ ๋ ˆ์ด์–ด๊ฐ€ ์žˆ๋Š” ResNet ๋ธ”๋ก.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
class ResNetBlock(nn.Module):
    """ResNet block with standard convolution layers."""

    def __init__(self, c1, c2, s=1, e=4):
        """Initialize convolution with given parameters."""
        super().__init__()
        c3 = e * c2
        self.cv1 = Conv(c1, c2, k=1, s=1, act=True)
        self.cv2 = Conv(c2, c2, k=3, s=s, p=1, act=True)
        self.cv3 = Conv(c2, c3, k=1, act=False)
        self.shortcut = nn.Sequential(Conv(c1, c3, k=1, s=s, act=False)) if s != 1 or c1 != c3 else nn.Identity()

    def forward(self, x):
        """Forward pass through the ResNet block."""
        return F.relu(self.cv3(self.cv2(self.cv1(x))) + self.shortcut(x))

__init__(c1, c2, s=1, e=4)

์ฃผ์–ด์ง„ ํŒŒ๋ผ๋ฏธํ„ฐ๋กœ ์ปจ๋ณผ๋ฃจ์…˜์„ ์ดˆ๊ธฐํ™”ํ•ฉ๋‹ˆ๋‹ค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
def __init__(self, c1, c2, s=1, e=4):
    """Initialize convolution with given parameters."""
    super().__init__()
    c3 = e * c2
    self.cv1 = Conv(c1, c2, k=1, s=1, act=True)
    self.cv2 = Conv(c2, c2, k=3, s=s, p=1, act=True)
    self.cv3 = Conv(c2, c3, k=1, act=False)
    self.shortcut = nn.Sequential(Conv(c1, c3, k=1, s=s, act=False)) if s != 1 or c1 != c3 else nn.Identity()

forward(x)

ResNet ๋ธ”๋ก์„ ์ˆœ๋ฐฉํ–ฅ์œผ๋กœ ํ†ต๊ณผํ•ฉ๋‹ˆ๋‹ค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
def forward(self, x):
    """Forward pass through the ResNet block."""
    return F.relu(self.cv3(self.cv2(self.cv1(x))) + self.shortcut(x))



ultralytics.nn.modules.block.ResNetLayer

๋ฒ ์ด์Šค: Module

์—ฌ๋Ÿฌ ๊ฐœ์˜ ResNet ๋ธ”๋ก์ด ์žˆ๋Š” ResNet ๋ ˆ์ด์–ด.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
class ResNetLayer(nn.Module):
    """ResNet layer with multiple ResNet blocks."""

    def __init__(self, c1, c2, s=1, is_first=False, n=1, e=4):
        """Initializes the ResNetLayer given arguments."""
        super().__init__()
        self.is_first = is_first

        if self.is_first:
            self.layer = nn.Sequential(
                Conv(c1, c2, k=7, s=2, p=3, act=True), nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
            )
        else:
            blocks = [ResNetBlock(c1, c2, s, e=e)]
            blocks.extend([ResNetBlock(e * c2, c2, 1, e=e) for _ in range(n - 1)])
            self.layer = nn.Sequential(*blocks)

    def forward(self, x):
        """Forward pass through the ResNet layer."""
        return self.layer(x)

__init__(c1, c2, s=1, is_first=False, n=1, e=4)

์ฃผ์–ด์ง„ ์ธ์ž๋กœ ResNetLayer๋ฅผ ์ดˆ๊ธฐํ™”ํ•ฉ๋‹ˆ๋‹ค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
def __init__(self, c1, c2, s=1, is_first=False, n=1, e=4):
    """Initializes the ResNetLayer given arguments."""
    super().__init__()
    self.is_first = is_first

    if self.is_first:
        self.layer = nn.Sequential(
            Conv(c1, c2, k=7, s=2, p=3, act=True), nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
        )
    else:
        blocks = [ResNetBlock(c1, c2, s, e=e)]
        blocks.extend([ResNetBlock(e * c2, c2, 1, e=e) for _ in range(n - 1)])
        self.layer = nn.Sequential(*blocks)

forward(x)

ResNet ๋ ˆ์ด์–ด๋ฅผ ํฌ์›Œ๋“œ ํŒจ์Šคํ•ฉ๋‹ˆ๋‹ค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
def forward(self, x):
    """Forward pass through the ResNet layer."""
    return self.layer(x)



ultralytics.nn.modules.block.MaxSigmoidAttnBlock

๋ฒ ์ด์Šค: Module

์ตœ๋Œ€ ์‹œ๊ทธ๋ชจ์ด๋“œ ์ฃผ์˜ ์ฐจ๋‹จ.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
class MaxSigmoidAttnBlock(nn.Module):
    """Max Sigmoid attention block."""

    def __init__(self, c1, c2, nh=1, ec=128, gc=512, scale=False):
        """Initializes MaxSigmoidAttnBlock with specified arguments."""
        super().__init__()
        self.nh = nh
        self.hc = c2 // nh
        self.ec = Conv(c1, ec, k=1, act=False) if c1 != ec else None
        self.gl = nn.Linear(gc, ec)
        self.bias = nn.Parameter(torch.zeros(nh))
        self.proj_conv = Conv(c1, c2, k=3, s=1, act=False)
        self.scale = nn.Parameter(torch.ones(1, nh, 1, 1)) if scale else 1.0

    def forward(self, x, guide):
        """Forward process."""
        bs, _, h, w = x.shape

        guide = self.gl(guide)
        guide = guide.view(bs, -1, self.nh, self.hc)
        embed = self.ec(x) if self.ec is not None else x
        embed = embed.view(bs, self.nh, self.hc, h, w)

        aw = torch.einsum("bmchw,bnmc->bmhwn", embed, guide)
        aw = aw.max(dim=-1)[0]
        aw = aw / (self.hc**0.5)
        aw = aw + self.bias[None, :, None, None]
        aw = aw.sigmoid() * self.scale

        x = self.proj_conv(x)
        x = x.view(bs, self.nh, -1, h, w)
        x = x * aw.unsqueeze(2)
        return x.view(bs, -1, h, w)

__init__(c1, c2, nh=1, ec=128, gc=512, scale=False)

์ง€์ •๋œ ์ธ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ MaxSigmoidAttnBlock์„ ์ดˆ๊ธฐํ™”ํ•ฉ๋‹ˆ๋‹ค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
def __init__(self, c1, c2, nh=1, ec=128, gc=512, scale=False):
    """Initializes MaxSigmoidAttnBlock with specified arguments."""
    super().__init__()
    self.nh = nh
    self.hc = c2 // nh
    self.ec = Conv(c1, ec, k=1, act=False) if c1 != ec else None
    self.gl = nn.Linear(gc, ec)
    self.bias = nn.Parameter(torch.zeros(nh))
    self.proj_conv = Conv(c1, c2, k=3, s=1, act=False)
    self.scale = nn.Parameter(torch.ones(1, nh, 1, 1)) if scale else 1.0

forward(x, guide)

์ „๋‹ฌ ํ”„๋กœ์„ธ์Šค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
def forward(self, x, guide):
    """Forward process."""
    bs, _, h, w = x.shape

    guide = self.gl(guide)
    guide = guide.view(bs, -1, self.nh, self.hc)
    embed = self.ec(x) if self.ec is not None else x
    embed = embed.view(bs, self.nh, self.hc, h, w)

    aw = torch.einsum("bmchw,bnmc->bmhwn", embed, guide)
    aw = aw.max(dim=-1)[0]
    aw = aw / (self.hc**0.5)
    aw = aw + self.bias[None, :, None, None]
    aw = aw.sigmoid() * self.scale

    x = self.proj_conv(x)
    x = x.view(bs, self.nh, -1, h, w)
    x = x * aw.unsqueeze(2)
    return x.view(bs, -1, h, w)



ultralytics.nn.modules.block.C2fAttn

๋ฒ ์ด์Šค: Module

C2f ๋ชจ๋“ˆ๊ณผ ์ถ”๊ฐ€ attn ๋ชจ๋“ˆ.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
class C2fAttn(nn.Module):
    """C2f module with an additional attn module."""

    def __init__(self, c1, c2, n=1, ec=128, nh=1, gc=512, shortcut=False, g=1, e=0.5):
        """Initialize CSP bottleneck layer with two convolutions with arguments ch_in, ch_out, number, shortcut, groups,
        expansion.
        """
        super().__init__()
        self.c = int(c2 * e)  # hidden channels
        self.cv1 = Conv(c1, 2 * self.c, 1, 1)
        self.cv2 = Conv((3 + n) * self.c, c2, 1)  # optional act=FReLU(c2)
        self.m = nn.ModuleList(Bottleneck(self.c, self.c, shortcut, g, k=((3, 3), (3, 3)), e=1.0) for _ in range(n))
        self.attn = MaxSigmoidAttnBlock(self.c, self.c, gc=gc, ec=ec, nh=nh)

    def forward(self, x, guide):
        """Forward pass through C2f layer."""
        y = list(self.cv1(x).chunk(2, 1))
        y.extend(m(y[-1]) for m in self.m)
        y.append(self.attn(y[-1], guide))
        return self.cv2(torch.cat(y, 1))

    def forward_split(self, x, guide):
        """Forward pass using split() instead of chunk()."""
        y = list(self.cv1(x).split((self.c, self.c), 1))
        y.extend(m(y[-1]) for m in self.m)
        y.append(self.attn(y[-1], guide))
        return self.cv2(torch.cat(y, 1))

__init__(c1, c2, n=1, ec=128, nh=1, gc=512, shortcut=False, g=1, e=0.5)

์ธ์ž ch_in, ch_out, number, shortcut, groups๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋‘ ๊ฐœ์˜ ์ปจ๋ณผ๋ฃจ์…˜์œผ๋กœ CSP ๋ณ‘๋ชฉ ๊ณ„์ธต์„ ์ดˆ๊ธฐํ™”ํ•ฉ๋‹ˆ๋‹ค, ํ™•์žฅ.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
def __init__(self, c1, c2, n=1, ec=128, nh=1, gc=512, shortcut=False, g=1, e=0.5):
    """Initialize CSP bottleneck layer with two convolutions with arguments ch_in, ch_out, number, shortcut, groups,
    expansion.
    """
    super().__init__()
    self.c = int(c2 * e)  # hidden channels
    self.cv1 = Conv(c1, 2 * self.c, 1, 1)
    self.cv2 = Conv((3 + n) * self.c, c2, 1)  # optional act=FReLU(c2)
    self.m = nn.ModuleList(Bottleneck(self.c, self.c, shortcut, g, k=((3, 3), (3, 3)), e=1.0) for _ in range(n))
    self.attn = MaxSigmoidAttnBlock(self.c, self.c, gc=gc, ec=ec, nh=nh)

forward(x, guide)

C2f ๋ ˆ์ด์–ด๋ฅผ ํ†ต๊ณผํ•ฉ๋‹ˆ๋‹ค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
def forward(self, x, guide):
    """Forward pass through C2f layer."""
    y = list(self.cv1(x).chunk(2, 1))
    y.extend(m(y[-1]) for m in self.m)
    y.append(self.attn(y[-1], guide))
    return self.cv2(torch.cat(y, 1))

forward_split(x, guide)

chunk() ๋Œ€์‹  split()์„ ์‚ฌ์šฉํ•˜์—ฌ ์ „๋‹ฌํ•ฉ๋‹ˆ๋‹ค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
def forward_split(self, x, guide):
    """Forward pass using split() instead of chunk()."""
    y = list(self.cv1(x).split((self.c, self.c), 1))
    y.extend(m(y[-1]) for m in self.m)
    y.append(self.attn(y[-1], guide))
    return self.cv2(torch.cat(y, 1))



ultralytics.nn.modules.block.ImagePoolingAttn

๋ฒ ์ด์Šค: Module

์ด๋ฏธ์ง€ ํ’€๋ง: ์ด๋ฏธ์ง€ ์ธ์‹ ์ •๋ณด๋กœ ํ…์ŠคํŠธ ์ž„๋ฒ ๋”ฉ์„ ๊ฐœ์„ ํ•ฉ๋‹ˆ๋‹ค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
class ImagePoolingAttn(nn.Module):
    """ImagePoolingAttn: Enhance the text embeddings with image-aware information."""

    def __init__(self, ec=256, ch=(), ct=512, nh=8, k=3, scale=False):
        """Initializes ImagePoolingAttn with specified arguments."""
        super().__init__()

        nf = len(ch)
        self.query = nn.Sequential(nn.LayerNorm(ct), nn.Linear(ct, ec))
        self.key = nn.Sequential(nn.LayerNorm(ec), nn.Linear(ec, ec))
        self.value = nn.Sequential(nn.LayerNorm(ec), nn.Linear(ec, ec))
        self.proj = nn.Linear(ec, ct)
        self.scale = nn.Parameter(torch.tensor([0.0]), requires_grad=True) if scale else 1.0
        self.projections = nn.ModuleList([nn.Conv2d(in_channels, ec, kernel_size=1) for in_channels in ch])
        self.im_pools = nn.ModuleList([nn.AdaptiveMaxPool2d((k, k)) for _ in range(nf)])
        self.ec = ec
        self.nh = nh
        self.nf = nf
        self.hc = ec // nh
        self.k = k

    def forward(self, x, text):
        """Executes attention mechanism on input tensor x and guide tensor."""
        bs = x[0].shape[0]
        assert len(x) == self.nf
        num_patches = self.k**2
        x = [pool(proj(x)).view(bs, -1, num_patches) for (x, proj, pool) in zip(x, self.projections, self.im_pools)]
        x = torch.cat(x, dim=-1).transpose(1, 2)
        q = self.query(text)
        k = self.key(x)
        v = self.value(x)

        # q = q.reshape(1, text.shape[1], self.nh, self.hc).repeat(bs, 1, 1, 1)
        q = q.reshape(bs, -1, self.nh, self.hc)
        k = k.reshape(bs, -1, self.nh, self.hc)
        v = v.reshape(bs, -1, self.nh, self.hc)

        aw = torch.einsum("bnmc,bkmc->bmnk", q, k)
        aw = aw / (self.hc**0.5)
        aw = F.softmax(aw, dim=-1)

        x = torch.einsum("bmnk,bkmc->bnmc", aw, v)
        x = self.proj(x.reshape(bs, -1, self.ec))
        return x * self.scale + text

__init__(ec=256, ch=(), ct=512, nh=8, k=3, scale=False)

์ง€์ •๋œ ์ธ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ImagePoolingAttn์„ ์ดˆ๊ธฐํ™”ํ•ฉ๋‹ˆ๋‹ค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
def __init__(self, ec=256, ch=(), ct=512, nh=8, k=3, scale=False):
    """Initializes ImagePoolingAttn with specified arguments."""
    super().__init__()

    nf = len(ch)
    self.query = nn.Sequential(nn.LayerNorm(ct), nn.Linear(ct, ec))
    self.key = nn.Sequential(nn.LayerNorm(ec), nn.Linear(ec, ec))
    self.value = nn.Sequential(nn.LayerNorm(ec), nn.Linear(ec, ec))
    self.proj = nn.Linear(ec, ct)
    self.scale = nn.Parameter(torch.tensor([0.0]), requires_grad=True) if scale else 1.0
    self.projections = nn.ModuleList([nn.Conv2d(in_channels, ec, kernel_size=1) for in_channels in ch])
    self.im_pools = nn.ModuleList([nn.AdaptiveMaxPool2d((k, k)) for _ in range(nf)])
    self.ec = ec
    self.nh = nh
    self.nf = nf
    self.hc = ec // nh
    self.k = k

forward(x, text)

์ž…๋ ฅ tensor x ๋ฐ ๊ฐ€์ด๋“œ tensor ์— ๋Œ€ํ•œ ์ฃผ์˜ ๋ฉ”์ปค๋‹ˆ์ฆ˜์„ ์‹คํ–‰ํ•ฉ๋‹ˆ๋‹ค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
def forward(self, x, text):
    """Executes attention mechanism on input tensor x and guide tensor."""
    bs = x[0].shape[0]
    assert len(x) == self.nf
    num_patches = self.k**2
    x = [pool(proj(x)).view(bs, -1, num_patches) for (x, proj, pool) in zip(x, self.projections, self.im_pools)]
    x = torch.cat(x, dim=-1).transpose(1, 2)
    q = self.query(text)
    k = self.key(x)
    v = self.value(x)

    # q = q.reshape(1, text.shape[1], self.nh, self.hc).repeat(bs, 1, 1, 1)
    q = q.reshape(bs, -1, self.nh, self.hc)
    k = k.reshape(bs, -1, self.nh, self.hc)
    v = v.reshape(bs, -1, self.nh, self.hc)

    aw = torch.einsum("bnmc,bkmc->bmnk", q, k)
    aw = aw / (self.hc**0.5)
    aw = F.softmax(aw, dim=-1)

    x = torch.einsum("bmnk,bkmc->bnmc", aw, v)
    x = self.proj(x.reshape(bs, -1, self.ec))
    return x * self.scale + text



ultralytics.nn.modules.block.ContrastiveHead

๋ฒ ์ด์Šค: Module

์ด๋ฏธ์ง€์™€ ํ…์ŠคํŠธ์˜ ์œ ์‚ฌ๋„์— ๋”ฐ๋ผ ์˜์—ญ-ํ…์ŠคํŠธ ์ ์ˆ˜๋ฅผ ๊ณ„์‚ฐํ•˜๋Š” YOLO-World์˜ ๋Œ€๋น„ ํ—ค๋“œ ๊ธฐ๋Šฅ์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
class ContrastiveHead(nn.Module):
    """Contrastive Head for YOLO-World compute the region-text scores according to the similarity between image and text
    features.
    """

    def __init__(self):
        """Initializes ContrastiveHead with specified region-text similarity parameters."""
        super().__init__()
        # NOTE: use -10.0 to keep the init cls loss consistency with other losses
        self.bias = nn.Parameter(torch.tensor([-10.0]))
        self.logit_scale = nn.Parameter(torch.ones([]) * torch.tensor(1 / 0.07).log())

    def forward(self, x, w):
        """Forward function of contrastive learning."""
        x = F.normalize(x, dim=1, p=2)
        w = F.normalize(w, dim=-1, p=2)
        x = torch.einsum("bchw,bkc->bkhw", x, w)
        return x * self.logit_scale.exp() + self.bias

__init__()

์ง€์ •๋œ ์˜์—ญ-ํ…์ŠคํŠธ ์œ ์‚ฌ์„ฑ ํŒŒ๋ผ๋ฏธํ„ฐ๋กœ ContrastiveHead๋ฅผ ์ดˆ๊ธฐํ™”ํ•ฉ๋‹ˆ๋‹ค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
def __init__(self):
    """Initializes ContrastiveHead with specified region-text similarity parameters."""
    super().__init__()
    # NOTE: use -10.0 to keep the init cls loss consistency with other losses
    self.bias = nn.Parameter(torch.tensor([-10.0]))
    self.logit_scale = nn.Parameter(torch.ones([]) * torch.tensor(1 / 0.07).log())

forward(x, w)

๋Œ€์กฐ ํ•™์Šต์˜ ์ˆœ๋ฐฉํ–ฅ ๊ธฐ๋Šฅ.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
def forward(self, x, w):
    """Forward function of contrastive learning."""
    x = F.normalize(x, dim=1, p=2)
    w = F.normalize(w, dim=-1, p=2)
    x = torch.einsum("bchw,bkc->bkhw", x, w)
    return x * self.logit_scale.exp() + self.bias



ultralytics.nn.modules.block.BNContrastiveHead

๋ฒ ์ด์Šค: Module

๋ฐฐ์น˜ ๋…ธ๋ฉ€๋ผ์ด์ œ์ด์…˜ ๋Œ€๋น„ ํ—ค๋“œ YOLO-์„ธ๊ณ„์—์„œ l2 ๋…ธ๋ฉ€๋ผ์ด์ œ์ด์…˜ ๋Œ€์‹  ๋ฐฐ์น˜ ๋…ธ๋ฉ€๋ผ์ด์ œ์ด์…˜์„ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.

๋งค๊ฐœ๋ณ€์ˆ˜:

์ด๋ฆ„ ์œ ํ˜• ์„ค๋ช… ๊ธฐ๋ณธ๊ฐ’
embed_dims int

ํ…์ŠคํŠธ ๋ฐ ์ด๋ฏธ์ง€ ๊ธฐ๋Šฅ์˜ ์น˜์ˆ˜๋ฅผ ํฌํ•จํ•ฉ๋‹ˆ๋‹ค.

ํ•„์ˆ˜
์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
class BNContrastiveHead(nn.Module):
    """
    Batch Norm Contrastive Head for YOLO-World using batch norm instead of l2-normalization.

    Args:
        embed_dims (int): Embed dimensions of text and image features.
    """

    def __init__(self, embed_dims: int):
        """Initialize ContrastiveHead with region-text similarity parameters."""
        super().__init__()
        self.norm = nn.BatchNorm2d(embed_dims)
        # NOTE: use -10.0 to keep the init cls loss consistency with other losses
        self.bias = nn.Parameter(torch.tensor([-10.0]))
        # use -1.0 is more stable
        self.logit_scale = nn.Parameter(-1.0 * torch.ones([]))

    def forward(self, x, w):
        """Forward function of contrastive learning."""
        x = self.norm(x)
        w = F.normalize(w, dim=-1, p=2)
        x = torch.einsum("bchw,bkc->bkhw", x, w)
        return x * self.logit_scale.exp() + self.bias

__init__(embed_dims)

์ง€์—ญ-ํ…์ŠคํŠธ ์œ ์‚ฌ์„ฑ ํŒŒ๋ผ๋ฏธํ„ฐ๋กœ ContrastiveHead๋ฅผ ์ดˆ๊ธฐํ™”ํ•ฉ๋‹ˆ๋‹ค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
def __init__(self, embed_dims: int):
    """Initialize ContrastiveHead with region-text similarity parameters."""
    super().__init__()
    self.norm = nn.BatchNorm2d(embed_dims)
    # NOTE: use -10.0 to keep the init cls loss consistency with other losses
    self.bias = nn.Parameter(torch.tensor([-10.0]))
    # use -1.0 is more stable
    self.logit_scale = nn.Parameter(-1.0 * torch.ones([]))

forward(x, w)

๋Œ€์กฐ ํ•™์Šต์˜ ์ˆœ๋ฐฉํ–ฅ ๊ธฐ๋Šฅ.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
def forward(self, x, w):
    """Forward function of contrastive learning."""
    x = self.norm(x)
    w = F.normalize(w, dim=-1, p=2)
    x = torch.einsum("bchw,bkc->bkhw", x, w)
    return x * self.logit_scale.exp() + self.bias



ultralytics.nn.modules.block.RepBottleneck

๋ฒ ์ด์Šค: Bottleneck

๋‹ด๋‹น์ž ๋ณ‘๋ชฉํ˜„์ƒ.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
class RepBottleneck(Bottleneck):
    """Rep bottleneck."""

    def __init__(self, c1, c2, shortcut=True, g=1, k=(3, 3), e=0.5):
        """Initializes a RepBottleneck module with customizable in/out channels, shortcut option, groups and expansion
        ratio.
        """
        super().__init__(c1, c2, shortcut, g, k, e)
        c_ = int(c2 * e)  # hidden channels
        self.cv1 = RepConv(c1, c_, k[0], 1)

__init__(c1, c2, shortcut=True, g=1, k=(3, 3), e=0.5)

์‚ฌ์šฉ์ž ์ง€์ • ๊ฐ€๋Šฅํ•œ ์ธ/์•„์›ƒ ์ฑ„๋„, ๋ฐ”๋กœ ๊ฐ€๊ธฐ ์˜ต์…˜, ๊ทธ๋ฃน ๋ฐ ํ™•์žฅ ๋น„์œจ์„ ์‚ฌ์šฉํ•˜์—ฌ RepBottleneck ๋ชจ๋“ˆ์„ ์ดˆ๊ธฐํ™”ํ•ฉ๋‹ˆ๋‹ค. ๋น„์œจ๋กœ ์ดˆ๊ธฐํ™”ํ•ฉ๋‹ˆ๋‹ค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
def __init__(self, c1, c2, shortcut=True, g=1, k=(3, 3), e=0.5):
    """Initializes a RepBottleneck module with customizable in/out channels, shortcut option, groups and expansion
    ratio.
    """
    super().__init__(c1, c2, shortcut, g, k, e)
    c_ = int(c2 * e)  # hidden channels
    self.cv1 = RepConv(c1, c_, k[0], 1)



ultralytics.nn.modules.block.RepCSP

๋ฒ ์ด์Šค: C3

3๊ฐœ์˜ ์ปจ๋ณผ๋ฃจ์…˜์œผ๋กœ CSP ๋ณ‘๋ชฉ ํ˜„์ƒ์„ ์žฌํ˜„ํ•ฉ๋‹ˆ๋‹ค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
class RepCSP(C3):
    """Rep CSP Bottleneck with 3 convolutions."""

    def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5):
        """Initializes RepCSP layer with given channels, repetitions, shortcut, groups and expansion ratio."""
        super().__init__(c1, c2, n, shortcut, g, e)
        c_ = int(c2 * e)  # hidden channels
        self.m = nn.Sequential(*(RepBottleneck(c_, c_, shortcut, g, e=1.0) for _ in range(n)))

__init__(c1, c2, n=1, shortcut=True, g=1, e=0.5)

์ง€์ •๋œ ์ฑ„๋„, ๋ฐ˜๋ณต, ๋ฐ”๋กœ ๊ฐ€๊ธฐ, ๊ทธ๋ฃน ๋ฐ ํ™•์žฅ ๋น„์œจ๋กœ RepCSP ๋ ˆ์ด์–ด๋ฅผ ์ดˆ๊ธฐํ™”ํ•ฉ๋‹ˆ๋‹ค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
def __init__(self, c1, c2, n=1, shortcut=True, g=1, e=0.5):
    """Initializes RepCSP layer with given channels, repetitions, shortcut, groups and expansion ratio."""
    super().__init__(c1, c2, n, shortcut, g, e)
    c_ = int(c2 * e)  # hidden channels
    self.m = nn.Sequential(*(RepBottleneck(c_, c_, shortcut, g, e=1.0) for _ in range(n)))



ultralytics.nn.modules.block.RepNCSPELAN4

๋ฒ ์ด์Šค: Module

CSP-ELAN.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
class RepNCSPELAN4(nn.Module):
    """CSP-ELAN."""

    def __init__(self, c1, c2, c3, c4, n=1):
        """Initializes CSP-ELAN layer with specified channel sizes, repetitions, and convolutions."""
        super().__init__()
        self.c = c3 // 2
        self.cv1 = Conv(c1, c3, 1, 1)
        self.cv2 = nn.Sequential(RepCSP(c3 // 2, c4, n), Conv(c4, c4, 3, 1))
        self.cv3 = nn.Sequential(RepCSP(c4, c4, n), Conv(c4, c4, 3, 1))
        self.cv4 = Conv(c3 + (2 * c4), c2, 1, 1)

    def forward(self, x):
        """Forward pass through RepNCSPELAN4 layer."""
        y = list(self.cv1(x).chunk(2, 1))
        y.extend((m(y[-1])) for m in [self.cv2, self.cv3])
        return self.cv4(torch.cat(y, 1))

    def forward_split(self, x):
        """Forward pass using split() instead of chunk()."""
        y = list(self.cv1(x).split((self.c, self.c), 1))
        y.extend(m(y[-1]) for m in [self.cv2, self.cv3])
        return self.cv4(torch.cat(y, 1))

__init__(c1, c2, c3, c4, n=1)

์ง€์ •๋œ ์ฑ„๋„ ํฌ๊ธฐ, ๋ฐ˜๋ณต ๋ฐ ์ปจ๋ณผ๋ฃจ์…˜์„ ์‚ฌ์šฉํ•˜์—ฌ CSP-ELAN ๋ ˆ์ด์–ด๋ฅผ ์ดˆ๊ธฐํ™”ํ•ฉ๋‹ˆ๋‹ค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
def __init__(self, c1, c2, c3, c4, n=1):
    """Initializes CSP-ELAN layer with specified channel sizes, repetitions, and convolutions."""
    super().__init__()
    self.c = c3 // 2
    self.cv1 = Conv(c1, c3, 1, 1)
    self.cv2 = nn.Sequential(RepCSP(c3 // 2, c4, n), Conv(c4, c4, 3, 1))
    self.cv3 = nn.Sequential(RepCSP(c4, c4, n), Conv(c4, c4, 3, 1))
    self.cv4 = Conv(c3 + (2 * c4), c2, 1, 1)

forward(x)

RepNCSPELAN4 ๋ ˆ์ด์–ด๋ฅผ ํฌ์›Œ๋“œ ์ „๋‹ฌํ•ฉ๋‹ˆ๋‹ค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
def forward(self, x):
    """Forward pass through RepNCSPELAN4 layer."""
    y = list(self.cv1(x).chunk(2, 1))
    y.extend((m(y[-1])) for m in [self.cv2, self.cv3])
    return self.cv4(torch.cat(y, 1))

forward_split(x)

chunk() ๋Œ€์‹  split()์„ ์‚ฌ์šฉํ•˜์—ฌ ์ „๋‹ฌํ•ฉ๋‹ˆ๋‹ค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
def forward_split(self, x):
    """Forward pass using split() instead of chunk()."""
    y = list(self.cv1(x).split((self.c, self.c), 1))
    y.extend(m(y[-1]) for m in [self.cv2, self.cv3])
    return self.cv4(torch.cat(y, 1))



ultralytics.nn.modules.block.ADown

๋ฒ ์ด์Šค: Module

ADown.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
class ADown(nn.Module):
    """ADown."""

    def __init__(self, c1, c2):
        """Initializes ADown module with convolution layers to downsample input from channels c1 to c2."""
        super().__init__()
        self.c = c2 // 2
        self.cv1 = Conv(c1 // 2, self.c, 3, 2, 1)
        self.cv2 = Conv(c1 // 2, self.c, 1, 1, 0)

    def forward(self, x):
        """Forward pass through ADown layer."""
        x = torch.nn.functional.avg_pool2d(x, 2, 1, 0, False, True)
        x1, x2 = x.chunk(2, 1)
        x1 = self.cv1(x1)
        x2 = torch.nn.functional.max_pool2d(x2, 3, 2, 1)
        x2 = self.cv2(x2)
        return torch.cat((x1, x2), 1)

__init__(c1, c2)

์ฑ„๋„ c1์—์„œ c2๋กœ ์ž…๋ ฅ์„ ๋‹ค์šด์ƒ˜ํ”Œ๋งํ•˜๊ธฐ ์œ„ํ•ด ์ปจ๋ณผ๋ฃจ์…˜ ๋ ˆ์ด์–ด๋กœ AD๋‹ค์šด ๋ชจ๋“ˆ์„ ์ดˆ๊ธฐํ™”ํ•ฉ๋‹ˆ๋‹ค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
def __init__(self, c1, c2):
    """Initializes ADown module with convolution layers to downsample input from channels c1 to c2."""
    super().__init__()
    self.c = c2 // 2
    self.cv1 = Conv(c1 // 2, self.c, 3, 2, 1)
    self.cv2 = Conv(c1 // 2, self.c, 1, 1, 0)

forward(x)

AD๋‹ค์šด ๋ ˆ์ด์–ด๋ฅผ ํ†ตํ•ด ์ „๋‹ฌํ•ฉ๋‹ˆ๋‹ค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
def forward(self, x):
    """Forward pass through ADown layer."""
    x = torch.nn.functional.avg_pool2d(x, 2, 1, 0, False, True)
    x1, x2 = x.chunk(2, 1)
    x1 = self.cv1(x1)
    x2 = torch.nn.functional.max_pool2d(x2, 3, 2, 1)
    x2 = self.cv2(x2)
    return torch.cat((x1, x2), 1)



ultralytics.nn.modules.block.SPPELAN

๋ฒ ์ด์Šค: Module

SPP-ELAN.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
class SPPELAN(nn.Module):
    """SPP-ELAN."""

    def __init__(self, c1, c2, c3, k=5):
        """Initializes SPP-ELAN block with convolution and max pooling layers for spatial pyramid pooling."""
        super().__init__()
        self.c = c3
        self.cv1 = Conv(c1, c3, 1, 1)
        self.cv2 = nn.MaxPool2d(kernel_size=k, stride=1, padding=k // 2)
        self.cv3 = nn.MaxPool2d(kernel_size=k, stride=1, padding=k // 2)
        self.cv4 = nn.MaxPool2d(kernel_size=k, stride=1, padding=k // 2)
        self.cv5 = Conv(4 * c3, c2, 1, 1)

    def forward(self, x):
        """Forward pass through SPPELAN layer."""
        y = [self.cv1(x)]
        y.extend(m(y[-1]) for m in [self.cv2, self.cv3, self.cv4])
        return self.cv5(torch.cat(y, 1))

__init__(c1, c2, c3, k=5)

๊ณต๊ฐ„ ํ”ผ๋ผ๋ฏธ๋“œ ํ’€๋ง์„ ์œ„ํ•ด ์ปจ๋ณผ๋ฃจ์…˜ ๋ฐ ์ตœ๋Œ€ ํ’€๋ง ๋ ˆ์ด์–ด๋กœ SPP-ELAN ๋ธ”๋ก์„ ์ดˆ๊ธฐํ™”ํ•ฉ๋‹ˆ๋‹ค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
def __init__(self, c1, c2, c3, k=5):
    """Initializes SPP-ELAN block with convolution and max pooling layers for spatial pyramid pooling."""
    super().__init__()
    self.c = c3
    self.cv1 = Conv(c1, c3, 1, 1)
    self.cv2 = nn.MaxPool2d(kernel_size=k, stride=1, padding=k // 2)
    self.cv3 = nn.MaxPool2d(kernel_size=k, stride=1, padding=k // 2)
    self.cv4 = nn.MaxPool2d(kernel_size=k, stride=1, padding=k // 2)
    self.cv5 = Conv(4 * c3, c2, 1, 1)

forward(x)

SPPELAN ๋ ˆ์ด์–ด๋ฅผ ํ†ต๊ณผํ•˜๋Š” ํฌ์›Œ๋“œ ํŒจ์Šค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
def forward(self, x):
    """Forward pass through SPPELAN layer."""
    y = [self.cv1(x)]
    y.extend(m(y[-1]) for m in [self.cv2, self.cv3, self.cv4])
    return self.cv5(torch.cat(y, 1))



ultralytics.nn.modules.block.Silence

๋ฒ ์ด์Šค: Module

์นจ๋ฌต.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
class Silence(nn.Module):
    """Silence."""

    def __init__(self):
        """Initializes the Silence module."""
        super(Silence, self).__init__()

    def forward(self, x):
        """Forward pass through Silence layer."""
        return x

__init__()

๋ฌด์Œ ๋ชจ๋“ˆ์„ ์ดˆ๊ธฐํ™”ํ•ฉ๋‹ˆ๋‹ค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
def __init__(self):
    """Initializes the Silence module."""
    super(Silence, self).__init__()

forward(x)

์นจ๋ฌต ๋ ˆ์ด์–ด๋ฅผ ์•ž์œผ๋กœ ํ†ต๊ณผํ•ฉ๋‹ˆ๋‹ค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
def forward(self, x):
    """Forward pass through Silence layer."""
    return x



ultralytics.nn.modules.block.CBLinear

๋ฒ ์ด์Šค: Module

CBLinear.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
class CBLinear(nn.Module):
    """CBLinear."""

    def __init__(self, c1, c2s, k=1, s=1, p=None, g=1):
        """Initializes the CBLinear module, passing inputs unchanged."""
        super(CBLinear, self).__init__()
        self.c2s = c2s
        self.conv = nn.Conv2d(c1, sum(c2s), k, s, autopad(k, p), groups=g, bias=True)

    def forward(self, x):
        """Forward pass through CBLinear layer."""
        outs = self.conv(x).split(self.c2s, dim=1)
        return outs

__init__(c1, c2s, k=1, s=1, p=None, g=1)

์ž…๋ ฅ์„ ๋ณ€๊ฒฝํ•˜์ง€ ์•Š๊ณ  ์ „๋‹ฌํ•˜์—ฌ CBLinear ๋ชจ๋“ˆ์„ ์ดˆ๊ธฐํ™”ํ•ฉ๋‹ˆ๋‹ค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
def __init__(self, c1, c2s, k=1, s=1, p=None, g=1):
    """Initializes the CBLinear module, passing inputs unchanged."""
    super(CBLinear, self).__init__()
    self.c2s = c2s
    self.conv = nn.Conv2d(c1, sum(c2s), k, s, autopad(k, p), groups=g, bias=True)

forward(x)

CB๋ฆฌ๋‹ˆ์–ด ๋ ˆ์ด์–ด๋ฅผ ํ†ต๊ณผํ•ฉ๋‹ˆ๋‹ค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
def forward(self, x):
    """Forward pass through CBLinear layer."""
    outs = self.conv(x).split(self.c2s, dim=1)
    return outs



ultralytics.nn.modules.block.CBFuse

๋ฒ ์ด์Šค: Module

CBFuse.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
class CBFuse(nn.Module):
    """CBFuse."""

    def __init__(self, idx):
        """Initializes CBFuse module with layer index for selective feature fusion."""
        super(CBFuse, self).__init__()
        self.idx = idx

    def forward(self, xs):
        """Forward pass through CBFuse layer."""
        target_size = xs[-1].shape[2:]
        res = [F.interpolate(x[self.idx[i]], size=target_size, mode="nearest") for i, x in enumerate(xs[:-1])]
        out = torch.sum(torch.stack(res + xs[-1:]), dim=0)
        return out

__init__(idx)

์„ ํƒ์  ๊ธฐ๋Šฅ ์œตํ•ฉ์„ ์œ„ํ•ด ๋ ˆ์ด์–ด ์ธ๋ฑ์Šค๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ CBFuse ๋ชจ๋“ˆ์„ ์ดˆ๊ธฐํ™”ํ•ฉ๋‹ˆ๋‹ค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
def __init__(self, idx):
    """Initializes CBFuse module with layer index for selective feature fusion."""
    super(CBFuse, self).__init__()
    self.idx = idx

forward(xs)

CBFuse ๋ ˆ์ด์–ด๋ฅผ ์ „๋‹ฌํ•ฉ๋‹ˆ๋‹ค.

์˜ ์†Œ์Šค ์ฝ”๋“œ ultralytics/nn/modules/block.py
def forward(self, xs):
    """Forward pass through CBFuse layer."""
    target_size = xs[-1].shape[2:]
    res = [F.interpolate(x[self.idx[i]], size=target_size, mode="nearest") for i, x in enumerate(xs[:-1])]
    out = torch.sum(torch.stack(res + xs[-1:]), dim=0)
    return out





์ƒ์„ฑ 2023-11-12, ์—…๋ฐ์ดํŠธ 2024-03-04
์ž‘์„ฑ์ž: ์›ƒ๋Š”-ํ (1), ๊ธ€๋ Œ-์กฐ์ฒ˜ (5)