现代CNN架构中使用的不同模块

资讯 2年前

2.07K

今天，我们将看到现代CNN架构中使用的不同模块，如ResNet、MobileNet、EfficientNet，以及它们在PyTorch中的实现。让我们创建一个通用的conv－norm－act层from

今天，我们将看到现代CNN架构中使用的不同模块，如ResNet、MobileNet、EfficientNet，以及它们在PyTorch中的实现。

让我们创建一个通用的conv－norm－act层

from functools import partial

from torch import nn

class ConvNormAct（nn．Sequential）：

def ＿＿init＿＿（

self，

in＿features： int，

out＿features： int，

kernel＿size： int，

norm： nn．Module ＝ nn．BatchNorm2d，

act： nn．Module ＝ nn．ReLU，

＊＊kwargs

）：

super（）．＿＿init＿＿（

nn．Conv2d（

in＿features，

out＿features，

kernel＿size＝kernel＿size，

padding＝kernel＿size ／／ 2，

），

norm（out＿features），

act（），

）

Conv1X1BnReLU ＝ partial（ConvNormAct， kernel＿size＝1）

Conv3X3BnReLU ＝ partial（ConvNormAct， kernel＿size＝3）

import torch

x ＝ torch．randn（（1， 32， 56， 56））

Conv1X1BnReLU（32， 64）（x）．shape

torch．Size（［1， 64， 56， 56］）

残差连接残差连接用于ResNet中，想法是将输入添加到输出中，输出＝层＋输入。下图可能会帮助你将其可视化。但是，我的意思是它只是一个＋运算符。残差操作提高了梯度传播的能力，允许有效地训练具有100层以上的网络。

在PyTorch中，我们可以轻松创建一个ResidualAdd层

from torch import nn

from torch import Tensor

class ResidualAdd（nn．Module）：

def ＿＿init＿＿（self， block： nn．Module）：

super（）．＿＿init＿＿（）

self．block ＝ block

def forward（self， x： Tensor）－＞ Tensor：

res ＝ x

x ＝ self．block（x）

x ＋＝ res

return x

ResidualAdd（

nn．Conv2d（32， 32， kernel＿size＝1）

）（x）．shape

shortcut

有时你的残差没有相同的输出维度，所以我们不能添加它们。我们可以使用shortcut中的卷积投射输入，以匹配输出特征：

from typing import Optional

class ResidualAdd（nn．Module）：

def ＿＿init＿＿（self， block： nn．Module， shortcut： Optional［nn．Module］＝ None）：

super（）．＿＿init＿＿（）

self．block ＝ block

self．shortcut ＝ shortcut

def forward（self， x： Tensor）－＞ Tensor：

res ＝ x

x ＝ self．block（x）

if self．shortcut：

res ＝ self．shortcut（res）

x ＋＝ res

return x

ResidualAdd（

nn．Conv2d（32， 64， kernel＿size＝1），

shortcut＝nn．Conv2d（32， 64， kernel＿size＝1）

）（x）．shape

BottleNeck Blocks

在图像识别的深度残差学习中引入了Bottlenecks。Bottlenecks块接受大小为BxCxHxW的输入，它首先使用1x1 卷积将其变为BxC／rxHxW，然后应用3x3 卷积，最后将输出重新映射到与输入相同的特征维度BxCxHxW，然后再次使用1x1卷积。这比使用三个3x3卷积更快。

因为首先减少了输入，所以我们称之为“Bottlenecks”。下图显示了该块，我们在原始实现中使用了r＝4

前两个卷积之后是batchnorm和一个非线性激活层，而最后一个非线性层在加法后应用。

在PyTorch中为：

from torch import nn

class BottleNeck（nn．Sequential）：

def ＿＿init＿＿（self， in＿features： int， out＿features： int， reduction： int ＝ 4）：

reduced＿features ＝ out＿features ／／ reduction

super（）．＿＿init＿＿（

nn．Sequential（

ResidualAdd（

nn．Sequential（

＃ wide －＞ narrow

Conv1X1BnReLU（in＿features， reduced＿features），

＃ narrow －＞ narrow

Conv3X3BnReLU（reduced＿features， reduced＿features），

＃ narrow －＞ wide

Conv1X1BnReLU（reduced＿features， out＿features， act＝nn．Identity），

），

shortcut＝Conv1X1BnReLU（in＿features， out＿features）

if in＿features ！＝ out＿features

else None，

），

nn．ReLU（），

）

BottleNeck（32， 64）（x）．shape

请注意，仅当输入和输出特征不同时，我们才应用shortcut。

在实践中，当我们希望减小空间维数时，在卷积中使用stride＝2。

Linear BottleNecks

MobileNet V2中引入了Linear Bottleneck。Linear BottleNecks是没有激活函数的Bottlenecks块。

在论文的第3．2节中，他们详细讨论了为什么在输出之前存在非线性会损害性能。简而言之，非线性函数ReLU在＜0时设为0会导致破坏信息。因此，在Bottlenecks中删除nn．ReLU你就可以拥有Linear BottleNecks。

倒残差

MobileNet V2中再次引入了倒残差。

倒残差块是反向的Bottlenecks层。它们通过第一次卷积扩展特征，而不是减少特征。

下图应该可以清楚地说明这一点

我们从BxCxHxW到－＞BxCxHxW－＞BxCxHxW－＞BxCxHxW，其中e是膨胀率，它被设置为4。而不是像在正常的Bottlenecks区那样变宽－＞变窄－＞变宽，而是相反，变窄－＞变宽－＞变窄。

在PyTorch中，实现如下

class InvertedResidual（nn．Sequential）：

def ＿＿init＿＿（self， in＿features： int， out＿features： int， expansion： int ＝ 4）：

expanded＿features ＝ in＿features ＊ expansion

super（）．＿＿init＿＿（

nn．Sequential（

ResidualAdd（

nn．Sequential（

＃ narrow －＞ wide

Conv1X1BnReLU（in＿features， expanded＿features），

＃ wide －＞ wide

Conv3X3BnReLU（expanded＿features， expanded＿features），

＃ wide －＞ narrow

Conv1X1BnReLU（expanded＿features， out＿features， act＝nn．Identity），

），

shortcut＝Conv1X1BnReLU（in＿features， out＿features）

if in＿features ！＝ out＿features

else None，

），

nn．ReLU（），

）

）
InvertedResidual（32， 64）（x）．shape

在MobileNet中，只有当输入和输出特征匹配时，才会应用残差连接

class MobileNetLikeBlock（nn．Sequential）：

def ＿＿init＿＿（self， in＿features： int， out＿features： int， expansion： int ＝ 4）：

＃ use ResidualAdd if features match， otherwise a normal Sequential

residual ＝ ResidualAdd if in＿features ＝＝ out＿features else nn．Sequential

expanded＿features ＝ in＿features ＊ expansion

super（）．＿＿init＿＿（

nn．Sequential（

residual（

nn．Sequential（

＃ narrow －＞ wide

Conv1X1BnReLU（in＿features， expanded＿features），

＃ wide －＞ wide

Conv3X3BnReLU（expanded＿features， expanded＿features），

＃ wide －＞ narrow

Conv1X1BnReLU（expanded＿features， out＿features， act＝nn．Identity），

），

nn．ReLU（），

）

）
MobileNetLikeBlock（32， 64）（x）．shape

MobileNetLikeBlock（32， 32）（x）．shape

MBConv

MobileNet V2的构建块被称为MBConv。MBConv是具有深度可分离卷积的倒残差的Linear BottleNecks层。

深度可分离卷积

深度可分离卷积采用一种技巧，将一个正常的3x3卷积夹在两个卷积中，以减少参数数量。

第一个对每个输入的通道应用单个3x3滤波器，另一个对所有通道应用1x1滤波器。

这与正常的3x3卷积相同，但你节省了参数。

然而它比我们现有硬件上的普通3x3慢得多。

下图显示了这个想法

通道中的不同颜色表示每个通道应用的单个过滤器

PyTorch中：

class DepthWiseSeparableConv（nn．Sequential）：

def ＿＿init＿＿（self， in＿features： int， out＿features： int）：

super（）．＿＿init＿＿（

nn．Conv2d（in＿features， in＿features， kernel＿size＝3， groups＝in＿features），

nn．Conv2d（in＿features， out＿features， kernel＿size＝1）

）
DepthWiseSeparableConv（32， 64）（x）．shape

第一次卷积通常称为depth，而第二次卷积称为point。让我们统计参数量

sum（p．numel（） for p in DepthWiseSeparableConv（32， 64）．parameters（） if p．requires＿grad）

输出：2432

让我们看一个普通的Conv2d

sum（p．numel（） for p in nn．Conv2d（32， 64， kernel＿size＝3）．parameters（） if p．requires＿grad）

输出：18496

有很大的区别

实现MBConv

那么，让我们创建一个完整的MBConv。

MBConv有几个重要的细节，标准化应用于深度和点卷积，非线性仅应用于深度卷积（Linear Bottlenecks）。

class MBConv（nn．Sequential）：

def ＿＿init＿＿（self， in＿features： int， out＿features： int， expansion： int ＝ 4）：

residual ＝ ResidualAdd if in＿features ＝＝ out＿features else nn．Sequential

expanded＿features ＝ in＿features ＊ expansion

super（）．＿＿init＿＿（

nn．Sequential（

residual（

nn．Sequential（

＃ narrow －＞ wide

Conv1X1BnReLU（in＿features，

expanded＿features，

act＝nn．ReLU6

），

＃ wide －＞ wide

Conv3X3BnReLU（expanded＿features，

expanded＿features，

groups＝expanded＿features，

act＝nn．ReLU6

），

＃ here you can apply SE

＃ wide －＞ narrow

Conv1X1BnReLU（expanded＿features， out＿features， act＝nn．Identity），

），

nn．ReLU（），

）

）
MBConv（32， 64）（x）．shape

Fused MBConv

EfficientNetV2中引入了融合倒残差

所以基本上，由于深度卷积比较慢，他们将第一个和第二个卷积融合在一个3x3的卷积中（第3．2节）。

class FusedMBConv（nn．Sequential）：

def ＿＿init＿＿（self， in＿features： int， out＿features： int， expansion： int ＝ 4）：

residual ＝ ResidualAdd if in＿features ＝＝ out＿features else nn．Sequential

expanded＿features ＝ in＿features ＊ expansion

super（）．＿＿init＿＿（

nn．Sequential（

residual（

nn．Sequential（

Conv3X3BnReLU（in＿features，

expanded＿features，

act＝nn．ReLU6

），

＃ here you can apply SE

＃ wide －＞ narrow

Conv1X1BnReLU（expanded＿features， out＿features， act＝nn．Identity），

），

nn．ReLU（），

）

）
MBConv（32， 64）（x）．shape

结论

现在你应该知道所有这些块之间的区别以及它们背后的原因了！

原文标题 : Residual, BottleNeck, Linear BottleNeck, MBConv解释

# 图像识别

文章版权归作者所有，未经允许请勿转载。

AI芯片，产业未来的搅局者

1724

本周（5.15-5.21）XR行业新闻汇总

952

指甲上的可穿戴传感器，监测健康状况

1308

从AI数羊到“相牛”：人工智能应用如何低门槛化？

2038

“零元购”元宇宙资产，中青宝再收关注函

960

Graphcore全力以赴全新架构AI芯片

1207

现代CNN架构中使用的不同模块

商汤首份财报释放了什么信号？

赞同科技资金拆借频繁，毛利率走低

相关文章

热门网址

时间线

热门标签

热门工具