【MobileNetV3】MobileNetV3网络结构详解

2023-05-16

文章目录

  • 1 MobileNetV3创新点
  • 2 block变成了什么样
    • 2.1 总体介绍
    • 2.2 SE模块理解
    • 2.3 ReLu6和hardswish激活函数理解
  • 3 网络总体结构
  • 4 代码解读
  • 5 感谢链接

在看本文前,强烈建议先看一下之前写的 MobilenetV2。

1 MobileNetV3创新点

  • bottleneck结构变了
  • 让网络更宽、更深,宽多少?深多少?采用NAS(Neural Architecture Search)搜索得到
  • 重新设计耗时层结构(针对NAS搜索的结构进行设计,咱可以不管)

2 block变成了什么样

2.1 总体介绍

参考大佬的图片进行解读,Mobilenetv2中的block如下图所示
mobilenetv2
Mobilenetv3中的block如下图所示

Mobilenetv3
可以发现,Mobilenetv3的block中加入了SE模块,更换了激活函数
SE模块下一节讲。
此处更新的激活函数在图中用NL(非线性)统一表示,因为用到的激活函数不一样,主要有hardswish、relu两种。
最后那个1x1降维投影层用的是线性激活(f(x)=x),也可以理解为没用激活。

2.2 SE模块理解

SE(Squeeze-and-Excitation) 模块类似于一个注意力模块,以在Mobilenetv3中的应用为例进行理解,如下图所示。
SE模块

2.3 ReLu6和hardswish激活函数理解

ReLu6激活函数如下图所示,相当于加了个最大值6进行限制。
relu6

hardswish激活函数如下图所示,相当于分成3段进行限制。
采用hardswish,计算速度相对较快,对量化过程友好

hardswish
hardswish图像

3 网络总体结构

作者针对不同需求,通过NAS得到两种结构,一个是MobilenetV3-Large,结构如下图:

MobilenetV3-Large
图中部分参数解释:

  • Input表示输入尺寸
  • Operator中的NBN表示不使用BN,最后的conv2d 1x1相当于全连接层的作用
  • exp size表示bottleneck中的第一层1x1卷积升维,维度升到多少(第一个bottleneck没有1x1卷积升维操作)
  • out表示bottleneck输出的channel个数
  • SE表示是否使用SE模块
  • NL表示使用何种激活函数,HS表示HardSwish,RE表示ReLu
  • s表示步长(s=2,长宽变为原来一半)

另一个是MobilenetV3-Small,结构如下图:

MobilenetV3-Small

4 代码解读

直接看代码注释即可,可运行

from typing import Callable, List, Optional

import torch
from torch import nn, Tensor
from torch.nn import functional as F
from functools import partial


# ------------------------------------------------------#
#   这个函数的目的是确保Channel个数能被8整除。
#   离它最近的8的倍数
#	很多嵌入式设备做优化时都采用这个准则
# ------------------------------------------------------#
def _make_divisible(ch, divisor=8, min_ch=None):
    if min_ch is None:
        min_ch = divisor
    # int(v + divisor / 2) // divisor * divisor:四舍五入到8
    new_ch = max(min_ch, int(ch + divisor / 2) // divisor * divisor)
    # Make sure that round down does not go down by more than 10%.
    if new_ch < 0.9 * ch:
        new_ch += divisor
    return new_ch


# -------------------------------------------------------------#
#   Conv+BN+Acti经常会用到,组在一起
# -------------------------------------------------------------#
class ConvBNActivation(nn.Sequential):
    def __init__(self,
                 in_planes: int,
                 out_planes: int,
                 kernel_size: int = 3,
                 stride: int = 1,
                 groups: int = 1,
                 norm_layer: Optional[Callable[..., nn.Module]] = None,         # 卷积后的BN层
                 activation_layer: Optional[Callable[..., nn.Module]] = None):  # 激活函数
        padding = (kernel_size - 1) // 2
        if norm_layer is None:          # 没有传入,就默认使用BN
            norm_layer = nn.BatchNorm2d
        if activation_layer is None:
            activation_layer = nn.ReLU6
        super(ConvBNActivation, self).__init__(nn.Conv2d(in_channels=in_planes,
                                                         out_channels=out_planes,
                                                         kernel_size=kernel_size,
                                                         stride=stride,
                                                         padding=padding,
                                                         groups=groups,
                                                         bias=False),           # 后面会用到BN层,故不使用bias
                                               norm_layer(out_planes),
                                               activation_layer(inplace=True))


# ------------------------------------------------------#
#   注意力模块:SE模块
#	就是两个FC层,节点个数、激活函数要注意要注意
# ------------------------------------------------------#
class SqueezeExcitation(nn.Module):
    # squeeze_factor: int = 4:第一个FC层节点个数是输入特征矩阵的1/4
    def __init__(self, input_c: int, squeeze_factor: int = 4):
        super(SqueezeExcitation, self).__init__()
        # 第一个FC层节点个数,也要是8的整数倍
        squeeze_c = _make_divisible(input_c // squeeze_factor, 8)
        # 通过卷积核大小为1x1的卷积替代FC层,作用相同
        self.fc1 = nn.Conv2d(input_c, squeeze_c, 1)
        self.fc2 = nn.Conv2d(squeeze_c, input_c, 1)

    def forward(self, x: Tensor) -> Tensor:
        # x有很多channel,通过output_size=(1, 1)实现每个channel变成1个数字
        scale = F.adaptive_avg_pool2d(x, output_size=(1, 1))
        scale = self.fc1(scale)
        scale = F.relu(scale, inplace=True)
        scale = self.fc2(scale)
        # 此处的scale就是第二个FC层输出的数据
        scale = F.hardsigmoid(scale, inplace=True)  
        return scale * x        # 和原输入相乘,得到SE模块的输出


# ------------------------------------------------------#
#   InvertedResidualConfig是参数配置文件
# ------------------------------------------------------#
class InvertedResidualConfig:
    def __init__(self,
                 input_c: int,          
                 kernel: int,
                 expanded_c: int,   # bottleneck中的第一层1x1卷积升维,维度升到多少
                 out_c: int,
                 use_se: bool,
                 activation: str,
                 stride: int,
                 width_multi: float):       # 和mobilenetv2中倍率因子相同,通过它得到每一层channels个数和基线的区别
        self.input_c = self.adjust_channels(input_c, width_multi)   # 倍率因子用在这儿了
        self.kernel = kernel
        self.expanded_c = self.adjust_channels(expanded_c, width_multi)
        self.out_c = self.adjust_channels(out_c, width_multi)
        self.use_se = use_se
        # activation == "HS",则self.use_hs==True
        self.use_hs = activation == "HS"  # whether using h-swish activation
        self.stride = stride

    # 静态方法
    @staticmethod
    def adjust_channels(channels: int, width_multi: float):
        return _make_divisible(channels * width_multi, 8)


class InvertedResidual(nn.Module):
    def __init__(self,
                 cnf: InvertedResidualConfig,       # cnf是个config文件,对应的格式就是上面介绍的InvertedResidualConfig类
                 norm_layer: Callable[..., nn.Module]):
        super(InvertedResidual, self).__init__()

        if cnf.stride not in [1, 2]:
            raise ValueError("illegal stride value.")

        # 是否使用shortcut连接
        self.use_res_connect = (cnf.stride == 1 and cnf.input_c == cnf.out_c)       

        layers: List[nn.Module] = []    # 定义一个空列表,里面元素类型为nn.module
        activation_layer = nn.Hardswish if cnf.use_hs else nn.ReLU

        # expand
        if cnf.expanded_c != cnf.input_c:       # 第一个bottleneck没有这个1x1卷积,故有这个if哦安短
            layers.append(ConvBNActivation(cnf.input_c,
                                           cnf.expanded_c,
                                           kernel_size=1,
                                           norm_layer=norm_layer,
                                           activation_layer=activation_layer))

        # depthwise
        layers.append(ConvBNActivation(cnf.expanded_c,      # 上一层1x1输出通道数为cnf.expanded_c
                                       cnf.expanded_c,
                                       kernel_size=cnf.kernel,
                                       stride=cnf.stride,
                                       groups=cnf.expanded_c,       # DW卷积
                                       norm_layer=norm_layer,
                                       activation_layer=activation_layer))

        if cnf.use_se:      # 是否使用se模块,只需要传入个input_channel
            layers.append(SqueezeExcitation(cnf.expanded_c))

        # project       降维1x1卷积层
        layers.append(ConvBNActivation(cnf.expanded_c,
                                       cnf.out_c,
                                       kernel_size=1,
                                       norm_layer=norm_layer,
                                       # nn.Identity是一个线性激活,没进行任何处理
                                       #    内部实现:直接return input
                                       activation_layer=nn.Identity))   

        self.block = nn.Sequential(*layers)
        self.out_channels = cnf.out_c
        self.is_strided = cnf.stride > 1

    def forward(self, x: Tensor) -> Tensor:
        result = self.block(x)
        if self.use_res_connect:
            result += x

        return result


# 继承来自nn.module类
class MobileNetV3(nn.Module):
    def __init__(self,
                 inverted_residual_setting: List[InvertedResidualConfig],   # 参数设置列表,列表里面每个元素类型是上面定义的那个类的形式
                 last_channel: int,         # 倒数第二层channel个数
                 num_classes: int = 1000,   # 需要分类的类别数
                 block: Optional[Callable[..., nn.Module]] = None,
                 norm_layer: Optional[Callable[..., nn.Module]] = None):
        super(MobileNetV3, self).__init__()

        if not inverted_residual_setting:
            raise ValueError("The inverted_residual_setting should not be empty.")
        elif not (isinstance(inverted_residual_setting, List) and
                  all([isinstance(s, InvertedResidualConfig) for s in inverted_residual_setting])):
            raise TypeError("The inverted_residual_setting should be List[InvertedResidualConfig]")

        if block is None:
            block = InvertedResidual

        # 将norm_layer设置为BN
        #   partial()给输入函数BN指定默认参数,简化之后的函数参数量
        if norm_layer is None:
            norm_layer = partial(nn.BatchNorm2d, eps=0.001, momentum=0.01)

        layers: List[nn.Module] = []

        # building first layer   就是普通的conv
        firstconv_output_c = inverted_residual_setting[0].input_c
        layers.append(ConvBNActivation(3,
                                       firstconv_output_c,
                                       kernel_size=3,
                                       stride=2,
                                       norm_layer=norm_layer,
                                       activation_layer=nn.Hardswish))
        # building inverted residual blocks
        for cnf in inverted_residual_setting:
            layers.append(block(cnf, norm_layer))

        # building last several layers
        lastconv_input_c = inverted_residual_setting[-1].out_c
        lastconv_output_c = 6 * lastconv_input_c            # small:96->576; Large:160->960
        layers.append(ConvBNActivation(lastconv_input_c,
                                       lastconv_output_c,
                                       kernel_size=1,
                                       norm_layer=norm_layer,
                                       activation_layer=nn.Hardswish))
        self.features = nn.Sequential(*layers)
        self.avgpool = nn.AdaptiveAvgPool2d(1)
        self.classifier = nn.Sequential(nn.Linear(lastconv_output_c, last_channel),
                                        nn.Hardswish(inplace=True),
                                        nn.Dropout(p=0.2, inplace=True),
                                        nn.Linear(last_channel, num_classes))

        # initial weights
        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                nn.init.kaiming_normal_(m.weight, mode="fan_out")
                if m.bias is not None:
                    nn.init.zeros_(m.bias)
            elif isinstance(m, (nn.BatchNorm2d, nn.GroupNorm)):
                nn.init.ones_(m.weight)
                nn.init.zeros_(m.bias)
            elif isinstance(m, nn.Linear):
                nn.init.normal_(m.weight, 0, 0.01)
                nn.init.zeros_(m.bias)

    def _forward_impl(self, x: Tensor) -> Tensor:
        x = self.features(x)
        x = self.avgpool(x)     # 到这后面不再需要高和宽的维度了
        x = torch.flatten(x, 1) # 故进行展平处理
        x = self.classifier(x)

        return x

    def forward(self, x: Tensor) -> Tensor:
        return self._forward_impl(x)


def mobilenet_v3_large(num_classes: int = 1000,
                       reduced_tail: bool = False) -> MobileNetV3:
    """
    Constructs a large MobileNetV3 architecture from
    "Searching for MobileNetV3" <https://arxiv.org/abs/1905.02244>.

    weights_link:
    https://download.pytorch.org/models/mobilenet_v3_large-8738ca79.pth

    Args:
        num_classes (int): number of classes
        reduced_tail (bool): 需要的话, 设为True, 可以进一步减小网络
            If True, reduces the channel counts of all feature layers
            between C4 and C5 by 2. It is used to reduce the channel redundancy in the
            backbone for Detection and Segmentation.
    """
    width_multi = 1.0       # 调整channel个数,默认1.0
    bneck_conf = partial(InvertedResidualConfig, width_multi=width_multi)   # partial()给输入函数指定默认参数
    # 给类里的方法传入参数      有了上面一行,这行有必要吗?
    adjust_channels = partial(InvertedResidualConfig.adjust_channels, width_multi=width_multi)

    reduce_divider = 2 if reduced_tail else 1

    inverted_residual_setting = [
        # input_c, kernel, expanded_c, out_c, use_se, activation, stride
        bneck_conf(16, 3, 16, 16, False, "RE", 1),
        bneck_conf(16, 3, 64, 24, False, "RE", 2),  # C1
        bneck_conf(24, 3, 72, 24, False, "RE", 1),
        bneck_conf(24, 5, 72, 40, True, "RE", 2),   # C2
        bneck_conf(40, 5, 120, 40, True, "RE", 1),
        bneck_conf(40, 5, 120, 40, True, "RE", 1),
        bneck_conf(40, 3, 240, 80, False, "HS", 2),  # C3
        bneck_conf(80, 3, 200, 80, False, "HS", 1),
        bneck_conf(80, 3, 184, 80, False, "HS", 1),
        bneck_conf(80, 3, 184, 80, False, "HS", 1),
        bneck_conf(80, 3, 480, 112, True, "HS", 1),
        bneck_conf(112, 3, 672, 112, True, "HS", 1),
        bneck_conf(112, 5, 672, 160 // reduce_divider, True, "HS", 2),  # C4
        bneck_conf(160 // reduce_divider, 5, 960 // reduce_divider, 160 // reduce_divider, True, "HS", 1),
        bneck_conf(160 // reduce_divider, 5, 960 // reduce_divider, 160 // reduce_divider, True, "HS", 1),
    ]
    last_channel = adjust_channels(1280 // reduce_divider)  # C5    # 倒数第二个全连接层节点个数

    return MobileNetV3(inverted_residual_setting=inverted_residual_setting,
                       last_channel=last_channel,
                       num_classes=num_classes)


def mobilenet_v3_small(num_classes: int = 1000,
                       reduced_tail: bool = False) -> MobileNetV3:
    """
    Constructs a large MobileNetV3 architecture from
    "Searching for MobileNetV3" <https://arxiv.org/abs/1905.02244>.

    weights_link:
    https://download.pytorch.org/models/mobilenet_v3_small-047dcff4.pth

    Args:
        num_classes (int): number of classes
        reduced_tail (bool): If True, reduces the channel counts of all feature layers
            between C4 and C5 by 2. It is used to reduce the channel redundancy in the
            backbone for Detection and Segmentation.
    """
    width_multi = 1.0
    bneck_conf = partial(InvertedResidualConfig, width_multi=width_multi)
    adjust_channels = partial(InvertedResidualConfig.adjust_channels, width_multi=width_multi)

    reduce_divider = 2 if reduced_tail else 1

    inverted_residual_setting = [
        # input_c, kernel, expanded_c, out_c, use_se, activation, stride
        bneck_conf(16, 3, 16, 16, True, "RE", 2),  # C1
        bneck_conf(16, 3, 72, 24, False, "RE", 2),  # C2
        bneck_conf(24, 3, 88, 24, False, "RE", 1),
        bneck_conf(24, 5, 96, 40, True, "HS", 2),  # C3
        bneck_conf(40, 5, 240, 40, True, "HS", 1),
        bneck_conf(40, 5, 240, 40, True, "HS", 1),
        bneck_conf(40, 5, 120, 48, True, "HS", 1),
        bneck_conf(48, 5, 144, 48, True, "HS", 1),
        bneck_conf(48, 5, 288, 96 // reduce_divider, True, "HS", 2),  # C4
        bneck_conf(96 // reduce_divider, 5, 576 // reduce_divider, 96 // reduce_divider, True, "HS", 1),
        bneck_conf(96 // reduce_divider, 5, 576 // reduce_divider, 96 // reduce_divider, True, "HS", 1)
    ]
    last_channel = adjust_channels(1024 // reduce_divider)  # C5

    return MobileNetV3(inverted_residual_setting=inverted_residual_setting,
                       last_channel=last_channel,
                       num_classes=num_classes)

if __name__ == "__main__":
    model = mobilenet_v3_small()
    print(model)

    from torchsummaryX import summary
    summary(model, torch.randn(1,3,224,224))

输出:

MobileNetV3(
  (features): Sequential(
    (0): ConvBNActivation(
      (0): Conv2d(3, 16, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
      (1): BatchNorm2d(16, eps=0.001, momentum=0.01, affine=True, track_running_stats=True)
      (2): Hardswish()
    )
    (1): InvertedResidual(
...
(classifier): Sequential(
    (0): Linear(in_features=576, out_features=1024, bias=True)
    (1): Hardswish()
    (2): Dropout(p=0.2, inplace=True)
    (3): Linear(in_features=1024, out_features=1000, bias=True)
  )
)
================================================================================================
                                           Kernel Shape       Output Shape  \
Layer
0_features.0.Conv2d_0                     [3, 16, 3, 3]  [1, 16, 112, 112]
1_features.0.BatchNorm2d_1                         [16]  [1, 16, 112, 112]
...
123_classifier.Dropout_2                      -          -
124_classifier.Linear_3                  1.025M     1.024M
------------------------------------------------------------------------------------------------
                          Totals
Total params           2.542856M
Trainable params       2.542856M
Non-trainable params         0.0
Mult-Adds             56.516456M

5 感谢链接

https://www.bilibili.com/video/BV1GK4y1p7uE/?spm_id_from=333.788
https://blog.csdn.net/m0_48742971/article/details/123438626
https://www.bilibili.com/video/BV1zT4y1P7pd/?spm_id_from=333.788
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)

【MobileNetV3】MobileNetV3网络结构详解 的相关文章

  • SpringBoot+LayUI+MybatisPlus 前后端分离 实现数据表格下拉框功能

    前言 xff1a 小伙伴们 xff0c 大家好 xff0c 我是狂奔 蜗牛rz xff0c 当然你们可以叫我蜗牛君 xff0c 我是一个学习Java快一年时间的小菜鸟 xff0c 同时还有一个伟大的梦想 xff0c 那就是有朝一日 xff0
  • Linux网络配置

    目录 1 查看网络配置 1 1ifconfig命令 查看网络接口信息 1 2 hostname 查看主机名称 1 2 1查看主机名 1 2 2修改主机名称 1 3route 查看路由表条目 1 4 netstat 查看网络连接情况 1 5
  • KVM virt-manager 启动不了 cannot open display,和中文乱码

    首先启动不了 如下情况 xff0c 启动不了 先重启虚拟机 span class token punctuation span root 64 kvm1 span class token operator span span class t
  • python使用request+xpath爬取豆瓣电影数据

    python使用request 43 xpath爬取豆瓣电影 背景话不多说上代码 背景 由于毕设需要用到电影相关的数据 xff0c 在网上想查找一个可以爬电影的教程 xff0c 但是基本上所有的教程都是爬的豆瓣top250 xff0c 并没
  • IDEA mvn阿里云镜像设置 保姆级教程

    设置 打开 文件 设置 搜索mvn 修改用户设置文件 和 本地仓库 路径为自己喜欢的目录下 以下为我此处的文件 新项目设置 xff08 创建新项目默认设置 xff09 打开 文件 新项目设置 新项目的设置 把刚才的设置设置一遍 文件 地址
  • 解压码

    BN00001 22kke BN00002 88cde BN00003 00ike BN00004 76cdb BN00005 09dbm BN00006 0mndc BN00007 cd78d BN00008 bdmf8 BN00009
  • 保险项目业务流程

    1 整个项目分为四分模块 xff1a 信息采集模块 信息验证 审批 生成合同 xff08 开单 xff09 信息采集模块 xff1a 包括购买保险产品 xff0c 客户个人信息 1 纸质文档给客户填写 xff0c 在回来录入系统 2 客户直
  • IDEA使用maven自定义archetype

    标题自定义archetype 在pom文件中添加archetype plugin span class token generics span class token punctuation lt span plugin span clas
  • 自定义Perperties文件内容读取

    新建properties文件放在resources目录下 properties文件内容 url span class token operator 61 span jdbc span class token operator span my
  • 使用CSS中的Hover控制显示子元素或者兄弟元素

    lt DOCTYPE html gt lt html lang 61 34 en 34 gt lt head gt lt meta charset 61 34 UTF 8 34 gt lt meta name 61 34 viewport
  • iphone表情显示问号_如何在iPhone上搜索特定的表情符号

    iphone表情显示问号 Most of us use emoji on our iPhone but until recently finding the right one has been tricky Luckily startin
  • maven项目中的jdbc连接步骤

    在maven项目pom xml中到入驱动包 xff08 以下是驱动包代码 xff09 lt dependencies gt lt https mvnrepository com artifact mysql mysql connector
  • executeUpdate()与executeQuery()的使用

    增 删 改 用executeUpdate xff08 xff09 返回值为int型 xff0c 表示被影响的行数 例子 查用executeQuery 返回的是一个集合 next xff08 xff09 表示 指针先下一行 xff0c 还有f
  • Access denied for user ''@'localhost' (using password: YES)错误解决方法

    远程登录被拒绝 xff0c 要改一个表数据的属性让他可以远程登录 解决方法如下 xff0c 执行命令 xff1a mysql gt use mysql mysql gt select host user from user 查看结果是不是r
  • leetcode部分数据库+sqlzoo练习题

    175 组合两个表 SQL架构 表1 Person 43 43 43 列名 类型 43 43 43 PersonId int FirstName varchar LastName varchar 43 43 43 PersonId 是上表主
  • ubuntu下手动安装gnome插件

    ubuntu下手动安装gnome插件 span class token comment 下载环境 span sudo apt span class token operator span span class token keyword g
  • 类和对象的理解

    类和对象的关系 是java中两个重要的概念 xff0c 简单一句话将就是 xff1a 类是对象的模板 xff0c 对象是类的实例 比如 xff1a 设计车的图纸是类 xff0c 然后比亚迪 本田 奔驰这些车 xff08 对象 xff09 都
  • java设计模式的几种体现方式

    1 单例模型 有时候在我的设计中 xff0c 所有的类只共享一个实例 xff0c 那么这时候就需要设计一个单实例的类 思路是将这个类构造器私有化 xff0c 这样外部就无法直接创建对象 xff0c 然后提供公有的静态方法 xff0c 让外部
  • springIOC使用xml装配JavaBean对象

    在一个maven工程下 xff0c 在pom xml中导入spring依赖和相关的配置 lt xml version 61 34 1 0 34 encoding 61 34 UTF 8 34 gt lt project xmlns 61 3
  • spring整合MyBatis代码

    Spring 整合 MyBatis 就是把Spring和MyBatis应用到同一个项目中 xff1b 其中MyBatis提供数据库相关的操作 xff0c 完成对象数据和关系数据的转换 xff1b Spring完成项目的管理 xff0c 通过

随机推荐