2020-10-22

2023-05-16

SSD Keras code解析

一、模型建立
1.1 重要标志参数

aspect_ratios_per_layer=[[1.0, 2.0, 0.5],
                        [1.0, 2.0, 0.5, 3.0, 1.0/3.0],
                        [1.0, 2.0, 0.5, 3.0, 1.0/3.0],
                        [1.0, 2.0, 0.5, 3.0, 1.0/3.0],
                        [1.0, 2.0, 0.5, 3.0, 1.0/3.0],
                        [1.0, 2.0, 0.5],
                        [1.0, 2.0, 0.5]]
aspect_ratios = aspect_ratios_per_layer
for ar in aspect_ratios_per_layer:
    if (1 in ar) & two_boxes_for_ar1:
       # +1 for the second box for aspect ratio 1
       #[3+1,5+1,5+1,5+1,5+1,3+1,3+1]
       n_boxes.append(len(ar) + 1)

# The number of predictor conv layers in the network is 7 for the original SSD512                        
n_predictor_layers = 7
# Account for the background class.
n_classes += 1

1.2 VGG基础网络
在这里插入图片描述

conv1_1 = Conv2D(64, (3, 3), activation='relu', padding='same', kernel_initializer='he_normal', kernel_regularizer=l2(l2_reg), name='conv1_1')(x1)
conv1_2 = Conv2D(64, (3, 3), activation='relu', padding='same', kernel_initializer='he_normal', kernel_regularizer=l2(l2_reg), name='conv1_2')(conv1_1)
pool1 = MaxPooling2D(pool_size=(2, 2), strides=(2, 2), padding='same', name='pool1')(conv1_2)
...
conv10_1 = Conv2D(128, (1, 1), activation='relu', padding='same', kernel_initializer='he_normal', kernel_regularizer=l2(l2_reg), name='conv10_1')(conv9_2)
conv10_1 = ZeroPadding2D(padding=((1, 1), (1, 1)), name='conv10_padding')(conv10_1)
conv10_2 = Conv2D(256, (4, 4), strides=(1, 1), activation='relu', padding='valid', kernel_initializer='he_normal', kernel_regularizer=l2(l2_reg), name='conv10_2')(conv10_1)
...

1.3 目标检测附加层

1.3.1 置信度层
如果检测目标共有 c c c个类别，SSD其实需要预测 c + 1 c+1 c+1个置信度值，其中第一个置信度指的是不含目标或者属于背景的评分。后面当我们说 c c c个类别置信度时，请记住里面包含背景那个特殊的类别，即真实的检测类别只有 c − 1 c-1 c−1个。在预测过程中，置信度最高的那个类别就是边界框所属的类别，特别地，当第一个置信度值最高时，表示边界框中并不包含目标。

conv4_3_norm_mbox_conf = Conv2D(n_boxes[0] * n_classes, (3, 3), padding='same', kernel_initializer='he_normal', kernel_regularizer=l2(l2_reg), name='conv4_3_norm_mbox_conf')(conv4_3_norm)
...
conv10_2_mbox_conf = Conv2D(n_boxes[6] * n_classes, (3, 3), padding='same', kernel_initializer='he_normal', kernel_regularizer=l2(l2_reg), name='conv10_2_mbox_conf')(conv10_2)

1.3.2 位置层

对于一个大小的特征图，共有 m × n m \times n m×n个单元，每个单元设置的先验框数目记为 k k k,也就是n_boxes[…]，那么每个单元共需要 ( c + 4 ) × k (c+4) \times k (c+4)×k个预测值，所有的单元共需要 ( c + 4 ) × k m n (c+4) \times kmn (c+4)×kmn个预测值，由于SSD采用卷积做检测，所以就需要 ( c + 4 ) × k (c+4) \times k (c+4)×k个卷积核完成这个特征图的检测过程。

n_boxes = [4, 6 ,6 ,6 ,6 ,4 ,4]

有了特征图之后，需要对特征图进行卷积得到检测结果，下图给出了一个 5 × 5 5 \times 5 5×5大小的特征图的检测过程。其中Priorbox是得到先验框。检测值包含两个部分：类别置信度和边界框位置，各采用一次 3 × 3 3 \times 3 3×3卷积来进行完成。每个先验框都会预测一个边界框，所以SSD512一共可以预测
64 × 64 × 4 + 32 × 32 × 6 + 16 × 16 × 6 + 8 × 8 × 6 + 4 × 4 × 6 + 2 × 2 × 4 + 1 × 1 × 4 = 24564 64 \times 64 \times4+32 \times 32 \times6+16 \times 16 \times6+8 \times 8 \times6+4 \times 4 \times6+2 \times 2 \times4+1 \times 1 \times4=24564 64×64×4+32×32×6+16×16×6+8×8×6+4×4×6+2×2×4+1×1×4=24564 个边界框，所以说SSD本质上是密集采样。
在这里插入图片描述

conv4_3_norm_mbox_loc = Conv2D(n_boxes[0] * 4, (3, 3), padding='same', kernel_initializer='he_normal', kernel_regularizer=l2(l2_reg), name='conv4_3_norm_mbox_loc')(conv4_3_norm)
...
conv10_2_mbox_loc = Conv2D(n_boxes[6] * 4, (3, 3), padding='same', kernel_initializer='he_normal', kernel_regularizer=l2(l2_reg), name='conv10_2_mbox_loc')(conv10_2)

1.3.3 先验框

# Output shape of anchors: `(batch, height, width, n_boxes, 8)
#The last axis contains the four anchor box coordinates 
#and the four variance values for each box.
#先验框函数计算是定值，详细如下(具体在AnchorBoxes类的call()函数里面)
'''
Note that this tensor does not participate in any graph computations at runtime. It is being created
as a constant once during graph creation and is just being output along with the rest of the model output
during runtime. Because of this, all logic is implemented as Numpy array operations and it is sufficient
to convert the resulting Numpy array into a Keras tensor at the very end before outputting it.
'''

#scales = [0.07, 0.15, 0.33, 0.51, 0.69, 0.87, 1.05]
#否则，scale从min_scale到max_scale线性增加
#scales = np.linspace(min_scale, max_scale, n_predictor_layers+1)

#two_boxes_for_ar1=True
conv4_3_norm_mbox_priorbox = AnchorBoxes(img_height, img_width, this_scale=scales[0], next_scale=scales[1], aspect_ratios=aspect_ratios[0],
                                             two_boxes_for_ar1=two_boxes_for_ar1, this_steps=steps[0], this_offsets=offsets[0], clip_boxes=clip_boxes,
                                             variances=variances, coords=coords, normalize_coords=normalize_coords, name='conv4_3_norm_mbox_priorbox')(conv4_3_norm_mbox_loc)
...

1.3.4 reshape层

# Reshape the class predictions, yielding 3D tensors of shape `(batch, height * width * n_boxes, n_classes)`
# We want the classes isolated in the last axis to perform softmax on them
conv4_3_norm_mbox_conf_reshape = Reshape((-1, n_classes), name='conv4_3_norm_mbox_conf_reshape')(conv4_3_norm_mbox_conf)

# Reshape the box predictions, yielding 3D tensors of shape `(batch, height * width * n_boxes, 4)`
# We want the four box coordinates isolated in the last axis to compute the smooth L1 loss
conv4_3_norm_mbox_loc_reshape = Reshape((-1, 4), name='conv4_3_norm_mbox_loc_reshape')(conv4_3_norm_mbox_loc)

# Reshape the anchor box tensors, yielding 3D tensors of shape `(batch, height * width * n_boxes, 8)`
conv4_3_norm_mbox_priorbox_reshape = Reshape((-1, 8), name='conv4_3_norm_mbox_priorbox_reshape')(conv4_3_norm_mbox_priorbox)

1.3.5 输出合并层

#将7个conf层进行合并
# Output shape of `mbox_conf`: (batch, n_boxes_total, n_classes)
mbox_conf = Concatenate(axis=1, name='mbox_conf')([conv4_3_norm_mbox_conf_reshape,
                                                   fc7_mbox_conf_reshape,
                                                   conv6_2_mbox_conf_reshape,
                                                   conv7_2_mbox_conf_reshape,
                                                   conv8_2_mbox_conf_reshape,
                                                   conv9_2_mbox_conf_reshape,
                                                   conv10_2_mbox_conf_reshape])

#将7个loc层进行合并
# Output shape of `mbox_loc`: (batch, n_boxes_total, 4)
mbox_loc = Concatenate(axis=1, name='mbox_loc')([conv4_3_norm_mbox_loc_reshape,
                                                 fc7_mbox_loc_reshape,
                                                 conv6_2_mbox_loc_reshape,
                                                 conv7_2_mbox_loc_reshape,
                                                 conv8_2_mbox_loc_reshape,
                                                 conv9_2_mbox_loc_reshape,
                                                 conv10_2_mbox_loc_reshape])  
#将7个priorbox层进行合并
# Output shape of `mbox_priorbox`: (batch, n_boxes_total, 8)
mbox_priorbox = Concatenate(axis=1, name='mbox_priorbox')([conv4_3_norm_mbox_priorbox_reshape,
                                                           fc7_mbox_priorbox_reshape,
                                                           conv6_2_mbox_priorbox_reshape,
                                                           conv7_2_mbox_priorbox_reshape,
                                                           conv8_2_mbox_priorbox_reshape,
                                                           conv9_2_mbox_priorbox_reshape,
                                                           conv10_2_mbox_priorbox_reshape])  
           
#添加softmax层
mbox_conf_softmax = Activation('softmax', name='mbox_conf_softmax')(mbox_conf)

# Output shape of `predictions`: (batch, n_boxes_total, n_classes + 4 + 8)
#合并所以输出层
predictions = Concatenate(axis=2, name='predictions')([mbox_conf_softmax, mbox_loc, mbox_priorbox])

1.4 模型建立

#在training模式下，训练数据的label是preditions，坐标loc是经过encode后的
if mode == 'training':
   model = Model(inputs=x, outputs=predictions)

#在training模式下，训练数据的label是preditions，坐标loc是经过encode后,然后进行decode
elif mode == 'inference':
    #3D tensor of shape `(batch_size, top_k, 6)
    '''The last axis contains the coordinates for each predicted box in the format
    [class_id, confidence, xmin, ymin, xmax, ymax]'''
    decoded_predictions = DecodeDetections(confidence_thresh=confidence_thresh,
                                          iou_threshold=iou_threshold,
                                          top_k=top_k,
                                          nms_max_output_size=nms_max_output_size,
                                          coords=coords,
                                          normalize_coords=normalize_coords,
                                          img_height=img_height,
                                          img_width=img_width,
                                          name='decoded_predictions')(predictions)
   model = Model(inputs=x, outputs=decoded_predictions)

1.5 编解码
对于边界框的location，包含4个值 ( c x , c y , w , h ) (c x, c y, w, h) (cx,cy,w,h)，分别表示边界框的中心坐标以及宽高。但是真实预测值其实只是边界框相对于先验框的转换值(paper里面说是offset，R-CNN中是transformation)。先验框位置用 d = ( d c x , d c y , d w , d h ) d=\left(d^{c x}, d^{c y}, d^{w}, d^{h}\right) d=(dcx,dcy,dw,dh)表示，其对应边界框用 b = ( b c x , b c y , b w , b h ) b=\left(b^{c x}, b^{c y}, b^{w}, b^{h}\right) b=(bcx,bcy,bw,bh)表示，那么边界框的预测值 l l l 其实是 b b b 相对于 d d d 的转换值：
l c x = ( b c x − d c x ) / d w , l c y = ( b c y − d c y ) / d h l^{c x}=\left(b^{c x}-d^{c x}\right) / d^{w}, l^{c y}=\left(b^{c y}-d^{c y}\right) / d^{h} lcx=(bcx−dcx)/dw,lcy=(bcy−dcy)/dh
l w = log ⁡ ( b w / d w ) , l h = log ⁡ ( b h / d h ) l^{w}=\log \left(b^{w} / d^{w}\right), l^{h}=\log \left(b^{h} / d^{h}\right) lw=log(bw/dw),lh=log(bh/dh)
习惯上称上面这个过程为边界框的编码（encode），预测时，你需要反向这个过程，即进行解码（decode），从预测值 l l l 中得到边界框的真实位置 b b b:

b c x = d w l c x + d c x , b c y = d y l c y + d c y b^{c x}=d^{w} l^{c x}+d^{c x}, b^{c y}=d^{y} l^{c y}+d^{c y} bcx=dwlcx+dcx,bcy=dylcy+dcy
b w = d w exp ⁡ ( l w ) , b h = d h exp ⁡ ( l h ) b^{w}=d^{w} \exp \left(l^{w}\right), b^{h}=d^{h} \exp \left(l^{h}\right) bw=dwexp(lw),bh=dhexp(lh)

在SSD的Caffe源码实现中还有trick，那就是设置variance超参数来调整检测值

b c x = d w ( b^{c x}=d^{w}\left(\right. bcx=dw(variance [ 0 ] ∗ l c x ) + d c x , b c y = d y ( \left.[0] * l^{c x}\right)+d^{c x}, b^{c y}=d^{y}\left(\right. [0]∗lcx)+dcx,bcy=dy(variance [ 1 ] ∗ l c y ) + d c y \left.[\mathbf{1}] * l^{c y}\right)+d^{c y} [1]∗lcy)+dcy
b w = d w exp ⁡ ( b^{w}=d^{w} \exp \left(\right. bw=dwexp(variance [ 2 ] ∗ l w ) , b h = d h exp ⁡ ( \left.[2] * l^{w}\right), b^{h}=d^{h} \exp \left(\right. [2]∗lw),bh=dhexp(variance [ 3 ] ∗ l h ) \left.[3] * l^{h}\right) [3]∗lh)

本文内容由网友自发贡献，版权归原作者所有，本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容，请联系:hwhale#tublm.com(使用前将#替换为@)

2020

2020-10-22 的相关文章

2020 CCF 非专业级别软件能力认证第一轮(CSP-S) 提高级 C++ 语言试题

目录一选择题 xff1a 每题 2 分 xff0c 共 15 题 xff0c 30 分在每小题给出的四个选项中 xff0c 只有一项是符合题目要求的二阅读程序程序输入不超过数组或字符串定义的范围 xff1b 判断题正确填 3 x
中国地质大学北京信息工程学院2020招生专业目录和导师

中国地质大学北京信息工程学院2020招生专业目录和导师 304信息工程学院 081000信息与通信工程 01信号与信息处理全日制 4 101思想政治理论 201英语一或202俄语 301数学一 860数字电子技术 1 模拟电子技术 2 C
2020-10-19

OpenMP 使用介绍 OpenMP 基本概念 Open Multi Processing的缩写 xff0c 是一个应用程序接口 xff08 API xff09 xff0c 可用于显式指导多线程共享内存的并行性在项目程序已经完成好的情况
anaconda3 2020.07_Anaconda3 & isce2

Anaconda3安装参考 xff1a https www digitalocean com community tutorials how to install anaconda on ubuntu 18 04 quickstart w
android10坑,Android 10 踩坑实录 👉 2020-01-20

1 https联网限制 Accessing hidden method Lcom android org conscrypt OpenSSLSocketImpl gt setUseSessionTickets Z V light greyl
2020/1/27 在setInterval中使用setTimeout时遇到的问题

这几天在做一个简单的随机抽取姓名的一个抽人的小页面顺顺利利的今天回来再打开看自己作死一样的一直找看看有没有什么bug 没想到真的找到一个花了一个多小时才解决其实最后才觉得是个比较细节的问题是关于setInterval和setTim
2020-11-17

大数据的就业前景还是很不错的大数据的价值体现在以下几个方面 xff1a xff08 1 xff09 对大量消费者提供产品或服务的企业可以利用大数据进行精准营销 xff1b xff08 2 xff09 做小而美模式的中小微企业可以利用大数据
CVPR 2020论文开源项目合集

0 参考github地址 CVPR 2020论文开源项目合集 1 阅读随笔更新 2020 3 11 CVPR 2020 3D Pose Estimation阅读随笔1 xff1a Cross View Tracking for Multi
六级(2020/7-1) Text1

People often discuss the dangers of too much stress xff0c but lately最近 a very different view of stress is gaining popula
使用2020版IDEA创建Servlet

使用2020版IDEA创建一个完整的Web项目的整个过程分为四步第一步创建一个普通的Java项目 1 打开IDEA xff0c 选择菜单File gt New gt Project 2 选择Java xff0c 以及自己的JDK xff
2020-12-29 Matlab自动化控制-Adrc自抗扰控制参数调节

Matlab自动化控制 Adrc自抗扰控制参数调节上篇参看 xff1a https blog csdn net qingfengxd1 article details 111830762 以最简单的线性组合方法 xff08 1 xff09
ubuntu中用anaconda下载库很慢的解决方法2020.3.5

1 安装anaconda xff0c 去清华源下载安装添加链接描述 2 由于第三方库在国外 xff0c 在没有梯子的情况下无法用conda install numpy来安装numpy xff0c 所以第二步就是切换清华源添加链接描述 3
2020电赛D题绕组飞行器

在准备电赛的过程中 xff0c 做了一下去年的题 xff0c 本文将介绍我的方案及部分代码 xff0c 希望可以帮助到大家一我的装备由于初学飞控所以主控用的是匿名的拓空者 xff0c 还有匿名的光流传感器 xff0c 北醒的激光雷达
2020-11-10

https pan baidu com s 1uvuB6ahrfijMiWy9AqFCig
2020.2.22 排位赛 G - Bucket Brigade(BFS)

Bucket Brigade 题面题目分析 BFS模板题代码 span class token macro property span class token directive keyword include span span cl
IDEA 2020.2 配置Tomcat服务器

1 创建一个工程 2 右键项目名称 xff0c 选择 add framwork support 3 选中Web Application xff0c 默认勾选创建web xml 目录结构如下 4 点这两个地方中的任意一个 xff0c 添加配置
2020.10.22项目调试记录

1 对于ROS项目 xff0c catkin make之后找不到可执行的launch文件 xff0c source devel setup bash之后还是无效最后发现是对应的CMakeLists txt文件中某个语句的位置问题导致的 x
2020-09-28

通用异步收发器 xff08 Universal Asynchronous Receiver Transmitter xff0c 通常称作UART xff0c 是一种串行异步全双工的通信协议 xff0c 在嵌入式领域应用的非常广泛 UAR
我的 2020 总结：跌宕起伏

文章目录复盘与展望复盘与展望 2020总结 2021计划个人生理健康 55kg前半年熬夜较多眼睛干涩眼睑有障碍经常热敷毛巾蒸汽眼罩滴眼药水坚持锻炼俯卧撑开合跳心理健康 6月份左右申请过劳动仲裁迟迟拿不到钱比较着急找
AI那些事儿之验证集、shuffle作用

验证集干啥的验证集合测试集哪个更重要一句话训练集用于自动地训练调整模型中网络参数 weights 验证集用于调超参数 epochs轮数几层比较合适啥时候过拟合要不要dropout 要多大程度测试集测试模型泛化能力验证集和

随机推荐

神经网络epoch和batch的粗浅理解

关于神经网络epoch和batch的理解理解粗浅 xff0c 仅为个人想法 xff0c 提前感谢指正 epoch 一个epoch代表全部数据进入网络一次 xff0c 这个时候 xff0c 整个网络结构只对这批数据全部走完一次 xff0c
目标检测与位姿估计（二十三）：OpenCV+Aruco完成目标检测

一份识别图像图像中所有Aruco的代码 include lt opencv2 core core hpp gt include lt opencv2 imgproc imgproc hpp gt include lt opencv2 hig
随记（2）：PP-Tracking工具

目标跟踪任务意义需求 xff1a 自动驾驶智慧城市安防领域面向车辆行人飞行器等快速运行的物体实时跟踪及分析算法优势 xff1a 单纯的目标检测算法只能输出目标的定位 43 分类 xff0c 无法对移动的目标具体的运动行为及运动特
【ROS&GAZEBO】多旋翼无人机仿真（一）——搭建仿真环境

ROS amp GAZEBO 多旋翼无人机仿真 xff08 一 xff09 搭建仿真环境 ROS amp GAZEBO 多旋翼无人机仿真 xff08 一 xff09 搭建仿真环境 ROS amp GAZEBO 多旋翼无人机仿真 xff08
【ROS&GAZEBO】多旋翼无人机仿真（三）——自定义多旋翼模型

ROS amp GAZEBO 多旋翼无人机仿真 xff08 一 xff09 搭建仿真环境 ROS amp GAZEBO 多旋翼无人机仿真 xff08 二 xff09 基于rotors的仿真 ROS amp GAZEBO 多旋翼无人机仿真 x
当使用CUBEMX，STM32F429阿波罗开发板的PCF8574与1-Wire冲突的解决办法。

hello 大家好 xff01 距离上次写博客还是上半年呢这几个月做了很多实验 xff0c 不过都很懒 xff0c 一直没有写上来准备慢慢补回来下面是我在使用原子哥的STM32F429的开发板做温度传感器项目的小实验时遇到的问题 1
【ROS&GAZEBO】解决“is neither a launch file in package ”的问题

这两天有小伙伴问到在安装完rotors后出现如下问题 xff1a 这个问题其实是ros环境没有配置好 xff0c 运行下面的命令 xff0c 将catkub ws加入ros的工作空间 span class token function mk
【ROS&GAZEBO】多旋翼无人机仿真（七）——四元数姿态控制

ROS amp GAZEBO 多旋翼无人机仿真 xff08 一 xff09 搭建仿真环境 ROS amp GAZEBO 多旋翼无人机仿真 xff08 二 xff09 基于rotors的仿真 ROS amp GAZEBO 多旋翼无人机仿真 x
【DRONECAN】（一）介绍

DRONECAN 前言笔者最近因为项目需要用到CAN通信 xff0c 所以研究了一下飞控上基于CAN的协议 xff0c 目前在Ardupilot和PX4中用的是DRONECAN xff0c DRONECAN是基于CAN的通信协议 xff0
普通人对AI的看法

就发展前景来看 xff0c 人工智能无疑将是现阶段与今后很长时间内的全球性热点这是一个可以预见性的历史潮流 xff0c 无可阻挡 xff0c 一旦它出现一定会对现代互联网的结构会产生颠覆性的改变它将重新定义现代互联网的理念 xff0c
java+postgis实现根据两点生成模拟轨迹gps数据

java 43 postgis实现根据两点生成模拟轨迹gps数据文章目录 java 43 postgis实现根据两点生成模拟轨迹gps数据前言一实现流程1 请求参数2 功能流程3 postgis重要使用函数介绍4 生成的GPS模拟轨迹点
Docker更新springboot容器镜像

下载安装partainer 拉取镜像 docker pull portainer portainer ce 运行容器 docker run d p 9000 9000 v var run docker sock var run docker
AUTOSAR简介

1 简介 AUTOSAR全称为 AUTomotive Open System ARchitecture xff0c 译为汽车开放系统体系结构 xff1b AUTOSAR是一家由汽车电子半导体和软件行业的汽车制造商供应商服务提供商等公
基于sklearn的分类与回归基础总结

一分类一数据类型 1 python自带类型 span class token builtin list span span class token comment 列表 span span class token builtin tu
回归模型 Boston房价预测

一加载数据集将取值范围差异很大的数据输入到神经网络中 xff0c 这是有问题的网络可能会自动适应这种取值范围不同的数据 xff0c 但学习肯定变得更加困难对于这种数据 xff0c 普遍采用的最佳实践是对每个特征做标准化 xff0c
卷积神经网络猫狗识别

一卷积神经网络搭建搭建框架 xff0c 需要使用卷积层和池化层 span class token keyword from span keras span class token keyword import span models s
Matlab学习笔记

PART 0 xff1a 绪论 2018年9月11日 16 54 参考书籍理论教程 MATLAB与计算方法谢进 xff0c 李大美主编武汉大学出版社图书馆编号TP312MAX321 实践教程 MATLAB基础与运用熊庆如主编机械
预训练卷积神经网络

一综述预训练网络 xff08 pretrained network xff09 是一个保存好的网络 xff0c 之前已在大型数据集 xff08 通常是大规模图像分类任务 xff09 上训练好如果这个原始数据集足够大且足够通用 xff0
图片操作汇总

1 keras preprocessing自带的图片处理器image xff0c 和tensorflow PIL中Image xff0c 返回的是同一种Image类型 span class token keyword from span k
2020-10-22

SSD Keras code解析一模型建立 1 1 重要标志参数 aspect ratios per layer span class token operator 61 span span class token punctuatio

2020-10-22

SSD Keras code解析

2020-10-22 的相关文章

随机推荐

热门标签