Udacity 深度学习项目２(Project2) Image Classification 解析

2023-05-16

本项目需要搭建一个简单的卷积神经网络（CNN）来对　CIFAR-10 数据进行图片分类。本文记录了这个项目的一些注意事项。

１.数据的预处理：对于CIFAR-10　的图片数据，首先要做归一化处理。对于 Label　数据，要做 one-hot-encoder　处理。　

One-hot-encoder　可以利用 sklearn　中的 Preprocessing　中的 LabelBinarizer 函数：
　　　　　　这里写图片描述
　　
　　也可以利用　numpy　中的 eye　函数：
　

def one_hot_encode(x):
    """
    One hot encode a list of sample labels. Return a one-hot encoded vector for each label.
    : x: List of sample Labels
    : return: Numpy array of one-hot encoded labels
    """
    # TODO: Implement Function
    np_classes = 10
    one_hot_labels = np.eye(np_classes)[x]
    return one_hot_label

２．卷积层，最大池化层，扁平化层，全连接层，输出层的代码如下：

def conv2d_maxpool(x_tensor, conv_num_outputs, conv_ksize, conv_strides, pool_ksize, pool_strides):
    """
    Apply convolution then max pooling to x_tensor
    :param x_tensor: TensorFlow Tensor
    :param conv_num_outputs: Number of outputs for the convolutional layer
    :param conv_ksize: kernal size 2-D Tuple for the convolutional layer
    :param conv_strides: Stride 2-D Tuple for convolution
    :param pool_ksize: kernal size 2-D Tuple for pool
    :param pool_strides: Stride 2-D Tuple for pool
    : return: A tensor that represents convolution and max pooling of x_tensor
    """
    # TODO: Implement Function
    depth = x_tensor.get_shape().as_list()
    padding = 'SAME'
    conv_ksize2 = [conv_ksize[0], conv_ksize[1],depth[-1], conv_num_outputs]
    conv_strides2 = [1, conv_strides[0], conv_strides[1], 1]
    pool_ksize2 = [1, pool_ksize[0], pool_ksize[1],1]
    pool_strides2 = [1, pool_strides[0], pool_strides[1], 1]
    filter_weights = tf.Variable(tf.truncated_normal((conv_ksize2),0 ,0.1))
    filter_bias = tf.Variable(tf.zeros(conv_num_outputs))

    filter_output = tf.nn.conv2d(x_tensor, filter_weights, conv_strides2, padding)
    filter_output = tf.nn.bias_add(filter_output, filter_bias)

    filter_output = tf.nn.relu(filter_output)

    filter_output = tf.nn.max_pool(filter_output, pool_ksize2, pool_strides2, padding)
    #print (filter_output.get_shape().as_list())
    return filter_output


def flatten(x_tensor):
    """
    Flatten x_tensor to (Batch Size, Flattened Image Size)
    : x_tensor: A tensor of size (Batch Size, ...), where ... are the image dimensions.
    : return: A tensor of size (Batch Size, Flattened Image Size).
    """
    # TODO: Implement Function
    flattened_image_size = np.prod(x_tensor.get_shape().as_list()[1:])
    flat_inputs = tf.reshape(x_tensor,[-1,flattened_image_size])

    #flat_inputs = tf.contrib.layers.flatten(x_tensor)
    return flat_inputs


def fully_conn(x_tensor, num_outputs):
    """
    Apply a fully connected layer to x_tensor using weight and bias
    : x_tensor: A 2-D tensor where the first dimension is batch size.
    : num_outputs: The number of output that the new tensor should be.
    : return: A 2-D tensor where the second dimension is num_outputs.
    """
    # TODO: Implement Function
    #output = tf.contrib.layers.fully_connected(x_tensor, num_outputs)
    weights_shape = list((x_tensor.get_shape().as_list()[-1], ) + (num_outputs, ))
    weights = tf.Variable(tf.truncated_normal(weights_shape, 0, 0.1))
    bias = tf.Variable(tf.zeros(num_outputs))
    return tf.nn.relu(tf.add(tf.matmul(x_tensor, weights), bias))


def output(x_tensor, num_outputs):
    """
    Apply a output layer to x_tensor using weight and bias
    : x_tensor: A 2-D tensor where the first dimension is batch size.
    : num_outputs: The number of output that the new tensor should be.
    : return: A 2-D tensor where the second dimension is num_outputs.
    """
    # TODO: Implement Function
    image_shape = x_tensor.get_shape().as_list()[1]
    weights = tf.Variable(tf.truncated_normal([image_shape, num_outputs],0,0.01))
    bias = tf.Variable(tf.zeros(num_outputs))
    outputs = tf.add(tf.matmul(x_tensor, weights),bias)
    return outputs

３．构建模型及参数选择
　　本项目的难点在于模型的参数选择。具体参数包括：
　　　　卷积层滤波器的　size　以及 stride：
　　　　　　经过反复测试，滤波器选择　4*4，stride 选择 1 的效果较好
　　　　　　
　　　　最大池化的 size　以及 stride:
　　　　　池化的 size 选择　8*8， stride 选择１的效果较好
　　　　　
　　　　全连接层的输出 size：
　　　　　　　输出层为 384 个输出的效果较好
　　　　　　　
　　　　keep-probability:
　　　　经过测试，keep-probability 宜小一点，可以更好的防止 overfitting。但是太小的 keep-probability 也会导致运算速度慢，以及模型预测不准的问题。
　　　　
　　　　各个层级的 weight　的初始化问题：
　　　　　　　我在搭建模型初期，遇到一个很大的问题就是，初始化参数设置不合理，导致收敛太慢。一开始注意到收敛太慢的问题时，我采用的方法是增大 optimizer 的 learning-rate ，设置：
　　　　　　　

optimizer = tf.train.AdamOptimizer(learning_rate = 0.01).munimize(cost)

但是效果仍然不好。通过查询资料，发现应该在初始化变量时，调整正态分布的标准差，将默认的标准差“1”修改为 “0.1” 或 “0.01”。

最终搭建神经网络的代码如下：

def conv_net(x, keep_prob):
    """
    Create a convolutional neural network model
    : x: Placeholder tensor that holds image data.
    : keep_prob: Placeholder tensor that hold dropout keep probability.
    : return: Tensor that represents logits
    """
    # TODO: Apply 1, 2, or 3 Convolution and Max Pool layers
    #    Play around with different number of outputs, kernel size and stride
    # Function Definition from Above:
    #    conv2d_maxpool(x_tensor, conv_num_outputs, conv_ksize, conv_strides, pool_ksize, pool_strides)

    output1 = conv2d_maxpool(x, 18, (4,4),(1,1),(8,8),(1,1))
    output1 = tf.nn.dropout(output1, keep_prob)

    #output2 = conv2d_maxpool(output1,200,(2,2),(2,2),(2,2),(2,2))
    #output2 = tf.nn.dropout(output2, keep_prob)


    # TODO: Apply a Flatten Layer
    # Function Definition from Above:
    #   flatten(x_tensor)
    output3 = flatten(output1)


    # TODO: Apply 1, 2, or 3 Fully Connected Layers
    #    Play around with different number of outputs
    # Function Definition from Above:
    #   fully_conn(x_tensor, num_outputs)
    output4 = fully_conn(output3, 384)
    #output5 = fully_conn(output4,50)
    output4 = tf.nn.dropout(output4, keep_prob)

    # TODO: Apply an Output Layer
    #    Set this to the number of classes
    # Function Definition from Above:
    #   output(x_tensor, num_outputs)
    logits_output = output(output4, 10)


    # TODO: return output
    return logits_output

4.显示结果：
注意在输出结果时，计算 accuracy 要使用 validation 数据集来计算。

def print_stats(session, feature_batch, label_batch, cost, accuracy):
    """
    Print information about loss and validation accuracy
    : session: Current TensorFlow session
    : feature_batch: Batch of Numpy image data
    : label_batch: Batch of Numpy label data
    : cost: TensorFlow cost function
    : accuracy: TensorFlow accuracy function
    """
    # TODO: Implement Function

    loss = session.run(cost, feed_dict={x:feature_batch, y:label_batch, keep_prob:1.0})
    valid_acc = sess.run(accuracy, feed_dict={
                x: valid_features,
                y: valid_labels,
                keep_prob: 1.})
    print('Loss: {:>10.4f} Validation Accuracy: {:.6f}'.format(
                loss,
                valid_acc))

最终结果为，一个 batch 的准确率为 63% 左右。全部五个 batch 的准确率在 70% 左右。结果较为满意。

本文内容由网友自发贡献，版权归原作者所有，本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容，请联系:hwhale#tublm.com(使用前将#替换为@)

Udacity 深度学习项目２(Project2) Image Classification 解析的相关文章

在 Kohana 3.2 视图中输出图像

我有以下脚本将图像输出到浏览器效果很好 file to output SERVER DOCUMENT ROOT static imgs uploads 20110318172207 16 jpg header Content Type i
使用 PIL 对图像进行着色，同时保持透明度？

好吧情况是这样的我想使用 Python 图像库来主题化图像如下所示 Theme color 33B5E5 IN OUT 我使用 ImageMagick 命令得到了结果 convert image png colorspace gr
Firefox 中忽略的图像最大高度百分比

HTML div class container div
WPF - 如何使用模板创建图像按钮

我正在尝试创建一个包含 3 个图像的按钮一个普通图像一个按下图像和一个禁用图像我将使用它们来创建向上向下箭头按钮我相信正确的方法是从Button并使用Template并设置触发器来更改图像我有 3 个依赖属性每个图像一个图像
App Engine、PIL 和叠加文本

我正在尝试在 GAE 上的图像上覆盖一些文本现在他们公开了 PIL 库这应该不是问题这就是我所拥有的它有效但我不禁认为我应该直接写入背景图像而不是创建单独的覆盖图像然后合并我可以用吗Image frombuffer http
从 PNG 图像中提取元数据

我正在尝试从 PNG 图像格式中提取元数据我正在使用这个库 http code google com p metadata extractor http code google com p metadata extractor 尽管它声称
HTML 中包含的带有“img”标签的 SVG 是否可以链接到带有“image”标签的外部图像？

我在服务器上的同一位置有以下文件 image svg 和文件 bitmap png
Direct2D：将 ID2D1Image 转换为 ID2D1Bitmap

我正在开发一个需要修改屏幕上已有内容的程序所以我只有一个ID2D1Bitmap我使用创建的pRenderTarget gt CopyFromRenderTarget 我想做的是将效果应用于该位图效果仅返回ID2D1Image 但我需要有
图像缩放会导致 Firefox/Internet Explorer 质量较差，但 Chrome 不会

See http jsfiddle net aJ333 1 http jsfiddle net aJ333 1 在 Chrome 中然后在 Firefox 或 Internet Explorer 中图像最初是 120 像素我缩小到 2
使用 DrawImage 方法黑屏

我必须使用绘制位图图像绘图上下文 DrawImage http msdn microsoft com en us library ms606804 28v vs 90 29 aspx method 使用下面的代码一切正常 BitmapIma
带有多个嵌入图像的 VB.NET 电子邮件

请有人给我一些关于如何发送包含多个嵌入图像的电子邮件的指示我可以发送一封基本电子邮件也可以使用 AlternateView 发送一封带有单个嵌入图像的电子邮件在 bodyText 中作为 XElement 我有 img src 然后我
如何在Python中设置像素的alpha值

我正在尝试编辑image https drive google com file d 0B8JcwRV HVk0OURrcTFJczhmV2RlUGdMOG0ybldYUVRoamtF view usp sharing以一种将所有白色像素转
asp.net 保护图像免受其他用户的静态请求？

我在一个为每个特定用户生成动态图像的网站上工作有时这些图像包含非常敏感数据的描述最近我们开始看到对属于不同用户的图像的请求其形式为 http myapp images someuid image1 jpg http myapp im
通过排队预加载图像？

我正在寻找一种预加载特定图像并将其添加到队列中的方法这是我目前所处的位置 http shivimpanim org testsite imageloader html http shivimpanim org testsite image
将pillow Image对象转换为JpegImageFile对象

我裁剪了一张 jpeg 图像但裁剪后的图像类型是
如何垂直对齐div内的图像

如何在包含的内容中对齐图像div Example 在我的示例中我需要将 img in the div with class frame div class frame style height 25px img src http jsfi
Java机器学习库可以商用吗？ [关闭]

Closed 这个问题正在寻求书籍工具软件库等的推荐不满足堆栈溢出指南 help closed questions 目前不接受答案有谁知道我可以将其用于商业产品的优秀 Java 机器学习库吗不幸的是 Weka 和 Rapidmin
如何使用 OpenCV 检测图像帧中的对象？

我正在使用 Raspberry Pi 开发一个漫游器它将清扫房间并捡起掉落在地上的物体为了检测物体我使用了在流动站操作开始时拍摄的参考图像以及每 10 秒单击一次的图像新图像为了确定图像帧是否发生变化我在参考图像和新图像之间进
在 WPF 中显示 Drawing.Image

我有一个 System Drawing Image 的实例如何在我的 WPF 应用程序中显示这一点我尝试过img Source但这不起作用我有同样的问题并通过结合多个答案来解决它 System Drawing Bitmap bmp I
使用 JavaScript 将图像上传到 Web 服务

我需要从 javascript 将图像上传到网络服务我必须发送一个 json 字符串和一个文件图像在java中我们有MultipartEntity 我在java中有以下代码 HttpPost post new HttpPost aWe

随机推荐

Go 单元测试高效实践

敏捷开发中有一个广为人知的开发方法就是 XP xff08 极限编程 xff09 xff0c XP 提倡测试先行 xff0c 为了将以后出现 bug 的几率降到最低 xff0c 这一点与近些年流行的 TDD xff08 测试驱动开发 xff0
操作系统—分段与分页

1 地址重定位所谓的地址重定位 xff08 也叫地址翻译 xff09 就是修改程序中的内存地址 xff0c 使得程序被载入内存后 xff0c 那些地址能够指向正确的内存空间例如 xff0c 程序中包含 call 40 语句 xff0c
putty连接被拒的原因分析

1 xff1a 检查防火墙 2 xff1a PING虚拟机 3 xff1a SSHD etc init d sshd start开启服务 4 xff1a IP字段问题 ifconfig eth0 新IP xff08 更改为字段和主机一样的I
Charles+Postern抓包遇到的问题（已解决）

问题描述 Charles的代理配置和Postern的代理配置的都是正确的 xff0c 但是当在手机上打开Postern中的开启VPN时 xff0c Charles上没有任何反应解决方法 xff1a 我在多次配置实验无果的情况下 xff08
4. ROS编程入门--PID控制器

介绍 xff1a 这篇教程是交给大家怎么在ROS里写一个PID控制器 xff0c PID控制器有三部分比例部分 xff08 P xff09 积分部分 xff08 I xff09 微分部分 xff08 D xff09 PID的输出是这三部分
6. ROS编程入门--路径跟随（Trajectory following）

Task 本次实验才采用 Follow the carrot 算法去跟随定义好的路径控制的目标点在机器人行走的路径上 xff0c 距离机器人是个常数距离机器人计算自己的方向角和目标点角度之间的相差度数控制这个差角为0 为了在探测时候能
位姿矩阵求逆

位姿矩阵求逆位姿矩阵分析位姿矩阵求逆矩阵为了更好的说明问题 xff0c 我们先来看一下位姿矩阵的定义位姿矩阵分析如下图所示 xff1a 如果在B坐标系下有一点PB xff0c 我们需要知道该点在坐标系A下的坐标PA xff0c 怎么
UNIX环境高级编程

环境配置 1 下载apue 3e文件夹 xff0c 可以通过http www apuebook com code3e html现在源码 2 解压后执行进入apue 3e中执行make指令如果出现 96 96 96 collect2 err
C#连接SQL Server 数据库

C 是如何跟SQL Server进行连接的 xff1f 在C NET程序设计中 xff0c 离不开ADO NET ADO NET是 NET连接数据库的重要组件使用其可以很方便地访问数据库 xff0c ADO NET还可以访问Oracle数
冒烟测试和回归测试的区别

每次新的版本出来的时候 xff0c 老大就让我们冒烟虽然不知道冒烟测试的含义 xff0c 但由于没有给用例 xff0c 我就随便跑跑跑完冒烟后 xff0c 老大就让做回归测试 xff0c 把bug回归掉但是其实对2个概念还是不太理
Webpack——02——打包html资源

1 在src中创建webpack的入口文件index js xff0c index html 2 src的同级目录下创建webpack config js 3 下载插件html webpack plugin 老规矩初始化 xff0c 下w
AT&T 汇编

1 基础 x86 的寄存器为32位 xff0c x64 的寄存器为64位寄存器间对应关系 xff1a 64位寄存器低32位低16位低8位 rax eax ax al rbx ebx bx bl rcx ecx cx cl rdx e
向导对话框[转]

很久以前在网上看到的一片文章 xff0c 忘了是谁写的了 xff0c 觉得很好 xff0c 所以在这里转载首先要了解的是CPropertySheet实际上是由一个TabCtrl加多个Page和多个按钮组成这里强调一点 xff0c DoM
win10用pip安装face_recognize

1 安装CMake 命令 xff1a pip install CMake 2 安装face recognition xff08 也可先安装dlib xff09 命令 xff1a pip install face recognition 查看
VS2019配置SFML

VS2019配置SFML 1 下载安装SFML SDK 网址 xff1a https www sfml dev org download php 解压并放在文件夹里 xff0c 记住这个路径在我的电脑中这个路径是F C 43 43 Pro
django学习笔记

未更新完成文章目录一什么是django1 1 有关django1 1 1 开始1 1 2 MVC框架设计模式1 1 3 Django架构 MTV模型 1 2 安装配置开始django1 2 1 python和django1 2 2
战略性放弃日记

文章目录 20216 26 2021 6 26 在win10中部署linux子系统ubuntu时 xff0c 安装WSL图形界面遇到报错 ccsm命令后 xff0c 报错 NoneType object has no attribute g
docker安装linux-ubuntu桌面环境

操作系统 xff1a win10 docker版本 xff1a 20 10 7 xff08 可通过docker version确认 xff09 在cmd或window power shell中输入命令 1 拉取镜像 docker pull
Cent OS7下载和安装图形桌面教程

具体安装教程下载地址 xff1a https www centos org download 1 进入官网 xff0c 找到下载 xff0c 下面三个ios镜像都可以选择 2 使用链接下载cent os ios centos7安装GUI图
Udacity 深度学习项目２(Project2) Image Classification 解析

本项目需要搭建一个简单的卷积神经网络 xff08 CNN xff09 来对 CIFAR 10 数据进行图片分类本文记录了这个项目的一些注意事项 xff11 数据的预处理 xff1a 对于CIFAR 10 的图片数据 xff0c 首先要做归

Udacity 深度学习项目２(Project2) Image Classification 解析

Udacity 深度学习项目２(Project2) Image Classification 解析 的相关文章

随机推荐

热门标签

Udacity 深度学习项目２(Project2) Image Classification 解析的相关文章