优达学城自动驾驶汽车-Project2 Traffic_Sign_Classifier

2023-05-16

这个Project的目的是利用神经卷积网络(CNN)来分类(classify)常见的交通标志。CNN 在电脑读图领域已经全面超过了传统的机器学习电脑读图的方法(SVC, OpenCV)。大量的数据是深度学习准确性的保证, 在数据不够的情况下也可以人为的对原有数据进行小改动从而来提高识别的准确度。

  • 导入必要的软件包(pickle, numpy, cv2, matplotlib, sklearn, tensorflow, Keras)
# Load pickled data
import pickle
import cv2
import matplotlib.pyplot as plt
import numpy as np
from sklearn.model_selection import train_test_split
from keras.preprocessing.image import ImageDataGenerator
from sklearn.model_selection import train_test_split
import tensorflow as tf
from tensorflow.contrib.layers import flatten
### Load the images and plot them here.
import os
# Visualizations will be shown in the notebook.
%matplotlib inline
  • 数据来源:
    和大部分的机器学习的要求一样, CNN需要大量有label的数据,German Traffic Sign Dataset提供了对于这个project的研究并给出结果可用于比较,数据在这里可以下载到。解压后就可以用Python导入了:
training_file = '/Volumes/SSD/traffic-signs-data/train.p' # change it to your local dir
testing_file = '/Volumes/SSD/traffic-signs-data/test.p' # change it to your local dir

with open(training_file, mode='rb') as f:
    train = pickle.load(f)
with open(testing_file, mode='rb') as f:
    test = pickle.load(f)

X_train, y_train = train['features'], train['labels']
X_test, y_test = test['features'], test['labels']

# Number of training examples
n_train = X_train.shape[0]

# Number of testing examples.
n_test = X_test.shape[0]

# What's the shape of an traffic sign image?
image_shape = X_train[0].shape

# How many unique classes/labels there are in the dataset?
n_classes = len(set(y_train))

print("Number of training examples =", n_train)
print("Number of testing examples =", n_test)
print("Image data shape =", image_shape)
print("Number of classes =", n_classes)

输出:

Number of training examples = 39209
Number of testing examples = 12630
Image data shape = (32, 32, 3)
Number of classes = 43

从上面我们可以看到有39209个用作训练的图像 和 12630个testing data。39209张照片对于训练CNN来说是不够的(100000张以上是比较理想的数据量), 所以之后要加入data augment 的模块来人为增加数据。 每张图像的大小是是32x32 并且有3个信道。总共有43个不同的label。

我们也可以把每个label对应的图片随机选择一张画出来。

# show a random sample from each class of the traffic sign dataset
rows, cols = 4, 12
fig, ax_array = plt.subplots(rows, cols) # ax_array is a array object consistint of plt object
plt.suptitle('RANDOM SAMPLES FROM TRAINING SET (one for each class)')
for class_idx, ax in enumerate(ax_array.ravel()):
    if class_idx < n_classes:
        cur_X = X_train[y_train == class_idx]
        cur_img = cur_X[np.random.randint(len(cur_X))]
        ax.imshow(cur_img)
        ax.set_title('{:02d}'.format(class_idx))
    else:
        ax.axis('off')
# hide both x and y ticks
plt.setp([a.get_xticklabels() for a in ax_array.ravel()], visible=False)
plt.setp([a.get_yticklabels() for a in ax_array.ravel()], visible=False)

plt.draw()

all labels
我们也可以看看数据的分布情况:

# bar chart of classes distribution
train_distribution, test_distribution = np.zeros(n_classes), np.zeros(n_classes)
for c in range(n_classes):
    train_distribution[c] = np.sum(y_train == c) / n_train
    test_distribution[c] = np.sum(y_test == c) / n_test
fig, ax = plt.subplots()
col_width = 0.5
bar_train = ax.bar(np.arange(n_classes), train_distribution, width=col_width, color='r')
bar_test = ax.bar(np.arange(n_classes)+col_width, test_distribution, width=col_width, color='b')
ax.set_ylabel('PERCENTAGE OF PRESENCE')
ax.set_xlabel('CLASS LABEL')
ax.set_title('Classes distribution in traffic-sign dataset')
ax.set_xticks(np.arange(0, n_classes, 5) )
ax.set_xticklabels(['{:02d}'.format(c) for c in range(0, n_classes, 5)])
ax.legend((bar_train[0], bar_test[0]), ('train set', 'test set'))
plt.show()

img distribution

从上图可以看到这42个类别的数据量的分配是很不均匀的。这个会给CNN带来bias(偏见):CNN会更倾向于预测在training data里出现频率多的那些分类。

  • 数据前期处理
    根据这篇论文[Sermanet, LeCun], 把RGB的照片转化成YUV 然后只选择Y信道的图片可以在不影响精度的同事减少数据计算量。然后每张图片都转化成以0为平均值, 以1为标准差的数组。
def preprocess_features(X, equalize_hist=True):
    # Convert from RGB to YUV
    X = np.array([np.expand_dims(cv2.cvtColor(rgb_img, cv2.COLOR_RGB2YUV)[:, :, 0], 2) for rgb_img in X])

    # adjust image contrast
    if equalize_hist:
        X = np.array([np.expand_dims(cv2.equalizeHist(img), 2) for img in X])

    X = np.float32(X)

    # Standardize features
    X -= np.mean(X, axis=0)
    X /= (np.std(X, axis=0) + np.finfo('float32').eps)

    return X

X_train_norm = preprocess_features(X_train)
X_test_norm = preprocess_features(X_test)
  • 人为添加数据(data augment)
    对于图片来时, 一定程度的旋转, 上下左右移动, 放大或者缩小都应该不会影响它的标签。 虽然图片数据已经完全不一样了, 我们肉眼还是能够识别的出来, 这能够在增加数据量的同时帮助CNN 总结(generalize). Keras 有个很方便的函数ImageDataGenerator(rotation_range=15.,
    zoom_range=0.2,
    width_shift_range=0.1,
    height_shift_range=0.1)
    可以实现这个, 这个函数可以设置 旋转的角度rotation_range, 放大或缩小的倍数zoom_range, 左右移动的比例width_shift_range 和上下移动的比例height_shift_range , 随机在区间内改动原来的照片并产生无数新的照片, 下面我选择一张作为示范:
# creat the generator to perform online data augmentation
image_datagen = ImageDataGenerator(rotation_range=15.,
                                   zoom_range=0.2,
                                   width_shift_range=0.1,
                                   height_shift_range=0.1)

# take a random image from the training set
img_rgb = X_train[10]

# plot the original image
plt.figure(figsize=(1,1))
plt.imshow(img_rgb)
plt.title('Example of RGB image (class = {})'.format(y_train[10]))
plt.axis('off')
plt.show()

# plot some randomly augmented images
rows, cols = 4, 10
fig, ax_array = plt.subplots(rows, cols)
for ax in ax_array.ravel():
    augmented_img, _ = image_datagen.flow(np.expand_dims(img_rgb, 0), y_train[10:11]).next()
    ax.imshow(np.uint8(np.squeeze(augmented_img)))
plt.setp([a.get_xticklabels() for a in ax_array.ravel()], visible=False)
plt.setp([a.get_yticklabels() for a in ax_array.ravel()], visible=False)
plt.suptitle('Random examples of data augment (starting from the previous image)')
plt.show()

random_img

  • 搭建CNN
    每一层的神经网络都加了dropout 来防止overfitting。这个CNN的特点是把两层conv的output做了一个合成:fc0 = tf.concat([flatten(drop1), flatten(drop2)],1 ) 然后再连接到fully_connected layer 和output layer(42 classes)。文献中说这样做的好处是“the classifier is explicitly provided both the local “motifs” (learned by conv1) and the more “global” shapes and structure (learned by conv2) found in the features.” 我的理解是: CNN 能够在图片的局部和整体都能作为判断的依据,从而提高准确率。
n_classes = 43
def weight_variable(shape, mu=0, sigma=0.1):
    initialization = tf.truncated_normal(shape=shape, mean=mu, stddev=sigma)
    return tf.Variable(initialization)

def bias_variable(shape, start_val=0.1):
    initialization = tf.constant(start_val,shape=shape)
    return tf.Variable(initialization)

def conv2d(x, W, strides=[1,1,1,1], padding='SAME'):
    return tf.nn.conv2d(input=x, filter=W, strides=strides, padding=padding)

def max_pool2x2(x):
    return tf.nn.max_pool(value=x, ksize=[1,2,2,1], strides=[1,2,2,1], padding='SAME')

# network architecture definition
def my_net(x, n_classes):

    c1_out = 64
    conv1_W = weight_variable(shape=(3,3,1,c1_out))
    conv1_b = bias_variable(shape=(c1_out,))
    conv1 = tf.nn.relu(conv2d(x, conv1_W) + conv1_b)

    pool1 = max_pool2x2(conv1)

    drop1 = tf.nn.dropout(pool1, keep_prob=keep_prob)

    c2_out = 128
    conv2_W = weight_variable(shape=(3,3,c1_out, c2_out))
    conv2_b = bias_variable(shape=(c2_out,))
    conv2 = tf.nn.relu(conv2d(drop1, conv2_W) + conv2_b)

    pool2 = max_pool2x2(conv2)

    drop2 = tf.nn.dropout(pool2, keep_prob=keep_prob)

    fc0 = tf.concat([flatten(drop1), flatten(drop2)],1 )

    fc1_out = 64
    fc1_W = weight_variable(shape=(fc0._shape[1].value, fc1_out))
    fc1_b = bias_variable(shape=(fc1_out,))
    fc1 = tf.matmul(fc0, fc1_W) + fc1_b

    drop_fc1 = tf.nn.dropout(fc1, keep_prob=keep_prob)

    fc2_out = n_classes
    fc2_W = weight_variable(shape=(drop_fc1._shape[1].value, fc2_out))
    fc2_b = bias_variable(shape=(fc2_out,))
    logits = tf.matmul(drop_fc1, fc2_W) + fc2_b

    return logits

tf.reset_default_graph()
# placeholders
x = tf.placeholder(dtype=tf.float32, shape=(None, 32, 32,1))
y = tf.placeholder(dtype=tf.int64, shape=None)
keep_prob = tf.placeholder(tf.float32)

# training pipeline
lr = 0.001
logits = my_net(x, n_classes=n_classes)
cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=logits, labels=y)
loss_function = tf.reduce_mean(cross_entropy)
optimizer = tf.train.AdamOptimizer(learning_rate=lr)
training_operation = optimizer.minimize(loss=loss_function)

correct_prediction = tf.equal(tf.argmax(logits, 1), y)
accuracy_operation = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
saver = tf.train.Saver()

def evaluate(X_data, y_data):
    num_examples = len(X_data)
    total_accuracy = 0
    sess = tf.get_default_session()
    for offset in range(0, num_examples, BATCH_SIZE):
        batch_x, batch_y = X_data[offset:offset+BATCH_SIZE], y_data[offset:offset+BATCH_SIZE]
        accuracy = sess.run(accuracy_operation, feed_dict={x: batch_x, y: batch_y, keep_prob:1.0})
        total_accuracy += (accuracy * len(batch_x))
    return total_accuracy / num_examples
  • 设置 hyperparameters
EPOCHS = 2
BATCHES_PER_EPOCH = 3
BATCH_SIZE = 128
  • 训练CNN
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())

    print("Training...")
    print()
    for epoch in range(EPOCHS):
        batch_counter = 0
        for batch_x, batch_y in image_datagen.flow(X_train, y_train, batch_size=BATCH_SIZE):
            batch_counter += 1
            sess.run(training_operation, feed_dict={x: batch_x, y: batch_y, keep_prob:0.5})

            if batch_counter == BATCHES_PER_EPOCH:
                break
        # at epoch end, evaluate accuracy on both training and validation set
        train_accuracy = evaluate(X_train, y_train)
        validation_accuracy = evaluate(X_validation, y_validation)
        print("EPOCH {} ...".format(epoch+1))
        print("Training Accuracy = {:.3f}, Validation Accuracy = {:.3f}".format(train_accuracy, validation_accuracy))
        print()
        saver.save(sess, save_path='./checkpoints/traffic_sign_model.ckpt', global_step=epoch)
  • Training log:
    EPOCH 1 …
    Train Accuracy = 0.889 - Validation Accuracy: 0.890
    EPOCH 2 …
    Train Accuracy = 0.960 - Validation Accuracy: 0.955
    EPOCH 3 …
    Train Accuracy = 0.975 - Validation Accuracy: 0.969
    EPOCH 4 …
    Train Accuracy = 0.985 - Validation Accuracy: 0.977
    EPOCH 5 …
    Train Accuracy = 0.987 - Validation Accuracy: 0.978
    EPOCH 6 …
    Train Accuracy = 0.991 - Validation Accuracy: 0.985
    EPOCH 7 …
    Train Accuracy = 0.991 - Validation Accuracy: 0.984
    EPOCH 8 …
    Train Accuracy = 0.991 - Validation Accuracy: 0.985
    EPOCH 9 …
    Train Accuracy = 0.991 - Validation Accuracy: 0.985
    EPOCH 10 …
    Train Accuracy = 0.994 - Validation Accuracy: 0.988
    EPOCH 11 …
    Train Accuracy = 0.996 - Validation Accuracy: 0.990
    EPOCH 12 …
    Train Accuracy = 0.995 - Validation Accuracy: 0.989
    EPOCH 13 …
    Train Accuracy = 0.995 - Validation Accuracy: 0.991
    EPOCH 14 …
    Train Accuracy = 0.993 - Validation Accuracy: 0.988
    EPOCH 15 …
    Train Accuracy = 0.995 - Validation Accuracy: 0.989
    EPOCH 16 …
    Train Accuracy = 0.996 - Validation Accuracy: 0.992
    EPOCH 17 …
    Train Accuracy = 0.996 - Validation Accuracy: 0.992
    EPOCH 18 …
    Train Accuracy = 0.997 - Validation Accuracy: 0.992
    EPOCH 19 …
    Train Accuracy = 0.996 - Validation Accuracy: 0.992
    EPOCH 20 …
    Train Accuracy = 0.993 - Validation Accuracy: 0.986
    EPOCH 21 …
    Train Accuracy = 0.997 - Validation Accuracy: 0.993
    EPOCH 22 …
    Train Accuracy = 0.995 - Validation Accuracy: 0.988
    EPOCH 23 …
    Train Accuracy = 0.996 - Validation Accuracy: 0.990
    EPOCH 24 …
    Train Accuracy = 0.996 - Validation Accuracy: 0.991
    EPOCH 25 …
    Train Accuracy = 0.997 - Validation Accuracy: 0.991
    EPOCH 26 …
    Train Accuracy = 0.997 - Validation Accuracy: 0.991
    EPOCH 27 …
    Train Accuracy = 0.996 - Validation Accuracy: 0.994
    EPOCH 28 …
    Train Accuracy = 0.997 - Validation Accuracy: 0.992
    EPOCH 29 …
    Train Accuracy = 0.996 - Validation Accuracy: 0.992
    EPOCH 30 …
    Train Accuracy = 0.996 - Validation Accuracy: 0.991

在EPOCH=27的时候, Validation Accuracy达到了最高 99.4%
可以在测试的时候载入这个模型用来最终 test set 的测试

with tf.Session() as sess:

    # restore saved session with highest validation accuracy
    checkpointer.restore(sess, '../checkpoints/traffic_sign_model.ckpt-27')

    test_accuracy = evaluate(X_test_norm, y_test)
    print('Performance on test set: {:.3f}'.format(test_accuracy))

>>> Performance on test set: 0.953
在test data set上有95.3% 的准确率!
网上找了5张图片来做测试:
这里写图片描述

5张图里对了4张。以下是每张图的probability, 从中可以看到CNN预测的confidence:

这里写图片描述

本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)

优达学城自动驾驶汽车-Project2 Traffic_Sign_Classifier 的相关文章

随机推荐

  • http协议之digest(摘要)认证

    参考网址 xff1a RFC 2617 HTTP Authentication Basic and Digest Access Authenti RFC2617 RFC 1321 The MD5 Message Digest Algorit
  • FreeRTOS heap 4 机制解析

    FreeRTOS提供了几个内存管理的方案 xff0c 其中一个实现较好的方式是heap4 本篇就来形象讲述heap4的工作原理 本文暂时只用作自己对heap4的工作机制的总结和记录 xff0c 有空了再修改成教程吧 xff0c 所以 xff
  • 使用Qt写Xml文档,追加节点。

    追加Xml文档就是 xff0c 要 增 一段内容 xff0c 要实现的效果如下 xff1a 因为想临时存储一些东西 xff0c 所以利用xml做个简易的数据库 xff0c 要应用的ARM设备上 xff0c 存放入SD卡 网上找了半天 xff
  • sudo apt-get update 报错 ubuntu xenial InRelease 明文签署文件不可用,结果为‘NOSPLIT’(您的网络需要认证吗?)解决

    一句话总结 xff1a 换可用的国内源 问题的产生 xff1a 今天刚装了个ubuntu 16 04 xff08 还没装新的 xff09 结果想装个vim一直不成功 xff0c sudo apt update多次总是出现下面的 xff1a
  • docker 常用命令

    systemctl start docker 启动docker服务 systemctl stop docker 停止docker服务 systemctl restart docker 重启docker服务 systemctl status
  • 大疆Manifold 制作和恢复镜像及恢复出厂设置卡主不动处理方法

    进入恢复模式 首先你要准备一台运行Ubuntu 14 04或者Ubuntu16 04的电脑 xff0c 将电脑与Manifold 的RECOVERY USB 接口 xff0c 用Manifold自带的USB线连接 连接电源 xff0c 上电
  • Gazebo构建小车模型并通过ROS控制

    Gazebo构建小车模型并通过ROS控制 介绍编写车子的URDF文件编写控制小车移动的插件 与ROS交互 结尾 介绍 突然想试试Gazebo这款仿真软件 xff0c 因为它可以让你在任何时候都有机器人玩 但Gazebo的机制也比较复杂 xf
  • OPENCV面试题

    1 opencv中RGB2GRAY是怎么实现的 答 xff1a 以R G B为轴建立空间直角坐标系 xff0c 则RGB图的每个象素的颜色可以用该三维空间的一个点来表示 xff0c 而Gray图的每个象素的颜色可以用直线R 61 G 61
  • RandomForestClassifier、SVM、xgboost实现二分类算法

    RandomForestClassifier span class token comment coding utf 8 span span class token keyword import span numpy span class
  • 二、Crazepony1无人机源码分析-(4)接受遥控器的数据

    二 Crazepony1无人机源码分析 xff08 4 xff09 接受遥控器的数据 1 程序源码2 流程图 1 程序源码 span class token comment 查询中断 span span class token keywor
  • Linux内核移植和根文件系统制作

    第一章移植内核 1 1 Linux内核基础知识 1 1 1 Linux版本 1 1 2 什么是标准内核 1 1 3 Linux操作系统的分类 1 1 4 linux内核的选择 1 2 Linux内核启动过程概述 1 2 1 Bootload
  • 联发科MT76x8使用1-芯片对比

    上面是我创建的群聊 xff0c 欢迎新朋友的加入 最近新到手一个MT 76X8的板子 xff0c 盖了个铁壳壳 xff0c 丝印上写的是MT 7628 学习一下 特意对比了MT 7628和MT 7688 对比了一下 xff0c 没什么太大区
  • FreeRTOS

    freertos 会接管 systick xff0c 作用时间片基准 xff0c 系统不跑systick不会开始计数 xff0c 所以需要另外的timer作为hal tick来源 freertos 会接管 svc xff0c 作用是用来开跑
  • STM32串口收发、串口中断、串口波特率的理解、普通IO模拟串口

    STM32串口收发 串口中断 一 串口中断二 使用DMA三 串口波特率的理解 开发环境 xff1a stm32cubuMax 43 Keil5 一 串口中断 1 当收到消息的时候 xff0c 立即进入控制程序 实现通过串口控制硬件 xff1
  • apt一键下载所有依赖的包

    apt一键下载所有依赖的包 无外网的局域网安装软件一个烦人的事件就是明明安装包下好了 xff0c 但有时候就是安装不上 xff0c 因为缺少相应依赖的包 那么如何将一个软件依赖的包 库之类的下载下来呢 这里就用到apt的相关功能 方法 首先
  • git合作开发时,没有pull就直接push会怎样

    git的时候总会遇到一些奇怪的问题 目前遇到最麻烦的还是在push之前没有pull 每天上班第一件事pull一下 xff0c 上传代码之前一定要pull 没有pull就push xff0c 会出现merge 即使使用git reset so
  • 关于Python Numpy array 的axis 的用法的总结

    我自己学习numpy已经很长时间了 xff0c 但一直搞不懂它array里面axis的用法 经常就是自己试一下看看axis 61 0 或者 axis 61 1 的时候会怎么样 然后再用到代码里面 xff0c 比如 xff1a 首先导入num
  • 我的STM32艰苦入门经验体会与总结

    我的STM32艰苦入门经验体会与总结 第一章 笔者的入门总结 1 1 为什么要把时间花在 犹豫 上 xff1f 每当我们在入门之前 xff08 ARM 是这样 xff0c DSP 也一样 xff09 xff0c 总会会有很多疑问 xff0c
  • 优达学城自动驾驶汽车-Project 1: Finding Lane Lines 学习笔记1

    根据像素的亮度来识别车 车道识别对于Computer vision 来说是一个相对简单的任务 原因是车道一般是白色或黄色 并且很容易和图片中其他的像素区别开来 在RGB 三信道 的图片中 xff0c 白色有很高的R G B 的像素值 所以我
  • 优达学城自动驾驶汽车-Project2 Traffic_Sign_Classifier

    这个Project的目的是利用神经卷积网络 xff08 CNN xff09 来分类 xff08 classify xff09 常见的交通标志 CNN 在电脑读图领域已经全面超过了传统的机器学习电脑读图的方法 xff08 SVC OpenCV