参考:
1. 《TensorFlow技术解析与实战》
2. https://www.cnblogs.com/hellcat/p/7401706.html
3. http://www.jianshu.com/p/3dbeb3ab9aa3
########################################################################################
用TensorFlow搭建一个循环神经网络RNN模型,并用来训练MNIST数据集。
RNN在自安然语言处理领域的以下几个方向应用非常成功:
1.机器翻译
2.语音识别
3.图像描述
4.语言单词预测
# -*- coding:utf-8 -*-
# ==============================================================================
# 20171115
# HelloZEX
# 循环神经网络RNN
# Code from https://github.com/aymericdamien/TensorFlow-Examples/blob/master/examples/3_NeuralNetworks/recurrent_network.py
# ==============================================================================
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
tf.set_random_seed(1) # set random seed
# 导入数据
mnist = input_data.read_data_sets("MNIST_Labels_Images", one_hot=True)
# hyperparameters
lr = 0.001 # learning rate
training_iters = 100000 # train step 上限
batch_size = 128
n_inputs = 28 # MNIST data input (img shape: 28*28)
n_steps = 28 # time steps
n_hidden_units = 128 # neurons in hidden layer
n_classes = 10 # MNIST classes (0-9 digits)
# x y placeholder
x = tf.placeholder(tf.float32, [None, n_steps, n_inputs])
y = tf.placeholder(tf.float32, [None, n_classes])
# 对 weights biases 初始值的定义
weights = {
# shape (28, 128)
'in': tf.Variable(tf.random_normal([n_inputs, n_hidden_units])),
# shape (128, 10)
'out': tf.Variable(tf.random_normal([n_hidden_units, n_classes]))
}
biases = {
# shape (128, )
'in': tf.Variable(tf.constant(0.1, shape=[n_hidden_units, ])),
# shape (10, )
'out': tf.Variable(tf.constant(0.1, shape=[n_classes, ]))
}
def RNN(X, weights, biases):
# 原始的 X 是 3 维数据, 我们需要把它变成 2 维数据才能使用 weights 的矩阵乘法
# X ==> (128 batches * 28 steps, 28 inputs)
X = tf.reshape(X, [-1, n_inputs])
# X_in = W*X + b
X_in = tf.matmul(X, weights['in']) + biases['in']
# X_in ==> (128 batches, 28 steps, 128 hidden) 换回3维
X_in = tf.reshape(X_in, [-1, n_steps, n_hidden_units])
# 使用 basic LSTM Cell.
lstm_cell = tf.contrib.rnn.BasicLSTMCell(n_hidden_units, forget_bias=1.0, state_is_tuple=True)
init_state = lstm_cell.zero_state(batch_size, dtype=tf.float32) # 初始化全零 state
outputs, final_state = tf.nn.dynamic_rnn(lstm_cell, X_in, initial_state=init_state, time_major=False)
# 把 outputs 变成 列表 [(batch, outputs)..] * steps
# list(tensor1, tensor2...)
outputs = tf.unstack(tf.transpose(outputs, [1, 0, 2]))
# 这里取的是所有图片最后step计算出的张量组(128,128)
# 依照RNN的思想,最后的step中包含了前面各组step的信息
# 应属常识但仍然强调一下的一点是在各中网络框架以及设计中同组batch用的参数相同而且是互不影响的
results = tf.matmul(outputs[-1], weights['out']) + biases['out'] # 选取最后一个 output
return results
pred = RNN(x, weights, biases)
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred, labels=y))
train_op = tf.train.AdamOptimizer(lr).minimize(cost)
correct_pred = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
step = 0
while step * batch_size < training_iters:
batch_xs, batch_ys = mnist.train.next_batch(batch_size)
batch_xs = batch_xs.reshape([batch_size, n_steps, n_inputs])
sess.run([train_op], feed_dict={
x: batch_xs,
y: batch_ys,
})
if step % 20 == 0:
print(sess.run(accuracy, feed_dict={
x: batch_xs,
y: batch_ys,
}))
step += 1
print('Finish!')
输出:
/usr/bin/python2.7 /home/zhengxinxin/Desktop/PyCharm/Spark/SparkMNIST/SparkMNIST_RNN.py
Extracting MNIST_Labels_Images/train-images-idx3-ubyte.gz
Extracting MNIST_Labels_Images/train-labels-idx1-ubyte.gz
Extracting MNIST_Labels_Images/t10k-images-idx3-ubyte.gz
Extracting MNIST_Labels_Images/t10k-labels-idx1-ubyte.gz
2017-11-15 10:56:00.015122: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-11-15 10:56:00.015152: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-11-15 10:56:00.015157: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-11-15 10:56:00.015160: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-11-15 10:56:00.015164: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
0.15625
0.671875
0.804688
0.796875
0.828125
0.84375
0.90625
0.945312
0.945312
0.945312
0.851562
0.90625
0.960938
0.890625
0.90625
0.90625
0.921875
0.945312
0.96875
0.9375
0.921875
0.953125
0.960938
0.929688
0.976562
0.96875
0.960938
0.945312
0.960938
0.976562
0.945312
0.976562
0.929688
0.96875
0.960938
0.945312
0.976562
0.9375
0.96875
0.960938
Finish!
Process finished with exit code 0
把RNN函数提出来单独讲解一下:
输入数据格式应该是[batch,step,input],对应下图中前向传播过程,
(图片引用自参考2)
关键内容:
lstm_cell = tf.contrib.rnn.BasicLSTMCell(n_hidden_units, forget_bias=1.0, state_is_tuple=True) # 生成cell核心
init_state = lstm_cell.zero_state(batch_size, dtype=tf.float32) # 初始cell状态
outputs, final_state = tf.nn.dynamic_rnn(lstm_cell, X_in, initial_state=init_state, time_major=False) # rnn计算,会一次性计算出所有的step
对于第三步,实际上tf.contrib.rnn.BasicLSTMCell对象是callable的参数是当前step的input和前一时刻的state,所以每次计算想要一个个step自己迭代的话参看参考3
tensorflow 循环神经网络RNN
。
这里的思路是就是RNN输入维度正好是一次一张图片(或者说一个数据),最后一step正好包含了本张图片前面所有的信息所以取它的output,正是因此提取时需要改维度为{step,batch,output}以方便提取全图片(batch)的最后一个step的output:
outputs = tf.unstack(tf.transpose(outputs, [1,0,2]))
[注]:其实不需要unstack,这个函数会把拆分后维度(默认为0)的张量改为包含小张量list,去掉了这个函数也不影响后面output[-1]的操作,张量和list都可以接收slice操作
从输出来看其实不如CNN在MNIST上面的效果好,虽然仅差一点点。