TensorFlow：实战Google深度学习框架（六）图像数据处理

2023-11-02

第七章图像数据处理

第七章图像数据处理

第6章中详细介绍了卷积神经网络，并提到了通过卷积神经网络给图像识别技术带来了突破性进展，本章从另外一个维度来进一步提升图像识别的精度以及训练的速度。

在很多图像识别问题中，光照、对比度等外界因素会对识别效果造成很大的影响，所以本章介绍如何对图像数据进行预处理使得训练得到的神经网络模型尽可能小的被无关因素所影响。
复杂的预处理过程可能导致训练效率下降，为了减小预训练对训练速度的影响，本章也将介绍TensorFlow中多线程处理输入数据的解决方案。

7.1 TFRecord输入数据格式

TensorFlow提供了一种统一的格式来存储数据——TFRecord格式

7.1.1 TFRecord格式介绍

tf.train.Example Protocol Buffer：TFRecord文件中的数据都是通过该格式储存的。

下列代码给出了tf.train.Example的定义

message Example{
  Features features=1;
 };

message Features{
   map<string,Feature> feature=1;
 };

message Feature{
  oneof kind{
    BytesList bytes_list=1;
    FloatList bytes_list=2;
    Int64List bytes_list=3;
  }
};

tf.train.Example中包含了一个从属性名称到取值的字典：
属性名称：一个字符串
属性取值：字符串（BytesList）、实数列表（FloatList）、整数列表（Int64List）

7.1.2 TFRecord样例程序

1. 将mnist输入数据转化为TFRecord格式

import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
import numpy as np

# 1. 将输入转化成TFRecord格式并保存
# 定义函数转化变量类型。
def _int64_feature(value):
    return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))

def _bytes_feature(value):
    return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))

# 读取mnist数据。
mnist = input_data.read_data_sets("../../datasets/MNIST_data",dtype=tf.uint8, one_hot=True)
images = mnist.train.images
labels = mnist.train.labels
pixels = images.shape[1]
num_examples = mnist.train.num_examples

# 输出TFRecord文件的地址。
filename = "Records/output.tfrecords" # 需要存在Records目录
writer = tf.python_io.TFRecordWriter(filename)
for index in range(num_examples):
    image_raw = images[index].tostring()

    example = tf.train.Example(features=tf.train.Features(feature={
        'pixels': _int64_feature(pixels),
        'label': _int64_feature(np.argmax(labels[index])),
        'image_raw': _bytes_feature(image_raw)
    }))
    writer.write(example.SerializeToString())
writer.close()
print("TFRecord文件已保存。")

2. 读取TFRecord文件中的数据

reader = tf.TFRecordReader()
filename_queue = tf.train.string_input_producer(["Records/output.tfrecords"])
_,serialized_example = reader.read(filename_queue)

# 解析读取的样例。
features = tf.parse_single_example(
    serialized_example,
    features={
        'image_raw':tf.FixedLenFeature([],tf.string),
        'pixels':tf.FixedLenFeature([],tf.int64),
        'label':tf.FixedLenFeature([],tf.int64)
    })

images = tf.decode_raw(features['image_raw'],tf.uint8)
labels = tf.cast(features['label'],tf.int32)
pixels = tf.cast(features['pixels'],tf.int32)

sess = tf.Session()

# 启动多线程处理输入数据。
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(sess=sess,coord=coord)

for i in range(10):
    image, label, pixel = sess.run([images, labels, pixels])

7.2 图像数据处理

7.2.1 TensorFlow图像处理函数

1. 图像编码处理

图像在存储时并不是直接记录像素矩阵中的数字，而是记录了压缩编码之后的结果，所以要将一张图像还原成一个三维矩阵，需要解码的过程。TensorFlow提供了对jpeg和png格式图像的编码/解码函数。

使用TensorFlow中对jpeg格式图像的编码/解码函数

# matplotlib.pyplot是一个python画图工具
import matplotlib.pyplot as plt
import tensorflow as tf

# 读取图像的原始数据
image_raw_data=tf.gfile.FastGFile("E:\\Opencv Image\\anglababy.jpg",'rb').read()

with tf.Session() as sess:
    # 将图像使用的jpeg的格式解码从而得到图像对应的三维矩阵
    # TensorFlow还提供了tf.image.decode_png函数对png格式的图像进行解码
    # 解码之后的结果为一个张量，在使用它的取值之前需要明确调用运行的过程
    img_data=tf.image.decode_jpeg(image_raw_data)

    print(img_data.eval())
    # 使用pyplot得到图像
    plt.imshow(img_data.eval())
    plt.show()

    # 将数据的类型转化成实数方便后续处理
    img_data=tf.image.convert_image_dtype(img_data,dtype=tf.uint8)

    # 将表示一张图像的三维矩阵重新按照jpeg个数编码并存到文件中
    # 打开该图，可以得到和原图一样的图像
    encode_image=tf.image.encode_jpeg(img_data)   #输入必须为uint8形式的，不然会报错
    with tf.gfile.GFile("E:\\Opencv Image\\an.jpg",'wb') as f:
        f.write(encode_image.eval())

2. 图像大小调整

因为获取的图像大小不固定，但神经网络输入节点的个数是固定的，所以在将图像像素作为输入提供给神经网络之前，需要先将图像大小进行统一。

1.通过算法调整，使得得到的新图像尽量保存原始图像的所有信息

TensorFlow实现：提供了四种不同的算法，并封装到了tf.image.resize_images函数

tf.image.resize_images(images, new_height, new_width, method=0)
# Resize images to new_width, new_height using the specified method.

tf.image.resize_images函数中method对应的取值

Method取值	图像大小调整算法
0	双线性差值法（Bilinear interpolation）
1	最近邻法（Nearest neighbour interpolation）
2	双三次差值法（Bicubic interpolation）
3	面积差值法（Area interpolation）

代码实现：

# matplotlib.pyplot是一个python画图工具
import matplotlib.pyplot as plt
import tensorflow as tf
import numpy as np

# 读取图像的原始数据
image_raw_data=tf.gfile.FastGFile("E:\\Opencv Image\\anglababy.jpg",'rb').read()

with tf.Session() as sess:
    # 将图像使用的jpeg的格式解码从而得到图像对应的三维矩阵
    # TensorFlow还提供了tf.image.decode_png函数对png格式的图像进行解码
    # 解码之后的结果为一个张量，在使用它的取值之前需要明确调用运行的过程
    img_data=tf.image.decode_jpeg(image_raw_data)

    print(img_data.eval())
    # 使用pyplot得到图像
    # plt.imshow(img_data.eval())
    # plt.show()

    # 将数据的类型转化成实数方便后续处理
    img_data=tf.image.convert_image_dtype(img_data,dtype=tf.uint8)

    # 将表示一张图像的三维矩阵重新按照jpeg个数编码并存到文件中
    # 打开该图，可以得到和原图一样的图像
    # encode_image=tf.image.encode_jpeg(img_data)
    # with tf.gfile.GFile("E:\\Opencv Image\\an.jpg",'wb') as f:
    #     f.write(encode_image.eval())

    with tf.Session() as sess:
        resized = tf.image.resize_images(img_data, [300, 300], method=3)
        print(img_data.get_shape())
        # TensorFlow的函数处理图片后存储的数据是float32格式的，需要转换成uint8才能正确打印图片。
        print( "Digital type: ", resized.dtype)
        angelababy2 = np.asarray(resized.eval(), dtype='uint8')
        # tf.image.convert_image_dtype(rgb_image, tf.float32)
        plt.imshow(angelababy2)
        plt.show()

原始图像	原始图像

双线性差值	最近邻

双三次插值	面积插值

2. 裁剪和填充

tf.image.resize_image_with_crop_or_pad(image, target_height, target_width)

# Crops and/or pads an image to a target width and height.
# Resizes an image to a target width and height by either centrally cropping the image or
# padding it evenly with zeros.

# image:原始图像
# target_height, target_width：目标大小
# 如果原始图像的尺寸大于目标图像：自动截取居中的部分
# 如果原始图像的尺寸小于目标图像：四周全0填充

import matplotlib.pyplot as plt
import tensorflow as tf

# 读取图像的原始数据
image_raw_data=tf.gfile.FastGFile("E:\\Opencv Image\\anglababy.jpg",'rb').read()

with tf.Session() as sess:
    img_data=tf.image.decode_jpeg(image_raw_data)
    print(img_data.eval())
    # 将数据的类型转化成实数方便后续处理
    img_data=tf.image.convert_image_dtype(img_data,dtype=tf.uint8)

with tf.Session() as sess:
    croped = tf.image.resize_image_with_crop_or_pad(img_data, 1000, 1000)
    padded = tf.image.resize_image_with_crop_or_pad(img_data, 500, 500)
    plt.imshow(croped.eval())
    plt.show()
    plt.imshow(padded.eval())
    plt.show()

填充到1000*1000	裁剪到500*500

通过比例调整图像大小——截取中间的部分

crop = tf.image.central_crop(image, central_fraction=0.5)
# 第一个参数：原始图像
# 第二个参数：调整比例，(0,1]直接的实数

import matplotlib.pyplot as plt
import tensorflow as tf

# 读取图像的原始数据
image_raw_data=tf.gfile.FastGFile("E:\\Opencv Image\\anglababy.jpg",'rb').read()

with tf.Session() as sess:
    img_data=tf.image.decode_jpeg(image_raw_data)
    print(img_data.eval())
    # 将数据的类型转化成实数方便后续处理
    img_data=tf.image.convert_image_dtype(img_data,dtype=tf.uint8)

# 5. 截取中间50%的图片
with tf.Session() as sess:
    central_cropped = tf.image.central_crop(img_data, 0.5)
    plt.imshow(central_cropped.eval())
    plt.show()

3. 图像翻转

实现图像的上下翻转、左右翻转及对角线翻转

import matplotlib.pyplot as plt
import tensorflow as tf

# 读取图像的原始数据
image_raw_data=tf.gfile.FastGFile("E:\\Opencv Image\\anglababy.jpg",'rb').read()

with tf.Session() as sess:
    img_data=tf.image.decode_jpeg(image_raw_data)
    print(img_data.eval())
    # 将数据的类型转化成实数方便后续处理
    img_data=tf.image.convert_image_dtype(img_data,dtype=tf.uint8)

# 6. 翻转图片
with tf.Session() as sess:
    # 上下翻转
    # flipped1 = tf.image.flip_up_down(img_data)
    # 左右翻转
    # flipped2 = tf.image.flip_left_right(img_data)

    # 对角线翻转
    transposed = tf.image.transpose_image(img_data)
    plt.imshow(transposed.eval())
    plt.show()

    # 以一定概率上下翻转图片。
    # flipped = tf.image.random_flip_up_down(img_data)
    # 以一定概率左右翻转图片。
    # flipped = tf.image.random_flip_left_right(img_data)

4. 图像色彩调整

1. 亮度

import matplotlib.pyplot as plt
import tensorflow as tf

# 读取图像的原始数据
image_raw_data=tf.gfile.FastGFile("E:\\Opencv Image\\anglababy.jpg",'rb').read()

with tf.Session() as sess:
    img_data=tf.image.decode_jpeg(image_raw_data)
    print(img_data.eval())
    # 将数据的类型转化成实数方便后续处理
    img_data=tf.image.convert_image_dtype(img_data,dtype=tf.uint8)

    plt.imshow(img_data.eval())
    plt.show()
    # 将图片的亮度-0.5。
    # adjusted = tf.image.adjust_brightness(img_data, -0.5)

    # 将图片的亮度+0.5
    adjusted = tf.image.adjust_brightness(img_data, 0.5)

    # 在[-max_delta, max_delta)的范围随机调整图片的亮度。
    # adjusted = tf.image.random_brightness(img_data, max_delta=0.6)

    plt.imshow(adjusted.eval())
    plt.show()

这里写图片描述 |
亮度增加0.5

2. 对比度

import matplotlib.pyplot as plt
import tensorflow as tf

# 读取图像的原始数据
image_raw_data=tf.gfile.FastGFile("E:\\Opencv Image\\anglababy.jpg",'rb').read()

with tf.Session() as sess:
    img_data=tf.image.decode_jpeg(image_raw_data)
    print(img_data.eval())
    # 将数据的类型转化成实数方便后续处理
    img_data=tf.image.convert_image_dtype(img_data,dtype=tf.uint8)

    plt.imshow(img_data.eval())
    plt.show()

    # 将图片的对比度-5
    # adjusted = tf.image.adjust_contrast(img_data, -5)

    # 将图片的对比度+5
    adjusted = tf.image.adjust_contrast(img_data, 5)

    # 在[lower, upper]的范围随机调整图的对比度。
    # adjusted = tf.image.random_contrast(img_data, lower, upper)

    plt.imshow(adjusted.eval())
    plt.show()

这里写图片描述 |
对比度加5

3. 色相

import matplotlib.pyplot as plt
import tensorflow as tf

# 读取图像的原始数据
image_raw_data=tf.gfile.FastGFile("E:\\Opencv Image\\anglababy.jpg",'rb').read()

with tf.Session() as sess:
    img_data=tf.image.decode_jpeg(image_raw_data)
    print(img_data.eval())
    # 将数据的类型转化成实数方便后续处理
    img_data=tf.image.convert_image_dtype(img_data,dtype=tf.uint8)

    adjusted = tf.image.adjust_hue(img_data, 0.1)
    # adjusted = tf.image.adjust_hue(img_data, 0.3)
    # adjusted = tf.image.adjust_hue(img_data, 0.6)
    # adjusted = tf.image.adjust_hue(img_data, 0.9)

    # 在[-max_delta, max_delta]的范围随机调整图片的色相。max_delta的取值在[0, 0.5]之间。
    # adjusted = tf.image.random_hue(image, max_delta)

    plt.imshow(adjusted.eval())
    plt.show()

这里写图片描述 |

4. 饱和度


import matplotlib.pyplot as plt
import tensorflow as tf

# 读取图像的原始数据
image_raw_data=tf.gfile.FastGFile("E:\\Opencv Image\\anglababy.jpg",'rb').read()

with tf.Session() as sess:
    img_data=tf.image.decode_jpeg(image_raw_data)
    print(img_data.eval())
    # 将数据的类型转化成实数方便后续处理
    img_data=tf.image.convert_image_dtype(img_data,dtype=tf.uint8)

    # 将图片的饱和度-5。
    adjusted = tf.image.adjust_saturation(img_data, -5)
    # 将图片的饱和度+5。
    # adjusted = tf.image.adjust_saturation(img_data, 5)
    # 在[lower, upper]的范围随机调整图的饱和度。
    # adjusted = tf.image.random_saturation(img_data, lower, upper)

    # 将代表一张图片的三维矩阵中的数字均值变为0，方差变为1。
    # adjusted = tf.image.per_image_standardization(img_data)

    plt.imshow(adjusted.eval())
    plt.show()

这里写图片描述 |

TensorFlow还提供了API来完成图像标准化的过程，将亮度均值变为0，方差变为1

# 将代表一张图片的三维矩阵中的数字均值变为0，方差变为1。
adjusted = tf.image.per_image_standardization(img_data)

这里写图片描述 |

5. 处理标注框

在很多图像识别的数据集中，图像中需要关注的物体通常会被标注框圈出来，利用下述函数实现：

tf.image.draw_bounding_boxs(images,boxes,name=None)

# images：是 [batch, height, width, depth] 形状的四维矩阵，
#         数据类型为 float32、half 中的一种，第一个值batch是因为处理的是一组图片。

# boxes： 形状 [batch, num_bounding_boxes, 4] 的三维矩阵，
# num_bounding_boxes 是标注框的数量，标注框由四个数字标示 [y_min, x_min, y_max, x_max]，数组类型为float32
# 例如：tf.constant([[[0.05, 0.05, 0.9, 0.7], [0.35, 0.47, 0.5, 0.56]]]) 
#       shape 为 [1,2,4] 表示一张图片中的两个标注框；
#       tf.constant([[[ 0.  0.  1.  1.]]]) 的 shape 为 [1,1,4]表示一张图片中的一个标注框

# name：操作的名称（可选）。

代码示例：

import matplotlib.pyplot as plt
import tensorflow as tf

# 读取图像的原始数据
image_raw_data=tf.gfile.FastGFile("E:\\Opencv Image\\anglababy.jpg",'rb').read()

with tf.Session() as sess:
    img_data=tf.image.decode_jpeg(image_raw_data)
    # 先将图像缩小，可让标注框更为清楚
    img_data=tf.image.resize_images(img_data,180,260,methed=1)
    # tf.image.draw_bounding_boxes的输入加一个batch，也就是多张图像组成的思维矩阵
    # 所以要将解码后的加一维度
    # tf.image.draw_bounding_boxes输入为实数
    batched = tf.expand_dims(tf.image.convert_image_dtype(img_data, tf.float32), 0)
    boxes = tf.constant([[[0.01, 0.2, 0.5, 0.7],[0.25, 0.4, 0.32, 0.55]]])
    # [0.05, 0.05, 0.9, 0.7]，（y min，x min，y max，x max）坐标点的相对位置，和原始大小相乘
    result=tf.image.draw_bounding_boxes(batched,boxes)

    plt.subplot(121), plt.imshow(img_data.eval())
    plt.subplot(122), plt.imshow(result[0].eval())
    plt.show()

这里写图片描述

随机截取图像上有信息含量的部分也是提高模型健壮性的一种方式，可以使得训练得到的模型不受识别物体的大小的影响。

tf.image.sample_distorted_bounding_box(  
    image_size,  
    bounding_boxes,  
    seed=None,  
    seed2=None,  
    min_object_covered=None,  
    aspect_ratio_range=None,  
    area_range=None,  
    max_attempts=None,  
    use_image_if_no_bounding_boxes=None,  
    name=None)

# image_size： 是包含 [height, width, channels] 三个值的一维数组。数值类型必须是 uint8，int8，int16，int32，int64 中的一种。

# bounding_boxes： 是一个 shape 为 [batch, N, 4] 的三维数组，数据类型为float32，第一个batch是因为函数是处理一组图片的，N表示描述与图像相关联的N个边界框的形状，而标注框由4个数字 [y_min, x_min, y_max, x_max] 表示出来。例如：tf.constant([[[0.05, 0.05, 0.9, 0.7], [0.35, 0.47, 0.5, 0.56]]]) 的 shape 为 [1,2,4] 表示一张图片中的两个标注框；tf.constant([[[ 0.  0.  1.  1.]]]) 的 shape 为 [1,1,4]表示一张图片中的一个标注框

# seed： （可选）数组类型为 int，默认为0。如果任一个seed或被seed2设置为非零，随机数生成器由给定的种子生成。否则，由随机种子生成。

# seed2： （可选）数组类型为 int，默认为0。第二种子避免种子冲突。

# min_object_covered：（可选）数组类型为 float，默认为 0.1。图像的裁剪区域必须包含所提供的任意一个边界框的至少 min_object_covered 的内容。该参数的值应为非负数，当为0时，裁剪区域不必与提供的任何边界框有重叠部分。

# aspect_ratio_range： （可选）数组类型为 floats 的列表，默认为 [0.75, 1.33] 。图像的裁剪区域的宽高比（宽高比=宽/高） 必须在这个范围内。

# area_range： （可选）数组类型为 floats 的列表，默认为 [0.05, 1] 。图像的裁剪区域必须包含这个范围内的图像的一部分。

# max_attempts： （可选）数组类型为 int，默认为100。尝试生成图像指定约束的裁剪区域的次数。经过  # max_attempts 次失败后，将返回整个图像。

# use_image_if_no_bounding_boxes：（可选）数组类型为 bool，默认为False。如果没有提供边框，则用它来控制行为。如果为True，则假设有一个覆盖整个输入的隐含边界框。如果为False，就报错。

# name： 操作的名称（可选）。

代码实现：

import matplotlib.pyplot as plt
import tensorflow as tf

# 读取图像的原始数据
image_raw_data = tf.gfile.FastGFile("E:\\Opencv Image\\anglababy.jpg", 'rb').read()

with tf.Session() as sess:
    img_data = tf.image.decode_jpeg(image_raw_data)
    print(img_data.eval())

    img_data = tf.image.resize_images(img_data, (330, 200), method=1)

    boxes = tf.constant([[[0.01, 0.2, 0.5, 0.7], [0.25, 0.4, 0.32, 0.55]]])

    # 随机图像截取
    begin, size, bbox_for_draw = tf.image.sample_distorted_bounding_box(
        tf.shape(img_data), bounding_boxes=boxes,min_object_covered=0.1)

    batched = tf.expand_dims(tf.image.convert_image_dtype(img_data, tf.float32), 0)
    image_with_box = tf.image.draw_bounding_boxes(batched, bbox_for_draw)
    distorted_image = tf.slice(img_data, begin, size)
    plt.imshow(distorted_image.eval())
    plt.show()

这里写图片描述

7.2.2 图像预处理完整样例

# 《TensorFlow实战Google深度学习框架》07 图像数据处理
# win10 Tensorflow1.0.1 python3.5.3
# CUDA v8.0 cudnn-8.0-windows10-x64-v5.1
# filename:ts07.03.py # 图像预处理完整样例

import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt

# 1. 随机调整图片的色彩，定义两种顺序
def distort_color(image, color_ordering=0):
    if color_ordering == 0:
        image = tf.image.random_brightness(image, max_delta=32./255.)
        image = tf.image.random_saturation(image, lower=0.5, upper=1.5)
        image = tf.image.random_hue(image, max_delta=0.2)
        image = tf.image.random_contrast(image, lower=0.5, upper=1.5)
    else:
        image = tf.image.random_saturation(image, lower=0.5, upper=1.5)
        image = tf.image.random_brightness(image, max_delta=32./255.)
        image = tf.image.random_contrast(image, lower=0.5, upper=1.5)
        image = tf.image.random_hue(image, max_delta=0.2)

    return tf.clip_by_value(image, 0.0, 1.0)

# 2. 对图片进行预处理，将图片转化成神经网络的输入层数据
# 给定一张解码的图像、目标尺寸、及图像上的标注图，此函数可以对给出的图像进行预处理
# 输入：原始训练图像
# 输出：神经网络模型的输入层
# 注意：此处只处理模型的训练数据，对预测数据不需要使用随机变换的步骤
def preprocess_for_train(image, height, width, bbox):
    # 查看是否存在标注框，如果没有标注框，则认为图像就是整个需要关注的部分
    if bbox is None:
        bbox = tf.constant([0.0, 0.0, 1.0, 1.0], dtype=tf.float32, shape=[1, 1, 4])
    # 转换图像的张量类型
    if image.dtype != tf.float32:
        image = tf.image.convert_image_dtype(image, dtype=tf.float32)

    # 随机的截取图片中一个块，减小物体大小对图像识别算法的影响
    bbox_begin, bbox_size, _ = tf.image.sample_distorted_bounding_box(
        tf.shape(image), bounding_boxes=bbox, min_object_covered=0.1)
    bbox_begin, bbox_size, _ = tf.image.sample_distorted_bounding_box(
        tf.shape(image), bounding_boxes=bbox, min_object_covered=0.1)
    distorted_image = tf.slice(image, bbox_begin, bbox_size)

    # 将随机截取的图片调整为神经网络输入层的大小，大小调整的算法是随机选择的
    distorted_image = tf.image.resize_images(distorted_image, [height, width], method=np.random.randint(4))
    # 随机左右翻转图像
    distorted_image = tf.image.random_flip_left_right(distorted_image)
    # 使用一种随机的顺序调整图像的色彩
    distorted_image = distort_color(distorted_image, np.random.randint(2))
    return distorted_image

# 3. 读取图片
image_raw_data = tf.gfile.FastGFile("E:\\Opencv Image\\dog.jpg", "rb").read()
with tf.Session() as sess:
    img_data = tf.image.decode_jpeg(image_raw_data)
    boxes = tf.constant([[[0.05, 0.05, 0.9, 0.7], [0.35, 0.47, 0.5, 0.56]]])
    # 运行6次获得6种不同的图像
    for i in range(6):
        result = preprocess_for_train(img_data, 299, 299, boxes)
        plt.imshow(result.eval())
        plt.show()

7.3 多线程输入数据处理框架

上述的预处理方法可以减小无关因素对图像识别模型效果的影响，但这些复杂的操作会减慢整个训练过程，为了避免预处理带来的影响，TensorFlow提供了多线程处理输入数据的框架。

经典输入数据处理流程图

7.3.1 队列与多线程

队列是计算图上“有状态”的节点，其他节点可以修改其内容，也就是其他队列可以把新元素插入到队列后端（rear），也可以将队列前端（front）的元素删除。

1. TensorFlow中提供了 FIFOQueue 和 RandomShuffleQueue 两种队列

对于队列来说，修改队列状态的操作主要有Enqueue、EnqueueMany、Dequeue，他们需要获取队列指针，而非普通的值，如此才能修改队列内容，在python API中，它们就是队列的方法，例如：q.enqueue()

EnqueueMany：队列的初始化
Dequeue：出队
Enqueue：入队

FIFOQueue——先进先出队列

import tensorflow as tf

#创建一个先进先出队列，指定队列中可以保存两个元素，并指定类型为整形
q=tf.FIFOQueue(2,"int32")

#使用enqueue_many函数来初始化队列中的元素
#和变量初始化类似，在使用队列之前需要明确调用这个初始化过程
init=q.enqueue_many(([0,10],))

#使用Dequeue函数将队列中的第一个元素出队列，该元素的值将被存在变量x中
x=q.dequeue()

#将得到的值加1
y=x+1

#将加1后的值重新加入队列
q_inc=q.enqueue([y])

with tf.Session() as sess:
    #运行初始化队列操作
    init.run()
    for _ in range(5):
        # 运行q_inc将执行数据出队列、出队的元素+1、重新加入队列的整个过程
        v,_ =sess.run([x,q_inc])
        # 打印出队元素的取值
        print(v)

输出：

# 队列开始有[0,10]两个元素，第一个出队的为0，加1之后再次入队得到的队列为[10,1]；
# 第二次出队的为10，加1之后为11，得到的队列为[1,11]...

0
10
1
11
2

RandomShuffleQueue——会将队列中的元素打乱，每次出队列操作得到的是当前队列中的随机某个元素

神经网络训练中更希望使用的训练数据尽量随机，所以该方法更多用。

2. 队列的作用

如上所示，队列是一种数据结构
同时也是异步张量取值的一个重要机制（如多个线程可以同时向一个队列中写元素，或者同时读取一个队列中的元素）

3. 多线程协同功能

tf.Coordinator、tf.QueueRunner两个类来完成多线程协同的功能

tf.Coordinator主要用于协同多个线程一起停止（并提供了should_stop、request_stop、join三个函数）
工作过程：
1. 声明一个tf.Coordinator的类，并将该类传入每一个创建的线程中去
2. 启动的线程要一直查询should_stop函数，为True时当前线程需要退出
3. 每一个启动的线程都可以通过调用request_stop函数来通知其他线程退出（当一个线程调用request_stop函数时，should_stop函数就好被设置为True，这样其他线程就可以同时终止）

import tensorflow as tf
import numpy as np
import threading
import time

# 线程中运行的程序，这个程序每隔1s判断是否需要停止并打印自己的ID
def MyLoop(coord,worker_id):
    # 使用tf.Coordinator类提供的协同工具判断当前线程是否需要停止
    while not coord.should_stop():
        # 随机停止所有线程
        if np.random.rand()<0.1:
            print("stop from id: %d\n" % worker_id)
            # 调用coord.request_stop()函数来通知其他线程停止
            coord.request_stop()
        else:
            # 打印当前线程的ID
            print("working on id: %d" % worker_id)
        # 暂停1s
        time.sleep(1)

# 声明一个tf.train.Coordinator类来协同多个线程
coord=tf.train.Coordinator()
# 声明创建5个线程
threads=[threading.Thread(target=MyLoop,args=(coord,i,)) for i in range(5)]
#启动所有的线程
for t in threads:t.start()
#等待所有线程退出
coord.join(threads)

输出：

working on id: 0
working on id: 1
working on id: 2
working on id: 3
working on id: 4
working on id: 0
working on id: 3
working on id: 2
working on id: 1
working on id: 4
working on id: 3
working on id: 0
working on id: 2
working on id: 4
working on id: 1
stop from id: 4

当所有线程启动后，每个线程会打印各自的ID，于是前4行打印了它们的ID，暂停1s之后，所有的线程将会第二遍打印ID。

tf.QueueRunner主要用于启动多个线程来操作同一个队列
启动的所有线程可以通过tf.Coordinator类来同一管理
下列代码展示了如何使用来管理多线程队列操作

import tensorflow as tf

# 声明一个先进先出的队列，队列中最多100个元素，类型为实数
queue = tf.FIFOQueue(100,"float")
# 定义队列的入队操作
enqueue_op = queue.enqueue([tf.random_normal([1])])

# 使用tf.train.QueueRunner来创建多个线程运行队列的入队操作
# tf.train.QueueRunner的第一个参数给出了被操作的队列
# [enqueue_op] * 5表示需要启动5个线程，每个线程中运行的是enqueue_op的操作
qr = tf.train.QueueRunner(queue, [enqueue_op] * 5)

# 将定义过的QueueRunner加入TensorFlow计算图上指定的集合
# tf.train.QueueRunner函数没有指定集合，则加入默认集合tf.GraphKeys.QUEUE_RUNNERS
# 下面的函数就是刚刚定义的qr加入默认的集合tf.GraphKeys.QUEUE_RUNNERS
tf.train.add_queue_runner(qr)

# 定义出队操作
out_tensor = queue.dequeue()

with tf.Session() as sess:
    # 使用tf.train.Coordinator来协同启动的线程
    coord = tf.train.Coordinator()

    # 使用tf.train.QueueRunner()时，需要明确调用tf.train.start_queue_runners来启动所有线程
    # 否则因为没有线程运行入队操作
    # 当调用出队操作时，程序会一直等待入队操作被运行
    # tf.train.start_queue_runners函数会默认启动tf.GraphKeys.QUEUE_RUNNERS集合中所有的QueueRunner
    # 因为该函数只支持启动指定集合中的QueueRunner
    # 所以一般来说tf.train.add_queue_runner函数和tf.train.start_queue_runners函数会指定同一个集合
    threads = tf.train.start_queue_runners(sess=sess, coord=coord)

    #获取队列中的取值
    for _ in range(3):
        print(sess.run(out_tensor)[0])

    # 使用tf.train.Coordinator来停止所有线程
    coord.request_stop()
    coord.join(threads)

输出：

-0.574549
1.83348
-0.67578

7.3.2 输入文件队列

本节介绍如何使用TensorFlow中的队列管理输入文件列表

虽然一个TFRecord文件可以存储多个训练样本，但当训练数据量较大时，可以将数据分成多个TFRecord文件来提高处理效率。

获取一个正则表达式的所有文件：tf.train.match_filenames_once
进行有效的管理：tf.train.string_input_producer

该函数会使用初始化时提供的文件列表创建一个输入队列，输入队列中的原始元素为文件列表中的所有文件，创建好的输入队列可以作为文件读取函数的参数，每层调用文件读取函数时，该函数会先判断当前是否已经有打开的文件可读，如果没有或者打开的文件已经读完，则该函数会从输入队列中出队一个文件并从该文件中读取数据。

当shuffle=True时，文件在加入队列之前会被打乱顺序，所以出队顺序也是随机的；
- 随机打乱文件顺序以及加入输入队列的过程是一个单独的线程，不会影响获取文件的速度。
- 当输入队列中的所有文件都被处理完之后，会将初始化时提供的文件列表中的文件全部重新加入队列。
num_epochs：限制加载初始文件列表的最大轮数
- 当设置为1时，计算完一轮之后，程序将自动停止。
- 神经网络模型测试时，所有测试数据仅仅需要使用一次即可，所以将其设置为1。

生成样例数据的简单程序：

import tensorflow as tf

# 创建TFRecord帮助函数
def _int64_feature(value):
    return tf.train.Feature(int64_list=tf.train.Int64List(value=[value]))
# 模拟海量数据CIA将数据写入不同的文件
# num_shards 定义了总共写入多少个文件
# instances_per_shard定义了每个文件中有多少个数据
num_shards=2
instances_per_shard=2
for i in range(num_shards):
    # 将数据分为多个文件时，可以将不同文件以类似0000n-of-0000m的后缀区分
    # m：表示数据总共被存在了多少个文件中
    # n：表示当前文件的编号
    # 式样的方式既方便了通过正则表达式获取文件列表，又在文件名中加入了更多的信息
    filename=("E:\\Opencv Image\\data.tfrecords-%.5d-of-%.5d" % (i,num_shards))
    writer=tf.python_io.TFRecordWriter(filename)
    # 将数据封装改成example结构并写入TFRecord文件
    for j in range(instances_per_shard):
        # example结构仅包含当前样例属于第几个文件以及是当前文件的第几个样本
        example=tf.train.Example(features=tf.train.Features(feature={
            'i':_int64_feature(i),
            'j':_int64_feature(j)}))
        writer.write(example.SerializeToString())
    writer.close()

程序运行之后，会在指定目录下生成两个文件：
这里写图片描述
每个文件存储了两个样例，生成样例之后，以下代码展示了两个函数的使用方法

import tensorflow as tf

# 使用tf.train.match_filenames_once函数获取文件列表
files = tf.train.match_filenames_once("E:\\pycharm\\TensorFlow chap7\\data.tfrecords-*")
filename_queue = tf.train.string_input_producer(files, shuffle=False)
reader = tf.TFRecordReader()
_, serialized_example = reader.read(filename_queue)
features = tf.parse_single_example(
      serialized_example,
      features={
          'i': tf.FixedLenFeature([], tf.int64),
          'j': tf.FixedLenFeature([], tf.int64),
      })
with tf.Session() as sess:
    # # tf.global_variables_initializer().run()  #报错
    sess.run([tf.global_variables_initializer(),tf.local_variables_initializer()])
    print(sess.run(files))

    # 声明tf.train.Coordinator类来协同不同线程，并启动线程
    coord = tf.train.Coordinator()
    threads = tf.train.start_queue_runners(sess=sess, coord=coord)

    # 多次执行获取数据的操作
    for i in range(6):
        print(sess.run([features['i'], features['j']]))
    coord.request_stop()
    coord.join(threads)

输出：

[b'E:\\pycharm\\TensorFlow chap7\\data.tfrecords-00000-of-00002'
 b'E:\\pycharm\\TensorFlow chap7\\data.tfrecords-00001-of-00002']
[0, 0]
[0, 1]
[1, 0]
[1, 1]
[0, 0]
[0, 1]

7.3.3 组合训练数据（batching）

将多个输入样例组织成一个batch可以提高模型训练的效率，所以在得到单个样例的预处理结果之后，还需要将其组织成batch，再提供给神经网络的输入层。
1. tf.train.batch：可以将样例组织成batch会生成一队列，队列的入队操作是生成单个样例的方法，每次出队会得到一个样例。
2. tf.train.shuffle_batch：可以交给你样例组织成batch，会生成一队列，队列的入队操作是生成单个样例的方法，每次出队会得到一个样例，但是tf.train.shuffle_batch函数会将数据顺序打乱。

tf.train.batch代码示例：

import tensorflow as tf

# 获取文件列表
files = tf.train.match_filenames_once("E:\\pycharm\\TensorFlow chap7\\data.tfrecords-*")

# 创建文件输入队列
filename_queue = tf.train.string_input_producer(files, shuffle=False)

# 读取并解析Example
reader = tf.TFRecordReader()
_, serialized_example = reader.read(filename_queue)
features = tf.parse_single_example(
    serialized_example,
    features={
        'i': tf.FixedLenFeature([], tf.int64),
        'j': tf.FixedLenFeature([], tf.int64)
    })

# i代表特征向量，j代表标签
example, label = features['i'], features['j']

# 一个batch中的样例数
batch_size = 3

# 文件队列中最多可以存储的样例个数
capacity = 1000 + 3 * batch_size

# 组合样例
example_batch, label_batch = tf.train.batch(
    [example, label], batch_size=batch_size, capacity=capacity)

with tf.Session() as sess:
    # 使用match_filenames_once需要用local_variables_initializer初始化一些变量
    sess.run(
        [tf.global_variables_initializer(),
         tf.local_variables_initializer()])

    # 用Coordinator协同线程，并启动线程
    coord = tf.train.Coordinator()
    threads = tf.train.start_queue_runners(coord=coord)

    # 获取并打印组合之后的样例。真实问题中一般作为神经网路的输入
    for i in range(2):
        cur_example_batch, cur_label_batch = sess.run(
            [example_batch, label_batch])
        print(cur_example_batch, cur_label_batch)

    coord.request_stop()
    coord.join(threads)

输出：

[0 0 1] [0 1 0]
[1 0 0] [1 0 1]
# tf.train.batch 函数可以将单个的数据组织成3个一组的batch
# 在example，lable中读到的数据依次为：
example：0, lable：0
example：0, lable：1
example：1, lable：0
example：1, lable：1
# 这是因为函数不会随机打乱顺序，所以组合之后得到的数据组合成了上面给出的输出

tf.train.shuffle_batch代码示例如下：

import tensorflow as tf
# 获取文件列表
files = tf.train.match_filenames_once("E:\\pycharm\\TensorFlow chap7\\data.tfrecords-*")

# 创建文件输入队列
filename_queue = tf.train.string_input_producer(files, shuffle=False)

# 读取并解析Example
reader = tf.TFRecordReader()
_, serialized_example = reader.read(filename_queue)
features = tf.parse_single_example(
    serialized_example,
    features={
        'i': tf.FixedLenFeature([], tf.int64),
        'j': tf.FixedLenFeature([], tf.int64)
    })

# i代表特征向量，j代表标签
example, label = features['i'], features['j']

# 一个batch中的样例数
batch_size = 3

# 文件队列中最多可以存储的样例个数
capacity = 1000 + 3 * batch_size

# 组合样例
#  `min_after_dequeue` 是该函数特有的参数，参数限制了出队时队列中元素的最少个数，
#   但当队列元素个数太少时，随机的意义就不大了
example_batch,label_batch = tf.train.shuffle_batch(
    [example,label],batch_size=batch_size,
    capacity=capacity,min_after_dequeue=30)

with tf.Session() as sess:
    # 使用match_filenames_once需要用local_variables_initializer初始化一些变量
    sess.run( [tf.global_variables_initializer(),tf.local_variables_initializer()])

    # 用Coordinator协同线程，并启动线程
    coord = tf.train.Coordinator()
    threads = tf.train.start_queue_runners(coord=coord)

    # 获取并打印组合之后的样例。真实问题中一般作为神经网路的输入
    for i in range(2):
        cur_example_batch, cur_label_batch = sess.run(
            [example_batch, label_batch])
        print(cur_example_batch, cur_label_batch)

    coord.request_stop()
    coord.join(threads)

输出：

[0 1 1] [0 1 0]
[1 0 0] [0 0 1]

7.3.4 输入数据处理框架

本节给出以上步骤整合之后的代码

框架主要是三方面的内容：

TFRecord 输入数据格式
图像数据处理
多线程输入数据处理

以下代码只是描绘了一个输入数据处理的框架，需要根据实际使用环境进行修改

import tensorflow as tf

# 创建文件列表
files = tf.train.match_filenames_once("E:\\pycharm\\TensorFlow chap7\\data.tfrecords-*")

# 创建输入文件队列
filename_queue = tf.train.string_input_producer(files, shuffle=False)

# 解析数据。假设image是图像数据，label是标签，height、width、channels给出了图片的维度
reader = tf.TFRecordReader()
_, serialized_example = reader.read(filename_queue)
features = tf.parse_single_example(
    serialized_example,
    features={
        'image': tf.FixedLenFeature([], tf.string),
        'label': tf.FixedLenFeature([], tf.int64),
        'height': tf.FixedLenFeature([], tf.int64),
        'width': tf.FixedLenFeature([], tf.int64),
        'channels': tf.FixedLenFeature([], tf.int64)
    })
image, label = features['image'], features['label']
height, width = features['height'], features['width']
channels = features['channels']

# 从原始图像中解析出像素矩阵，并还原图像
decoded_image = tf.decode_raw(image, tf.uint8)
decoded_image.set_shape([height, width, channels])

# 定义神经网络输入层图片的大小
image_size = 299

# preprocess_for_train函数是对图片进行预处理的函数
distorted_image = preprocess_for_train(decoded_image, image_size, image_size,None)

# 组合成batch
min_after_dequeue = 10000
batch_size = 100
capacity = min_after_dequeue + 3 * batch_size
image_batch, label_batch = tf.train.shuffle_batch(
    [distorted_image, label],
    batch_size=batch_size,
    capacity=capacity,
    min_after_dequeue=min_after_dequeue)

# 定义神经网络的结构及优化过程
logit = inference(image_batch)
loss = calc_loss(logit, label_batch)
train_step = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss)

with tf.Session() as sess:
    sess.run(
        [tf.global_variables_initializer(),
         tf.local_variables_initializer()])
    coord = tf.train.Coordinator()
    threads = tf.train.start_queue_runners(coord=coord)

    # 神经网络训练过程
    for i in range(TRAINING_ROUNDS):
        sess.run(train_step)

    coord.request_stop()
    coord.join()

总结

对于输入数据的处理，大体上流程都差不多，可以归结如下：

将数据转为 TFRecord 格式的多个文件
用 tf.train.match_filenames_once() 创建文件列表（图中为{A,B,C}）
用 tf.train.string_input_producer() 创建输入文件队列，可以将输入文件顺序随机打乱，并加入输入队列（是否打乱为可选项，该函数也会生成并维护一个输入文件队列，不同进程中的文件读取函数可以共享这个输入文件队列）
用 tf.TFRecordReader() 读取文件中的数据
用 tf.parse_single_example() 解析数据
对数据进行解码及预处理
用 tf.train.shuffle_batch() 将数据组合成 batch
将 batch 用于训练

这里写图片描述
上图就是以上代码中输入数据处理的全部流程

本文内容由网友自发贡献，版权归原作者所有，本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容，请联系:hwhale#tublm.com(使用前将#替换为@)

tensorflow