如何实现固定长度的空间金字塔池化层?

2024-04-20

我想实现所介绍的空间金字塔池层在本文中 https://arxiv.org/pdf/1406.4729v4.pdf.

正如论文设置,关键点是定义 max_pooling 层的变体内核大小和步幅大小,即:

kernel_size = ceil(a/n)
stride_size = floor(a/n)

where a是输入张量的空间大小,并且n是金字塔级别,即池化输出的空间仓。

我尝试用张量流实现这一层:

import numpy as np
import tensorflow as tf


def spp_layer(input_, name='SPP_layer'):
    """
    4 level SPP layer.

    spatial bins: [6_6, 3_3, 2_2, 1_1]

    Parameters
    ----------
    input_ : tensor
    name : str

    Returns
    -------
    tensor
    """
    shape = input_.get_shape().as_list()

    with tf.variable_scope(name):

        spp_6_6_pool = tf.nn.max_pool(input_,
                                      ksize=[1,
                                             np.ceil(shape[1]/6).astype(np.int32),
                                             np.ceil(shape[2]/6).astype(np.int32),
                                             1],
                                      strides=[1, shape[1]//6, shape[2]//6, 1],
                                      padding='SAME')
        print('SPP layer level 6:', spp_6_6_pool.get_shape().as_list())

        spp_3_3_pool = tf.nn.max_pool(input_,
                                      ksize=[1,
                                             np.ceil(shape[1]/3).astype(np.int32),
                                             np.ceil(shape[2]/3).astype(np.int32),
                                             1],
                                      strides=[1, shape[1]//3, shape[2]//3, 1],
                                      padding='SAME')
        print('SPP layer level 3:', spp_3_3_pool.get_shape().as_list())

        spp_2_2_pool = tf.nn.max_pool(input_,
                                      ksize=[1,
                                             np.ceil(shape[1]/2).astype(np.int32),
                                             np.ceil(shape[2]/2).astype(np.int32),
                                             1],
                                      strides=[1, shape[1]//2, shape[2]//2, 1],
                                      padding='SAME')
        print('SPP layer level 2:', spp_2_2_pool.get_shape().as_list())

        spp_1_1_pool = tf.nn.max_pool(input_,
                                      ksize=[1,
                                             np.ceil(shape[1]/1).astype(np.int32),
                                             np.ceil(shape[2]/1).astype(np.int32),
                                             1],
                                      strides=[1, shape[1]//1, shape[2]//1, 1],
                                      padding='SAME')
        print('SPP layer level 1:', spp_1_1_pool.get_shape().as_list())

        spp_6_6_pool_flat = tf.reshape(spp_6_6_pool, [shape[0], -1])
        spp_3_3_pool_flat = tf.reshape(spp_3_3_pool, [shape[0], -1])
        spp_2_2_pool_flat = tf.reshape(spp_2_2_pool, [shape[0], -1])
        spp_1_1_pool_flat = tf.reshape(spp_1_1_pool, [shape[0], -1])

        spp_pool = tf.concat(1, [spp_6_6_pool_flat,
                                 spp_3_3_pool_flat,
                                 spp_2_2_pool_flat,
                                 spp_1_1_pool_flat])

    return spp_pool

但当输入大小不同时,它不能保证相同长度的池化输出。

如何解决这个问题呢?


我认为该论文的作者是错误的,公式应该是:

stride_size = floor(a/n)
kernel_size = floor(a/n) + (a mod n)

请注意,当 n

我修改了我找到的代码https://github.com/tensorflow/tensorflow/issues/6011 https://github.com/tensorflow/tensorflow/issues/6011这是:

def spp_layer(input_, levels=(6, 3, 2, 1), name='SPP_layer'):
    shape = input_.get_shape().as_list()
    with tf.variable_scope(name):
        pyramid = []
        for n in levels:

            stride_1 = np.floor(float(shape[1] / n)).astype(np.int32)
            stride_2 = np.floor(float(shape[2] / n)).astype(np.int32)
            ksize_1 = stride_1 + (shape[1] % n)
            ksize_2 = stride_2 + (shape[2] % n)
            pool = tf.nn.max_pool(input_,
                                  ksize=[1, ksize_1, ksize_2, 1],
                                  strides=[1, stride_1, stride_2, 1],
                                  padding='VALID')

            # print("Pool Level {}: shape {}".format(n, pool.get_shape().as_list()))
            pyramid.append(tf.reshape(pool, [shape[0], -1]))
        spp_pool = tf.concat(1, pyramid)
    return spp_pool
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系:hwhale#tublm.com(使用前将#替换为@)

如何实现固定长度的空间金字塔池化层? 的相关文章

随机推荐