TensorFlow 相当于 PyTorch 的 Transforms.Normalize()

2024-05-15

我正在尝试推断最初在 PyTorch 中构建的 TFLite 模型。我一直在遵循PyTorch 实现 https://github.com/leoxiaobin/deep-high-resolution-net.pytorch/blob/1ee551d619641268c2ebd80134101db6e962f45f/demo/inference.py#L93并且必须沿着 RGB 通道预处理图像。我找到了最接近的 TensorFlow 等价物transforms.Normalize() to be tf.image.per_image_standardization() (文档 https://www.tensorflow.org/api_docs/python/tf/image/per_image_standardization）。虽然这是一场很不错的搭配，tf.image.per_image_standardization()这是通过跨渠道获取均值和标准差并将其应用于它们来实现的。这是他们的完整实现here https://github.com/tensorflow/tensorflow/blob/r1.1/tensorflow/python/ops/image_ops_impl.py

def per_image_standardization(image):
  """Linearly scales `image` to have zero mean and unit norm.
  This op computes `(x - mean) / adjusted_stddev`, where `mean` is the average
  of all values in image, and
  `adjusted_stddev = max(stddev, 1.0/sqrt(image.NumElements()))`.
  `stddev` is the standard deviation of all values in `image`. It is capped
  away from zero to protect against division by 0 when handling uniform images.
  Args:
    image: 3-D tensor of shape `[height, width, channels]`.
  Returns:
    The standardized image with same shape as `image`.
  Raises:
    ValueError: if the shape of 'image' is incompatible with this function.
  """
  image = ops.convert_to_tensor(image, name='image')
  _Check3DImage(image, require_static=False)
  num_pixels = math_ops.reduce_prod(array_ops.shape(image))

  image = math_ops.cast(image, dtype=dtypes.float32)
  image_mean = math_ops.reduce_mean(image)

  variance = (math_ops.reduce_mean(math_ops.square(image)) -
              math_ops.square(image_mean))
  variance = gen_nn_ops.relu(variance)
  stddev = math_ops.sqrt(variance)

  # Apply a minimum normalization that protects us against uniform images.
  min_stddev = math_ops.rsqrt(math_ops.cast(num_pixels, dtypes.float32))
  pixel_value_scale = math_ops.maximum(stddev, min_stddev)
  pixel_value_offset = image_mean

  image = math_ops.subtract(image, pixel_value_offset)
  image = math_ops.div(image, pixel_value_scale)
  return image

而 PyTorch 的transforms.Normalize()允许我们提及要应用于每个通道的平均值和标准差，如下所示。

# transformation
    pose_transform = transforms.Compose([
        transforms.ToTensor(),
        transforms.Normalize(mean=[0.485, 0.456, 0.406],
                             std=[0.229, 0.224, 0.225]),
    ])

在 TensorFlow 2.x 中获得此功能的方法是什么？

Edit:我创建了一个快速的错误，似乎通过定义一个函数来解决这个问题：

def normalize_image(image, mean, std):
    for channel in range(3):
        image[:,:,channel] = (image[:,:,channel] - mean[channel])/std[channel]
    
    return image

我不确定这有多有效，但似乎可以完成工作。在输入到模型之前，我仍然必须将输出转换为张量。

您提到的解决方法似乎没问题。但使用for...loop计算标准化为each RGB通道为单幅图像当您处理数据管道中的大型数据集时可能会有点问题（generator or tf.data）。但无论如何都没关系。这是您的方法的演示，稍后我们将提供两种可能适合您的替代方案。

from PIL import Image 
from matplotlib.pyplot import imshow, subplot, title, hist

# load image (RGB)
img = Image.open('/content/9.jpg')

def normalize_image(image, mean, std):
    for channel in range(3):
        image[:,:,channel] = (image[:,:,channel] - mean[channel]) / std[channel]
    return image

OP_approach = normalize_image(np.array(img) / 255.0, 
                            mean=[0.485, 0.456, 0.406], 
                            std=[0.229, 0.224, 0.225])

现在，让我们观察一下变换属性。

plt.figure(figsize=(25,10))
subplot(121); imshow(OP_approach); title(f'Normalized Image \n min-px: \
    {OP_approach.min()} \n max-pix: {OP_approach.max()}')
subplot(122); hist(OP_approach.ravel(), bins=50, density=True); \ 
                                    title('Histogram - pixel distribution')

归一化后最小和最大像素的范围是（-2.1179039301310043, 2.6399999999999997）分别。

Option 2

我们可以使用tf。 keras...标准化 https://www.tensorflow.org/api_docs/python/tf/keras/layers/experimental/preprocessing/Normalization预处理层做同样的事情。它需要两个重要的论点，它们是mean and, variance（的平方std).

from tensorflow.keras.experimental.preprocessing import Normalization

input_data = np.array(img)/255
layer = Normalization(mean=[0.485, 0.456, 0.406], 
                      variance=[np.square(0.299), 
                                np.square(0.224), 
                                np.square(0.225)])

plt.figure(figsize=(25,10))
subplot(121); imshow(layer(input_data).numpy()); title(f'Normalized Image \n min-px: \
   {layer(input_data).numpy().min()} \n max-pix: {layer(input_data).numpy().max()}')
subplot(122); hist(layer(input_data).numpy().ravel(), bins=50, density=True);\
   title('Histogram - pixel distribution')

归一化后最小和最大像素的范围是（-2.0357144, 2.64）分别。

Option 3

这更像是减去平均值mean并除以平均值std.

norm_img = ((tf.cast(np.array(img), tf.float32) / 255.0) - 0.449) / 0.226

plt.figure(figsize=(25,10))
subplot(121); imshow(norm_img.numpy()); title(f'Normalized Image \n min-px: \
{norm_img.numpy().min()} \n max-pix: {norm_img.numpy().max()}')
subplot(122); hist(norm_img.numpy().ravel(), bins=50, density=True); \
title('Histogram - pixel distribution')

归一化后最小和最大像素的范围是（-1.9867257, 2.4380531）分别。最后，如果我们比较pytorch方式，这些方法之间没有太大区别。

import torchvision.transforms as transforms

transform_norm = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406],
                            std=[0.229, 0.224, 0.225]),
])
norm_pt = transform_norm(img)

plt.figure(figsize=(25,10))
subplot(121); imshow(np.array(norm_pt).transpose(1, 2, 0));\
  title(f'Normalized Image \n min-px: \
  {np.array(norm_pt).min()} \n max-pix: {np.array(norm_pt).max()}')
subplot(122); hist(np.array(norm_pt).ravel(), bins=50, density=True); \
  title('Histogram - pixel distribution')

归一化后最小和最大像素的范围是（-2.117904, 2.64）分别。

本文内容由网友自发贡献，版权归原作者所有，本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容，请联系:hwhale#tublm.com(使用前将#替换为@)