TensorFlow 中的高效图像膨胀

2024-01-03

我正在寻找一种有效的实施方式形态学图像膨胀 https://en.wikipedia.org/wiki/Dilation_(morphology)在 TensorFlow 中使用方形内核。正如 OpenCV 所示，与实际效果相比，显而易见的方法似乎效率极低。查看粘贴在底部的运行源代码的结果 - 即使最快的方法也比 OpenCV 慢 30 倍左右。这些来自配备 M1 芯片组的 MacBook Air。

Dilation of 640x480 image with a 25x25 kernel took: 
  0.61ms using opencv
  545.40ms using tf.nn.max_pool2d
  228.66ms using tf.nn.dilation2d naively
  17.63ms using tf.nn.dilation2d with row-col

Question：有谁知道一种使用 TensorFlow 进行图像膨胀的方法，而且效率不是极低？

当前解决方案的源代码：

import numpy as np
import cv2
import tensorflow as tf
import time


def tf_dilate(heatmap, width: int, method: str = 'rowcol'):
    """ Dilate the heatmap with a square kernel """
    if method=='maxpool':
        return tf.nn.max_pool2d(heatmap[None, :, :, None], ksize=width, padding='SAME', strides=(1, 1))[0, :, :, 0]
    elif method == 'naive_dilate':
        return tf.nn.dilation2d(heatmap[None, :, :, None], filters=tf.zeros((width, width, 1), dtype=heatmap.dtype),
                                        strides=(1, 1, 1, 1), padding="SAME", data_format="NHWC", dilations=(1, 1, 1, 1))[0, :, :, 0]
    elif method == 'rowcol_dilate':

        row_dilation = tf.nn.dilation2d(heatmap[None, :, :, None], filters=tf.zeros((1, width, 1), dtype=heatmap.dtype),
                                        strides=(1, 1, 1, 1), padding="SAME", data_format="NHWC", dilations=(1, 1, 1, 1))
        full_dilation = tf.nn.dilation2d(row_dilation, filters=tf.zeros((width, 1, 1), dtype=heatmap.dtype),
                                         strides=(1, 1, 1, 1), padding="SAME", data_format="NHWC", dilations=(1, 1, 1, 1))
        return full_dilation[0, :, :, 0]
    else:
        raise NotImplementedError(f'No method {method}')


def test_dilation_options(img_shape=(480, 640), kernel_size=25):

    img = np.random.randn(*img_shape).astype(np.float32)**2

    def get_result_and_time(version: str):

        tf_image = tf.constant(img, dtype=tf.float32)
        t_start = time.time()
        if version=='opencv':
            result = cv2.dilate(img, kernel=np.ones((kernel_size, kernel_size), dtype=np.float32))
            return time.time()-t_start, result
        else:
            result = tf_dilate(tf_image, width=kernel_size, method=version)
            return time.time()-t_start, result.numpy()

    t_opencv, result_opencv = get_result_and_time('opencv')
    t_maxpool, result_maxpool = get_result_and_time('maxpool')
    t_naive_dilate, result_naive_dilate = get_result_and_time('naive_dilate')
    t_rowcol_dilate, result_rowcol_dilate = get_result_and_time('rowcol_dilate')
    assert np.array_equal(result_opencv, result_maxpool), "Maxpool result did not match opencv result"
    assert np.array_equal(result_opencv, result_naive_dilate), "Naive dilation result did not match opencv result"
    assert np.array_equal(result_opencv, result_rowcol_dilate), "Row-col dilation result did not match opencv result"
    print(f'Dilation of {img_shape[1]}x{img_shape[0]} image with a {kernel_size}x{kernel_size} kernel took: '
          f'\n  {t_opencv*1000:.2f}ms using opencv'
          f'\n  {t_maxpool*1000:.2f}ms using tf.nn.max_pool2d'
          f'\n  {t_naive_dilate*1000:.2f}ms using tf.nn.dilation2d naively'
          f'\n  {t_rowcol_dilate*1000:.2f}ms using tf.nn.dilation2d with row-col'
          )


if __name__ == '__main__':
    test_dilation_options()

好吧，如果你没问题的话近似解决方案中，总是存在“穷人的扩张”，它使用加权局部平均值（盒式滤波器）来近似扩张，其中通过对图像求幂来获取权重。它是O((H+K)*(W+K)) where W,H是图像的宽度、高度和K是内核大小。

它还具有以下优点：梯度不仅流过局部最大值，还流过竞争者直至抛出。

参见代码：

TensorImage = NewType('TensorImage', tf.Tensor)  # A (height, width, n_colors) uint8 image
TensorFloatImage = NewType('TensorFloatImage', tf.Tensor)
TensorHeatmap = NewType('TensorHeatmap', tf.Tensor)  # A (height, width) heatmap

def tf_box_filter(image: Union[TensorImage, TensorFloatImage, TensorHeatmap], width: int, normalize: bool = True, weights: Optional[TensorHeatmap] = None,
                  weight_eps: float = 1e-6, norm_weights: bool = True):
    image = tf.cast(image, tf.float32) if image.dtype != tf.float64 else image
    if weights is not None:
        if norm_weights:
            weights = weights/(width**2)
        if len(image.shape) == 3:
            weights = weights[:, :, None]  # Lets us broadcast weights against image

        image = image * weights

    lwidth = width // 2 + 1
    rwidth = width - lwidth

    integral_image_container = tf.pad(image,
                                      paddings=[(lwidth, rwidth), (lwidth, rwidth)] + [(0, 0)] * (len(image.shape) - 2))
    integral_image_container = tf.cumsum(tf.cumsum(integral_image_container, axis=0), axis=1)
    box_image = integral_image_container[width:, width:] \
                - integral_image_container[width:, :-width] \
                - integral_image_container[:-width, width:] \
                + integral_image_container[:-width, :-width]

    if not normalize:
        return box_image if (weights is None or not norm_weights) else box_image*(width**2)
    elif weights is None:
        return box_image / (width ** 2)
    else:
        box_weights = tf_box_filter(weights, width=width, normalize=False)
        return (box_image + weight_eps) / (box_weights + weight_eps)


def tf_poor_mans_dilate(heatmap: TensorHeatmap, width: int, power: int = 4, cast_to_64 = False) -> TensorHeatmap:
    """ A 'poor man's' version of dilation, whise runtime is O((image_height+kernel_width), (image_width+kernel_width))"""
    if cast_to_64:
        heatmap = tf.cast(heatmap, tf.float64)
    return tf_box_filter(heatmap, width, weights=heatmap**power, weight_eps=1e-9)

测试表明它比问题中的解决方案快大约 3 倍（当内核很大时速度更快）。


def test_poor_mans_dilate(show=False):
    """ Can be faster for large images and kernels

    Dilating image of shape (1280, 720) with kernel of shape 40x40
        Real Dilate: Elapsed time is 0.09009s
        Poor Man's Dilate: Elapsed time is 0.02953s

    Dilating image of shape (640, 480) with kernel of shape 40x40
        Real Dilate: Elapsed time is 0.03089s
        Poor Man's Dilate: Elapsed time is 0.008736s

    Dilating image of shape (640, 480) with kernel of shape 20x20
        Real Dilate: Elapsed time is 0.01475s
        Poor Man's Dilate: Elapsed time is 0.009809s
    """
    img = tf.random.Generator.from_seed(1234).normal((640, 480))**4
    width = 20
    print(f'Dilating image of shape {img.shape} with kernel of shape {width}x{width}')
    with profile_context('Real Dilate', print_result=True):
        dil_img = tf_dilate(img, width=width)
    with profile_context("Poor Man's Dilate", print_result=True):
        poor_dil_img = tf_poor_mans_dilate(img, width=width)

    assert np.allclose(dil_img.numpy().max(), poor_dil_img.numpy().max(), rtol=0.001)

本文内容由网友自发贡献，版权归原作者所有，本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容，请联系:hwhale#tublm.com(使用前将#替换为@)

TensorFlow 中的高效图像膨胀的相关文章

窗口多维 Tensorflow 数据集

我有形状的二维数据m by n我想要的窗口大小w沿着第一个轴进入数据集m w许多二维数组每个数组的大小w by n 例如如果数据是 0 1 2 3 4 5 6 7 8 9 10 11 然后我想将其窗口化 0 1 2 3 4 5 6 7 8
Python子进程：cmd退出时的回调

我目前正在使用启动一个程序subprocess Popen cmd shell TRUE 我对 Python 相当陌生但感觉应该有一些 api 可以让我做类似的事情 subprocess Popen cmd shell TRUE po
Python groupby 无法按预期工作[重复]

这个问题在这里已经有答案了我正在尝试读取一个 Excel 电子表格其中包含以下格式的一些列 column1 column1 AccountName column1 SomeOtherFeature column2 blabla colu
倒计时：01:05

如何在 Python 中创建一个看起来像 00 00 分钟和秒的倒计时时钟它独立成一行每次减少一actual秒则应将旧计时器替换为低一秒的新计时器 01 00变成00 59它实际上击中了00 00 这是我开始使用但想要改造的基本计时
在 Matplotlib 中选择标记大小

我正在 matplotlib 中用方形标记绘制散点图如下所示我想实现这样的目标这意味着我必须调整标记大小和图形大小比例以使标记之间没有空白每个索引单元还应该有一个标记 x and y都是整数所以如果y从 60 到 100 应该
Python Ctypes：将返回的 C 数组转换为 python 列表，无需 numpy

我正在使用 Python Ctypes 来访问一些 C 库我连接到的函数之一返回const double 它实际上是一个双精度数组当我在Python中得到结果时如何将该数组转换为Python列表 C函数的签名 const double
为什么 Contextmanager 会抛出运行时错误“生成器在 throw() 之后没有停止”？

在我的 utility py 中 contextmanager def rate limit protection max tries 3 wait 300 tries 0 while max tries gt tries try yiel
具有动态特性的 Python 嵌套作用域

需要帮助理解以下句子PEP 227 http www python org dev peps pep 0227 和Python 语言参考 http docs python org reference executionmodel html
类型错误：不支持的操作数类型 -：“int”和“list”

我正在尝试用 python 创建一个程序它会使用 Zeller 算法告诉你你出生在星期几http en wikipedia org wiki Zeller 27s congruence http en wikipedia org wiki
Python 删除额外的特殊 unicode 字符

我正在 python 中处理一些文本它内部已经采用 unicode 格式但我想删除一些特殊字符并用更标准的版本替换它们我目前有一条看起来像这样的线路但它变得越来越复杂我发现它最终会带来更多麻烦 tmp infile lower r
优化 Django Queryset for 循环

如何优化以下查询集 link goal for link in self child links all 我想摆脱 for 循环并只访问数据库一次我有以下代码 class Goal models Model name models Cha
这个 Python 字符串切片语句中的两个冒号的用途是什么？

例如 str hello str 1 3 我在 Python 文档中哪里可以找到它 in 序列描述 http docs python org library stdtypes html index 510 s i j k slice of
如何在 conda 中从一个文件安装多个包而不创建新环境？

我从当前环境缺少的包的 yml 文件中获取了这些我如何在当前环境中安装这些 channels defaults dependencies appdirs 1 4 3 py36h28b3542 0 asn1crypto 0 24 0 py3
是否有像 python 的 issubclass 这样的东西，如果第一个参数不是类，它将返回 False？

我想要issubclass 1 str 返回 false 1不是的子类str 因为它根本不是一个类所以我收到了 TypeError 有没有一个好的方法来测试这个而不诉诸try except try if issubclass value
Python代码检测OS X El Capitan中的暗模式以更改状态栏菜单图标

我有目标 C 代码来检测暗模式以更改状态栏 NSDistributedNotificationCenter defaultCenter addObserver self selector selector darkModeChanged n
需要帮助编写扭曲的代理

我想编写一个简单的代理可以对请求页面正文中的文本进行打乱我已经阅读了 stackoverflow 上的部分扭曲文档和其他一些类似的问题但我有点菜鸟所以我仍然不明白我现在就是这样不知道如何访问和修改页面 from twisted
Python Flask 不更新图像[重复]

这个问题在这里已经有答案了这里有一些关于图像的 Flask 问题但没有一个能解决我的问题我有一个应用程序可以创建图像保存它然后显示它一次它应该多次执行此操作每次更改图像时它应该加载新图像它不是它只显示与其显示的文件名关
检查数组中是否有 3 个连续值高于某个阈值

假设我有一个像这样的 np array a 1 3 4 5 60 43 53 4 46 54 56 78 有没有一种快速方法来获取 3 个连续数字都高于某个阈值的所有位置的索引也就是说对于某个阈值th 得到所有x其中 a x gt th
从多个 .csv 文件创建混淆矩阵

我有很多具有以下格式的 csv 文件 338 800 338 550 339 670 340 600 327 500 301 430 299 350 284 339 284 338 283 335 283 330 283 310 282 3
在 Jupyter Notebook 上使用 virtualenv

我尝试使用virtualenv在 jupyter 笔记本上使用环境中安装的所有软件包但在 jupyter 内部它们无法识别已经尝试过 pip install tornado 4 5 3 pip install ipykernel 4

随机推荐

后台进程的 cy.exec 超时

我正在尝试使用启动服务器cy exec并像这样后台处理 cy exec nohup python m my module arg 1 failOnNonZeroExit false then result gt if result code
如何防止密码和其他敏感信息出现在 ASP.NET 转储中？

如何防止在 IIS ASP NET 转储文件中向 ASP NET 网页提交和接收密码和其他敏感数据重现步骤使用 Visual Studio 2010 创建 ASP NET MVC 3 Intranet 应用程序将其配置为使用 IIS
Spring嵌套事务

在我的 Spring Boot 项目中我实现了以下服务方法 Transactional public boolean validateBoard Board board boolean result false if inProgress
更新更改 svn 时出错

我安装了 PHPStorm 并使用 SVN 打开包含 PHP 项目的目录在更改的 SVN 选项卡下我遇到以下错误 Error updating changes svn E155021 The client is too old to
Spring JPA Repository - 在服务器重启时保留数据

我目前正在尝试学习如何使用 Spring Boot 但遇到一个问题我不确定如何解决我已经按照使用 JPA 访问数据 http spring io guides gs accessing data jpa 指导一切正常但是如果我重新
Pandas 和 Matplotlib - fill_ Between() 与 datetime64

有一个 Pandas 数据框
ggplot 中的热图，每组不同的颜色

我正在尝试在 ggplot 中生成热图我希望每个组都有不同的颜色渐变但不知道该怎么做我当前的代码如下所示 dummy data data lt data frame group sample c Direct Patient Care
OL3：强制重绘图层

我目前正在将 OpenLayers 客户端版本 2 13 1 升级为新版本的 OpenLayers OL3 我的设置包括作为 WMS 映射服务器的 Mapserver 和前面提到的 OpenLayers 客户端在旧系统中我支持用户交互
R 中百分比格式表

我想获取一个百分比表将值格式化为百分比并以良好的格式显示它们如果重要的话我正在使用 RStudio 并编织为 PDF 我看过其他关于此的帖子但它们看起来都不干净而且效果不佳例如下面的 apply 语句确实采用百分比格式但是
检索两个字符之间的子字符串

我有这样的字符串 var str it itA itB et etA etB etC etD 如何检索和之间的元素截至目前我正在用新行分割文本但无法解决这个问题请帮我解决这个问题请使用这个小提琴http jsfiddle ne
IronPython - JSON 选择

在 IronPython 2 0 1 中处理 JSON 的最佳方法是什么原生 Python 标准库 json 看起来尚未实现如果我想使用 Newtonsoft Json NET 库我该怎么做我可以将程序集添加到 GAC 但我还有其他
如何使用 php 渲染远程图像？

这是一个 jpg https i stack imgur com PIFN0 jpg 假设我希望这个渲染自 img php file name PIFN0 jpg 以下是我尝试完成这项工作的方法样本 php p Here s my ima
UICollectionView 启用取消选择单元格，同时禁用 allowedMultipleSelection

When collectionView allowsMultipleSelection YES 我可以取消选择已选择的单元格 when collectionView allowsMultipleSelection NO 我无法取消选择已选择
Fortran 中不提升数组的标量参数

为什么 Fortran 会将标量表达式提升为数组表达但不作为过程的参数特别是为什么标准机构做出这样的设计决定仅仅是因为含糊不清程序就应该超载吗在这种情况下错误消息是否可以作为替代方法例如在下面的代码中最后一条语句 x f
Jsoup，在执行表单POST之前获取值

这是我用来提交表单的代码 Connection Response res Jsoup connect http example com data id myID data username myUsername data code MyAu
iPhone：cocos2d 中相机跟随玩家

我正在用 cocos2d 制作 iPhone 游戏我想知道如何使相机视图遵循特定的精灵我会使用 CCCamera 类吗是的 CCCamera 可以工作然而它有一些缺点使其不适合某些用途相对于该精灵移动图层以及所有其他对象可能
在 StructureMap 中注册一个默认实例

我有一堂课 MyService 具有静态属性 MyService Context 代表当前上下文特定于当前登录的用户因此它会发生变化我想要实现的目标 ObjectFactory Initialize x gt x For
在 WPF 中，我们如何将 Duration 定义为资源？

我在许多动画中使用了一个持续时间 0 0 0 5 并且我想仅在一个位置定义该数字我可以将双精度定义为
在 Win32 API 中绘制格式化文本的最快方法是什么？

我正在使用普通 Win32 API 在 C 中实现一个文本编辑器并且我正在尝试找到实现语法突出显示的最佳方法我知道有像 scintilla 这样的现有控件但我这样做是为了好玩所以我想自己完成大部分工作我还希望它又快又轻从我到目前
TensorFlow 中的高效图像膨胀

我正在寻找一种有效的实施方式形态学图像膨胀 https en wikipedia org wiki Dilation morphology 在 TensorFlow 中使用方形内核正如 OpenCV 所示与实际效果相比显而易见的方法似

TensorFlow 中的高效图像膨胀

TensorFlow 中的高效图像膨胀 的相关文章

随机推荐

热门标签

TensorFlow 中的高效图像膨胀的相关文章