如何使 yolo v3 中的边界框更紧密（更靠近对象）？

2023-12-20

我正在关注这个Repo https://github.com/ayooshkathuria/YOLO_v3_tutorial_from_scratch在 PyTorch 中从头开始创建 Yolo v3 模型。唯一的问题是，在我尝试的大多数图像中，边界框并不那么紧密（靠近物体）。我将它们与tutorial https://github.com/shahkaran76/yolo_v3-tensorflow-ipynb/blob/master/YOLO%20Tensorflow.ipynb关于创建 Yolo v3 模型但使用 TensorFlow。张量流模型产生出色的边界框，与对象尽可能紧密。
我试图了解两者之间的计算有何不同，但我发现自己陷入了 torch 和 tf.
我相信 tf 教程中边界框的代码来自这里：

def yolo_layer(inputs, n_classes, anchors, img_size, data_format):
    """Creates Yolo final detection layer.

    Detects boxes with respect to anchors.

    Args:
        inputs: Tensor input.
        n_classes: Number of labels.
        anchors: A list of anchor sizes.
        img_size: The input size of the model.
        data_format: The input format.

    Returns:
        Tensor output.
    """
    n_anchors = len(anchors)

    inputs = tf.layers.conv2d(inputs, filters=n_anchors * (5 + n_classes),
                              kernel_size=1, strides=1, use_bias=True,
                              data_format=data_format)

    shape = inputs.get_shape().as_list()
    grid_shape = shape[2:4] if data_format == 'channels_first' else shape[1:3]
    if data_format == 'channels_first':
        inputs = tf.transpose(inputs, [0, 2, 3, 1])
    inputs = tf.reshape(inputs, [-1, n_anchors * grid_shape[0] * grid_shape[1],
                                 5 + n_classes])

    strides = (img_size[0] // grid_shape[0], img_size[1] // grid_shape[1])

    box_centers, box_shapes, confidence, classes = \
        tf.split(inputs, [2, 2, 1, n_classes], axis=-1)

    x = tf.range(grid_shape[0], dtype=tf.float32)
    y = tf.range(grid_shape[1], dtype=tf.float32)
    x_offset, y_offset = tf.meshgrid(x, y)
    x_offset = tf.reshape(x_offset, (-1, 1))
    y_offset = tf.reshape(y_offset, (-1, 1))
    x_y_offset = tf.concat([x_offset, y_offset], axis=-1)
    x_y_offset = tf.tile(x_y_offset, [1, n_anchors])
    x_y_offset = tf.reshape(x_y_offset, [1, -1, 2])
    box_centers = tf.nn.sigmoid(box_centers)
    box_centers = (box_centers + x_y_offset) * strides

    anchors = tf.tile(anchors, [grid_shape[0] * grid_shape[1], 1])
    box_shapes = tf.exp(box_shapes) * tf.to_float(anchors)

    confidence = tf.nn.sigmoid(confidence)

    classes = tf.nn.sigmoid(classes)

    inputs = tf.concat([box_centers, box_shapes,
                        confidence, classes], axis=-1)

    return inputs

虽然 pytorch 模型的边界框代码来自here https://github.com/ayooshkathuria/YOLO_v3_tutorial_from_scratch/blob/master/util.py，以及解释 https://blog.paperspace.com/how-to-implement-a-yolo-v3-object-detector-from-scratch-in-pytorch-part-3/:

def bbox_iou(box1, box2):
    """
    Returns the IoU of two bounding boxes 


    """
    #Get the coordinates of bounding boxes
    b1_x1, b1_y1, b1_x2, b1_y2 = box1[:,0], box1[:,1], box1[:,2], box1[:,3]
    b2_x1, b2_y1, b2_x2, b2_y2 = box2[:,0], box2[:,1], box2[:,2], box2[:,3]

    #get the corrdinates of the intersection rectangle
    inter_rect_x1 =  torch.max(b1_x1, b2_x1)
    inter_rect_y1 =  torch.max(b1_y1, b2_y1)
    inter_rect_x2 =  torch.min(b1_x2, b2_x2)
    inter_rect_y2 =  torch.min(b1_y2, b2_y2)

    #Intersection area
    inter_area = torch.clamp(inter_rect_x2 - inter_rect_x1 + 1, min=0) * torch.clamp(inter_rect_y2 - inter_rect_y1 + 1, min=0)

    #Union Area
    b1_area = (b1_x2 - b1_x1 + 1)*(b1_y2 - b1_y1 + 1)
    b2_area = (b2_x2 - b2_x1 + 1)*(b2_y2 - b2_y1 + 1)

    iou = inter_area / (b1_area + b2_area - inter_area)

    return iou

def predict_transform(prediction, inp_dim, anchors, num_classes, CUDA = True):


    batch_size = prediction.size(0)
    stride =  inp_dim // prediction.size(2)
    grid_size = inp_dim // stride
    bbox_attrs = 5 + num_classes
    num_anchors = len(anchors)

    prediction = prediction.view(batch_size, bbox_attrs*num_anchors, grid_size*grid_size)
    prediction = prediction.transpose(1,2).contiguous()
    prediction = prediction.view(batch_size, grid_size*grid_size*num_anchors, bbox_attrs)
    anchors = [(a[0]/stride, a[1]/stride) for a in anchors]

    #Sigmoid the  centre_X, centre_Y. and object confidencce
    prediction[:,:,0] = torch.sigmoid(prediction[:,:,0])
    prediction[:,:,1] = torch.sigmoid(prediction[:,:,1])
    prediction[:,:,4] = torch.sigmoid(prediction[:,:,4])

    #Add the center offsets
    grid = np.arange(grid_size)
    a,b = np.meshgrid(grid, grid)

    x_offset = torch.FloatTensor(a).view(-1,1)
    y_offset = torch.FloatTensor(b).view(-1,1)

    if CUDA:
        x_offset = x_offset.cuda()
        y_offset = y_offset.cuda()

    x_y_offset = torch.cat((x_offset, y_offset), 1).repeat(1,num_anchors).view(-1,2).unsqueeze(0)

    prediction[:,:,:2] += x_y_offset

    #log space transform height and the width
    anchors = torch.FloatTensor(anchors)

    if CUDA:
        anchors = anchors.cuda()

    anchors = anchors.repeat(grid_size*grid_size, 1).unsqueeze(0)
    prediction[:,:,2:4] = torch.exp(prediction[:,:,2:4])*anchors

    prediction[:,:,5: 5 + num_classes] = torch.sigmoid((prediction[:,:, 5 : 5 + num_classes]))

    prediction[:,:,:4] *= stride

    return prediction

None

本文内容由网友自发贡献，版权归原作者所有，本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容，请联系:hwhale#tublm.com(使用前将#替换为@)

如何使 yolo v3 中的边界框更紧密（更靠近对象）？的相关文章

如何创建 Keras 层来执行 4D 卷积 (Conv4D)？

看起来tf nn convolution应该能够进行 4D 卷积但我无法成功创建 Keras 层来使用此函数我尝试过使用 KerasLambda层来包裹tf nn convolution功能但也许其他人有更好的主意我想利用数据的高维
保存具有自定义前向功能的 Bert 模型并将其置于 Huggingface 上

我创建了自己的 BertClassifier 模型从预训练开始然后添加由不同层组成的我自己的分类头微调后我想使用 model save pretrained 保存模型但是当我打印它并从预训练上传时我看不到我的分类器头代码如下
ValueError：维度 (-1) 必须在 [0, 2) 范围内

我的python版本是3 5 2 我已经安装了keras和tensorflow 并尝试了官方的一些示例示例链接示例标题用于多类 softmax 分类的多层感知器 MLP https keras io getting started s
如何使用 pytorch 同时迭代两个数据加载器？

我正在尝试实现一个接收两张图像的暹罗网络我加载这些图像并创建两个单独的数据加载器在我的循环中我想同时遍历两个数据加载器以便我可以在两个图像上训练网络 for i data in enumerate zip dataloaders1
样本（）和r样本（）有什么区别？

当我从 PyTorch 中的发行版中采样时两者sample and rsample似乎给出了类似的结果 import torch seaborn as sns x torch distributions Normal torch tens
在 Tensorflow 对象检测 API 中绘制验证损失

我正在使用 Tensorflow 对象检测 API 来检测和定位图像中的一类对象为了这些目的我使用预先训练的faster rcnn resnet50 coco 2018 01 28 model 我想在训练模型后检测拟合不足过度拟合我
将 Pytorch LSTM 的状态参数转换为 Keras LSTM

我试图将现有的经过训练的 PyTorch 模型移植到 Keras 中在移植过程中我陷入了LSTM层 LSTM 网络的 Keras 实现似乎具有三种状态类型的状态矩阵而 Pytorch 实现则具有四种状态矩阵例如对于hidden l
ValueError：使用火炬张量时需要解压的值太多

对于神经网络项目我使用 Pytorch 并使用 EMNIST 数据集已经给出的代码加载到数据集中 train dataset dsets MNIST root data train True transform transforms T
可视化 TFLite 图并获取特定节点的中间值？

我想知道是否有办法知道 tflite 中特定节点的输入和输出列表我知道我可以获得输入输出详细信息但这不允许我重建发生在Interpreter 所以我要做的是 interpreter tf lite Interpreter model
ValueError：没有为“dense_input”提供数据

我正在使用以下简单的代码使用tensorflow加载csv并使用keras执行建模无法弄清楚这个错误 import tensorflow as tf train dataset fp tf keras utils get file fna
带有 CUDA 的 Tensorflow：导入错误

我已经按照 NVIDIA 教程中的说明一步步安装了 TensorFlow Ubuntu 16 04 桌面版 GTX 970 http www nvidia com object gpu accelerated applications te
阻止 TensorFlow 访问 GPU？ [复制]

这个问题在这里已经有答案了有没有一种方法可以纯粹在CPU上运行TensorFlow 我机器上的所有内存都被运行 TensorFlow 的单独进程占用我尝试将 per process memory fraction 设置为 0 但未成功
PyTorch 给出 cuda 运行时错误

我对我的代码做了一些小小的修改以便它不使用 DataParallel and DistributedDataParallel 代码如下 import argparse import os import shutil import time
异常：加载数据时 URL 获取失败

我正在尝试设置我的机器来运行 Tensorflow 2 我从未使用过 Tensorflow 只是下载了 Python 3 7 我不确定这是否是我的机器的问题我按照上面列出的安装说明进行操作TensorFlow 的网站 https www
张量流服务错误：参数无效：JSON 对象：没有命名输入

我正在尝试使用 Amazon Sagemaker 训练模型并且希望使用 Tensorflow 服务来为其提供服务为了实现这一目标我将模型下载到 Tensorflow 服务 docker 并尝试从那里提供服务 Sagemaker 的训练
YOLOv8获取预测边界框

我想将 OpenCV 与 YOLOv8 集成ultralytics 所以我想从模型预测中获取边界框坐标我该怎么做呢 from ultralytics import YOLO import cv2 model YOLO yolov8n pt
tf.gather_nd 直观上是做什么的？

你能直观地解释一下或者举更多例子吗tf gather nd用于在 Tensorflow 中索引和切片为高维张量我读了API https www tensorflow org api docs python tf gather nd 但它保
如何使用 Tensorflow-GPU 和 Keras 修复低易失性 GPU-Util？

我有一台 4 GPU 机器在上面运行带有 Keras 的 Tensorflow GPU 我的一些分类问题需要几个小时才能完成 nvidia smi returns Volatile GPU Util which never exceeds
使用预训练的 word2vec 初始化 Seq2seq 嵌入

我对使用预训练的 word2vec 初始化tensorflow seq2seq 实现感兴趣我已经看过代码了嵌入似乎已初始化 with tf variable scope scope or embedding attention deco
在张量流中向卷积神经网络提供可变大小的输入

我正在尝试使用 feed dict 参数将不同大小的 2d numpy 数组列表传递给卷积神经网络 x tf placeholder tf float32 batch size None None None y tf placeholder

随机推荐

如何管理 git 上多个用户的代码合并？

Git 不会让人推送存在合并冲突的分支在我们目前的情况下大约有 15 个冲突文件其中一些是我的代码一些是其他人的代码最好是每个更改代码的人都对这些特定文件执行合并我们如何处理自己的文件来完成合并 Unmerged paths b
使用 XSLT 检查节点是否存在

首先我想问一下对于XML节点下面两条语句有什么区别检查节点是否为空节点检查节点是否存在假设我有一个像这样的 XML 文件
MySQL 有没有办法自动将空字符串转换为 NULL 值？

我有一个表我正在尝试向其添加唯一索引问题是脚本设置插入数据的方式有时会出现一些空字符串而不是NULL 因此除了更改脚本我预见这是必要的除非有人救我 mysql 中是否有任何设置可以自动地 use NULL如果将空字符串传递给唯
如何在Rails 4中查询所有记录而不是.all()？

现在ActiveRecord Relation all在 Rails 4 中已弃用如何迭代所有记录之前 Foo all each do foo whatever end 我现在可以这样近似但感觉很脏 Foo where true ea
如何使用 jQuery 触发浏览器窗口或选项卡关闭事件

有没有办法用 jQuery 触发窗口选项卡关闭事件我已经尝试过 selector unload 但这没有用您可以使用unload http api jquery com unload on the windowjQuery 中的属性
为什么 Firebase 的“getRedirectResults()”返回“{user: null}”？

目前当用户通过社交身份验证通过重定向时它会成功创建一个用户Firebase 身份验证但无法检索 Google Facebook API 的数据很困惑当身份验证成功时它如何无法检索数据我以前只有弹出登录效果很好但我想得到获取
为什么 initWithNibName 不适用于我的 UIViewController 子类？

我将 ViewController 子类化为一个新类 UIPageViewController 我正在编写一个简单的图书应用程序我想添加一个从 nib 文件加载的新视图并使用以下代码有用 PageViewController view
MKMapView 不更新用户位置图像

我的 MKMapView 在启动时显示我的位置但图像永远不会跟随我位置得到更新屏幕确实跟随我但原始的用户位置图像保留在后面这是一些代码片段 MKAnnotationView mapView MKMapView mapVie
是否有类似管道的 .NET Stream 类不使用操作系统管道？

使用 C NET Xamarin Mono 真的我有一个类其方法 A 接受写入的流我还有另一个类其方法 B 接受要读取的流我想要一个传递给 A 和 B 的流这样当 A 写入流数据可以由 B 读取已经有这样一个不使用操作系统管
如何快速将 id 重新映射到连续数字

我有一个很大的 csv 文件其中的行看起来像 stringa stringb stringb stringc stringd stringa 我需要对其进行转换以便 id 从 0 开始连续编号在这种情况下以下内容将起作用 0 1 1
如何使用 GroupBox 标题的样式？

我已经失去了GroupBox在我的表格中他们的标题文本必须是Bold 我知道如何为一个人做到这一点GroupBox
如何强制调用某些构造函数/函数以使用命名参数？

我有一些构造函数和函数我希望始终使用命名参数来调用它们有没有办法要求这个我希望能够对具有许多参数的构造函数和函数以及在使用命名参数时读得更清楚的构造函数和函数执行此操作在 Kotlin 1 0 中您可以通过使用来做到这一点Noth
如何使 Ruby on Rails 中的 URL 对知道 @vendor.name 的 SEO 友好？

我的申请在 RoR 中我有一个名为 showsummary 的操作视图其中 ID 已传递到 URL 控制器已使用它来实例化 vendor 其中 vendor name 是公司名称我希望 URL 中包含 vendor name 而不是
K 最短路径 Python 不工作

我的 K 最短路径算法存在某些问题代码如下 def K shortest Paths graph S T K 4 Initialize Variables Accordingly B P set count for U in graph
如何在 Symfony 中排除 Sentry 的异常？

我安装了 Symfony 4 3 并将其升级到 4 4 19 在我的旧安装中 Sentry 与排除异常配合得很好我像这样使用它sentry yaml sentry dsn https email protected cdn cgi l e
SwiftUI 视图中的 StoreKit 委托/可观察对象

当我恢复购买时我需要禁用视图中的恢复按钮 I have IAPManager与一起上课SKPaymentTransactionObserver 而且效果很好我看到了print restored 当我打电话时 SKPaymentTran
异步 CTP 错误 - 任务永远无法完成

首先致歉我无法将以下错误隔离到简单的控制台应用程序中然而在我相对简单的 ASP NET Web 窗体应用程序中以下代码将导致当前线程无限期阻塞 public class MyModule IHttpModule public voi
哪种算法最适合用 Python 解决“Boggle”这样的单词搜索游戏

我正在编写一个类似的游戏Boggle http en wikipedia org wiki Boggle玩家应该在由随机字母组成的大字符串中找到单词例如有五个数组里面有字符串如下所示五行每行由六个字母组成 AMSDNS MASD
C# 将 CSV 加载到 DataGrid 中

所以我有一个 CSV 文件 Header1 Header2 Header3 Header4 Data11 Data12 Data13 Data14 Data21 Data22 Data23 Data24 Data31 Data32 Data
如何使 yolo v3 中的边界框更紧密（更靠近对象）？

我正在关注这个Repo https github com ayooshkathuria YOLO v3 tutorial from scratch在 PyTorch 中从头开始创建 Yolo v3 模型唯一的问题是在我尝试的大多数图像中

如何使 yolo v3 中的边界框更紧密（更靠近对象）？

如何使 yolo v3 中的边界框更紧密（更靠近对象）？ 的相关文章

随机推荐

热门标签

如何使 yolo v3 中的边界框更紧密（更靠近对象）？的相关文章