我正在尝试使用地面真实深度图、姿势信息和相机矩阵将帧从视图 1 扭曲到视图 2。我已经能够删除大部分 for 循环并将其矢量化,除了一个 for 循环。扭曲时,由于遮挡,视图 1 中的多个像素可能会映射到视图 2 中的单个位置。在这种情况下,我需要选择深度值最低的像素(前景对象)。我无法对这部分代码进行矢量化。任何帮助向量化这个 for 循环的帮助都是值得赞赏的。
Context:
I'm trying to warp an image into a new view, given ground truth pose, depth, and camera matrix. After computing warped locations, I'm rounding them off. Any suggestions to implement inverse bilinear interpolation are also welcome. My images are of full HD resolution. Hence it is taking a lot of time to warp the frames to the new view. If I can vectorize, I'm planning to convert the code to TensorFlow or PyTorch and run it on a GPU. Any other suggestions to speed up warping, or existing implementations are also welcome.
Code:
def warp_frame_04(frame1: numpy.ndarray, depth: numpy.ndarray, intrinsic: numpy.ndarray, transformation1: numpy.ndarray,
transformation2: numpy.ndarray, convert_to_uint: bool = True, verbose_log: bool = True):
"""
Vectorized Forward warping. Nearest Neighbor.
Offset requirement of warp_frame_03() overcome.
mask: 1 if pixel found, 0 if no pixel found
Drawback: Nearest neighbor, collision resolving not vectorized
"""
height, width, _ = frame1.shape
assert depth.shape == (height, width)
transformation = numpy.matmul(transformation2, numpy.linalg.inv(transformation1))
y1d = numpy.array(range(height))
x1d = numpy.array(range(width))
x2d, y2d = numpy.meshgrid(x1d, y1d)
ones_2d = numpy.ones(shape=(height, width))
ones_4d = ones_2d[:, :, None, None]
pos_vectors_homo = numpy.stack([x2d, y2d, ones_2d], axis=2)[:, :, :, None]
intrinsic_inv = numpy.linalg.inv(intrinsic)
intrinsic_4d = intrinsic[None, None]
intrinsic_inv_4d = intrinsic_inv[None, None]
depth_4d = depth[:, :, None, None]
trans_4d = transformation[None, None]
unnormalized_pos = numpy.matmul(intrinsic_inv_4d, pos_vectors_homo)
world_points = depth_4d * unnormalized_pos
world_points_homo = numpy.concatenate([world_points, ones_4d], axis=2)
trans_world_homo = numpy.matmul(trans_4d, world_points_homo)
trans_world = trans_world_homo[:, :, :3]
trans_norm_points = numpy.matmul(intrinsic_4d, trans_world)
trans_pos = trans_norm_points[:, :, :2, 0] / trans_norm_points[:, :, 2:3, 0]
trans_pos_int = numpy.round(trans_pos).astype('int')
# Solve occlusions
a = trans_pos_int.reshape(-1, 2)
d = depth.ravel()
b = numpy.unique(a, axis=0, return_index=True, return_counts=True)
collision_indices = b[1][b[2] >= 2] # Unique indices which are involved in collision
for c1 in tqdm(collision_indices, disable=not verbose_log):
cl = a[c1].copy() # Collision Location
ci = numpy.where((a[:, 0] == cl[0]) & (a[:, 1] == cl[1]))[0] # Colliding Indices: Indices colliding for cl
cci = ci[numpy.argmin(d[ci])] # Closest Collision Index: Index of the nearest point among ci
a[ci] = [-1, -1]
a[cci] = cl
trans_pos_solved = a.reshape(height, width, 2)
# Offset both axes by 1 and set any out of frame motion to edge. Then crop 1-pixel thick edge
trans_pos_offset = trans_pos_solved + 1
trans_pos_offset[:, :, 0] = numpy.clip(trans_pos_offset[:, :, 0], a_min=0, a_max=width + 1)
trans_pos_offset[:, :, 1] = numpy.clip(trans_pos_offset[:, :, 1], a_min=0, a_max=height + 1)
warped_image = numpy.ones(shape=(height + 2, width + 2, 3)) * numpy.nan
warped_image[trans_pos_offset[:, :, 1], trans_pos_offset[:, :, 0]] = frame1
cropped_warped_image = warped_image[1:-1, 1:-1]
mask = numpy.isfinite(cropped_warped_image)
cropped_warped_image[~mask] = 0
if convert_to_uint:
final_warped_image = cropped_warped_image.astype('uint8')
else:
final_warped_image = cropped_warped_image
mask = mask[:, :, 0]
return final_warped_image, mask
代码说明
- 我使用方程 [1,2] 来获取 view2 中的像素位置
- 一旦获得像素位置,我需要确定是否存在任何遮挡,如果有,我必须选择前景像素。
- `b = numpy.unique(a, axis=0, return_index=True, return_counts=True)` 给了我独特的位置。
- 如果 view1 中的多个像素映射到 view2 中的单个像素(碰撞),则“return_counts”将给出大于 1 的值。
- `collision_indices = b[1][b[2] >= 2]` 给出涉及碰撞的索引。请注意,这只会为每次碰撞提供一个索引。
- 对于每个这样的碰撞点, `ci = numpy.where((a[:, 0] == cl[0]) & (a[:, 1] == cl[1]))[0]` 提供索引view1 中映射到 view2 中同一点的所有像素。
- `cci = ci[numpy.argmin(d[ci])]` 给出具有最低深度值的像素索引。
- `a[ci] = [-1, -1]` 和 `a[cci] = cl` 将所有其他背景像素映射到框架外的位置 (-1,-1),因此将被忽略。
[1] https://i.stack.imgur.com/s1D9t.png https://i.stack.imgur.com/s1D9t.png
[2] https://dsp.stackexchange.com/q/69890/32876 https://dsp.stackexchange.com/q/69890/32876