8

I'm trying to warp a frame from view1 to view2 using ground truth depth map, pose information, and camera matrix. I've been able to remove most of the for-loops and vectorize it, except one for-loop. When warping, multiple pixels in view1 may get mapped to a single location in view2, due to occlusions. In this case, I need to pick the pixel with the lowest depth value (foreground object). I'm not able to vectorize this part of the code. Any help to vectorize this for loop is appreciated.

Context:

I'm trying to warp an image into a new view, given ground truth pose, depth, and camera matrix. After computing warped locations, I'm rounding them off. Any suggestions to implement inverse bilinear interpolation are also welcome. My images are of full HD resolution. Hence it is taking a lot of time to warp the frames to the new view. If I can vectorize, I'm planning to convert the code to TensorFlow or PyTorch and run it on a GPU. Any other suggestions to speed up warping, or existing implementations are also welcome.

Code:

def warp_frame_04(frame1: numpy.ndarray, depth: numpy.ndarray, intrinsic: numpy.ndarray, transformation1: numpy.ndarray,
                  transformation2: numpy.ndarray, convert_to_uint: bool = True, verbose_log: bool = True):
    """
    Vectorized Forward warping. Nearest Neighbor.
    Offset requirement of warp_frame_03() overcome.
    mask: 1 if pixel found, 0 if no pixel found
    Drawback: Nearest neighbor, collision resolving not vectorized
    """
    height, width, _ = frame1.shape
    assert depth.shape == (height, width)
    transformation = numpy.matmul(transformation2, numpy.linalg.inv(transformation1))

    y1d = numpy.array(range(height))
    x1d = numpy.array(range(width))
    x2d, y2d = numpy.meshgrid(x1d, y1d)
    ones_2d = numpy.ones(shape=(height, width))
    ones_4d = ones_2d[:, :, None, None]
    pos_vectors_homo = numpy.stack([x2d, y2d, ones_2d], axis=2)[:, :, :, None]

    intrinsic_inv = numpy.linalg.inv(intrinsic)
    intrinsic_4d = intrinsic[None, None]
    intrinsic_inv_4d = intrinsic_inv[None, None]
    depth_4d = depth[:, :, None, None]
    trans_4d = transformation[None, None]

    unnormalized_pos = numpy.matmul(intrinsic_inv_4d, pos_vectors_homo)
    world_points = depth_4d * unnormalized_pos
    world_points_homo = numpy.concatenate([world_points, ones_4d], axis=2)
    trans_world_homo = numpy.matmul(trans_4d, world_points_homo)
    trans_world = trans_world_homo[:, :, :3]
    trans_norm_points = numpy.matmul(intrinsic_4d, trans_world)
    trans_pos = trans_norm_points[:, :, :2, 0] / trans_norm_points[:, :, 2:3, 0]
    trans_pos_int = numpy.round(trans_pos).astype('int')

    # Solve occlusions
    a = trans_pos_int.reshape(-1, 2)
    d = depth.ravel()
    b = numpy.unique(a, axis=0, return_index=True, return_counts=True)
    collision_indices = b[1][b[2] >= 2]  # Unique indices which are involved in collision
    for c1 in tqdm(collision_indices, disable=not verbose_log):
        cl = a[c1].copy()  # Collision Location
        ci = numpy.where((a[:, 0] == cl[0]) & (a[:, 1] == cl[1]))[0]  # Colliding Indices: Indices colliding for cl
        cci = ci[numpy.argmin(d[ci])]  # Closest Collision Index: Index of the nearest point among ci
        a[ci] = [-1, -1]
        a[cci] = cl
    trans_pos_solved = a.reshape(height, width, 2)

    # Offset both axes by 1 and set any out of frame motion to edge. Then crop 1-pixel thick edge
    trans_pos_offset = trans_pos_solved + 1
    trans_pos_offset[:, :, 0] = numpy.clip(trans_pos_offset[:, :, 0], a_min=0, a_max=width + 1)
    trans_pos_offset[:, :, 1] = numpy.clip(trans_pos_offset[:, :, 1], a_min=0, a_max=height + 1)

    warped_image = numpy.ones(shape=(height + 2, width + 2, 3)) * numpy.nan
    warped_image[trans_pos_offset[:, :, 1], trans_pos_offset[:, :, 0]] = frame1
    cropped_warped_image = warped_image[1:-1, 1:-1]
    mask = numpy.isfinite(cropped_warped_image)
    cropped_warped_image[~mask] = 0
    if convert_to_uint:
        final_warped_image = cropped_warped_image.astype('uint8')
    else:
        final_warped_image = cropped_warped_image
    mask = mask[:, :, 0]
    return final_warped_image, mask
    
    

Code Explanation

  • I'm using the equations[1,2] to get pixel locations in view2
  • Once I have the pixel locations, I need to figure out if there are any occlusions and if so, I have to pick the foreground pixels.
  • `b = numpy.unique(a, axis=0, return_index=True, return_counts=True)` gives me unique locations.
  • If multiple pixels from view1 map to a single pixel in view2 (collision), `return_counts` will give a value greater than 1.
  • `collision_indices = b[1][b[2] >= 2]` gives indices which are involved in collision. Note that this gives only one index per collision.
  • For each of such collision points, `ci = numpy.where((a[:, 0] == cl[0]) & (a[:, 1] == cl[1]))[0]` provides indices of all pixels from view1 which map to the same point in view2.
  • `cci = ci[numpy.argmin(d[ci])]` gives the pixel index with lowest depth value.
  • `a[ci] = [-1, -1]` and `a[cci] = cl` maps all other background pixels to location (-1,-1) which is out of frame and hence will be ignored.

[1] https://i.stack.imgur.com/s1D9t.png
[2] https://dsp.stackexchange.com/q/69890/32876

Nagabhushan S N
  • 6,407
  • 8
  • 44
  • 87

0 Answers0