How to rotate an image using only the mxnet ndarray and contrib.nd.array APIs?

Question

How to rotate an image using only the mxnet ndarray and contrib.nd.array APIs

I'm trying to rotate an image using base mxnet array api. I also trying to avoid using code which breaks the mxnet pipeline. eg.

Perhaps this is pedantic, but its also an exercise in learning the api. With that said, here is my question in detail.

(NOTE: I originally did this without batch axes, but I readded the batch axes to make it more similar to contrib api example.)

Given a simple 2x2 pixel "image" represented as so:

# batch 1
in_data = nd.arange(4).reshape((1,2,2))
print('in_data is ', in_data)

>output
in_data is  
[[[0. 1.]
  [2. 3.]]]
<NDArray 1x2x2 @cpu(0)>

Create an array for output. put all nines as pixel values

# setup an ouptut array with all nines, so I can see overwrites with new values
out_data = nd.ones(4).reshape((2,2))*9
# out_data also needs to be in batch form
out_data = nd.expand_dims(out_data, axis=0)
print('out_data is ', out_data)

out_data is  
[[[9. 9.]
  [9. 9.]]]
<NDArray 1x2x2 @cpu(0)>

setup a rotate array

# setup a rotation matrix
rotate_data = rotate_90
print('rotate_90 ', rotate_data)


rotate_90  
[[ 6.123234e-17 -1.000000e+00]
 [ 1.000000e+00  6.123234e-17]]
<NDArray 2x2 @cpu(0)>

Build an iterator and rotate the image indexes

# NDArrayIter(data, label=None, batch_size=1, shuffle=False, 
#            last_batch_handle='pad', data_name='data', 
#            label_name='softmax_label')
#
# Ignore the label parameter.
dataiter = mx.io.NDArrayIter(in_data, batch_size=1, shuffle=False, last_batch_handle='discard')
#batch_index = [0]
for batch in dataiter:
    print('loop entry - a single batch - a single image in batch.data[0] from what is in in_data')
    #print ('batch.data[0] = ', batch.data[0].asnumpy())
    #print ('batch.data[0].shape = ', batch.data[0].shape)
    # Does this copy or get an alias to the input image?
    input_img = batch.data[0]
    #print ('input_img = ', input_img.asnumpy())
    print ('input_img.shape = ', input_img.shape)
    
    # this will print the axis including the batch axis
    # print('indicies are ', mx.nd.contrib.index_array(entire_img) )
    # this will print the x,y axis and ignore the batch
    input_img_indexes = mx.nd.contrib.index_array(input_img, axes=(1, 2))
    #print('indexes are: ', input_img_indexes)
    #print ('input_img_indexes.shape = ', input_img_indexes.shape)
    
    
    # copy data from input to output
    # Note, this seems to also use the batch axis?  Its shape (1,2,2)
    #out_data = input_img
    
    # Try to assign input data to output data based upon indicies
    orig_indexes = mx.nd.reshape(input_img_indexes, shape=(4,2))
    #print('new_indexes ', new_indexes)
    orig_indexes = orig_indexes.astype("float32")
    
    # I've seen one variant where matrix-matrix multiply 
    # is done with .T.  why?
    new_indexes = nd.dot(orig_indexes, rotate_data) 
    #print('result = ', result)
    new_indexes = new_indexes.astype('int64')
    print('new_indexes = ', new_indexes)
    new_indexes = new_indexes + nd.array(nd.array([0, 1])).astype('int64')
    print('new_indexes after shift to positive', new_indexes)
    


loop entry - a single batch - a single image in batch.data[0] from what is in in_data
input_img.shape =  (1, 2, 2)
new_indexes =  
[[ 0  0]
 [ 1  0]
 [ 0 -1]
 [ 1 -1]]
<NDArray 4x2 @cpu(0)>
new_indexes after shift to positive 
[[0 1]
 [1 1]
 [0 0]
 [1 0]]
<NDArray 4x2 @cpu(0)>

Dump the ins and outs

print('out_data is ', out_data)
print('in_data is ', in_data)
print('rotate_data is ', rotate_data)
print('orig_indexes is ', orig_indexes)
print('new_indexes is ', new_indexes)
print('new_indexes.dtype is ', new_indexes.dtype)

out_data is  
[[[9. 9.]
  [9. 9.]]]
<NDArray 1x2x2 @cpu(0)>
in_data is  
[[[0. 1.]
  [2. 3.]]]
<NDArray 1x2x2 @cpu(0)>
rotate_data is  
[[ 6.123234e-17 -1.000000e+00]
 [ 1.000000e+00  6.123234e-17]]
<NDArray 2x2 @cpu(0)>
orig_indexes is  
[[0. 0.]
 [0. 1.]
 [1. 0.]
 [1. 1.]]
<NDArray 4x2 @cpu(0)>
new_indexes is  
[[0 1]
 [1 1]
 [0 0]
 [1 0]]
<NDArray 4x2 @cpu(0)>
new_indexes.dtype is  <class 'numpy.int64'>

And now the reason I am asking for your help

At this point, I muddled about a bunch. Then I went back to pen and paper. Here is description of my problem now and why I am here.

Back to the code

rewriting to match notes

To avoid the batch dimension, and simplfy problem

the orig image

orig_image = in_data[0]
print('orig_image ', orig_image)

orig_image  
[[0. 1.]
 [2. 3.]]
<NDArray 2x2 @cpu(0)>

the rotated image memory for output

rotated_image=out_data[0]
print('rotated_image', rotated_image)

rotated_image 
[[9. 9.]
 [9. 9.]]
<NDArray 2x2 @cpu(0)>

The original indexes/indices

print('orig_indexes', orig_indexes)

orig_indexes 
[[0. 0.]
 [0. 1.]
 [1. 0.]
 [1. 1.]]
<NDArray 4x2 @cpu(0)>

The new indices

print('new_indexes', new_indexes)

new_indexes 
[[0 1]
 [1 1]
 [0 0]
 [1 0]]
<NDArray 4x2 @cpu(0)>

do pixel 0,0

orig_indexes[0,0]  # row 0, column 0 = the first row index is in the first row column 0

[0.]
<NDArray 1 @cpu(0)>

[1]
<NDArray 1 @cpu(0)>

orig_image[new_indexes[0,0], new_indexes[0,1]]  # the new 0,0 pixel should be nine, pulled from 1,1

[1.]
<NDArray 1 @cpu(0)>

Did a 1 move to pixel 0,0?

rotated_image[orig_indexes[0,0], orig_indexes[0,1]] = orig_image[new_indexes[0,0], new_indexes[0,1]]
rotated_image

[[1. 9.]
 [9. 9.]]
<NDArray 2x2 @cpu(0)>

do pixel 0,1

rotated_image[orig_indexes[1,0], orig_indexes[1,1]] = orig_image[new_indexes[1,0], new_indexes[1,1]]
rotated_image

[[1. 3.]
 [9. 9.]]
<NDArray 2x2 @cpu(0)>

Do pixel 1,0

rotated_image[orig_indexes[2,0], orig_indexes[2,1]] = orig_image[new_indexes[2,0], new_indexes[2,1]]
rotated_image

[[1. 3.]
 [0. 9.]]
<NDArray 2x2 @cpu(0)>

do pixel 1,1

rotated_image[orig_indexes[3,0], orig_indexes[3,1]] = orig_image[new_indexes[3,0], new_indexes[3,1]]
rotated_image

[[1. 3.]
 [0. 2.]]
<NDArray 2x2 @cpu(0)>

Is that rotated 90?


print('orig image', orig_image)
print('rotated_image', rotated_image)

orig image 
[[0. 1.]
 [2. 3.]]
<NDArray 2x2 @cpu(0)>
rotated_image 
[[1. 3.]
 [0. 2.]]
<NDArray 2x2 @cpu(0)>

Looks like it to me. I call that success, but not DFD success

####How to do this implicitly?

orig_image[new_indexes] # nope

[[[0. 1.]
  [2. 3.]]

 [[2. 3.]
  [2. 3.]]

 [[0. 1.]
  [0. 1.]]

 [[2. 3.]
  [0. 1.]]]
<NDArray 4x2x2 @cpu(0)>

Hmm, try this contrib api again

# for reference
x = mx.nd.zeros((5,3))
t = mx.nd.array([[1,2,3],[4,5,6],[7,8,9]])
index = mx.nd.array([0,4,2])  # <--- this is interesting. It has a zero length axes?????
# x is a template for output
# index is the indexes?
# t is "new_tensor?", its the source tensor?
mx.nd.contrib.index_copy(x, index, t)

[[1. 2. 3.]
 [0. 0. 0.]
 [7. 8. 9.]
 [0. 0. 0.]
 [4. 5. 6.]]
<NDArray 5x3 @cpu(0)>

That looks promising. However, as i was copying my code to this question. I realize the problem. The new_array, which is really a source array - fwiw in api param syntax its called new_tensor. The index parameter is just to specify which rows of the source array are copied as is to the output array - api param parlanx old_tensor. But not really the output array, to memory in the same shape as the "output array - old array in the api parameter syntax." ¯_(ツ)_/¯

hmm.

FWIW, here is how the nd.index_copy() routine works

I updated this section with more info on how this api works. In hopes this api can do what I want.

template_tensor = nd.zeros([5,3])
template_tensor

[[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]]
<NDArray 5x3 @cpu(0)>

source_indices = nd.array([0,2,4])
source_indices

[0. 2. 4.]
<NDArray 3 @cpu(0)>

source_tensor = nd.array([[0,1,2],[3,4,5],[6,7,8]])
source_tensor

[[0. 1. 2.]
 [3. 4. 5.]
 [6. 7. 8.]]
<NDArray 3x3 @cpu(0)>

This api uses the src_indices to specify which elements to copy from the source tensor to output. The output uses the template tensor for unspecified values misssing from the src_indices array.

nd.contrib.index_copy(template_tensor, source_indices, source_tensor)
[[0. 1. 2.]
 [0. 0. 0.]
 [3. 4. 5.]
 [0. 0. 0.]
 [6. 7. 8.]]
<NDArray 5x3 @cpu(0)>

Doing this again using the name parameter form

out_tensor = nd.ones([5,3])*9 # testing the out parameter with all nines

nd.contrib.index_copy(old_tensor=template_tensor,  # old_tensor is confusing  
           index_vector=source_indices,            # index_vector is not as confusing, but not clear 
           new_tensor=source_tensor,               # new_tensor is confusing.  Its not new nor is it modifed. 
           out=out_tensor,                         # using out paramter as foo, since not sure what it is ...
           name=None,)

results in

[[0. 1. 2.]
 [0. 0. 0.]
 [3. 4. 5.]
 [0. 0. 0.]
 [6. 7. 8.]]
<NDArray 5x3 @cpu(0)>

Hmm, this shows that out_tensor is the name of the output.

So, what happens is all the values from the template_tensor is copied to the output, with the exception of the rows specified by the index_vector. In that case, the values in the new_tensor are are copied to output out_tensor

out_tensor
[[0. 1. 2.]
 [0. 0. 0.]
 [3. 4. 5.]
 [0. 0. 0.]
 [6. 7. 8.]]
<NDArray 5x3 @cpu(0)>

If it helps here is the original code https://github.com/rtp-aws/devpost_aws_disaster_response/blob/main/python/gray_rotate_two.ipynb — netskink, Feb 22 '22 at 21:08

score 0 · Answer 1 · answered Feb 24 '22 at 02:13

This works for what I've tested

I would still like some comment to improve upon the answer though. I'm certain it can be done better. It also has an error with 270 rotate. I've tested it with a 2x2 and 3x3 and rotates of 0,90, and 180

def my_rotate(input_image_batch, rotate_matrix, verbose=False):
    # NDArrayIter(data, label=None, batch_size=1, shuffle=False, 
    #            last_batch_handle='pad', data_name='data', 
    #            label_name='softmax_label')
    #
    # Ignore the label parameter.
    dataiter = mx.io.NDArrayIter(input_image_batch, batch_size=1, shuffle=False, last_batch_handle='discard')
    for batch in dataiter:

        my_print(verbose, 'loop entry - a single batch - a single image in batch.data[0] from what is in in_data')
        # Does this copy or get an alias to the input image?
        a_img_batch = batch.data[0]
        my_print(verbose, 'a_img_batch = ', a_img_batch)
        my_print(verbose, 'a_img_batch.shape = ', a_img_batch.shape)

        a_img_indexes = mx.nd.contrib.index_array(a_img_batch, axes=(1, 2))
        my_print(verbose, 'a_img_indexes  ', a_img_indexes)
        my_print(verbose, 'a_img_indexes.shape ', a_img_indexes.shape)



        # Try to assign input data to output data based upon indicies
        #
        # Need to reshape so that rows=size of image sans batch
        # 2x2-> 4,2
        # 3x3-> 9,2
        #
        num_rows = a_img_batch[0].size
        my_print(True,'num_rows = ', num_rows)
        orig_indexes = mx.nd.reshape(a_img_indexes, shape=(num_rows,2))
        my_print(verbose, 'orig_indexes ', orig_indexes)
        orig_indexes = orig_indexes.astype("float32")

        # do the rotate
        new_indexes = nd.dot(orig_indexes, rotate_matrix) 
        my_print(verbose, 'new_indexes = ', new_indexes)
        new_indexes = new_indexes.astype('int64')
        my_print(verbose, 'new_indexes = ', new_indexes)
        #
        # shift to lower right quadrant. shift so that index axes is 0,0 in top left
        #
        # find the min row value
        min_row = new_indexes.min(axis=0)[0]
        my_print(verbose, 'min_row = ', min_row)
        adj_row = nd.abs(min_row).asscalar()
        my_print(verbose, 'adj_row = ', adj_row)
        
        # find the min col value
        min_col = new_indexes.min(axis=0)[1]
        my_print(verbose, 'min_col = ', min_col)
        adj_col = nd.abs(min_col).asscalar()
        my_print(verbose, 'adj_col = ', adj_col)
        
        # adjust based upon min row/col
        # with rotate 90 for 3x3 its [0,2]
        new_indexes = new_indexes + nd.array(nd.array([adj_row, adj_col])).astype('int64')
        my_print(verbose, 'new_indexes after shift to positive', new_indexes)

        
        output_image_batch = nd.zeros(input_image_batch.size).reshape(input_image_batch.shape)
        output_image = output_image_batch[0]
        
        my_print(verbose, 'output_image_batch ', output_image_batch)
        my_print(verbose, 'output_image_batch.shape ', output_image_batch.shape)
        my_print(verbose, 'output_image_batch[0] ', output_image_batch[0])
        my_print(verbose, 'a_img_batch[0][new_indexes[:,0],new_indexes[:,1]] ', a_img_batch[0][new_indexes[:,0],new_indexes[:,1]])
        output_image = a_img_batch[0][new_indexes[:,0],new_indexes[:,1]]
        my_print(verbose, 'output_image ', output_image)
        # output_image is flatened, need to resize to rectangular
        new_dims = float(num_rows)**0.5
        new_dims = int(new_dims)
        output_image = output_image.reshape(new_dims,new_dims)
        my_print(verbose, 'output_image ', output_image)
        output_image_batch = nd.expand_dims(output_image, axis=0)
        my_print(verbose, 'output_image_batch ', output_image_batch)
        
        return output_image_batch
        #return 0 # stub

Test 90 with 3x3

# Input Image 
in_img_batch = nd.arange(9).reshape((1,3,3))
print('in_img_batch is ', in_img_batch)
in_img = in_img_batch[0]
print('in_img ', in_img)

in_img_batch is  
[[[0. 1. 2.]
  [3. 4. 5.]
  [6. 7. 8.]]]
<NDArray 1x3x3 @cpu(0)>
in_img  
[[0. 1. 2.]
 [3. 4. 5.]
 [6. 7. 8.]]
<NDArray 3x3 @cpu(0)>

out_img_batch = my_rotate(in_img_batch, rotate_90)
out_img_batch

('num_rows = ', 9)

[[[2. 5. 8.]
  [1. 4. 7.]
  [0. 3. 6.]]]
<NDArray 1x3x3 @cpu(0)>

Test 90 with 2x2

# Input Image 
in_img_batch = nd.arange(4).reshape((1,2,2))
print('in_img_batch is ', in_img_batch)
in_img = in_img_batch[0]
print('in_img ', in_img)

in_img_batch is  
[[[0. 1.]
  [2. 3.]]]
<NDArray 1x2x2 @cpu(0)>
in_img  
[[0. 1.]
 [2. 3.]]
<NDArray 2x2 @cpu(0)>

out_img_batch = my_rotate(in_img_batch, rotate_90)
out_img_batch

('num_rows = ', 4)

[[[1. 3.]
  [0. 2.]]]
<NDArray 1x2x2 @cpu(0)>

Here is the notebook, if you wish to see the helper routines and imports.

https://github.com/rtp-aws/devpost_aws_disaster_response/blob/main/python/rotate_six.ipynb