Numpy convert list of 3D variable size volumes to 4D array

Question

I'm working on a neural network where I am augmenting data via rotation and varying the size of each input volume.

Let me back up, the input to the network is a 3D volume. I generate variable size 3D volumes, and then pad each volume with zero's such that the input volume is constant. Check here for an issue I was having with padding (now resolved).

I generate a variable size 3D volume, append it to a list, and then convert the list into a numpy array. At this point, padding hasn't occured so converting it into a 4D tuple makes no sense...

input_augmented_matrix = []
label_augmented_matrix = []
for i in range(n_volumes):
    if i % 50 == 0:
        print ("Augmenting step #" + str(i))
    slice_index = randint(0,n_input)
    z_max = randint(5,n_input)
    z_rand = randint(3,5)
    z_min = z_max - z_rand
    x_max = randint(75, n_input_x)
    x_rand = randint(60, 75)
    x_min = x_max - x_rand
    y_max = randint(75, n_input_y)
    y_rand = randint(60, 75)
    y_min = y_max - y_rand
    random_rotation = randint(1,4) * 90
    for j in range(2):
        temp_volume = np.empty((z_rand, x_rand, y_rand))
        k = 0
        for z in range(z_min, z_max):
            l = 0
            for x in range(x_min, x_max):
                m = 0
                for y in range(y_min, y_max):
                    if j == 0:
                        #input volume
                        try:
                            temp_volume[k][l][m] = input_matrix[z][x][y]
                        except:
                            pdb.set_trace()
                    else:
                        #ground truth volume
                        temp_volume[k][l][m] = label_matrix[z][x][y]
                    m = m + 1
                l = l + 1
            k = k + 1
        temp_volume = np.asarray(temp_volume)
        temp_volume = np.rot90(temp_volume,random_rotation)
        if j == 0:
            input_augmented_matrix.append(temp_volume)
        else:
            label_augmented_matrix.append(temp_volume)

input_augmented_matrix = np.asarray(input_augmented_matrix)
label_augmented_matrix = np.asarray(label_augmented_matrix)

The dimensions of input_augmented_matrix at this point is (N,)

Then I pad with the following code...

for i in range(n_volumes):
    print("Padding volume #" + str(i))
    input_augmented_matrix[i] = np.lib.pad(input_augmented_matrix[i], ((0,n_input_z - int(input_augmented_matrix[i][:,0,0].shape[0])),
                                                (0,n_input_x - int(input_augmented_matrix[i][0,:,0].shape[0])),
                                                (0,n_input_y - int(input_augmented_matrix[i][0,0,:].shape[0]))),
                                'constant', constant_values=0)
    label_augmented_matrix[i] = np.lib.pad(label_augmented_matrix[i], ((0,n_input_z - int(label_augmented_matrix[i][:,0,0].shape[0])),
                                                (0,n_input_x - int(label_augmented_matrix[i][0,:,0].shape[0])),
                                                (0,n_input_y - int(label_augmented_matrix[i][0,0,:].shape[0]))),
                                'constant', constant_values=0)

At this point, the dimensions are still (N,) even though every element of the list is constant. For example input_augmented_matrix[0] = input_augmented_matrix[1]

Currently I just loop through and create a new array, but it takes too long and I would prefer some sort of method that automates this. I do it with the following code...

input_4d = np.empty((n_volumes, n_input_z, n_input_x, n_input_y))
label_4d = np.empty((n_volumes, n_input_z, n_input_x, n_input_y))
for i in range(n_volumes):
    print("Converting to 4D tuple #" + str(i))
    for j in range(n_input_z):
        for k in range(n_input_x):
            for l in range(n_input_y):
                input_4d[i][j][k][l] = input_augmented_matrix[i][j][k][l]
                label_4d[i][j][k][l] = label_augmented_matrix[i][j][k][l]

Is there a cleaner and faster way to do this?

score 1 · Accepted Answer · answered Aug 16 '16 at 17:40

As I understood this part

k = 0
for z in range(z_min, z_max):
    l = 0
    for x in range(x_min, x_max):
        m = 0
        for y in range(y_min, y_max):
            if j == 0:
                #input volume
                try:
                    temp_volume[k][l][m] = input_matrix[z][x][y]
                except:
                    pdb.set_trace()
            else:
                #ground truth volume
                temp_volume[k][l][m] = label_matrix[z][x][y]
            m = m + 1
        l = l + 1
    k = k + 1

You just want to do this

temp_input = input_matrix[z_min:z_max, x_min:x_max, y_min:y_max]
temp_label = label_matrix[z_min:z_max, x_min:x_max, y_min:y_max]

and then

temp_input = np.rot90(temp_input, random_rotation)
temp_label = np.rot90(temp_label, random_rotation)

input_augmented_matrix.append(temp_input)
label_augmented_matrix.append(temp_label)

Here

input_augmented_matrix[i] = np.lib.pad(
    input_augmented_matrix[i],
    ((0,n_input_z - int(input_augmented_matrix[i][:,0,0].shape[0])),
     (0,n_input_x - int(input_augmented_matrix[i][0,:,0].shape[0])),
     (0,n_input_y - int(input_augmented_matrix[i][0,0,:].shape[0]))),
    'constant', constant_values=0)

Better to do this, because shape property gives you size of array by all dimensions

ia_shape = input_augmented_matrix[i].shape
input_augmented_matrix[i] = np.lib.pad(
    input_augmented_matrix[i],
    ((0, n_input_z - ia_shape[0]),
     (0, n_input_x - ia_shape[1])),
     (0, n_input_y - ia_shape[2]))),
    'constant',
    constant_values=0)

I guess now you're ready to refactor the last part of your code with magic indexing of NumPy.

My common suggestions:

use functions for repeated parts of code to avoid such indents like in your cascade of loops;
if you need so lot of nested loops, think about recursion, if you can't deal without them;
explore abilities of NumPy in official documentation: they're really exciting ;) For example, indexing is helpful for this task;
use PyLint and Flake8 packages to inspect quality of your code.

Do you want to write neural network by yourself, or you just want to solve some patterns recognition task? SciPy library may contain what you need and it's based on NumPy.

thanks for the help! this looks much better :) and this is a data class for a tensorflow network — Kendall Weihe, Aug 16 '16 at 18:01
@KendallWeihe you're welcome! Does it solve your issue or I've missed something? — Charlie, Aug 18 '16 at 05:44

Numpy convert list of 3D variable size volumes to 4D array

1 Answers1