1

I have two images with the following dimensions, x, y, z:

img_a: 50, 50, 100

img_b: 50, 50

I'd like to reduce the z-dim of img_a from 100 to 1, grabbing just the value coincide with the indices stored in img_b, pixel by pixel, as indices vary throughout the image.

This should result in a third image with the dimension:

img_c: 50, 50

Is there already a function dealing with this issue?

thanks, peter

KobeJohn
  • 7,390
  • 6
  • 41
  • 62
peter
  • 25
  • 4
  • What do you mean by "coincide with the indices stored in img_b"? Does img_b contain indices? Or do you simply want something like `img_a[:,:,0]`? –  Sep 27 '15 at 00:20
  • thanks for your comment, Evert. yes, img_b contains indices. KobeJohn, see below, already provided a nice very fast solution for my problem. thanks for your answer. peter – peter Sep 27 '15 at 13:16

1 Answers1

0

Ok updated with a vectorized method.

Here is a duplicate question but the solution currently doesn't work when the row and column dimensions are not the same size.

The code below has the method I added that explicitly creates the indices for look up purposes with numpy.indices() and then does the loop logic but in a vectorized way. It's slightly slower (2x) than the numpy.meshgrid() method but I think it's easier to understand and it also works with unequal row and column sizes.

The timing is approximate but on my system I get:

Meshgrid time: 0.319000005722
Indices time: 0.704999923706
Loops time: 13.3789999485

-

import numpy as np
import time


x_dim = 5000
y_dim = 5000
channels = 3

# base data
a = np.random.randint(1, 1000, (x_dim, y_dim, channels))
b = np.random.randint(0, channels, (x_dim, y_dim))


# meshgrid method (from here https://stackoverflow.com/a/27281566/377366 )
start_time = time.time()
i1, i0 = np.meshgrid(xrange(x_dim), xrange(y_dim), sparse=True)
c_by_meshgrid = a[i0, i1, b]
print('Meshgrid time: {}'.format(time.time() - start_time))

# indices method (this is the vectorized method that does what you want)
start_time = time.time()
b_indices = np.indices(b.shape)
c_by_indices = a[b_indices[0], b_indices[1], b[b_indices[0], b_indices[1]]]
print('Indices time: {}'.format(time.time() - start_time))

# loops method
start_time = time.time()
c_by_loops = np.zeros((x_dim, y_dim), np.intp)
for i in xrange(x_dim):
    for j in xrange(y_dim):
        c_by_loops[i, j] = a[i, j, b[i, j]]
print('Loops time: {}'.format(time.time() - start_time))


# confirm correctness
print('Meshgrid method matches loops: {}'.format(np.all(c_by_meshgrid == c_by_loops)))
print('Loop method matches loops: {}'.format(np.all(c_by_indices == c_by_loops)))
Community
  • 1
  • 1
KobeJohn
  • 7,390
  • 6
  • 41
  • 62
  • thanks for answering, KobeJohn and thanks for sharing your code. I am actually searching for a vectorized approach, as I am dealing with huge data sets and want to avoid loops. I was wandering if there is a one-liner out there solving this problem in a vectorized way ?? – peter Sep 26 '15 at 12:59
  • @peter I totally understand. I have been trying to find the way to use advanced indexing for it. I'll update if I can find it. There are plenty of numpy masters on here who can tell you immediately if there is a way to do it or not. I'm not sure why there are no answers yet. – KobeJohn Sep 26 '15 at 13:16
  • @peter Added a questionably vectorized version. – KobeJohn Sep 27 '15 at 00:18
  • I am impressed, it works! its so fast! great job, KobeJohn. not familiar with mashgrids either, looks interesting, however unequal row and column sizes are common among my datasets, so I leave it with the indices method. deeply grateful, peter. – peter Sep 27 '15 at 12:05
  • @peter I'm glad it worked! If you could mark it solved, I would appreciate it and it will help future users to know that the solution works. – KobeJohn Sep 27 '15 at 14:02
  • OK, found the check mark. thanks for the reminder. cheers and many thanks again – peter Sep 27 '15 at 14:47