4

Suppose I have the following 2D array:

x = np.array([[10,20,30,40], [50,60,70,80],[90,100,110,120]])  
print(x)

array([[ 10,  20,  30,  40],
       [ 50,  60,  70,  80],
       [ 90, 100, 110, 120]])

I would like to construct a new array, y, where each row has the values of a 2x2 block from x in clockwise order:

print(y)
array([[ 10,  20,  60,  50],
       [ 20,  30,  70,  60],
       [ 30,  40,  80,  70],
       [ 50,  60,  100, 90],
       [ 60,  70,  110, 100],
       [ 70,  80,  120, 110]])

I could achieve that using Python for loops as follows:

n_rows, n_cols = x.shape
y = []
for i in range(n_rows-1): 
     for j in range(n_cols-1): 
         row = [x[i,j],x[i,j+1],x[i+1, j+1],x[i+1,j]] 
         y.append(row) 
y = np.array(y)

I wonder if there is a faster way that takes advantage of Numpy functions and avoid using Python loops.

Mad Physicist
  • 107,652
  • 25
  • 181
  • 264
Amazigh_05
  • 241
  • 1
  • 8
  • If I understand correctly, you're trying to implement some kind of convolution-like operation, where each row in `y` has the values from a 2x2 block in `x`, right? – joanis Apr 09 '22 at 17:13
  • @joanis: I want to extract each 2x2 block and put it in a row. – Amazigh_05 Apr 09 '22 at 17:15
  • @joanis. You're overthinking it a bit: there is no summation here, and the extraction is in clockwise order, instead of just raveled. – Mad Physicist Apr 09 '22 at 17:46
  • @MadPhysicist You're right, sliding window is the right tool, this doesn't have anything to do with convolutions. I like your answer. I just didn't find the question very clear in the first place. – joanis Apr 09 '22 at 18:03
  • There, I just edited the question to make the relationship between x and y clear. – joanis Apr 09 '22 at 18:06
  • @joanis. Prose is generally confusing. I generally look at the reference implementation, which OP provided, and is totally unambiguous. – Mad Physicist Apr 09 '22 at 18:11
  • @MadPhysicist I only half agree. Yes, the code is unambiguous, but a little bit of well-writen prose can lead you to understand the code much faster. – joanis Apr 09 '22 at 18:14

2 Answers2

2

You can cache your code since the loop is mainly iterating the same matrice again and again (If you want to keep your same code with loop). I have made a speed comparison for your code before and after caching.

# Before caching
def loop_before_cache():
    n_rows, n_cols = x.shape
    y = []
    for i in range(n_rows-1): 
        for j in range(n_cols-1): 
            row = [x[i,j],x[i,j+1],x[i+1, j+1],x[i+1,j]] 
            y.append(row) 
    return np.array(y)


%timeit loop_before_cache()
11.6 µs ± 318 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

And now with caching

# After caching
from functools import lru_cache

@lru_cache()
def loop_after_cache():
    n_rows, n_cols = x.shape
    y = []
    for i in range(n_rows-1): 
        for j in range(n_cols-1): 
            row = [x[i,j],x[i,j+1],x[i+1, j+1],x[i+1,j]] 
            y.append(row) 
    return np.array(y)

%timeit loop_after_cache()
83.6 ns ± 2.42 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

Extra

I have added simulated data with a (1000,5000) array using range to show the efficiency of caching.

x = np.array([i for i in range(1,5000001)])
x = np.reshape(x, (1000,5000))

# Before caching
def loop_before_cache():
    n_rows, n_cols = x.shape
    y = []
    for i in range(n_rows-1): 
        for j in range(n_cols-1): 
            row = [x[i,j],x[i,j+1],x[i+1, j+1],x[i+1,j]] 
            y.append(row) 
    return np.array(y)

%timeit loop_before_cache()
8.58 s ± 113 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
# After caching
@lru_cache(maxsize = 256)
def loop_after_cache():
    n_rows, n_cols = x.shape
    y = []
    for i in range(n_rows-1): 
        for j in range(n_cols-1): 
            row = [x[i,j],x[i,j+1],x[i+1, j+1],x[i+1,j]] 
            y.append(row) 
    return np.array(y)

%timeit loop_after_cache()
82.2 ns ± 5.58 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
  • Nice! However, I still want to see an answer that uses numpy functions if possible. – Amazigh_05 Apr 09 '22 at 17:13
  • 1
    I hope I can help with that, but sorry I'm not really good at `numpy`. Well, you can use this caching too after there's a better post with numpy solution :D –  Apr 09 '22 at 17:16
  • 2
    Now try it on a 100x500 array or so. Benchmarks on tiny datasets are pretty useless since they are dominated by overhead, especially in a case like numpy, which has a huge amount of prep-work before the data is passed to the actual C function. – Mad Physicist Apr 09 '22 at 17:51
  • Thanks for the suggestion, I have included an update to the answer –  Apr 10 '22 at 01:25
2

First, create a sliding_window_view into x with the 2x2 boxes you want to see:

b = np.lib.stride_tricks.sliding_window_view(x, (2, 2))

Each of the innermost 2x2 arrays contains an unraveled version of what you want, but with the second part of the array reversed. So far we didn't copy any data. Now make a copy by raveling the last dimension. The reshape will always make a copy here because b is highly non-contiguous:

c = b.reshape(*b.shape[:2], 4)

Swap the last two columns:

c[..., 2:] = c[..., -1:1:-1]

Now ravel the leading dimensions:

y = c.reshape(-1, c.shape[-1])

If you have a version of numpy that is older whan 1.20, you can replace the definition of b with

b = np.lib.stride_tricks.as_strided(x, shape=(x.shape[0] - 1, x.shape[1] - 1, 2, 2), strides=x.strides * 2)
Mad Physicist
  • 107,652
  • 25
  • 181
  • 264