6

I am trying to write a function that would create a regular grid of 5 pixels by 5 pixels inside a 2d array. I was hoping some combination of numpy.arange and numpy.repeat might do it, but so far I haven't had any luck because numpy.repeat will just repeat along the same row.

Here is an example:

Let's say I want a 5x5 grid inside a 2d array of shape (20, 15). It should look like:

array([[ 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2],
       [ 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2],
       [ 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2],
       [ 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2],
       [ 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2],
       [ 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5],
       [ 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5],
       [ 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5],
       [ 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5],
       [ 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5],
       [ 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8],
       [ 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8],
       [ 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8],
       [ 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8],
       [ 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8],
       [ 9, 9, 9, 9, 9,10,10,10,10,10,11,11,11,11,11],
       [ 9, 9, 9, 9, 9,10,10,10,10,10,11,11,11,11,11],
       [ 9, 9, 9, 9, 9,10,10,10,10,10,11,11,11,11,11],
       [ 9, 9, 9, 9, 9,10,10,10,10,10,11,11,11,11,11],
       [ 9, 9, 9, 9, 9,10,10,10,10,10,11,11,11,11,11]])

I realize I could simply use a loop and slicing to accomplish this, but I could be applying this to very large arrays and I worry that the performance of that would be too slow or impractical.

Can anyone recommend a method to accomplish this?

Thanks in advance.

UPDATE:

All the answers provided seem to work well. Can anyone tell me which will be the most efficient to use for large arrays? By large array I mean it could be 100000 x 100000 or more with 15 x 15 grid cell sizes.

Brian
  • 2,702
  • 5
  • 37
  • 71
  • 1
    Here are two solutions, one involving `numpy.kron`, one involving `numpy.repeat`: http://stackoverflow.com/questions/7525214/how-to-scale-a-numpy-array – Brionius Oct 14 '13 at 16:02
  • I posted an answer here, but it turns out to be the same method @NPE used in your linked question. – jorgeca Oct 14 '13 at 16:09
  • @Brionius - `numpy.kron` definitely worked. Do you think there is much of a performance difference between that and @Mr. E's answer? – Brian Oct 14 '13 at 16:36
  • The output array is what dominates everything else in your problem, so differences will be small. If size is `(m, n)`, and the cells are `(d, d)`, Mr. E builds two arrays of size `(m,)` and `(n,)`, and performs `m * n` additions. My solution creates an array `(m * n / d / d,)` and performs no other operations. My guess is that for large `m` and `n` and relatively small `d`, Mr. E's solution will be faster and more efficient, but not by much. By looking at the [source code](https://github.com/numpy/numpy/blob/v1.7.0/numpy/lib/shape_base.py#L665), I am pretty sure `np.kron` will perform worse. – Jaime Oct 14 '13 at 20:43

3 Answers3

3

Broadcasting is the answer here:

m, n, d = 20, 15, 5
arr = np.empty((m, n), dtype=np.int)
arr_view = arr.reshape(m // d, d, n // d, d)
vals = np.arange(m // d * n // d).reshape(m // d, 1, n // d, 1)
arr_view[:] = vals

>>> arr
array([[ 0,  0,  0,  0,  0,  1,  1,  1,  1,  1,  2,  2,  2,  2,  2],
       [ 0,  0,  0,  0,  0,  1,  1,  1,  1,  1,  2,  2,  2,  2,  2],
       [ 0,  0,  0,  0,  0,  1,  1,  1,  1,  1,  2,  2,  2,  2,  2],
       [ 0,  0,  0,  0,  0,  1,  1,  1,  1,  1,  2,  2,  2,  2,  2],
       [ 0,  0,  0,  0,  0,  1,  1,  1,  1,  1,  2,  2,  2,  2,  2],
       [ 3,  3,  3,  3,  3,  4,  4,  4,  4,  4,  5,  5,  5,  5,  5],
       [ 3,  3,  3,  3,  3,  4,  4,  4,  4,  4,  5,  5,  5,  5,  5],
       [ 3,  3,  3,  3,  3,  4,  4,  4,  4,  4,  5,  5,  5,  5,  5],
       [ 3,  3,  3,  3,  3,  4,  4,  4,  4,  4,  5,  5,  5,  5,  5],
       [ 3,  3,  3,  3,  3,  4,  4,  4,  4,  4,  5,  5,  5,  5,  5],
       [ 6,  6,  6,  6,  6,  7,  7,  7,  7,  7,  8,  8,  8,  8,  8],
       [ 6,  6,  6,  6,  6,  7,  7,  7,  7,  7,  8,  8,  8,  8,  8],
       [ 6,  6,  6,  6,  6,  7,  7,  7,  7,  7,  8,  8,  8,  8,  8],
       [ 6,  6,  6,  6,  6,  7,  7,  7,  7,  7,  8,  8,  8,  8,  8],
       [ 6,  6,  6,  6,  6,  7,  7,  7,  7,  7,  8,  8,  8,  8,  8],
       [ 9,  9,  9,  9,  9, 10, 10, 10, 10, 10, 11, 11, 11, 11, 11],
       [ 9,  9,  9,  9,  9, 10, 10, 10, 10, 10, 11, 11, 11, 11, 11],
       [ 9,  9,  9,  9,  9, 10, 10, 10, 10, 10, 11, 11, 11, 11, 11],
       [ 9,  9,  9,  9,  9, 10, 10, 10, 10, 10, 11, 11, 11, 11, 11],
       [ 9,  9,  9,  9,  9, 10, 10, 10, 10, 10, 11, 11, 11, 11, 11]])
Jaime
  • 65,696
  • 17
  • 124
  • 159
  • Thanks for the answer. Unfortunately, this did not work for me. I ended up with `arr_view.shape = (4L, 5L, 3L, 5L)` and not `(20, 15)`. – Brian Oct 14 '13 at 16:32
  • @Brian `arr_view` is a view of your original array that is only used to have a shape that is broadcastable. It is `arr` that you want to look at: it remains with shape `(20, 15)`. – Jaime Oct 14 '13 at 17:10
  • I see. Thanks for clarifying. I'm unfamiliar with the concept of views. I'll have to look into that. – Brian Oct 14 '13 at 17:18
3

Similar to Jaime's answer:

np.repeat(np.arange(0, 10, 3), 4)[..., None] + np.repeat(np.arange(3), 5)[None, ...]
YXD
  • 31,741
  • 15
  • 75
  • 115
  • This seems much simpler than Jaime's answer and much closer to what I was trying to do. I'm unfamiliar with `[..., None]`. What exactly does this mean? – Brian Oct 14 '13 at 16:34
  • That's called broadcasting. Best explanations are given [here](http://scipy-lectures.github.io/intro/numpy/numpy.html#broadcasting) and [here](http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html). Read about the ellipsis and more about broadcasting [here](http://docs.scipy.org/doc/numpy/reference/arrays.indexing.html). You may see `np.newaxis` used instead of `None` in the docs. – YXD Oct 14 '13 at 16:45
2

kron will do this expansion (as Brionius also suggested in the comments):

xi, xj, ni, nj = 5, 5, 4, 3
r = np.kron(np.arange(ni*nj).reshape((ni,nj)), np.ones((xi, xj)))

Although I haven't tested it, I assume it's less efficient than the broadcasting approach, but a bit more concise and easier to understand (I hope). It's likely less efficient because: 1) it requires the array of ones, 2) it does xi*xj multiplications by 1, and 3) it does a bunch of concats.

tom10
  • 67,082
  • 10
  • 127
  • 137
  • This didn't actually work. A small correction is required. `np.ones((xi*ni, xj*nj))` should be `np.ones((xi, xj))`. With the correction it did work. Thanks for the explanation of efficiency. – Brian Oct 14 '13 at 17:22