2

I have a numpy array say

a = array([[1, 2, 3],
           [4, 5, 6],
           [7, 8, 9]])

I have an array 'replication' of the same size where replication[i,j](>=0) denotes how many times a[i][j] should be repeated along the row. Obiviously, replication array follows the invariant that np.sum(replication[i]) have the same value for all i. For example, if

replication = array([[1, 2, 1],
                     [1, 1, 2],
                     [2, 1, 1]])

then the final array after replicating is:

new_a = array([[1, 2, 2, 3],
           [4, 5, 6, 6],
           [7, 7, 8, 9]])

Presently, I am doing this to create new_a:

 ##allocate new_a
 h = a.shape[0]
 w = a.shape[1]
 for row in range(h):
      ll = [[a[row][j]]*replicate[row][j] for j in range(w)]
      new_a[row] = np.array([item for sublist in ll for item in sublist])

However, this seems to be too slow as it involves using lists. Can I do the intended entirely in numpy, without the use of python lists?

1 Answers1

3

You can flatten out your replication array, then use the .repeat() method of a:

import numpy as np

a = array([[1, 2, 3],
           [4, 5, 6],
           [7, 8, 9]])

replication = array([[1, 2, 1],
                     [1, 1, 2],
                     [2, 1, 1]])

new_a = a.repeat(replication.ravel()).reshape(a.shape[0], -1)

print(repr(new_a))
# array([[1, 2, 2, 3],
#        [4, 5, 6, 6],
#        [7, 7, 8, 9]])
ali_m
  • 71,714
  • 23
  • 223
  • 298