This loops over shifts rather than rows (loop of size 10):
N = 10
c = np.hstack([b[i:i-N] for i in range(N)])
Explanation: b[i:i-N]
is b
's rows from i
to m-(N-i)
(excluding m-(N-i)
itself) where m
is number of rows in b
. Then np.hstack
stacks those selected sub-arrays horizontally(stacks b[0:m-10]
, b[1:m-9]
, b[2:m-8]
, ..., b[10:m]
) (as question explains).
c.shape: (990, 20)
Also I think you may be looking for a shape of (991, 20) if you want to include all windows.
you can also use strides, but if you want to do operations on it, I would advise against that, since the memory is tricky using them. Here is a strides solution if you insist:
from skimage.util.shape import view_as_windows
c = view_as_windows(b, (10,2)).reshape(-1, 20)
c.shape: (991, 20)
If you don't want the last row, simply remove it by calling c[:-1]
.
A similar solution applies with numpy's as_strides function (they basically operate similar, not sure of internals of them).
UPDATE: if you want to find unique values and their frequencies in each row of c
you can do:
unique_values = []
unique_counts = []
for row in c:
unique, unique_c = np.unique(row, return_counts=True)
unique_values.append(unique)
unique_counts.append(unique_c)
Note that numpy arrays have to be rectangular, meaning the number of elements per each(dimension) row must be the same. Since different rows in c
can have different number of unique values, you cannot create a numpy array for unique values of each row (Alternative would be to make a structured numpy array). Therefore, a solution is to make a list/array of arrays, each including unique values of different rows in c. unique_values
are list of arrays of unique values and unique_counts
is their frequency in the same order.