numpy.lib.stride_tricks produces different strides for same shaped arrays

Question

I have a dataset that is giving a different number of stride lengths compared to another dataset of the same shape. The dataset can be downloaded from filebin at:

https://filebin.net/e02dm84v5etjyxoq

while the code that I use is the following, where I want to extract every 5x5 patch from the NxM array:

import numpy as np
from numpy.lib.stride_tricks import as_strided

N = 827
M = 914
gsize = 5

rand_array = np.random.normal(0,1,(N,M))
a = as_strided(rand_array, ((N-gsize+1),(M-gsize+1),gsize,gsize),(rand_array.strides[1], rand_array.strides[1]) + rand_array.strides).reshape(-1, gsize, gsize)

# Using the same code as above, just replacing with the loaded data
tmp = np.load('tmp.npy',allow_pickle=True)
b = as_strided(tmp, ((N-gsize+1),(M-gsize+1),gsize,gsize),(tmp.strides[1], tmp.strides[1]) + tmp.strides).reshape(-1, gsize, gsize)

the results are correct for the array 'a', but I get an empty array for 'b', even when 'np.where(tmp > 0)' clearly produces output for values > 0. Looking more closely at the data reveals:

>>> rand_array.shape
(827,914)
>>> tmp.shape
(827,914)
>>> a.shape
(748930,5,5)
>>> b.shape
(748930,5,5)
>>> rand_array.strides
(7312,8)
>>> tmp.strides
(3656,4)

Therefore, why does the as_strided technique work for the random array but not for the data that I have loaded, where the only difference seems to be for the stride shape, and how can this be fixed to produce the proper results? I can get this to work for a messy-looking loop, but that takes much longer than this method. Also, I currently cannot install skimage for the 'view_as_windows' option, either.

As far as I can see, the `a.shape == b.shape`. Why do you think that it doesn't work for the saved data? — Quang Hoang, Sep 21 '20 at 14:10
As mentioned above, the only clear difference (that I see) are the stride lengths, but it could (potentially) have something to do with how the data is stored (i.e., bit length / size). Otherwise, that is the purpose of this post asking for help / guidance as I am not sure. — WX_M, Sep 21 '20 at 14:15
It must be you saving the data. You saved data contain all `0`. — Quang Hoang, Sep 21 '20 at 14:16
Download the data (via the filebin link posted) and you will see the data are not full of zeros. In fact, running 'np.count_nonzero' on the loaded data results in '19489'. — WX_M, Sep 21 '20 at 14:26

numpy.lib.stride_tricks produces different strides for same shaped arrays

0 Answers0