2

How to effectively slice an array into overlapping subarrays, so that for

>>> N = 5
>>> L = 2  # could be any, less than N
>>> x = range(N)

the expected result is

[[1,0],[2,1],[3,2],[4,3]]

Here what I've tried:

>>> [ x[i:i-L:-1] for i in range(L-1,len(x)) ]
[[], [3, 2], [4, 3], [5, 4]]  # wrong

>>> [ x[i:i-L:-1] for i in range(L,len(x)) ]
[[2, 1], [3, 2], [4, 3]]  # wrong

>>> [ x[i:i-L if i-L >= 0 else None:-1] for i in range(L-1,len(x)) ]
[[1, 0], [2, 1], [3, 2], [4, 3]]  # correct

It produces the desired result, but is it the best way to achieve it?

Are there some numpy, itertools functions that may help?

mr.tarsa
  • 6,386
  • 3
  • 25
  • 42
  • So, is the input a list or NumPy array? Is the expected result a list or array? The title says `array`, whereas the sample is `range(N)` that creates a list. – Divakar Nov 30 '16 at 17:44
  • Thanks, I expect it to be the one for which the better solution exists. – mr.tarsa Nov 30 '16 at 17:48

3 Answers3

1

You can use simple list comprehension

>>> [[x[i+1], x[i]] for i in range(len(x) - 1)]
[[1, 0], [2, 1], [3, 2], [4, 3]]

Or use itertools.izip:

>>> from itertools import izip
>>> [list(k) for k in izip(x[1:], x)]
[[1, 0], [2, 1], [3, 2], [4, 3]]

I see that you updated the question, so here's generic itertools way with use of itertools.izip, itertools.islice and itertools.imap

>>> res = imap(lambda i:islice(reversed(x), i, i+L), xrange(N-L,-1,-1))
>>> [list(e) for e in res]
[[1, 0], [2, 1], [3, 2], [4, 3]]

Or even pure generators:

>>> res = (reversed(x[i:i+L]) for i in xrange(N-L+1))
>>> [list(e) for e in res]
[[1, 0], [2, 1], [3, 2], [4, 3]]
Roman Pekar
  • 107,110
  • 28
  • 195
  • 197
  • Thanks, you provided simple and elegant solution for small subarrays. But what about the case of 10-element subarrays (also shifted by 1 element)? So that for `x = range(100)` the result will be `[[9,8,..,0],[10,9,..,1],..,[99,98,..,90]]`. – mr.tarsa Nov 30 '16 at 17:23
  • Something like [[x[i+n-1, i, -1] for i in range(len(x) - n + 1)] should work but i can't test it atm,will check when i get home – Roman Pekar Nov 30 '16 at 17:39
1

I am assuming the input as a NumPy array. So, if it's not already, we could have it as an array with np.asarray(). Thus, we would start with : x = np.asarray(input_list) if the input is a list. So, with that as the setup let's try to solve the problem.

Here's an approach using strides that uses the concept of views, which avoids making copies and as such must be pretty efficient -

L = 2 # Row length
strided = np.lib.stride_tricks.as_strided
n = x.strides[0]
out = strided(x[L-1:],shape=(x.size-L+1,L),strides=(n,-n))

Sample runs -

In [85]: L = 2

In [86]: strided(x[L-1:],shape=(x.size-L+1,L),strides=(n,-n))
Out[86]: 
array([[1, 0],
       [2, 1],
       [3, 2],
       [4, 3]])

In [87]: L = 3

In [88]: strided(x[L-1:],shape=(x.size-L+1,L),strides=(n,-n))
Out[88]: 
array([[2, 1, 0],
       [3, 2, 1],
       [4, 3, 2]])

In [89]: L = 4

In [90]: strided(x[L-1:],shape=(x.size-L+1,L),strides=(n,-n))
Out[90]: 
array([[3, 2, 1, 0],
       [4, 3, 2, 1]])

Here's another approach using broadcasting -

L = 2 # Row length
out = x[np.arange(x.size-L+1)[:,None] + np.arange(L-1,-1,-1)]
Community
  • 1
  • 1
Divakar
  • 218,885
  • 19
  • 262
  • 358
1

Or with the regular zip (producing a list of tuples)

In [158]: x=list(range(5))
In [159]: x[1:],x[0:-1]
Out[159]: ([1, 2, 3, 4], [0, 1, 2, 3])
In [160]: list(zip(x[1:],x[0:-1]))
Out[160]: [(1, 0), (2, 1), (3, 2), (4, 3)]

or for lists

In [161]: [list(i) for i in zip(x[1:],x[0:-1])]
Out[161]: [[1, 0], [2, 1], [3, 2], [4, 3]]

This is use of zip is a kind of transpose. numpy arrays also transposes easily:

In [167]: arr=np.array((x[1:],x[:-1]))
In [168]: arr
Out[168]: 
array([[1, 2, 3, 4],
       [0, 1, 2, 3]])
In [169]: arr.T.tolist()
Out[169]: [[1, 0], [2, 1], [3, 2], [4, 3]]

Note that I had to make two copies of the list. Divakar's stride_tricks approach is the only way of creating overlapping 'windows' without copying. It's a more advanced method.

For small lists, I'd suggest sticking with the list approaches. There's overhead associated with creating arrays.

hpaulj
  • 221,503
  • 14
  • 230
  • 353