Return values from a list where difference != 2

Question

I have a list e.g. my_list = [1, 3, 5, 7, 14, 16, 18, 22, 28, 30, 32, 41, 43]

I want a function that will return all values from the list where the difference between that value and previous value is not equal to 2, e.g. the function will return [1, 14, 22, 28, 41] for the above list. Note that the first value of my_list will always appear as the first value of the output. The input lists are of non-zero length and up to the order of 100's.

So far I have this:

def get_output(array):
    start = [array[0]]
    for i in range(1, len(array)-1):
        if (array[i] - array[i-1]) != 2:
            start.append(array[i])

    return start

Is there a vectorised solution that would be faster, bearing in mind I will be applying this function to thousands of input arrays?

@AzatIbrakov that's what I want it to return. First element of output is always first element of input. — Imran, Sep 12 '17 at 07:09
To vectorize your function you need to use numpy. Maybe [this](https://stackoverflow.com/questions/35215161/most-efficient-way-to-map-function-over-numpy-array) may help. — RedEyed, Sep 12 '17 at 07:15

juanpa.arrivillaga · Accepted Answer · 2017-09-12T07:40:45.450

5

To avoid using the inefficient np.concat, use np.ediff1 instead of np.diff, which takes a to_begin argument to pre-pend to the result:

>>> my_list = [1, 3, 5, 7, 14, 16, 18, 22, 28, 30, 32, 41, 43]
>>> arr = np.array(my_list)
>>> np.ediff1d(arr, to_begin=0)
array([0, 2, 2, 2, 7, 2, 2, 4, 6, 2, 2, 9, 2])

So now, using boolean-indexing:

>>> arr[np.ediff1d(arr, to_begin=0) != 2]
array([ 1, 14, 22, 28, 41])

edited Sep 12 '17 at 07:40

answered Sep 12 '17 at 07:37

juanpa.arrivillaga

88,713
10
131
172

2

or just use boolean indexing: `arr[np.ediff1d(arr, to_begin = 0) != 2]` – Daniel F Sep 12 '17 at 07:39

Julien · Answer 2 · 2017-09-12T07:16:03.233

2

Apart from the first element which you can add manually (although it doesn't really make sense as per Azat Ibrakov comment) you can use np.where

a = np.array([1, 3, 5, 7, 14, 16, 18, 22, 28, 30, 32, 41, 43])
a[np.where(a[1:] - a[:-1] != 2)[0] + 1]

array([14, 22, 28, 41])

Adding first element:

[a[0]] + list(a[np.where(a[1:] - a[:-1] != 2)[0] + 1])

[1, 14, 22, 28, 41]

edited Sep 12 '17 at 07:16

answered Sep 12 '17 at 07:10

Julien

13,986
5
29
53

Does a[1:] return a copy object? I mean, how much memory does a[1:] use? As I know, slices return a copy, so it is not memory efficient. – RedEyed Sep 12 '17 at 07:18
No copy: slices are just views in numpy. – Julien Sep 12 '17 at 07:19
Could you prove this (some links)? Because slices of lists is a copy, isn't it? – RedEyed Sep 12 '17 at 07:20
1

try it yourself: modify a slice of a np.array, the original will be modified too. – Julien Sep 12 '17 at 07:21
1

Try `a = np.arange(10)`, `b = a[:5]`, `b[0] = 10`, `print(a)` – Daniel F Sep 12 '17 at 07:24
Boolean and lists of indices create a copy, slices create views. – Daniel F Sep 12 '17 at 07:25
@Julien, thanks! I checked it out and you are right! What about python lists, is there a method to make list slices like numpy slices(views)? – RedEyed Sep 12 '17 at 07:26
1

@DanielF it would take some serious hacking with perhaps `struct` to get the underlying buffer for the `list`. This would rely on python-version-specific implementation details. It would also be entirely unsafe, seeing as Python lists are re-sizable, and the underlying memory would be re-leased and re-allocated somewhere else potentially every time the list re-sizes. – juanpa.arrivillaga Sep 12 '17 at 07:33
@DanielF, Thanks, I'm not. I heard about memoryvie, I'll try it. – RedEyed Sep 12 '17 at 07:33
1

And if you want speed and vectorization, numpy is the way, don't waste time hacking python lists for worse results... – Julien Sep 12 '17 at 07:35
1

I think I created a monster >.< Deleting that comment so no one else tries it. – Daniel F Sep 12 '17 at 07:38

score 2 · Answer 3 · answered Sep 12 '17 at 07:20

2

You could use boolean array indexing for NumPy arrays and np.diff to get the difference between values:

>>> my_list = [1, 3, 5, 7, 14, 16, 18, 22, 28, 30, 32, 41, 43]
>>> import numpy as np
>>> my_arr = np.array(my_list)
>>> my_mask = np.ones(my_arr.shape, dtype=bool)  # initial mask
>>> my_mask[1:] = np.diff(my_arr) != 2           # set all elements to False that have a difference of 2
>>> my_arr[my_mask]                              # mask the array
array([ 1, 14, 22, 28, 41])

answered Sep 12 '17 at 07:20

MSeifert

145,886
38
333
352

1

Might be a little more efficient to initialize `my_mask` with `np.empty(my_arr.shape, dtype=bool)`, `my_mask[0] = True`. No need to fill with ones from the start – Daniel F Sep 12 '17 at 07:23
That's right but the solution already is quite long-ish and the benefit will be quite small. But thank you, I didn't think of that! :) – MSeifert Sep 12 '17 at 07:30

score 0 · Answer 4 · answered Sep 12 '17 at 13:40

0

import numpy as np

my_list = [1, 3, 5, 7, 14, 16, 18, 22, 28, 30, 32, 41, 43]
a = np.array(my_list)
output = a[[True] + list(a[1:]-a[:-1] != 2)]
print(output)

answered Sep 12 '17 at 13:40

FooBar167

2,721
1
26
37

Return values from a list where difference != 2

4 Answers4