Numpy array counter with reset

Question

I have a numpy array with only -1, 1 and 0, like this:

np.array([1,1,-1,-1,0,-1,1])

I would like a new array that counts the -1 encountered. The counter must reset when a 0 appears and remain the same when it's a 1:

Desired output:

np.array([0,0,1,2,0,1,1])

The solution must be very little time consuming when used with larger array (up to 100 000)

Edit: Thanks for your contribution, I've a working solution for now.

I'm still looking for a non-iterative way to solve it (no for loop). Maybe with a pandas Series and the cumsum() method ?

please add how large is target array to your question – Amin Taghikhani Dec 09 '21 at 07:11 — Amin Taghikhani, Dec 09 '21 at 07:11

tdy · Accepted Answer · 2021-12-09T07:41:22.080

Maybe with a pandas Series and the cumsum() method?

Yes, use Series.cumsum and Series.groupby:

s = pd.Series([1, 1, -1, -1, 0, -1, 1])

s.eq(-1).groupby(s.eq(0).cumsum()).cumsum().to_numpy()
# array([0, 0, 1, 2, 0, 1, 1])

Step-by-step

Create pseudo-groups that reset when equal to 0:

groups = s.eq(0).cumsum()
# array([0, 0, 0, 0, 1, 1, 1])

Then groupby these pseudo-groups and cumsum when equal to -1:

s.eq(-1).groupby(groups).cumsum().to_numpy()
# array([0, 0, 1, 2, 0, 1, 1])

Timings

not time consuming when used with larger array (up to 100,000)

groupby + cumsum is ~8x faster than looping, given np.random.choice([-1, 0, 1], size=100_000):

%timeit series_cumsum(a)
# 3.29 ms ± 721 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%timeit miki_loop(a)
# 26.5 ms ± 925 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit skyrider_loop(a)
# 26.8 ms ± 1.36 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

Fatemeh Sangin · Answer 2 · 2021-12-09T10:11:42.457

1

Let's first save your numpy array in a variable:

a = np.array([1,1,-1,-1,0,-1,1])

I define a variabel, count to hold the value you care about, and set it to be zero. Then I define a list to hold the new elements. Let's call it l. Then I iterate on elemnts of a and in each ieration I name the element i. Inside each iteration, I implement the logic:

if i is -1, then increase counter
else, if i is 0, reset the counter
and do nothing otherwise And finally, I append the counter to l. Lastly, convert l to be a numpy array, out.

l = []
count = 0
for i in a:
    if i == -1:
        count+=1
    elif i==0: 
        count = 0
    l.append(count)
out = np.array(l)
out

edited Dec 09 '21 at 10:11

answered Dec 09 '21 at 07:46

Fatemeh Sangin

558
1
4
19

1

While this code may answer the question, [including an explanation](https://meta.stackoverflow.com/questions/392712/explaining-entirely-code-based-answers) of how or why this solves the problem would really help to improve the quality of your post. Remember that you are answering the question for readers in the future, not just the person asking now. Please [edit] your answer to add explanations and give an indication of what limitations and assumptions apply. – ppwater Dec 09 '21 at 09:25
1

dear @ppwater, Is it better now? – Fatemeh Sangin Dec 09 '21 at 10:12

score 1 · Answer 3 · answered Dec 09 '21 at 11:50

I seem to get a 10x speedup over Pandas solution with numba for this benchmark:

from numba import jit

inp1 = np.array([1,1,-1,-1,0,-1,1], dtype=int)
inp2 = np.random.randint(-1, 10, size=10**6)

@jit
def with_numba(arr):
  val = 0
  put = np.zeros_like(arr)
  for i in range(arr.size):
    if arr[i] == -1:
      val += 1
    elif arr[i] == 0:
      val = 0
    put[i] = val

  return put

def with_pandas(inp):
  s = pd.Series(inp)
  return s.eq(-1).groupby(s.eq(0).cumsum()).cumsum().to_numpy()
  
assert (with_numba(inp1) == with_pandas(inp1)).all()
assert (with_numba(inp2) == with_pandas(inp2)).all()

%timeit with_numba(inp2)
# 100 loops, best of 5: 4.57 ms per loop
%timeit with_pandas(inp2)
# 10 loops, best of 5: 46.3 ms per loop

Skyrider Feyrs · Answer 4 · 2021-12-09T06:37:06.820

0

Use a for loop. Set a variable which starts at 1 and reset it each time you encounter a different number. For example:

counter = 1;
outputArray = [];
for number in npArray:
    if number == -1:
        outputArray.append(counter)
        counter += 1
    elif number == 1:
        outputArray.append(0)
    else:
        outputArray.append(0)
        counter = 1
print(outputArray)

edited Dec 09 '21 at 06:37

answered Dec 09 '21 at 06:26

Skyrider Feyrs

82
16

Your solution won't work. When 1 is encounterd the counter must be constant but your solution will append a new 0 in the outputArray. – Lénis Parge Dec 09 '21 at 06:33
If that's a problem, please edit the question to include that. – Skyrider Feyrs Dec 09 '21 at 06:34
This code won't work if npArray is like [-1,-1,1,-1,-1] the output will be [1, 2, 0, 1, 2] but it must be [1,2,0,3,4] if I get the question right – Miki Dec 09 '21 at 06:35
Now I get it. I'll edit :) – Skyrider Feyrs Dec 09 '21 at 06:36
Yes that's right @Miki and thanks Skyrider – Lénis Parge Dec 09 '21 at 06:37
There, i've fixed that – Skyrider Feyrs Dec 09 '21 at 06:37

Miki · Answer 5 · 2021-12-09T07:03:33.910

0

Here is a fix for @skyrider's code

npArray = [1,1,-1,-1,0,-1,1]
counter = 0
outputArray = []
for number in npArray:
    if number == -1:
        counter += 1
        outputArray.append(counter)
    elif number == 0:
        outputArray.append(0)
        counter = 0
    else:
        outputArray.append(counter)
print(outputArray)

edited Dec 09 '21 at 07:03

answered Dec 09 '21 at 06:37

Miki

157
1
8

The problem is when 1 is encountered in the middle: the counter must be constant but your solution will append a new 0 in the outputArray instead – Lénis Parge Dec 09 '21 at 06:48
what do you mean by constant your mean like when 1 is encountered in the middle it should not include it or... – Miki Dec 09 '21 at 06:51
What i mean is when 1 is encountered the chain must append the current state of the counter. np.array([1,1,-1,-1,0,-1,1]) becomes: np.array([0,0,1,2,0,1,1]) – Lénis Parge Dec 09 '21 at 06:56
It's more like a cumsum with reset when 0 appears – Lénis Parge Dec 09 '21 at 07:02
Ok so I think it's fixed now check it – Miki Dec 09 '21 at 07:06

Numpy array counter with reset

5 Answers5

Step-by-step

Timings