So I have this series of integers shown below
from pandas import Series
s = Series([1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1])
And I want to see how many times the numbers changes over the series, so I can do the following and get the expected result.
[i != s[:-1][idx] for idx, i in enumerate(s[1:])]
Out[5]:
[True,
True,
True,
True,
True,
True,
True,
True,
True,
True,
True,
True,
True,
True,
True,
True,
True,
True,
True,
True,
True,
True,
True,
True,
True,
True,
True,
True,
True,
True,
False,
False,
False,
False,
False,
False,
False,
False,
False]
From there I could just count the number of True's present easy. But this is obviously not the best way to operate on a pandas Series and I'm adding this in a situation where performance matter so I did the below expecting identical results, however I was very surprised and confused.
s[1:].ne(s[:-1])
Out[4]:
0 True
1 False
2 False
3 False
4 False
5 False
6 False
7 False
8 False
9 False
10 False
11 False
12 False
13 False
14 False
15 False
16 False
17 False
18 False
19 False
20 False
21 False
22 False
23 False
24 False
25 False
26 False
27 False
28 False
29 False
30 False
31 False
32 False
33 False
34 False
35 False
36 False
37 False
38 False
39 True
dtype: bool
Not only does the output using the Series.ne
method not make any logical sense to me but the output is also longer than either of the inputs which is especially confusing.
I think this might be related to this https://github.com/pandas-dev/pandas/issues/1134
Regardless I'm curious as to what I'm doing wrong and what the best way to accomplish this would be.
tl;dr:
Where s
is a pandas.Series of int's
[i != s[:-1][idx] for idx, i in enumerate(s[1:])] != s[:-1].ne(s[1:]).tolist()
Edit
Thanks all, reading some of the answers below a possible solution is sum(s.diff().astype(bool)) - 1
however I'm still curious why the above solution doesn't work