I have a df
containing rows (sometimes thousands) of data, corresponding to a digital signal. I have added an extra column using:
df['On/Off'] = np.where(df[col] > value, 'On', 'Off')
to label the signal as being on or off (value
is set depending on the signal source). The following code gives an example dataframe albeit without actual measurement data:
df = pd.DataFrame({"Time/s" : np.arange(0,100,2),
"On/Off" : ("Off")})
df.at[10:13,"On/Off"] = "On"
df.at[40:43,"On/Off"] = "On"
df.at[47:,"On/Off"] = "On"
I want to count how many times the signal registers as being on. For the above code, the result would be 2 (ideally with an index returned).
Given how the dataframe is organised, I think going down the rows and looking for pairs of rows where column on/off
reads as 'off' at row n
, then 'on' at row_n+1
should be the approach, as in:
i =0 # <--- number of on/off pairings
if cycle = [row_n]='On'; [row_n+1]='Off':
i=+1
My current plan came from an answer for this (Pandas iterate over DataFrame row pairs)
I think df.shift()
offers a potential route, generating 2 dataframes, and then comparing rows for mismatches, but it feels there could be a simpler way, possibly using itertools, or pd.iterrows (etc.).
As usual, any help is greatly appreciated.