New to Python here and trying to see if there is a more elegant solution.
I have a time series data of telematics devices that has motion indicator. I need to expand the motion indicator to +/- 1 row of the actual motion start and stop (denoted by motion2 column below). I was doing it in SQL using case statements and lead and lag window functions. Trying to convert my codes to python...
Here is the data. import pandas as pd
data = {'device':[1,1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2],
'time':[1,2,3,4,5,6,7,8,9,10,11,12,5,6,7,8,9,10,11,12,13,14],
'motion':[0,0,1,1,1,0,0,0,1,1,0,0,0,0,0,1,1,1,0,1,0,0]}
df = pd.DataFrame.from_dict(data)
df = df[['device','time','motion']]
##sort data chronologically for each device
df.sort_values(['device','time'], ascending = True, inplace = True)
This is how df looks like
device, time, motion
1,1,0
1,2,0
1,3,1
1,4,1
1,5,1
1,6,0
1,7,0
1,8,0
1,9,1
1,10,1
1,11,0
1,12,0
2,5,0
2,6,0
2,7,0
2,8,1
2,9,1
2,10,1
2,11,0
2,12,1
2,13,0
2,14,0
What I need is the motion2 column below added to the data frame.
device, time, motion, motion2
1,1,0,0
1,2,0,1
1,3,1,1
1,4,1,1
1,5,1,1
1,6,0,1
1,7,0,0
1,8,0,1
1,9,1,1
1,10,1,1
1,11,0,1
1,12,0,0
2,5,0,0
2,6,0,0
2,7,0,1
2,8,1,1
2,9,1,1
2,10,1,1
2,11,0,1
2,12,1,1
2,13,0,1
2,14,0,0
Below is the python code that does works. However, wondering if there is a more elegant way.
##create new columns for prior and next motion indicator
df['prev_motion'] = df.groupby(['device'])['motion'].shift(1)
df['next_motion'] = df.groupby(['device'])['motion'].shift(-1)
##create the desired motion2 indicator to expand +/- 1 record of the motion
start and stop
df['motion2'] = df[['prev_motion', 'motion', 'next_motion']].apply(lambda
row: 1 if row['motion']==1 else (1 if row['prev_motion']==1 or
row['next_motion']==1 else 0), axis=1)
##drop unwanted columns
df.drop(columns=['prev_motion', 'next_motion'], inplace = True)
This was much easier in SQL using case statement and windows functions (lead and lag).
case
when motion = 1 then 1
when motion = 0 and (lead(motion) over (partition by device order by time) = 1) then 1
when motion = 0 and (lag(motion) over (partition by device order by time) = 1) then 1
else 0
end as motion2