I have a for loop that runs through weeks 1 - 8 of NFL offensive plays, throughout each week I have another loop that loops through each unique play. I then take the play marked "ball_snap", and another play marked "pass_forward" and calculate the difference between the two so that I know how long the QB held the ball. Then I append the playId and time_to_throw to 2 separate lists then outside the loop they create my dataframe.[enter image description here]
play_list = []
time_list = []
for week in weeks:
for play in week.playId.unique():
time_to_throw = 0
#Extract the playId, and time for each unique play when ball was snapped
try:
ball_snap = week[(week['playId'] == play) &
(week['event'] == 'ball_snap')].iloc[0, np.r_[1,4]]
#convert the time recording when ball was snapped to datetime
ball_snap_time = datetime.strptime(' '.join(ball_snap.time.split('T',1)), "%Y-%m-%d %H:%M:%S.%f")
print(ball_snap_time)
#Extract the playId, and time for each unique play when ball was thrown forward
pass_forward = week[(week['playId'] == play) &
(week['event'] == 'pass_forward')].iloc[0, np.r_[1,4]]
#convert the time recording when ball was thrown forward to datetime
pass_forward_time = datetime.strptime((' '.join(pass_forward.time.split('T',1))), "%Y-%m-%d %H:%M:%S.%f")
print(pass_forward_time)
except:
pass
#Calculate the time it took from ball snap to ball thrown forward
time_to_throw = (pass_forward_time - ball_snap_time).total_seconds()
print(time_to_throw)
#Now fill data into dictionary and convert to dataframe
play_list.append(play)
time_list.append(time_to_throw)
df_ttt = pd.DataFrame({'playId': play_list,
'time_to_throw': time_list})
I was successful, kind of. This loop is extremely slow and takes upwards of 15 minutes to execute. I understand that it has to loop through each play and I believe that is my bottleneck. However, I don't think there is any other way to get the time from 'event'= 'ball_snap' and event='pass_forward' for the same play, convert both into datetime and then find the difference. OR maybe there is?