How do I make this for loop that calculates the difference between two times more efficient?

Question

I have a for loop that runs through weeks 1 - 8 of NFL offensive plays, throughout each week I have another loop that loops through each unique play. I then take the play marked "ball_snap", and another play marked "pass_forward" and calculate the difference between the two so that I know how long the QB held the ball. Then I append the playId and time_to_throw to 2 separate lists then outside the loop they create my dataframe.[enter image description here]

play_list = []
time_list = []
for week in weeks:
    for play in week.playId.unique():
        time_to_throw = 0
        #Extract the playId, and time for each unique play when ball was snapped
        try:
            ball_snap = week[(week['playId'] == play) & 
             (week['event'] == 'ball_snap')].iloc[0, np.r_[1,4]]

        #convert the time recording when ball was snapped to datetime
            ball_snap_time = datetime.strptime(' '.join(ball_snap.time.split('T',1)), "%Y-%m-%d %H:%M:%S.%f")
            print(ball_snap_time)
        #Extract the playId, and time  for each unique play when ball was thrown forward
            pass_forward = week[(week['playId'] == play) & 
             (week['event'] == 'pass_forward')].iloc[0, np.r_[1,4]]
        #convert the time recording when ball was thrown forward to datetime
            pass_forward_time = datetime.strptime((' '.join(pass_forward.time.split('T',1))), "%Y-%m-%d %H:%M:%S.%f")
            print(pass_forward_time)
        except:
            pass




        #Calculate the time it took from ball snap to ball thrown forward
        time_to_throw = (pass_forward_time - ball_snap_time).total_seconds()
        print(time_to_throw)
        #Now fill data into dictionary and convert to dataframe
        play_list.append(play)
        time_list.append(time_to_throw)

df_ttt = pd.DataFrame({'playId': play_list,
                    'time_to_throw': time_list})

I was successful, kind of. This loop is extremely slow and takes upwards of 15 minutes to execute. I understand that it has to loop through each play and I believe that is my bottleneck. However, I don't think there is any other way to get the time from 'event'= 'ball_snap' and event='pass_forward' for the same play, convert both into datetime and then find the difference. OR maybe there is?

Welcome to Stack Overflow. Please do not provide an image of a code but the code itself as a text. Please consider reading [How to ask](https://stackoverflow.com/help/how-to-ask). — Jérôme Richard, Dec 12 '22 at 19:46
Wow, you sure are looking far and wide for a matching pass_forward within each week's worth of plays. The week granularity seems far too coarse. It's not like we'll see a snap in one game and then it corresponds to a pass in a different game. Recommend you sort values by game and time. Then focus on a ten-minute window following the snap -- the QB can only hold for so long. This is similar to https://stackoverflow.com/a/74765186/8431111 , where ordering rows brought a multi-hour running time down to just three minutes. [Let us know](https://stackoverflow.com/help/self-answer) how it goes! — J_H, Dec 12 '22 at 20:08

How do I make this for loop that calculates the difference between two times more efficient?

0 Answers0