I am new to python and pandas. I am trying to assign new session IDs for around 2270 users, based on the time difference between the timestamps. If the time difference exceeds 4 hours, I want a new session ID. Otherwise, it would have to remain the same. In the end, I want a modified data frame with the new session ID column. Here is what I have so far:
Eh_2016["NewSessionID"] = 1 #Initialize 'NewSessionID' column in df with 1
Eh_2016['elapsed'] = datetime.time(0,0,0,0) #Create an empty elapsed to calculate Time diff later
users = Eh_2016['Username'].unique() #find the number of unique Usernames
for user in users: #start of the loop
idx = Eh_2016.index[Eh_2016.Username == user] #Slice the original df
temp = Eh_2016[Eh_2016.Username == user] #Create a temp placeholder for the slice
counter = 1 # Initialize counter for NewSessionIDs
for i,t in enumerate(temp['Timestamp']): #Looping for each timestamp value
if i != 0 :
temp['elapsed'].iloc[i] = (t - temp['Timestamp'].iloc[i-1]) #Calculate diff btwn timestamps
if temp['elapsed'].iloc[i] > datetime.timedelta(hours = 4): #If time diff>4
counter +=1 #Increase counter value
temp['NewSessionID'].iloc[i]=counter #Assign new counter value as NewSessionID
else:
temp['NewSessionID'].iloc[i] = counter #Retain previous sessionID
Eh_2016.loc[idx,:]= temp #Replace original df with the updated slice
Any help on how to make this faster would be greatly appreciated! Let me know if you need more details. Thanks in advance.
Edit: Sample DF
Username Timestamp NewSessionID Elapsed
126842 1095513 2016-06-30 20:58:30.477 1 00:00:00
126843 1095513 2016-07-16 07:54:47.986 2 15 days 10:56:17.509000
126844 1095513 2016-07-16 07:54:47.986 2 0 days 00:00:00
126845 1095513 2016-07-16 07:55:10.986 2 0 days 00:00:23
126846 1095513 2016-07-16 07:55:13.456 2 0 days 00:00:02.470000
... ... ... ...
146920 8641894 2016-08-11 22:26:14.051 31 0 days 04:50:23.415000
146921 8641894 2016-08-11 22:26:14.488 31 0 days 00:00:00.437000
146922 8641894 2016-08-12 20:01:02.419 32 0 days 21:34:47.931000
146923 8641894 2016-08-23 10:19:05.973 33 10 days 14:18:03.554000
146924 8641894 2016-09-25 11:30:35.540 34 33 days 01:11:29.567000