I am running a very large function on a very large file (30 gigs). Because python is slow, I decided to try implement this function in numba. After initial reading it seems as if numba is very archaic in terms of manipulation ability with datetime, only allowing for manipulation on np.datetime64 objects and only looking at timedeltas and very basic np.datetime64 operations.
One of the columns in the file is a datetime object. One of the checks I need to run is to check if the day changed (which is defined as 5:00 pm in the timezone of the dataset), and perform operations if the day changed. Unfortunately, I have not found a clean solution where I can work on the numpy datetime64 object to perform this check, and was wondering if there was a way to do this.
Currently, the function takes in an integer array for year, month, week, weekday, day, hour, minute, and second, and this is how I am working with time in the numba function, very inefficient.
# What I have right now:
@nb.jit
def check(hour):
for i in range(1, len(hour)-1):
if hour[i-1] == 4 and hour[i] == 5:
# run code
else:
pass
# What I would Like (timestamp is a numpy datetime64 array):
@nb.jit
def check(timestamp):
if hour(timestamp)[i-1] == 4 and hour(timestamp[i]) == 5:
# Run code
else:
pass
Return the same thing that I am doing now without the function needing to use integer array variables.