Efficiently find closest points to track in space & time on gridded data

Question

Summary/simplified version

Given a list of track points defined by three 1-dimensional arrays (lats, lons and dtime all with same length) and a gridded 3-dimensional array rr (defined by 2-D lat_radar, lon_radar coordinate arrays and a 1-dimensional time array time_radar) I want to extract all the grid values in rr where the coordinates (latitude, longitude AND time included) are closest to the three 1-dimensional arrays.

I've managed to use cKDTree to select points in space but I don't know how to generalize the solution to space & time together. Right now I have to do the selection on time separately and it makes the code quite bulky and hard to read.

for more details about this problem see hereinafter

Extended version

I'm trying to develop an app that uses precipitation data obtained from weather radar composites to predict the precipitation along a track. Most apps usually predict the precipitation at a point without considering the point moving in time.

The idea is, given points identifying a track in space and time, find the closest grid points from radar data to obtain a precipitation estimate over the track (see plot). The final goal would be to shift the start time to identify the best time to leave to avoid rain.

I just optimized my previous algorithm, that was using plain loops, to use cKDTree from scipy. Execution time went down from 30s to 380ms :). However I think the code can still be optimized. Here is my attempt.

As input we have

lons, lats: coordinates of the track as N-dimensional arrays
dtime: timedelta T-dimensional array containing the time elapsed on the track
lon_radar, lat_radar: M x P matrices containing the coordinates of the radar data
dtime_radar: timedelta Q-dimensional array containing the radar forecast
rr: M x P X Q array containing the radar forecast at every time step

First find the grid points closest to the trajectory using cKDTree:

combined_x_y_arrays = np.dstack([lon_radar.ravel(), 
                                 lat_radar.ravel()])[0]
points_list = list(np.vstack([lons, lats]).T)

def do_kdtree(combined_x_y_arrays, points):
    mytree = cKDTree(combined_x_y_arrays)
    dist, indexes = mytree.query(points)
    return indexes

results = do_kdtree(combined_x_y_arrays, points_list)
# As we have many duplicates, since the itinerary has a much higher resolution than the radar,
# we only select the unique points 
inds_itinerary = np.unique(results)
lon_lat_itinerary = combined_x_y_arrays[inds_itinerary]

then find the closest points in the track to subset it. It doesn't make sense to have a track resolution of 10 m if the radar only has grid points every km.

combined_x_y_arrays = np.vstack([lons, lats]).T
points_list = list(lon_lat_itinerary)

results = do_kdtree(combined_x_y_arrays, points_list)

Now we can use these positions to get the elapsed time on the trajectory and the relative time steps in radar data

dtime_itinerary = dtime[results]
# find indices of these dtimes in radar dtime 
inds_dtime_radar = np.abs(np.subtract.outer(dtime_radar, dtime_itinerary)).argmin(0)

Now we have everything that we need to find the precipitation so we only need one last loop. I also loop on shifts to obtain prediction with different start times.

shifts = (1, 3, 5, 7, 9)

rain = np.empty(shape=(len(shifts), len(inds_itinerary)))
for i, shift in enumerate(shifts):
    temp = []
    for i_time, i_space in zip(inds_dtime_radar, inds_itinerary):
        temp.append(rr[i_time+shift].ravel()[i_space])
    rain[i, :] = temp

In particular I would like to find a way to combine the time search with the lat-lon search for the closest points.

Could you rephrase your question to be more precise and specific? — Serial Lazer, Nov 10 '20 at 11:25
More specific than this? :) I posted a picture and the full working code snippet... Honestly I think already now the question is too long and with too many details. Which parts should I elaborate more? — Droid, Nov 10 '20 at 12:41
What I mean is, quite the opposite. Your question is lost in too many details. Can you rephrase it to ask a specific question instead? — Serial Lazer, Nov 10 '20 at 12:55
Makes sense. I've added a summarized version of the question at the top. I'm not confident in deleting also the detailed part so I will leave it there for now :) — Droid, Nov 10 '20 at 14:37
I think I found out why this is so complicated...https://stackoverflow.com/questions/18135476/datetime-as-a-dimension-in-python-kdtree, https://stackoverflow.com/questions/788005/is-a-kd-tree-suitable-for-4d-space-time-data-x-y-z-time. So probably my question is related to these — Droid, Nov 11 '20 at 08:37

Efficiently find closest points to track in space & time on gridded data

0 Answers0