2

I've a gridded weather data set which have a dimension 33 X 77 X 77. The first dimension is time and rest are Lat and Lon respectively. I need to interpolate (linear or nearest neighbour) the data to different points (lat&lon) for each time and write it into a csv file. I've used interp2d function from scipy and it is successful for one time step. As I've many locations I don't want to loop over time.

below shown is the piece of code that I wrote, Can any one suggest a better method to accomplish the task?

import sys ; import numpy as np ; import scipy as sp ; from scipy.interpolate import interp2d ;import datetime ; import time ; import pygrib as pg ; 
grb_f=pg.open('20150331/gfs.20150331.grb2')  lat=tmp[0].data(lat1=4,lat2=42,lon1=64,lon2=102)[1] ; lat=lat[:,0]; 
lon=tmp[0].data(lat1=4,lat2=42,lon1=64,lon2=102)[2] ; lon=lon[0,:] ; 
temp=np.empty((0,lon.shape[0]))
for i in range(0,tmp.shape[0]):
    dat=tmp[i].data(lat1=4,lat2=42,lon1=64,lon2=102)
    temp=np.concatenate([temp,dat[0]-273.15],axis=0)
temp1=temp.reshape(tmp.shape[0],lat.shape[0],lon.shape[0])
x=77 ; y=28 #(many points) 
f=interp2d(lon,lat, temp1[0,:,:],kind='linear',copy=False,bounds_error=True ) ; Z=f(x,y)  

EDIT ::

Instead of making a 3D matrix, I appended the data in vertically and made data matrix of size 2541 X 77 and lat and lon of size 2541 X 1. the interp2d function gives Invalid length Error.

f=interp2d(lon,lat, temp1[0,:,:],kind='linear',copy=False,bounds_error=True )

"Invalid length for input z for non rectangular grid")

ValueError: Invalid length for input z for non rectangular grid

length of my x,y,z matrix are same (2541,2541,2541). Then why did it throw an Error? Could any one explain ? Your help will be highly appreciated.

pkv
  • 107
  • 1
  • 11

3 Answers3

1

Processing of time series is very easy with RedBlackPy.

 import datetime as dt
 import redblackpy as rb

 index = [dt.date(2018,1,1), dt.date(2018,1,3), dt.date(2018,1,5)]
 lat = [10.0, 30.0, 50.0]
 # create Series object
 lat_series = rb.Series(index=index, values=lat, dtype='float32',
                        interpolate='linear')
 # Now you can access at any key using linear interpolation
 # Interpolation does not create new items in Series
 # It uses neighbours to calculate value inplace when you call getitem
 print(lat_series[dt.date(2018,1,2)]) #prints 20

So, if you want to just write interpolated values to csv file, you can iterate over list of needed keys and call getitem of Series object then put value to file:

 # generator for dates range
 def date_range(start, stop, step=dt.timedelta(1)):

     it = start - step

     while it < step:
         it += step
         yield it

 #------------------------------------------------
 # create list for keeping output strings
 out_data = []    
 # create output file
 out_file = open('data.csv', 'w')
 # add head for output table
 out_data.append('Time,Lat\n')

 for date in date_range(dt.date(2018,1,1), dt.date(2018,1,5)):
     out_data.append( '{:},{:}\n'.format(date, lat_series[date]) )

 # write output Series
 out_file.writelines(out_data)
 out_file.close()

By the same way you can add to your processing Lon data.

0

If it's the same lat and lon for each time could you do it using slices and a manual interpolation. So if you want a 1D array of values at lat = 4.875, lon = 8.4 (obviously you would need to scale to match your actual spacing)

b = a[:,4:6, 8:10]
c = ((b[:,0,0] * 0.125 + b[:,0,1] * 0.875) * 0.6 + ((b[:,1,0] * 0.125 + b[:,1,1] * 0.875) * 0.4)

obviously you could do it all in one line but it would be even uglier

EDIT to allow variable lat and lon at each time period.

lat = np.linspace(55.0, 75.0, 33)
lon = np.linspace(1.0, 25.0, 33)
data = np.linspace(18.0, 25.0, 33 * 77 * 77).reshape(33, 77, 77)

# NB for simplicity I map 0-360 and 0-180 rather than -180+180
# also need to ensure values on grid lines or edges work ok
lat_frac = lat * 77.0 / 360.0
lat_fr = np.floor(lat_frac).astype(int)
lat_to = lat_fr + 1
lat_frac -= lat_fr

lon_frac = lon * 77.0 / 180.0
lon_fr = np.floor(lon_frac).astype(int)
lon_to = lon_fr + 1
lon_frac -= lon_fr

data_interp = ((data[:,lat_fr,lon_fr] * (1.0 - lat_frac) +
                data[:,lat_fr,lon_to] * lat_frac) * (1.0 - lon_frac) +
               (data[:,lat_to,lon_fr] * (1.0 - lat_frac) +
                data[:,lat_to,lon_to] * lat_frac) * lon_frac)
paddyg
  • 2,153
  • 20
  • 24
  • No, my lat and lon are not same. Rather the points spans around the globe. so slicing is not a possible option. I need a method, free from loops. – pkv Apr 21 '15 at 04:55
  • thinking about it, you can use numpy advanced indexing with arrays of integers instead of 4,6,8,10 and arrays of floats instead of 0.125, 0.875 etc. I will sketch out a more detailed answer but if lat1=[4,4,5,5,5,.. lat2=[5,5,6,6,.. lon1=[8,9,10,10... lon2=[9,10,11,11.. then b=a[:,[lat1,lat2],[lon1,lon2]] – paddyg Apr 21 '15 at 06:32
0

If you want to create an "interpolator" object once, and use it to sequentially query just the specific points you need, you could take a loot at the scipy.interpolate.Rbf module:

"A class for radial basis function approximation/interpolation of n-dimensional scattered data."

Where n-dimensional would work for your data if you adjust ratio between temporal and spatial dimensions, and scattered meaning you can also use it for regular/uniform data.

heltonbiker
  • 26,657
  • 28
  • 137
  • 252