0

After long time of interested reading, it´s now time for my first question here.

I try to create a 3D volatility surface out of single data points. The basic structure looks like this:

Strike 42 77 133 224 315 406 595
7.2 NaN NaN NaN NaN NaN NaN 54
13.0 NaN NaN 66 46 60 NaN NaN
14.0 NaN NaN 61 60 58 54 51
15.0 NaN 65 58 57 56 NaN NaN
15.5 74 62 NaN NaN NaN NaN NaN
18.0 62 55 53 51 50 46 45
18.5 59 53 NaN NaN NaN NaN NaN
19.0 57 52 51 50 48 NaN NaN
19.5 56 51 NaN NaN NaN NaN NaN
20.0 54 50 49 48 47 45 42
21.0 51 48 NaN NaN NaN NaN NaN
22.0 49 46 46 45 43 NaN NaN
27.0 46 43 NaN NaN NaN NaN NaN
28.0 46 43 42 41 39 38 37
44.0 NaN NaN NaN NaN NaN NaN 33
48.0 NaN NaN NaN NaN NaN NaN 34

This is a table with random data, but pretty close to my data (my values are all float numbers), to show the basic traits. The column names is the time to maturity in days, the index is the strike price, the corresponding values are the volatilities. I identified two problems (for me, maybe they are no problems):

  1. Randomly distributed NaN values in between the data.
  2. Uneven lenght of index and columns.

In the end I want to be in the situation, that I can assign a volatility value to each random pair of days to maturity and strike. In a 2 dimensional space this was possible by using interp1d (for each maturity band). But with this I have a function for each maturity band. If a value is in between two maturity bands, I don´t know how to calculate it. So, I tried the following on the 3D calculation:

Griddata:

df_test = df2_0.iloc[9:16,0:7]
X=df_test.columns.values
Y=df_test.index.values
Z=df_test.values

test = np.ma.masked_invalid(df_test)

XX,YY = np.meshgrid(np.linspace(min(X),max(X),10),np.linspace(min(Y),max(Y),10))

x1=XX[~test.mask]
y1=YY[~test.mask]
df_test1=test[~test.mask]

d = interpolate.griddata((x1,y1), df_test1.ravel(),(XX,YY),method="cubic")

Griddata was one of the most promising attempts because of the easy set up. It worked when I used iloc to use only extracts of the dataframe above. Without iloc I get the error message: "IndexError: boolean index did not match indexed array along dimension 1; dimension is 34 but corresponding boolean dimension is 7".

Another attempt was to leave out the masking, but then it gives me also the error message "ValueError: invalid number of dimensions in xi":

m= ((int(max(Y))-int(min(Y)))*10)-1
XX,YY = np.meshgrid(np.linspace(min(X),max(X),(max(X)-min(X)+1)),np.linspace(min(Y),max(Y),m))
ZZ = griddata(np.array([X,Y]).T,np.array(Z),(XX,YY), method='linear')

Is there a possibility to use griddata on my dataset? And does it also compute values which are not present in the dataframe above (for example Strike "30" with maturity "100")?interesting

Interp2d and RBFInterpolator:

I also took a look into these commands and they were especially intresting because of the possibility to extrapolate the data. I tried this:

X=df_test.columns.values
Y=df_test.index.values
Z=df_test.values

XX, YY = np.meshgrid(X,Y)

rbf = interp.RBFInterpolator(np.array([X,Y]).T,Z,
                             smoothing=0, kernel='cubic')

There was no error, but when I try to put in numbers into rbf, it only gives me "nan" values. Have i used this one right? Or should I give interp2d a try?

Thank you very much in advance.

Ooni1
  • 1
  • 1
  • Please edit the question to limit it to a specific problem with enough detail to identify an adequate answer. – Community Jan 04 '23 at 02:18

0 Answers0