Getting Nan for custom loss function in Keras for time series regression encoder-decoder model

Question

I have read some of the other topics that are about this issue, but I don't understand how to solve it in my case as the loss functions they use are usually a lot more complex than mine. I believe the answer is about adding a small value like 1e-10 somewhere in the loss function, but I don't know where. My data has 6 time series that I transformed to log return. I set up an encoder-decoder model that takes in batches of size 60 and predictions of the following 30 values for all 6 time series. I have tried adding 1e-10 to various parts of the loss function but when I run the model it still has loss of Nan. Also, in case it matters, I plan to use MinMaxScaler to transform the data in my next attempt rather than using log returns. Not sure if this is important to mention, but just in case it changes how to fix this problem. I have tried changing various other aspects as well: I used the original non-transformed data, tried tanh activations for the LSTM layers and different optimisers as well, no matter what it still comes out to Nan for loss. If I change loss to 'mse' then it does seem to work and the loss is around 1.5e-6 for both training and validation. Any ideas? Thanks!

Here is the transformation I applied to the data just in case:

data_lr = pd.DataFrame()

for column in data.columns:
  data_lr['log_return_'+column] = np.log(data[column]/data[column].shift(1))

data_lr.dropna(inplace=True)
data_lr.head()

And here is the first rows of the data to show what the values are like:

    log_return_sample_0     log_return_sample_1     log_return_sample_2     log_return_sample_3     log_return_sample_4     log_return_sample_5
day.minute                      
0.1     -0.001386   -0.001578   -0.001115   -0.000758   -0.000910   0.000223
0.2     0.001386    0.000226    -0.002514   -0.002847   -0.003647   0.000669
0.3     0.000346    0.001353    -0.000839   -0.000951   -0.001600   0.001336
0.4     0.000692    0.000676    0.000839    0.000380    0.000457    0.000667
0.5     0.000000    -0.000450   0.000000    0.000570    0.000914    0.002443

Here is the loss function:

from keras.models import Sequential
from keras.layers import LSTM
from keras.layers import Dense
from keras.layers import RepeatVector
from keras.layers import TimeDistributed
import tensorflow.keras.backend as kb

def custom_loss_function(y_actual, y_hat):
  custom_loss_value = kb.mean(kb.abs(kb.log(y_hat/y_actual)))
  return custom_loss_value

Here is the model:

model_LSTM = Sequential()
model_LSTM.add(LSTM(200, activation='relu', input_shape=(n_steps_in, num_features)))
model_LSTM.add(RepeatVector(n_steps_out))
model_LSTM.add(LSTM(200, activation='relu', return_sequences=True))
model_LSTM.add(TimeDistributed(Dense(num_features)))
model_LSTM.compile(optimizer='adam', loss=custom_loss_function)

Here's the model summary:

Model: "sequential_2"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
lstm_3 (LSTM)                (None, 200)               165600    
_________________________________________________________________
repeat_vector_2 (RepeatVecto (None, 30, 200)           0         
_________________________________________________________________
lstm_4 (LSTM)                (None, 30, 200)           320800    
_________________________________________________________________
time_distributed_2 (TimeDist (None, 30, 6)             1206      
=================================================================
Total params: 487,606
Trainable params: 487,606
Non-trainable params: 0
_________________________________________________________________
None

And finally what I ran:

model_LSTM.fit(X, y, epochs=10, verbose=2, validation_split=0.1)

EDIT: Just in case, here are the original values untransformed for the first few rows:

    sample_0    sample_1    sample_2    sample_3    sample_4    sample_5
day.minute                      
0.0     28.88   44.39   35.89   52.81   43.99   44.84
0.1     28.84   44.32   35.85   52.77   43.95   44.85
0.2     28.88   44.33   35.76   52.62   43.79   44.88
0.3     28.89   44.39   35.73   52.57   43.72   44.94
0.4     28.91   44.42   35.76   52.59   43.74   44.97

the problems may occur dividing by 0 and using log on negative value — Marco Cerliani, Jun 05 '20 at 08:47
@MarcoCerliani I tried this with the original untransformed values, and I checked the csv file to make sure none of the values in the origina 6 time series are 0. The loss function divides by `y_actual` so if none of them are 0, then that shouldn't be the case right? I'm not sure if it is using log on negative values though, how can I check this? Or rather, what do I do in general to solve this and get the custom loss function to work? Thanks! — Kevin, Jun 05 '20 at 08:52
if you are sure to not have 0 values in y_actual try in this way kb.mean(kb.abs(kb.log(kb.maximum(y_hat/y_actual, kb.epsilon())))) — Marco Cerliani, Jun 05 '20 at 09:14
@MarcoCerliani Thanks! I tried it, and it does produce a value in the loss now. However, I noticed something. When I use untransformed values (I added the first few rows of untransformed values at the end of my origina post) the loss seems to end up being around 6-7 for training and validation. However, if I use log return transformed values, then in the first epoch the training loss is inf and validation loss is 16.1181, then in the second epoch, training and validation loss are both 16.1181. Any idea why this is happening when using the transformed data? Thanks! — Kevin, Jun 05 '20 at 09:29

Getting Nan for custom loss function in Keras for time series regression encoder-decoder model

0 Answers0