-1

I am trying to approximate a function that smoothly maps five inputs to a single probability using Keras, but seem to have hit a limit. A similar problem was posed here (Keras Regression to approximate function (goal: loss < 1e-7)) for a ten-dimensional function and I have found that the architecture proposed there, namely:

model = Sequential()

model.add(Dense(128,input_shape=(5,), activation='tanh'))

model.add(Dense(64,activation='tanh'))

model.add(Dense(1,activation='sigmoid'))

model.compile(optimizer='adam', loss='mae')

gives me my best results, converging to a best loss of around 7e-4 on my validation data when the batch size is 1000. Adding or removing more neurons or layers seems to reduce the accuracy. Dropout regularisation also reduces accuracy. I am currently using 1e7 training samples, which took two days to generate (hence the desire to approximate this function). I would like to reduce the mae by another order of magnitude, does anyone have any suggestions how to do this?

desertnaut
  • 57,590
  • 26
  • 140
  • 166
user23
  • 9
  • 1
  • 4

1 Answers1

0

I recommend use utilize the keras callbacks ReduceLROnPlateau, documentation is here and ModelCheckpoint, documentation is here.. For the first, set it to monitory validation loss and it will reduce the learning rate by a factor(factor) if the loss fails to reduce after a fixed number (patience) of consecutive epochs. For the second also monitor validation loss and save the weights for the model with the lowest validation loss to a directory. After training load the weights and use them to evaluate or predict on the test set. My code implementation is shown below.

checkpoint=tf.keras.callbacks.ModelCheckpoint(filepath=save_loc, monitor='val_loss', verbose=1, save_best_only=True,
        save_weights_only=True, mode='auto', save_freq='epoch', options=None)
lr_adjust=tf.keras.callbacks.ReduceLROnPlateau( monitor="val_loss", factor=0.5, patience=1, verbose=1, mode="auto",
        min_delta=0.00001,  cooldown=0,  min_lr=0)
callbacks=[checkpoint, lr_adjust]
history = model.fit_generator( train_generator, epochs=EPOCHS,
          steps_per_epoch=STEPS_PER_EPOCH,validation_data=validation_generator,
          validation_steps=VALIDATION_STEPS, callbacks=callbacks)
model.load_weights(save_loc) # load the saved weights
# after this use the model to evaluate or predict on the test set.
# if you are satisfied with the results you can then save the entire model with
model.save(save_loc)
desertnaut
  • 57,590
  • 26
  • 140
  • 166
Gerry P
  • 7,662
  • 3
  • 10
  • 20
  • Thank you. This didn't help with my original architecture, but it did allow me to train a deeper network to have better results using this callback with sgd. I am still not quite at my goal in terms of accuracy, but will keep experimenting. – user23 Oct 05 '20 at 14:14