2

I am designing a model with two outputs, y and dy, where I have much more training data for y than dy while the location (x) of those data points are the same (please check the image bellow).

I am handling this issue with sample_weight in keras.model.fit. There are two concerns:

  1. If I pass 'zero' for a sample weight, after the first training, it results into NaN. I instead have to pass a very small number, which I am not sure how it affects the training.

  2. This is inefficient if I have multiple outputs with many of them have available training data at very few locations. Because, all the training data will be included in the updates. Is there any other way to handle this case?

Note that Keras works fine training the model, however, I am looking for more efficient way to also be able to pass zero for unwanted weights.

Unbalanced training data for y vs dy

Please check the code bellow:

import numpy as np
import keras as k
import tensorflow as tf
from matplotlib.pyplot import plot, show, legend

# Note this is needed to handle lambda layers as Keras' gradient does not work in this setup. 
def custom_grad(y, x):
    return tf.gradients(y, x, unconnected_gradients='zero', colocate_gradients_with_ops=True)

# Setting up keras model.
x = k.Input((1,), name='x', dtype='float32')
lay = k.layers.Dense(10, activation='tanh')(x)
lay = k.layers.Dense(10, activation='tanh')(lay)
y = k.layers.Dense(1, name='y')(lay)
dy = k.layers.Lambda(lambda f: custom_grad(f, x), name='dy')(y)
model = k.Model(x, [y, dy])

# Preparing training data.
num_samples = 10000
x_true = np.linspace(0.0, np.pi, num_samples)
y_true = np.sin(x_true)
dy_true = np.zeros_like(y_true)

# for dy, we only have values at certain points -
# say 10% of what is available for yfrom initial and the end.
percentage = 0.1
dy_ids = np.concatenate((np.arange(0, num_samples*percentage, dtype=int),
                         np.arange(num_samples*(1-percentage), 10000, dtype=int)))
dy_true[dy_ids] = np.cos(x_true[dy_ids])

# I use sample weight to circumvent unbalanced available data.
y_sample_weight = np.ones_like(y_true)
dy_sample_weight = np.zeros_like(y_true) + 1.0e-8
dy_sample_weight[dy_ids] = num_samples/dy_ids.size
assert abs(dy_sample_weight.sum() - num_samples) <= 1.0e-3

# training the model.
model.compile("adam", loss="mse")
model.fit(x_true, [y_true, dy_true],
          sample_weight=[y_sample_weight, dy_sample_weight],
          epochs=50, shuffle=True)

[y_pred, dy_pred] = model.predict(x_true)

# expected outputs.
plot(x_true, y_true, '.k', label='y true')
plot(x_true[dy_ids], dy_true[dy_ids], '.r', label='dy true')
plot(x_true, y_pred, '--b', label='y pred')
plot(x_true, dy_pred, '--b', label='dy pred')
legend()
show()
Ehsan
  • 63
  • 5
  • Did you mean to say `class_weight`? Because I think you're not using it correctly (purely from looking at the description and not the code) [Difference between class_weight and sample_weight](https://stackoverflow.com/questions/43459317/keras-class-weight-vs-sample-weights-in-the-fit-generator) – IanQ Jul 16 '19 at 19:47
  • No, I meant `sample_weight`. Class weight only applies a factor to different losses and does not handle imbalance data availability, to my understanding. – Ehsan Jul 17 '19 at 02:16

0 Answers0