18

I have a multilayer perceptron with a sigmoid loss (tf.nn.sigmoid_cross_entropy_with_logits) and an Adam optimizer (tf.train.AdamOptimizer). My input data has several features and some nan feature-values. When I replace the nan values with 0, I get a result, however, when I do not replace the nan values, I get loss=nan.

What is the best way to handle nan values in TensorFlow, and how can I use my input data with nan values without replacing them with 0?

nbro
  • 15,395
  • 32
  • 113
  • 196
TomDriftwood
  • 591
  • 1
  • 5
  • 8
  • Getting `loss == NaN` is the expected behaviour if you have NaN in your data, because any arithmetic operation involving NaNs results in NaN. Thus yes, you have to remove them. How to do that is left to you to decide, there are several possibilities (replacing with 0 is just one of them) – GPhilo Sep 19 '17 at 08:11
  • 1
    Google "handling missing values" to get an idea of the possibilities and try to figure out which one applies to your case – GPhilo Sep 19 '17 at 08:12
  • You can't do meaningful computations with nan values. I don't know your specific application, but you probably want to ignore these values. If you want to do this in the tensorflow graph, you might have a look at [tf.is_nan](https://www.tensorflow.org/api_docs/python/tf/is_nan), [tf.where](https://www.tensorflow.org/api_docs/python/tf/where), [tf.boolean_mask](https://www.tensorflow.org/api_docs/python/tf/boolean_mask). – chrert Sep 19 '17 at 08:28
  • thank you for your comments. – TomDriftwood Sep 19 '17 at 12:21

1 Answers1

16

Question

How can I somehow tell my network to ignore some input data. For example when the input data is nan

Answer

This is very similar to adding a mask to your input data. You want your input data to pass through, nans turned to zeros, but you want somehow to also signal to the neural network to ignore where the nans were and pay attention to everything else.

In this question about adding a mask I review how a mask can successfully be added to an image but also give a code demonstration for a non-image problem.

  • First create a mask, 1's where data exists in the input and 0's where nan exist.
  • Second, clean up the input converting nans to 0's, or 0.5's, or anything really.
  • Third, stack the mask onto the input. If the input is an image, then the mask becomes another colour channel.

The code in the masking question shows that when the mask is added the neural net is able to learn well and when the mask is not added it is not able to learn well.

Anton Codes
  • 3,663
  • 1
  • 19
  • 28
  • 4
    On further though, I'd go so far as to convert the **nan**s to **random** data to further help the neural network learn to associate the mask with meaningful and meaningless data. – Anton Codes Mar 29 '19 at 18:42
  • Any idea on how to input the additional masking layer into the model? Would you use a 1D convolution layer combining it with the actual data? – Phlogi Apr 11 '20 at 18:27
  • @Phlogi use the same shape as your input, without depth. So, for example, if your input is a 2d color image, then you'll have 3d input (h*w*color). If you have sections of image that are nan, I'm assuming they'll be nan in all color channels. So your new improved input is h*w*c + h*w*1 = h*w*(c+1). If it is instead an input of tokenized words with some nan words, say 50 words, 128 len embed per word, then your input is 50*(128+1), the 1 is the nan flag. – Anton Codes Apr 12 '20 at 22:46
  • Thanks, my dataset has numerical and categorical features. I created the mask using pandas and was adding the nan information with additional crossed feature columns (tf.feature_column.crossed_column). I'm still evaluating what the effects are. – Phlogi Apr 13 '20 at 19:00