2

I am writing a mapping function for a dataset in Tensorflow 2. The dataset contains several images and the corresponding labels, more specifically there are only three possible values for the labels, 13, 17 and 34. The mapping function is supposed to take the labels and convert them into categorical labels.

There might be better ways to implement this function (please feel free to suggest), but this is my implementation:

def map_labels(dataset):

    def convert_labels_to_categorical(image, labels):
        labels = [1.0, 0., 0.] if labels == 13 else [0., 1.0, 0.] if labels == 17 else [0., 0., 1.0]

        return image, labels

        categorical_dataset = dataset.map(convert_labels_to_categorical)

    return categorical_dataset

The main issue is that I get the error below:

OperatorNotAllowedInGraphError: using a `tf.Tensor` as a Python `bool` is not allowed: AutoGraph is disabled in this function. Try decorating it directly with @tf.function.

I really have no idea about what this error means, and there are not so many other sources on the internet documenting the same error. Any idea?

EDIT (new non-working implementation):

def map_labels(dataset):

    def convert_labels_to_categorical(image, labels):
        labels = tf.Variable([1.0, 0., 0.]) if tf.reduce_any(tf.math.equal(labels, tf.constant(0,dtype=tf.int64))) \
        else tf.Variable([0., 1.0, 0.]) if tf.reduce_any(tf.math.equal(labels, tf.constant(90,dtype=tf.int64))) \
        else tf.Variable([0., 0., 1.0])

        return image, labels

    categorical_dataset = dataset.map(convert_labels_to_categorical)

    return categorical_dataset
Phys
  • 508
  • 3
  • 11
  • 22

2 Answers2

1

I found a working solution. First, I created a dictionary, then the dataset:

dictionary = {"data":data, "labels":labels.astype('int32')}
dataset = tf.data.Dataset.from_tensor_slices(dict(dictionary))

This allows me to easily access the data and the labels inside dataset. There might be other ways that do not require using a dictionary, but this one works for me. For the mapping I used:

def map_labels(dataset):

    def convert_labels_to_categorical(dataset):
        if dataset['labels'] ==  tf.constant(13):
            dataset['labels'] = tf.constant([1, 0, 0]) 

        elif dataset['labels'] == tf.constant(17):
            dataset['labels'] =  tf.constant([0, 1, 0])

        elif dataset['labels'] == tf.constant(34):
            dataset['labels'] = tf.constant([0, 0, 1])

        return dataset

    categorical_dataset = dataset.map(convert_labels_to_categorical)

    return categorical_dataset

After mapping the dataset, if I inspect it with categorical_dataset.element_spec, I get:

{'data': TensorSpec(shape=(32, 32, 3), dtype=tf.uint8, name=None), 'labels': TensorSpec(shape=(None,), dtype=tf.int32, name=None)}

and if I print the elements, the new categorical labels are correctly assigned to the corresponding image. In summary, == and = are still working for tf variables.

Phys
  • 508
  • 3
  • 11
  • 22
0

Your problem comes from the fact that you intermix Python code with TensorFlow code.

Practically, in your map function you use arbitrary Python code, and not exclusively optimized TensorFlow code.

In a map function, you may only use functions that belong to the tf* category. If you still want to use arbitrary Python code, you need to use the tf.py_function() library.

You may want to consult this thread to get a better overview:

Is there an alternative to tf.py_function() for custom Python code?

To solve your problem, you need to exclusively use functions from the tf module, such as tf.strings, tf.bool etc.

Timbus Calin
  • 13,809
  • 5
  • 41
  • 59
  • Thanks, I slightly modified my function to take into account your suggestion. To my understanding, I got rid of any Python function, but there is still something missing, because I still get the same error. I definitely want to use pure TensorFlow 2 code, no Python. – Phys Apr 17 '20 at 23:02
  • Also, I do not get why the error is mentioning "using a `tf.Tensor` as a Python `bool` is not allowed". Where is that Python bool? – Phys Apr 17 '20 at 23:05
  • Have you tried using functions such as: tf.math.equal() ? – Timbus Calin Apr 18 '20 at 07:55
  • tf.math.equal(labels,tf.constant[1,0,0]) – Timbus Calin Apr 18 '20 at 07:56
  • Thanks, I forgot about that. I inserted tf.math.equal() and also added tf.reduce_any(), same error message. There is still something not right. – Phys Apr 18 '20 at 11:47
  • I think the final solution to this one is to use tf.cond() instead of 'Python if'. – Timbus Calin Apr 18 '20 at 14:22
  • Have a look here: https://www.tensorflow.org/api_docs/python/tf/cond – Timbus Calin Apr 18 '20 at 14:22
  • Or, as an alternative to using tf.cond(), you may resort to @tf.function decorator. – Timbus Calin Apr 18 '20 at 14:23
  • Ok, I tried with the tf.cond() but I get other errors. Also, I believe this is not strictly necessary, as in the past I managed to use if else conditions with tf datasets. – Phys Apr 18 '20 at 16:46
  • What version of TensorFlow are you using? Exactly 2.0.0? – Timbus Calin Apr 18 '20 at 16:47
  • Those datasets, unlike the one I am using in this thread, were obtained from pandas dataframes by using `tf.data.Dataset.from_tensor_slices(dict(dataframe))`. As a result, I could just access the dataset by using `dataset['variable_name']`, which is what I should do here. Do you know how to set names for the variables in a dataset? – Phys Apr 18 '20 at 16:48
  • I am using version 2.1.0 – Phys Apr 18 '20 at 16:49