14

I am trying to convert the shape property of a Tensor in Tensorflow 2.1 and I get this error:

AttributeError: 'Tensor' object has no attribute 'numpy'

I already checked that the output of tf.executing eagerly() is True,

A bit of context: I load a tf.data.Dataset from a TFRecords, then I apply a map. The maping function is trying to convert the shape property of one of the dataset sample Tensor to numpy:

def _parse_and_decode(serialized_example):
    """ parse and decode each image """
    features = tf.io.parse_single_example(
        serialized_example,
        features={
            'encoded_image': tf.io.FixedLenFeature([], tf.string),
            'kp_flat': tf.io.VarLenFeature(tf.int64),
            'kp_shape': tf.io.FixedLenFeature([3], tf.int64),
        }
    )
    image = tf.io.decode_png(features['encoded_image'], dtype=tf.uint8)
    image = tf.cast(image, tf.float32)

    kp_shape = features['kp_shape']

    kp_flat = tf.sparse.to_dense(features['kp_flat'])
    kp = tf.reshape(kp_flat, kp_shape)

    return image, kp


def read_tfrecords(records_dir, batch_size=1):
    # Read dataset from tfrecords
    tfrecords_files = glob.glob(os.path.join(records_dir, '*'))
    dataset = tf.data.TFRecordDataset(tfrecords_files)
    dataset = dataset.map(_parse_and_decode, num_parallel_calls=batch_size)
    return dataset


def transform(img, labels):
    img_shape = img.shape  # type: <class 'tensorflow.python.framework.ops.Tensor'>`
    img_shape = img_shape.numpy()  # <-- Throws the error
    # ...    

dataset = read_tfrecords(records_dir)

This throws the error:

dataset.map(transform, num_parallel_calls=1)

While this perfecly works:

for img, labels in dataset.take(1):
    print(img.shape.numpy())

Edit: trying to access the img.numpy() instead of img.shape.numpy() results in the same behavior in the tranformer and the codde just above.

I checked the type of img_shape and it is <class 'tensorflow.python.framework.ops.Tensor'>.

Has anyone solved this sort of issue in new versions of Tensorflow?

Timbus Calin
  • 13,809
  • 5
  • 41
  • 59
Nick Skywalker
  • 1,027
  • 2
  • 10
  • 26
  • 1
    is the shape of `img` completely defined? If its shape contains `None` in any one of the dimensions, then this could happen – learner Feb 22 '20 at 05:37
  • I edited my post to add a bit more context. I'm using `tf.io.decode_png` to parse `img` so I'm guessing that the shape is known, isn't it? Also calling `numpy()`on `img` instead of it shape give me the same behavior... The weird thing is that all of this does not result in an error if I do this in elements from `dataset.take()` instead of inside the `map`... – Nick Skywalker Feb 22 '20 at 10:27

2 Answers2

30

The problem in your code is that you cannot use .numpy() inside functions that are mapped onto tf.data.Datasets, because .numpy() is Python code not pure TensorFlow code.

When you use a function like my_dataset.map(my_function), you can only use tf.* functions inside your my_function function.

This is not a bug of TensorFlow 2.x versions, but rather on how static graphs are generated behind the scenes for performance purposes.

If you want to use custom Python code inside a function which you map on your dataset, you have to use tf.py_function(), docs: https://www.tensorflow.org/api_docs/python/tf/py_function. There is really no other way to mix Python code and TensorFlow code when mapping on a dataset.

You can also consult this question for further information; it's the exact question that I asked a couple of months ago: Is there an alternative to tf.py_function() for custom Python code?

Timbus Calin
  • 13,809
  • 5
  • 41
  • 59
  • Thanks! I think I can't convert everything from my pipeline to native tensorflow operations as I need to dynamically populate a list based on tensor values.... Do you think I can still gain performance by mapping everything I can in pure tensorflow ops and then doing the dynamic part in another `map` calling `tf.py_func`? – Nick Skywalker Feb 25 '20 at 09:58
  • You can still gain some performance, but I would benchmark to clearly see the differences/gains/losses in time performance. – Timbus Calin Mar 04 '20 at 06:45
  • This is what i was looking for. Is there a way to inspect the value of those tensors inside the function? For debug purposes – Mattia Surricchio Feb 12 '22 at 21:37
  • Yes. There is tf.print() instead of pure Python print(), one easy solution. – Timbus Calin Feb 13 '22 at 09:27
0

To expand the answer of Timbus Calin further, here's an implementation example for your use case using the tf.py_function(). Keeping the transform function you have as it is, you should change the dataset.map like this:

dataset.map(lambda img, labels: tf.py_function(transform, 
                                               inp=[img, labels], 
                                               Tout=[tf.float64, tf.float32]))

Change the output types in Tout according to your data.

Aelius
  • 1,029
  • 11
  • 22