1

I have 45000 images of size 224*224, stored as a numpy array. This array, called source_arr has shape 45000,224,224 and it fits in the memory.

I want to divide this array into train, test and validate array and pre-process (normalize and convert greyscale to 3 channels RGB) them using tf.data API.

I have written a pre process function like:

def pre_process(x):
     #Zero centering the scaled dataset
     x_norm = (x-mean_Rot_MIP)/Var_Rot_MIP
     #Stack 3 channels
     x_norm_3ch= np.stack((x_norm, x_norm, x_norm),axis=0)
     print('Rotn MIP 3ch dim:', x_norm_3ch.shape) # (3, 224, 224)
     #converting  channel 1st to channel last move axis 1 to 3
     x_norm_3ch = moveaxis(x_norm_3ch, 0,2) 
     print('Rotn MIP ch last dim: ',x_norm_3ch.shape)  # (224, 224, 3)   
     return x_norm_3ch

X_train_cases_idx.idx contains the index of images from source_arr that are part of training data.

I have read the corresponding training images from source_arr in the dataset object like:

X_train = tf.data.Dataset.from_tensor_slices([source_arr[i] for i in X_train_cases_idx.idx])

And then I apply the pre_process function on the training images like X_train = X_train.map(pre_process)

but I get the following error

Traceback (most recent call last):

  File "<ipython-input-37-69aa131a6944>", line 1, in <module>
    X_train = X_train.map(pre_process)

  File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\data\ops\dataset_ops.py", line 1695, in map
    return MapDataset(self, map_func, preserve_cardinality=True)

  File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\data\ops\dataset_ops.py", line 4045, in __init__
    use_legacy_function=use_legacy_function)

  File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\data\ops\dataset_ops.py", line 3371, in __init__
    self._function = wrapper_fn.get_concrete_function()

  File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\eager\function.py", line 2939, in get_concrete_function
    *args, **kwargs)

  File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\eager\function.py", line 2906, in _get_concrete_function_garbage_collected
    graph_function, args, kwargs = self._maybe_define_function(args, kwargs)

  File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\eager\function.py", line 3213, in _maybe_define_function
    graph_function = self._create_graph_function(args, kwargs)

  File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\eager\function.py", line 3075, in _create_graph_function
    capture_by_value=self._capture_by_value),

  File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\framework\func_graph.py", line 986, in func_graph_from_py_func
    func_outputs = python_func(*func_args, **func_kwargs)

  File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\data\ops\dataset_ops.py", line 3364, in wrapper_fn
    ret = _wrapper_helper(*args)

  File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\data\ops\dataset_ops.py", line 3299, in _wrapper_helper
    ret = autograph.tf_convert(func, ag_ctx)(*nested_args)

  File "C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\autograph\impl\api.py", line 258, in wrapper
    raise e.ag_error_metadata.to_exception(e)

NotImplementedError: in user code:

    <ipython-input-2-746b4230fbd1>:58 pre_process  *
        x_norm_3ch= np.stack((x_norm, x_norm, x_norm),axis=1)
    <__array_function__ internals>:6 stack  **
        
    C:\ProgramData\Anaconda3\lib\site-packages\numpy\core\shape_base.py:419 stack
        arrays = [asanyarray(arr) for arr in arrays]
    C:\ProgramData\Anaconda3\lib\site-packages\numpy\core\shape_base.py:419 <listcomp>
        arrays = [asanyarray(arr) for arr in arrays]
    C:\ProgramData\Anaconda3\lib\site-packages\numpy\core\_asarray.py:138 asanyarray
        return array(a, dtype, copy=False, order=order, subok=True)
    C:\ProgramData\Anaconda3\lib\site-packages\tensorflow\python\framework\ops.py:848 __array__
        " a NumPy call, which is not supported".format(self.name))

    NotImplementedError: Cannot convert a symbolic Tensor (truediv:0) to a numpy array. This error may indicate that you're trying to pass a Tensor to a NumPy call, which is not supported

What am I doing wrong, and how do I fix it? I am using Tensorflow 2.0 with python 3.7 on windows 10

1 Answers1

1

As the error message points out, you are trying to use NumPy functions to operate with TensorFlow tensors. Instead, you should use TensorFlow operations. This is equivalent to what you were trying to do:

def pre_process(x):
     x_norm = (x - mean_Rot_MIP) / Var_Rot_MIP
     # Stacking along the last dimension to avoid having to move channel axis
     x_norm_3ch = tf.stack((x_norm, x_norm, x_norm), axis=-1)
     return x_norm_3ch
jdehesa
  • 58,456
  • 7
  • 77
  • 121
  • Thank you, this solved the issue. Do all Numpy functions have a direct TensorFlow counterpart? – Divya Choudhary Aug 17 '20 at 17:03
  • @DivyaChoudhary For the most part, yes. There are a few ones that do not have an exact counterpart, but it is rare to find something that is not straightforward to "translate" (although not always as efficiently, e.g. `np.isin`). The other thing is that with NumPy you can iterate arrays, use conditionals, etc, while with TF, if in "graph mode" (i.e. within a `@tf.function`) you cannot (or not in a normal way), and you need to use [`tf.numpy_function`](https://www.tensorflow.org/api_docs/python/tf/numpy_function) or [`tf.py_function`](https://www.tensorflow.org/api_docs/python/tf/py_function). – jdehesa Aug 17 '20 at 18:28