3

I try to use TextLineDataset and Tokenizer together but I run into problems when using it with #map. I get an error that it is not possible to iterate over a Tensor. I think I understand, that #texts_to_sequence tries to run on the graph representation. Is that correct? Is it even possible to combine those two parts? Does it make sense to combine them?

from keras.preprocessing.text import Tokenizer
import tensorflow as tf

# Example used from
# https://stackoverflow.com/a/51203923/998092

docs = ["A heart that",
         "full up like",
         "a landfill",
        "no surprises",
        "and no alarms"
         "a job that slowly"
         "Bruises that",
         "You look so",
         "tired happy",
         "no alarms",
        "and no surprises"]

T = Tokenizer()
T.fit_on_texts(docs)

def encode(sentence):
  return T.texts_to_sequences(sentence)

data = tf.data.TextLineDataset.from_tensor_slices(docs)
encoded_data = data.map(encode)

print("result for test 1:\n%s" %(data))

Minimal example in collab

Resulting in:

WARNING:tensorflow:Entity <bound method Tokenizer.texts_to_sequences_generator of <keras_preprocessing.text.Tokenizer object at 0x7f9089f5de48>> appears to be a generator function. It will not be converted by AutoGraph.
WARNING: Entity <bound method Tokenizer.texts_to_sequences_generator of <keras_preprocessing.text.Tokenizer object at 0x7f9089f5de48>> appears to be a generator function. It will not be converted by AutoGraph.

---------------------------------------------------------------------------

OperatorNotAllowedInGraphError            Traceback (most recent call last)

<ipython-input-8-46695a877229> in <module>()
     25 
     26 data = tf.data.TextLineDataset.from_tensor_slices(docs)
---> 27 encoded_data = data.map(encode)
     28 
     29 print("result for test 1:\n%s" %(data))

10 frames

/usr/local/lib/python3.6/dist-packages/tensorflow/python/autograph/impl/api.py in wrapper(*args, **kwargs)
    263       except Exception as e:  # pylint:disable=broad-except
    264         if hasattr(e, 'ag_error_metadata'):
--> 265           raise e.ag_error_metadata.to_exception(e)
    266         else:
    267           raise

OperatorNotAllowedInGraphError: in user code:

    <ipython-input-8-46695a877229>:24 encode  *
        return T.texts_to_sequences(sentence)
    /usr/local/lib/python3.6/dist-packages/keras_preprocessing/text.py:279 texts_to_sequences  *
        return list(self.texts_to_sequences_generator(texts))
    /usr/local/lib/python3.6/dist-packages/keras_preprocessing/text.py:298 texts_to_sequences_generator  **
        for text in texts:
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py:561 __iter__
        self._disallow_iteration()
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py:557 _disallow_iteration
        self._disallow_in_graph_mode("iterating over `tf.Tensor`")
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py:537 _disallow_in_graph_mode
        " this function with @tf.function.".format(task))

    OperatorNotAllowedInGraphError: iterating over `tf.Tensor` is not allowed in Graph execution. Use Eager execution or decorate this function with @tf.function.
Pascal Zaugg
  • 183
  • 2
  • 9

0 Answers0