I try to use TextLineDataset and Tokenizer together but I run into problems when using it with #map. I get an error that it is not possible to iterate over a Tensor. I think I understand, that #texts_to_sequence tries to run on the graph representation. Is that correct? Is it even possible to combine those two parts? Does it make sense to combine them?
from keras.preprocessing.text import Tokenizer
import tensorflow as tf
# Example used from
# https://stackoverflow.com/a/51203923/998092
docs = ["A heart that",
"full up like",
"a landfill",
"no surprises",
"and no alarms"
"a job that slowly"
"Bruises that",
"You look so",
"tired happy",
"no alarms",
"and no surprises"]
T = Tokenizer()
T.fit_on_texts(docs)
def encode(sentence):
return T.texts_to_sequences(sentence)
data = tf.data.TextLineDataset.from_tensor_slices(docs)
encoded_data = data.map(encode)
print("result for test 1:\n%s" %(data))
Resulting in:
WARNING:tensorflow:Entity <bound method Tokenizer.texts_to_sequences_generator of <keras_preprocessing.text.Tokenizer object at 0x7f9089f5de48>> appears to be a generator function. It will not be converted by AutoGraph.
WARNING: Entity <bound method Tokenizer.texts_to_sequences_generator of <keras_preprocessing.text.Tokenizer object at 0x7f9089f5de48>> appears to be a generator function. It will not be converted by AutoGraph.
---------------------------------------------------------------------------
OperatorNotAllowedInGraphError Traceback (most recent call last)
<ipython-input-8-46695a877229> in <module>()
25
26 data = tf.data.TextLineDataset.from_tensor_slices(docs)
---> 27 encoded_data = data.map(encode)
28
29 print("result for test 1:\n%s" %(data))
10 frames
/usr/local/lib/python3.6/dist-packages/tensorflow/python/autograph/impl/api.py in wrapper(*args, **kwargs)
263 except Exception as e: # pylint:disable=broad-except
264 if hasattr(e, 'ag_error_metadata'):
--> 265 raise e.ag_error_metadata.to_exception(e)
266 else:
267 raise
OperatorNotAllowedInGraphError: in user code:
<ipython-input-8-46695a877229>:24 encode *
return T.texts_to_sequences(sentence)
/usr/local/lib/python3.6/dist-packages/keras_preprocessing/text.py:279 texts_to_sequences *
return list(self.texts_to_sequences_generator(texts))
/usr/local/lib/python3.6/dist-packages/keras_preprocessing/text.py:298 texts_to_sequences_generator **
for text in texts:
/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py:561 __iter__
self._disallow_iteration()
/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py:557 _disallow_iteration
self._disallow_in_graph_mode("iterating over `tf.Tensor`")
/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/ops.py:537 _disallow_in_graph_mode
" this function with @tf.function.".format(task))
OperatorNotAllowedInGraphError: iterating over `tf.Tensor` is not allowed in Graph execution. Use Eager execution or decorate this function with @tf.function.