57

Keras' fit_generator() model method expects a generator which produces tuples of the shape (input, targets), where both elements are NumPy arrays. The documentation seems to imply that if I simply wrap a Dataset iterator in a generator, and make sure to convert the Tensors to NumPy arrays, I should be good to go. This code, however, gives me an error:

import numpy as np
import os
import keras.backend as K
from keras.layers import Dense, Input
from keras.models import Model
import tensorflow as tf
from tensorflow.contrib.data import Dataset

os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'

with tf.Session() as sess:
    def create_data_generator():
        dat1 = np.arange(4).reshape(-1, 1)
        ds1 = Dataset.from_tensor_slices(dat1).repeat()

        dat2 = np.arange(5, 9).reshape(-1, 1)
        ds2 = Dataset.from_tensor_slices(dat2).repeat()

        ds = Dataset.zip((ds1, ds2)).batch(4)
        iterator = ds.make_one_shot_iterator()
        while True:
            next_val = iterator.get_next()
            yield sess.run(next_val)

datagen = create_data_generator()

input_vals = Input(shape=(1,))
output = Dense(1, activation='relu')(input_vals)
model = Model(inputs=input_vals, outputs=output)
model.compile('rmsprop', 'mean_squared_error')
model.fit_generator(datagen, steps_per_epoch=1, epochs=5,
                    verbose=2, max_queue_size=2)

Here's the error I get:

Using TensorFlow backend.
Epoch 1/5
Exception in thread Thread-1:
Traceback (most recent call last):
  File "/home/jsaporta/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 270, in __init__
    fetch, allow_tensor=True, allow_operation=True))
  File "/home/jsaporta/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 2708, in as_graph_element
    return self._as_graph_element_locked(obj, allow_tensor, allow_operation)
  File "/home/jsaporta/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 2787, in _as_graph_element_locked
    raise ValueError("Tensor %s is not an element of this graph." % obj)
ValueError: Tensor Tensor("IteratorGetNext:0", shape=(?, 1), dtype=int64) is not an element of this graph.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/jsaporta/anaconda3/lib/python3.6/threading.py", line 916, in _bootstrap_inner
    self.run()
  File "/home/jsaporta/anaconda3/lib/python3.6/threading.py", line 864, in run
    self._target(*self._args, **self._kwargs)
  File "/home/jsaporta/anaconda3/lib/python3.6/site-packages/keras/utils/data_utils.py", line 568, in data_generator_task
    generator_output = next(self._generator)
  File "./datagen_test.py", line 25, in create_data_generator
    yield sess.run(next_val)
  File "/home/jsaporta/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 895, in run
    run_metadata_ptr)
  File "/home/jsaporta/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1109, in _run
    self._graph, fetches, feed_dict_tensor, feed_handles=feed_handles)
  File "/home/jsaporta/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 413, in __init__
    self._fetch_mapper = _FetchMapper.for_fetch(fetches)
  File "/home/jsaporta/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 233, in for_fetch
    return _ListFetchMapper(fetch)
  File "/home/jsaporta/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 340, in __init__
    self._mappers = [_FetchMapper.for_fetch(fetch) for fetch in fetches]
  File "/home/jsaporta/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 340, in <listcomp>
    self._mappers = [_FetchMapper.for_fetch(fetch) for fetch in fetches]
  File "/home/jsaporta/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 241, in for_fetch
    return _ElementFetchMapper(fetches, contraction_fn)
  File "/home/jsaporta/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 277, in __init__
    'Tensor. (%s)' % (fetch, str(e)))
ValueError: Fetch argument <tf.Tensor 'IteratorGetNext:0' shape=(?, 1) dtype=int64> cannot be interpreted as a Tensor. (Tensor Tensor("IteratorGetNext:0", shape=(?, 1), dtype=int64) is not an element of this graph.)

Traceback (most recent call last):
  File "./datagen_test.py", line 34, in <module>
    verbose=2, max_queue_size=2)
  File "/home/jsaporta/anaconda3/lib/python3.6/site-packages/keras/legacy/interfaces.py", line 87, in wrapper
    return func(*args, **kwargs)
  File "/home/jsaporta/anaconda3/lib/python3.6/site-packages/keras/engine/training.py", line 2011, in fit_generator
    generator_output = next(output_generator)
StopIteration

Strangely enough, adding a line containing next(datagen) directly after where I initialize datagen causes the code to run just fine, with no errors.

Why does my original code not work? Why does it begin to work when I add that line to my code? Is there a more efficient way to use TensorFlow's Dataset API with Keras that doesn't involve converting Tensors to NumPy arrays and back again?

jsaporta
  • 1,149
  • 2
  • 12
  • 15
  • I'm not sure if that's the reason, but I find it really strange that you define a function inside a `with` block. – Daniel Möller Sep 09 '17 at 23:32
  • Evidently, putting the `with` block inside the generator definition does make the code work both with and without the extra line, though I could have sworn I tried it that way first. Considering how (I think) TensorFlow `Session`s work, though, I don't see why it should make any difference. Another mystery. – jsaporta Sep 10 '17 at 01:02
  • Doesn't the with block close the session at its end? I think it's really not supposed to contain definitions that will be used outside of it.... If I post that as an answer to the question, would it be marked as answered? – Daniel Möller Sep 10 '17 at 01:04
  • I don't think the question will be answered. If we put `sess = tf.InteractiveSession()` at the top of the file and change the `with` block to `with sess.as_default()` (and have it inside the generator definition), we get the same error as before. Making the interactive session change and removing the with block altogether (because it sets itself as the default session), also gives the same error. It's not clear to me that this is the crux of the problem. – jsaporta Sep 10 '17 at 01:18
  • I think it's really a "disconnection" of the graph. Once you transform a tensor in a numpy array, you lose connection (it's not a tensor anymore). Is there a way to create parallel sessions? Maybe your generator should create subsessions inside it (that are independent from the session running the model), so this way it will not expect a connection? – Daniel Möller Sep 10 '17 at 01:29
  • Or maybe you just run the iterator before running the model, and save the data as numpy arrays for loading later with a regular generator? – Daniel Möller Sep 10 '17 at 01:30

6 Answers6

62

Update June 09, 2018

  • Starting from Tensorflow 1.9, one can pass tf.data.Dataset object directly into keras.Model.fit() and it would act similar to fit_generator.
  • A complete example can be found on this gist.
# Load mnist training data
(x_train, y_train), _ = tf.keras.datasets.mnist.load_data()
training_set = tfdata_generator(x_train, y_train,is_training=True)

model = # your keras model here              
model.fit(
    training_set.make_one_shot_iterator(),
    steps_per_epoch=len(x_train) // 128,
    epochs=5,
    verbose = 1)
  • tfdata_generator is a function that returns an iterable tf.data.Dataset.
def tfdata_generator(images, labels, is_training, batch_size=128):
  '''Construct a data generator using `tf.Dataset`. '''

  def map_fn(image, label):
      '''Preprocess raw data to trainable input. '''
    x = tf.reshape(tf.cast(image, tf.float32), (28, 28, 1))
    y = tf.one_hot(tf.cast(label, tf.uint8), _NUM_CLASSES)
    return x, y

  dataset = tf.data.Dataset.from_tensor_slices((images, labels))

  if is_training:
    dataset = dataset.shuffle(1000)  # depends on sample size
  dataset = dataset.map(map_fn)
  dataset = dataset.batch(batch_size)
  dataset = dataset.repeat()
  dataset = dataset.prefetch(tf.contrib.data.AUTOTUNE)

  return dataset

Old Solution:

In addition to @Yu-Yang's answer, you can also modify tf.data.Dataset to become a generator for fit_generator as following

from tensorflow.contrib.learn.python.learn.datasets import mnist

data   = mnist.load_mnist()
model  = # your Keras model
model.fit_generator(generator = tfdata_generator(data.train.images, data.train.labels),
                    steps_per_epoch=200,
                    workers = 0 , # This is important
                    verbose = 1)


def tfdata_generator(images, labels, batch_size=128, shuffle=True,):
    def map_func(image, label):
        '''A transformation function'''
        x_train = tf.reshape(tf.cast(image, tf.float32), image_shape)
        y_train = tf.one_hot(tf.cast(label, tf.uint8), num_classes)
        return [x_train, y_train]

    dataset  = tf.data.Dataset.from_tensor_slices((images, labels))
    dataset  = dataset.map(map_func)
    dataset  = dataset.shuffle().batch(batch_size).repeat()
    iterator = dataset.make_one_shot_iterator()

    next_batch = iterator.get_next()
    while True:
        yield K.get_session().run(next_batch)
Dat
  • 5,405
  • 2
  • 31
  • 32
  • 3
    This is AFAIK the only way to provide validation data to keras, using the validation_data parameter of fit_generator – Warrick Jan 23 '18 at 08:50
  • 1
    When I used tfdata_generator( ) again in the evaluate_generator() after training the model, I met the issue that the same tfdata_generate() function cannot be run. It seems that the command "K.get_session()" cannot be run twice. Any thought on correccting this? – Leo5188 Apr 13 '18 at 00:51
  • 3
    the workers = 0 line is very important. Basically if workers > 0 you end up with multithreading issues since multiple threads are trying to evaluate the same generator. If you call the generator once (initialize it), it will work because you already created the end point but you might end up with strange results since it is not thread-safe – kmader May 09 '18 at 09:56
  • 2
    The result of `K.get_session().run(next_batch)` will be a list of numpy arrays, no? I thought the idea was to avoid going back to the python layer and stay in tensorflow... – FirefoxMetzger May 11 '18 at 08:44
  • -1 ... like @FirefoxMetzger says, the question is to do it without getting through numpy arrays, which is exactly what you do. – Ciprian Tomoiagă May 23 '18 at 09:26
  • @Dat Nguyen , your solution give me error " AttributeError: 'Iterator' object has no attribute 'mdim'" I used tensor flow 1.10. And my dataset is in tfreocds format – W. Sam Aug 12 '18 at 21:22
  • Is there way to avoid `len(x_train)` in `steps_per_epoch=len(x_train) // 128`? I only have `training_set` at my disposal. All data is not in memory. – Nitin Aug 30 '18 at 06:21
  • @Nitin This depending on the format of your `training_set`. If it's a tfrecord, then you'll have to iterate through the dataset and count the number of records. An example is here [link](https://github.com/tensorflow/magenta/blob/master/magenta/common/sequence_example_lib.py), see the method `count_records` – khuang834 Nov 01 '18 at 06:47
  • @khuang834 Thanks for link showing how to get the count. My question why do we even need to provide this if dataset is a generator which can be exhausted (repeat=False) and that can indicate end of an epoch. – Nitin Nov 02 '18 at 07:06
  • Shouldn't it be `int(np.ceil(len(x_train) / 128))` instead of `len(x_train) // 128` ? Otherwise, number of steps does not account for the last batch, if it is incomplete. – Kilian Obermeier Dec 16 '18 at 17:35
  • @dat-nguyen, with the updated version, how do I feed `validation_data`? Neither passing a pair of tensors, nor a Dataset made from them works. – LOST Feb 05 '19 at 20:49
  • 1
    I am wondering how Keras is able to do 5 epochs when the `make_one_shot_iterator()` which only supports iterating once through a dataset? – SantoshGupta7 Mar 31 '19 at 19:02
  • 1
    update, you do not have to call an iterator anymore https://stackoverflow.com/questions/55444615/the-established-way-to-use-tf-dataset-api-in-keras-is-to-feed-model-fit-with – SantoshGupta7 Apr 01 '19 at 20:55
  • Does the "Old Solution" work for everyone? For me model.fit_generator just hangs in the first epoch. It seems the generator is not producing the yields properly. I just copy-pasted the code with `image_shape = (28, 28)` and `num_classes = 10 ` – coda Jan 24 '20 at 20:41
41

There is indeed a more efficient way to use Dataset without having to convert the tensors into numpy arrays. However, it is not (yet?) on the official documentation. From the release note, it's a feature introduced in Keras 2.0.7. You may have to install keras>=2.0.7 in order to use it.

x = np.arange(4).reshape(-1, 1).astype('float32')
ds_x = Dataset.from_tensor_slices(x).repeat().batch(4)
it_x = ds_x.make_one_shot_iterator()

y = np.arange(5, 9).reshape(-1, 1).astype('float32')
ds_y = Dataset.from_tensor_slices(y).repeat().batch(4)
it_y = ds_y.make_one_shot_iterator()

input_vals = Input(tensor=it_x.get_next())
output = Dense(1, activation='relu')(input_vals)
model = Model(inputs=input_vals, outputs=output)
model.compile('rmsprop', 'mse', target_tensors=[it_y.get_next()])
model.fit(steps_per_epoch=1, epochs=5, verbose=2)

Several differences:

  1. Supply the tensor argument to the Input layer. Keras will read values from this tensor, and use it as the input to fit the model.
  2. Supply the target_tensors argument to Model.compile().
  3. Remember to convert both x and y into float32. Under normal usage, Keras will do this conversion for you. But now you'll have to do it yourself.
  4. Batch size is specified during the construction of Dataset. Use steps_per_epoch and epochs to control when to stop model fitting.

In short, use Input(tensor=...), model.compile(target_tensors=...) and model.fit(x=None, y=None, ...) if your data are to be read from tensors.

Yu-Yang
  • 14,539
  • 2
  • 55
  • 62
  • 10
    It looks like it's not even necessary to have two separate iterators. You can just zip up the two datasets, create a node like `next_val = it.get_next()`, and provide the elements of its output to the `Input()` and `Model.compile()` functions. – jsaporta Sep 10 '17 at 17:13
  • 5
    What about iterator initialisation? Can I somehow tell keras to initialise it with each and every epoch? Or I still need to create session and do it manually and then just run one epoch each time? – pfulop Oct 28 '17 at 12:15
  • 1
    So I have tried this approach and it not only makes the model quite difficult to use (you have to save and reload it in order to give it inputs like validation data), but it also seems to be slower (https://www.kaggle.com/kmader/tensorflow-data-keras-tensors-retinopathy vs https://www.kaggle.com/kmader/inceptionv3-for-retinopathy-gpu-hr) from 10s per iteration (using session.run() and normal fit) to 15s per iteration (using Input(tensor= and target_tensors) – kmader May 10 '18 at 06:10
  • 1
    @yu-yang if we are to save a model created like this, when we load it it has a normal `Input` layer. How can we bind a TF tensor to this input ? – Ciprian Tomoiagă Jun 01 '18 at 13:22
  • I believe the best way is to use the same code to re-create the model, and then use `model.load_weights` (to load the weight file saved by `model.save_weights`). The [official example](https://github.com/keras-team/keras/blob/master/examples/mnist_dataset_api.py) also uses this approach to save/load the model. – Yu-Yang Jun 01 '18 at 14:13
  • Have you tried to incorporate the logging to tensorboard? Because it seems without validation data, it does not log to tensorboard, which is very unfortunate. – Matěj Račinský Mar 01 '19 at 23:10
4

The other answers are good, however it is important to note that using from_tensor_slices directly with large numpy arrays can quickly fill up your memory as, IIRC, the values are copied into the graph as tf.constants. In my experience, this will cause a silent failure where training will eventually start but no improvement in loss etc will occur.

A better way is to use placeholders. E.g. here is my code to create a generator for images and their onehot targets:

def create_generator_tf_dataset(self, images, onehots, batch_size):
    # Get shapes
    img_size = images.shape
    img_size = (None, img_size[1], img_size[2], img_size[3])
    onehot_size = onehots.shape
    onehot_size = (None, onehot_size[1])

    # Placeholders
    images_tensor = tf.placeholder(tf.float32, shape=img_size)
    onehots_tensor = tf.placeholder(tf.float32, shape=onehot_size)

    # Dataset
    dataset = tf.data.Dataset.from_tensor_slices((images_tensor, onehots_tensor))
    # Map function (e.g. augmentation)
    if map_fn is not None:
        dataset = dataset.map(lambda x, y: (map_fn(x), y), num_parallel_calls=tf.data.experimental.AUTOTUNE)
    # Combined shuffle and infinite repeat
    dataset = dataset.apply(
        tf.data.experimental.shuffle_and_repeat(len(images), None))  
    dataset = dataset.batch(batch_size)
    dataset = dataset.prefetch(1)

    # Make the iterator
    iterator = dataset.make_initializable_iterator()
    init_op = iterator.initializer
    next_val = iterator.get_next()

    with K.get_session().as_default() as sess:
        sess.run(init_op, feed_dict={images_tensor: images, onehots_tensor: onehots})
        while True:
            inputs, labels = sess.run(next_val)
            yield inputs, labels
geometrikal
  • 3,195
  • 2
  • 29
  • 40
2

@Yu_Yang and @Dat-Nguyen's solutions both work fine. It is possible to make @Yu-Yang's solution support validation set during training as well, by using feedable iterators and passing the validation set's handle as the validation "data". It's a bit convoluted but it works.

You can also convert the Keras model to an Estimator, they support datasets:

estimator = tf.keras.estimator.model_to_estimator(keras_model=model,
                                                  model_dir=model_dir)
input_name = model.layers[0].input.op.name

def input_fn(dataset):
    dataset = dataset.map(lambda X,y: {input_name: X}, y)
    return dataset.make_one_shot_iterator().get_next()

train_spec = tf.estimator.TrainSpec(
    input_fn=lambda: input_fn(train_set), max_steps=100)
eval_spec = tf.estimator.EvalSpec(
    input_fn=lambda: input_fn(test_set))

tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec)
MiniQuark
  • 46,633
  • 36
  • 147
  • 183
  • 2
    can you please detail, with code maybe, how @Yu-Yang's solution can take validation data? I tries to pass the `validation_data` parameter but Keras throws an error about it when using native tensors – Ciprian Tomoiagă May 28 '18 at 13:50
0

Here is a solution if you are creating a TensorFlow Dataset using Pandas library. Note that this code will not work without tf.reshape() since for some reason the tensors coming from tf.py_func() don't have shape information. So this doesn't work with tuple. Does anybody have a workaround?

def _get_input_data_for_dataset(file_name):
     df_input=pd.read_csv(file_name.decode(),usecols=['Wind_MWh'])            

     X_data = df_input.as_matrix()

     return X_data.astype('float32', copy=False)

X_dataset = tf.data.Dataset.from_tensor_slices(file_names)
X_dataset = X_dataset.flat_map(lambda file_name: tf.data.Dataset.from_tensor_slices(
                            tf.reshape(tf.py_func(_get_input_data_for_dataset,[file_name], tf.float32),[-1,1])))

X_dataset = X_dataset.batch(5)
X_iter = X_dataset.make_one_shot_iterator()
X_batch = X_iter.get_next()
input_X1 = Input(tensor= X_batch ,name='input_X1')

y1 = Dense(units=64, activation='relu',kernel_initializer=tf.keras.initializers.Constant(1),name='layer_FC1')(input_X1)
siby
  • 745
  • 11
  • 25
-3

One important observation from my recent experience is to use tf.keras instead of the native keras. Works for me tf > 1.12.

Hope it can help others too.

Jason Liu
  • 55
  • 1
  • 8