1

I have recently started using tensorflow.contrib.learn (skflow) library and really like it. However, I am facing an issue with using Estimator, the fit function uses either

  1. (X, Y, and batch_size) - the problem with this approach is that it does not support provision for specifying number of epochs and allowing arbitrary source of data.
  2. input_fn - besides, setting epochs, it gives me much more flexibility on source of training ( which in my case is coming directly from a database).

Now I am aware that I could create input_fn which reads files, however, as I am not interested in dealing with files, the following functions are not useful for me -

  • tf.contrib.learn.read_batch_examples
  • tf.contrib.learn.read_batch_features
  • tf.contrib.learn.read_batch_record_features

Ideally, I would like to use StreamingDataFeeder as input_fn. Any ideas how I can achieve this?

Abhi
  • 111
  • 1
  • 3
  • This discussion is meanwhile moving forward on https://groups.google.com/a/tensorflow.org/forum/#!topic/discuss/ZEzEa1TyYuE – Abhi Oct 05 '16 at 10:52

1 Answers1

0

StreamingDataFeeder is used when you provide iterators as x / y to fit/predict/evaluate of Estimator.

Example:

x = (np.array([i]) for i in xrange(10**10)) # use range for python >=3.0
y = (np.array([i + 1]) for i in xrange(10**10))
lr = tf.contrib.learn.LinearRegressor(
    feature_columns=[tf.contrib.layers.real_valued_column('')])

# only consumes 1000*10 values from iterators.
lr.fit(x, y, steps=1000, batch_size=10)

If you want to use input_fn for feeding data - you need to use graph operations to read / process data. For example you can create a C++ operation that will produce your data (it can be listening port or reading from database Op) and convert into Tensor. Mainly this is good for reading data from files, but other readers can be implemented as well.

ilblackdragon
  • 1,834
  • 12
  • 12
  • Thanks for the reply. I would like to implement my own reader. But I am lost on where to begin. Can you point me in right direction please? – Abhi Oct 12 '16 at 14:13
  • Here is the `TFRecordReader` implementation - https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/kernels/tf_record_reader_op.cc i.e. you need to implement a class that inherits from `ReaderBase` and an Op that you can then call in the input function. – ilblackdragon Oct 19 '16 at 18:38