Tensorflow Datasets - what's the point

Question

I'm hoping someone can guide me in the right direction.

I am trying to feed input variables (features) and label to tf.estimator.DNNClassifier, and it keeps recommending that I use tensorflow datasets instead of reading the data from a pandas dataframe (using tf.estimator.inputs.pandas_input_fn()).

The issue is, I need to read my CSV file first into a dataframe to make a lot of transformations before feeding into the DNN. As I understand from this blog post, the tensorflow dataset wants to read the data from a CSV file - for reasons that make sense.

So, then will have to write the transformed data to another CSV so I can re-import into a tensorflow dataset? That doesn't make any sense. Is there a good guide that I can read. I'm frustrated.

You can still do those transformations in pandas and then finally, pass the result to a tf dataset. Look at [this thread](https://stackoverflow.com/a/50647478/10111931) on SO to get an idea about how you can do this. Alternately, you can read in csv files using the [make csv dataset](https://www.tensorflow.org/api_docs/python/tf/data/experimental/make_csv_dataset) and then work on doing all the preprocessing with the help of tf data methods. But the ease of use depends on the type of transformations you need. — kvish, Feb 25 '19 at 17:25

Tensorflow Datasets - what's the point

0 Answers0