TensorFlow's `tf.data` module provides a functional API for building input pipelines, using the `tf.data.Dataset` and `tf.data.Iterator` classes.
Questions tagged [tensorflow-datasets]
2091 questions
130
votes
6 answers
Meaning of buffer_size in Dataset.map , Dataset.prefetch and Dataset.shuffle
As per TensorFlow documentation , the prefetch and map methods of tf.contrib.data.Dataset class, both have a parameter called buffer_size.
For prefetch method, the parameter is known as buffer_size and according to documentation :
buffer_size: A…

Ujjwal
- 1,628
- 2
- 10
- 18
94
votes
5 answers
What is the difference between Dataset.from_tensors and Dataset.from_tensor_slices?
I have a dataset represented as a NumPy matrix of shape (num_features, num_examples) and I wish to convert it to TensorFlow type tf.Dataset.
I am struggling trying to understand the difference between these two methods: Dataset.from_tensors and…

Llewlyn
- 1,503
- 1
- 12
- 11
70
votes
10 answers
Split a dataset created by Tensorflow dataset API in to Train and Test?
Does anyone know how to split a dataset created by the dataset API (tf.data.Dataset) in Tensorflow into Test and Train?

Dani
- 857
- 1
- 6
- 8
66
votes
19 answers
tf.data.Dataset: how to get the dataset size (number of elements in an epoch)?
Let's say I have defined a dataset in this way:
filename_dataset = tf.data.Dataset.list_files("{}/*.png".format(dataset))
how can I get the number of elements that are inside the dataset (hence, the number of single elements that compose an…

nessuno
- 26,493
- 5
- 83
- 74
62
votes
2 answers
tf.data with multiple inputs / outputs in Keras
For the application, such as pair text similarity, the input data is similar to: pair_1, pair_2. In these problems, we usually have multiple input data. Previously, I implemented my models successfully:
model.fit([pair_1, pair_2], labels,…

Amir
- 16,067
- 10
- 80
- 119
53
votes
13 answers
How to extract data/labels back from TensorFlow dataset
there are plenty of examples how to create and use TensorFlow datasets, e.g.
dataset = tf.data.Dataset.from_tensor_slices((images, labels))
My question is how to get back the data/labels from the TF dataset in numpy form? In other words want would…

Valentin
- 1,492
- 3
- 18
- 27
50
votes
4 answers
TensorFlow: training on my own image
I am new to TensorFlow. I am looking for the help on the image recognition where I can train my own image dataset.
Is there any example for training the new dataset?

VICTOR
- 1,894
- 5
- 25
- 54
42
votes
3 answers
parallelising tf.data.Dataset.from_generator
I have a non trivial input pipeline that from_generator is perfect for...
dataset = tf.data.Dataset.from_generator(complex_img_label_generator,
(tf.int32, tf.string))
dataset = dataset.batch(64)
iter =…

mat kelcey
- 3,077
- 2
- 30
- 35
41
votes
2 answers
Tensorflow tf.data AUTOTUNE
I was reading the TF performance guide for Data Loading section. For prefetch it says,
The tf.data API provides a software pipelining mechanism through the
tf.data.Dataset.prefetch transformation, which can be used to decouple
the time when…

dgumo
- 1,838
- 1
- 14
- 18
41
votes
2 answers
How do I split Tensorflow datasets?
I have a tensorflow dataset based on one .tfrecord file. How do I split the dataset into test and train datasets? E.g. 70% Train and 30% test?
Edit:
My Tensorflow Version: 1.8
I've checked, there is no "split_v" function as mentioned in the possible…

Lukas Hestermeyer
- 830
- 1
- 7
- 19
41
votes
7 answers
Tensorflow : logits and labels must have the same first dimension
I am new in tensoflow and I want to adapt the MNIST tutorial https://www.tensorflow.org/tutorials/layers with my own data (images of 40x40).
This is my model function :
def cnn_model_fn(features, labels, mode):
# Input Layer
…

Geoffrey Pruvost
- 624
- 1
- 6
- 13
40
votes
1 answer
How to create only one copy of graph in tensorboard events file with custom tf.Estimator?
I'm using a custom tf. Estimator object to train a neural network. The problem is in the size of the events file after training - it is unreasonably large.
I've already solved the problem with saving part of a dataset as constant by using…

Andrii Zadaianchuk
- 820
- 9
- 17
33
votes
5 answers
how to get string value out of tf.tensor which dtype is string
I want to use tf.data.Dataset.list_files function to feed my datasets.
But because the file is not image, I need to load it manually.
The problem is tf.data.Dataset.list_files pass variable as tf.tensor and my python code can not handle tensor.
How…

Ko Ohhashi
- 844
- 1
- 11
- 23
32
votes
4 answers
Difference between tf.data.Dataset.map() and tf.data.Dataset.apply()
With the recent upgrade to version 1.4, Tensorflow included tf.data in the library core.
One "major new feature" described in the version 1.4 release notes is tf.data.Dataset.apply(), which is a "method for
applying custom transformation functions".…

GPhilo
- 18,519
- 9
- 63
- 89
29
votes
4 answers
How to input a list of lists with different sizes in tf.data.Dataset
I have a long list of lists of integers (representing sentences, each one of different sizes) that I want to feed using the tf.data library. Each list (of the lists of list) has different length, and I get an error, which I can reproduce here:
t =…

Escachator
- 1,742
- 1
- 16
- 32