Questions tagged [tf.data.dataset]

145 questions
0
votes
1 answer

Error: expected str, bytes or os.PathLike object, not Tensor

This code is preparing image data for a deep learning model to be trained on. The classes variable is a list of object categories, and the label_data variable is a dictionary mapping image file names to their corresponding object categories. The…
0
votes
0 answers

UPDATE with code: tensor operation (slice, insert) to update inputs with output in a custom trainning loop

I want to make a LSTM model able to use is own forcast as one of the inputs feature, in this way adding values of this feature but learn to deal with it own error too. A king of partial auto-regression model. Note: for the ones who like all details,…
Jonathan Roy
  • 405
  • 1
  • 6
  • 18
0
votes
1 answer

How to feed multi head tensorflow model with tf.data.dataset

I'm using for the first time tf.data.dataset to feed a model. I look on some exemple but don't find how to use multi-inputs on a 2 heads model. My first input as shape[nb_samples, nb_timesteps, nb_features] to feed LSTM first head. My second input…
Jonathan Roy
  • 405
  • 1
  • 6
  • 18
0
votes
0 answers

How to save a ShuffleDataset object?

I created a dataset by using from_generator function from tf.data.dataset API. Then I shuffle my dataset and divide it into batches of size 10. So the pipeline looks like dataset = dataset.shuffle(buffer_size=1000).batch(batch_size=10) (For…
gülsemin
  • 25
  • 4
0
votes
0 answers

tf.data.Dataset.from_generator long to initialize

I have a generator that I am trying to put into a tf.data.dataset. def static_syn_batch_generator( total_size: int, batch_size: int, start_random_seed:int=0, fg_seeds_ss:SampleSet=None, bg_seeds_ss:SampleSet=None,…
lr100
  • 648
  • 1
  • 9
  • 29
0
votes
1 answer

tf.data.datasets set each batch (prefetch)

I am looking for help thinking through this. I have a function (that is not a generator) that will give me any number of samples. Let's say that getting all the data I want to train (1000 samples) can't fit into memory. So I want to call this…
lr100
  • 648
  • 1
  • 9
  • 29
0
votes
0 answers

tf.dataset got empty before it calls model.fit

I am building a keras model. The features are coming from pandas.DataFrame. I build the tf.Dataset through from_generator API. I followed this page to process the categorical string features. output_sig= ... features = [...] def iter_to_gen(it): …
0
votes
0 answers

Memory Leak with tf.Dataset

I have done a pipeline to read my data on file to a tf.data.Dataset. The issue is that for each epoch, memory is accumulated. After a while the training is killed. I have tried to reduce the number of images shuffled. Tweak the number of parallel…
El_Loco
  • 1,716
  • 4
  • 20
  • 35
0
votes
0 answers

Custom loss with weights when using BatchDataset in Keras

I have a Keras model for which I have features, labels and an additional array which I want to use it as weights for a custom loss function. I am ingesting the data using 2 BatchDataset structures as follows: One is containing the features and…
gdstoica
  • 29
  • 1
  • 4
0
votes
1 answer

convert tf.dense Tensor to tf.one_hot Tensor on Graph execution Tensorflow

TF version: 2.11 I try to train a simple 2input classifier with TFRecords tf.data pipeline I do not manage to convert the tf.dense Tensor with containing only a scalar to a tf.onehot vector # get all recorddatasets abspath training_names=…
0
votes
1 answer

deleat row with nan in a tensorflow dataset

There is a way to do like pandas inside tensor dataset, deleating row with a nan like this??? ds = ds[~np.isnan(ds).any(axis=1)] My test exemple is: simple_data_samples = np.array([ [1, 11, 111, -1, -11], [2, np.nan, 222, -2, -22], [3,…
0
votes
1 answer

How to clean nan in tf.data.Dataset in sequences multivariates inputs for LSTM

I try to feed huge dataset (out of memory) to my lstm model. I want to make some transformation on my data using the tf.data.Dataset. I first turn my numpy data to dataset using tf.keras.utils.timeseries_dataset_from_array. This is an exemple of my…
0
votes
1 answer

how to use tf.data to preprocess n sample dataset and generate 100*n sample dataset considering memory limitation ( using .from_generator()? )

I have a dataset containing 100 samples with dimensions (5000,2) means the initial dataset shape is (100,5000,2), (assumed numbers to make the example clear, the intended dataset is much bigger than that) Now each of the samples is pre-processed…
0
votes
1 answer

How efficiently filter a specific number of entries and concatenating them in a unique tf.data.Dataset?

I have a huge TFRecord file with more than 4M entries. It is a very unbalanced dataset containing many more entries of some labels and few others - compare to the whole dataset. I want to filter a limited number of entries of some of these labels in…
Marlon Teixeira
  • 334
  • 1
  • 14
0
votes
0 answers

Filtering tf.dataset results in an endless process

I'm filtering the dataset according to certain labels. Once I call the filtering method, everything is fine. But once I call next(iter(dataset))for certain values it gets processing for more the 12 hours - for other value it just give the result. My…
Marlon Teixeira
  • 334
  • 1
  • 14