I am using the from_generator
function in tf.data.Dataset to load my data of 9000 samples, but it takes only the first 256 elements and repeats them to fill 9000 samples.
def gen():
for idx in z:
yield idx
z = list(range(9000)) # 9000 is the length of my dataset
dataset = tf.data.Dataset.from_generator(gen, tf.uint8)
for step, sample in enumerate(dataset):
print(step)
print(sample)
Expected behavior:
0
tf.Tensor(0, shape=(), dtype=uint8)
...
8999
tf.Tensor(8999, shape=(), dtype=uint8)
Actual behavior:
0
tf.Tensor(0, shape=(), dtype=uint8)
1
tf.Tensor(1, shape=(), dtype=uint8)
...
255
tf.Tensor(255, shape=(), dtype=uint8)
256
tf.Tensor(0, shape=(), dtype=uint8)
...
I feel like I filled a sort of buffer of length 256, but I am not sure. Would appreciate any help!