tf.data.Dataset.zip: Can we have some alternative method of tf.data.Dataset.zip?

Question

When utilizing tf.data. Dataset.zip for zipping two datasets. It combines each index value of the first dataset with the corresponding index value of the second datasets.

a = tf.data.Dataset.range(1, 4)  # ==> [ 1, 2, 3 ]
b = tf.data.Dataset.range(4, 7)  # ==> [ 4, 5, 6 ]
ds = tf.data.Dataset.zip((a, b))
list(ds.as_numpy_iterator()) # (1, 4), (2, 5), (3, 6)]

You can observe a single combination of two datasets, such as 1,4 followed by 2, 5 and then 3,6. How can multiple all possible combinations be generated, such as (1, 4), (1,5), (1, 6), (2,4), (2,5), (2, 6), (3, 4), (3, 5), (3, 6)?

AloneTogether · Answer 1 · 2022-07-02T12:08:48.457

A pure tensorflow approach without loops could look like this:

import tensorflow as tf

a = tf.data.Dataset.range(1, 4)
b = tf.data.Dataset.range(4, 7)
repeats = 3
b = b.repeat(repeats).window(repeats, shift=repeats).flat_map(lambda x: x.batch(repeats))
ds = tf.data.Dataset.zip((a, b)).map(lambda x, y: tf.data.Dataset.from_tensor_slices(tf.stack([tf.broadcast_to(x, (repeats,)), y], axis=1)))
ds = ds.flat_map(lambda x: x.batch(1).map(lambda y: (y[0][0], y[0][1])))

list(ds.as_numpy_iterator())

[(1, 4), (1, 5), (1, 6), (2, 4), (2, 5), (2, 6), (3, 4), (3, 5), (3, 6)]

score 1 · Answer 2 · answered Jul 02 '22 at 11:51

You could use a list comprehension -

a = tf.data.Dataset.range(1, 4)  # ==> [ 1, 2, 3 ]
b = tf.data.Dataset.range(4, 7)  # ==> [ 4, 5, 6 ]
d = tf.data.Dataset.from_tensor_slices([(x, y) for x in a for y in b])
for el in d:
  print(el)

Output

tf.Tensor([1 4], shape=(2,), dtype=int64)
tf.Tensor([1 5], shape=(2,), dtype=int64)
tf.Tensor([1 6], shape=(2,), dtype=int64)
tf.Tensor([2 4], shape=(2,), dtype=int64)
tf.Tensor([2 5], shape=(2,), dtype=int64)
tf.Tensor([2 6], shape=(2,), dtype=int64)
tf.Tensor([3 4], shape=(2,), dtype=int64)
tf.Tensor([3 5], shape=(2,), dtype=int64)
tf.Tensor([3 6], shape=(2,), dtype=int64)

tf.data.Dataset.zip: Can we have some alternative method of tf.data.Dataset.zip?

2 Answers2