1

I have a raggedTensor of row_lens going from 1 to up to 10k. I would like to select elements randomly from it with an upper limit on the number per row in a scalable way. Like in this example:

vect = [[1,2,3],[4,5][6],[7,8,9,10,11,12,13]]
limit = 3
sample(vect, limit)

-> output: [[1,2,3],[4,5],[6],[7,9,11]]

My idea was to select * in case len_row < limit and randomly in the other case. I wonder if this can be done with less than batch_size complexity with some tensorflow operations?

Nodiz
  • 333
  • 2
  • 8
  • How do you get `[7,9,11]` in your output? – AloneTogether Feb 11 '22 at 07:00
  • Picking 3 randomly, could be `[9,10,13]` if you want – Nodiz Feb 11 '22 at 10:27
  • You want to sample from every row? – AloneTogether Feb 11 '22 at 10:28
  • Exactly. The code should be in Tensorflow and executable in graph mode. This is were I'm stuck – Nodiz Feb 11 '22 at 10:36
  • Yes, it works! Thank you – Nodiz Feb 11 '22 at 23:23
  • 1
    Related to your answer, I had to add @tf.function(experimental_relax_shapes=True), otherwise every time the function is called with different shapes, tracing is re-executed – Nodiz Feb 11 '22 at 23:24
  • Hey, @AloneTogether, I wanted to update because actually I run into an issue from tensorflow (2.7). Setting up the samplng with the map in graph mode throws a `ValueError: as_list() is not defined on an unknown TensorShape.` exception. I tried to remove the tf.function decorator and move the sample inside and outside the class but with no chage. It seems tensorflow cannot understand the shapes for the graph execution (it runs in jupyter and in init) Do you know f there is a problem with map_fn, graph execution and sparse tensors? – Nodiz Feb 14 '22 at 00:41
  • added `fn_output_signature=tf.RaggedTensorSpec(ragged_rank=0, dtype=tf.int32))` but no fix – Nodiz Feb 14 '22 at 01:19
  • you have to add all dimension to the `tf.RaggedTensorSpec`. Check this https://stackoverflow.com/questions/70557245/how-to-slice-according-to-batch-in-the-tensorflow-array/70558892#70558892 or https://stackoverflow.com/questions/70419275/apply-linear-algebra-to-ragged-tensor-in-tensorflow/70420897#70420897 – AloneTogether Feb 14 '22 at 06:24

1 Answers1

1

You can try using tf.map_fn in graph mode:

import tensorflow as tf

vect = tf.ragged.constant([[1,2,3],[4,5],[6],[7,8,9,10,11,12,13]])

@tf.function
def sample(x, samples=3):
  length = tf.shape(x)[0]
  x = tf.cond(tf.less_equal(length, samples), lambda: x, lambda: tf.gather(x, tf.random.shuffle(tf.range(length))[:samples]))
  return x

c = tf.map_fn(sample, vect)
<tf.RaggedTensor [[1, 2, 3], [4, 5], [6], [12, 7, 9]]>

Note that tf.vectorized_map would probably be faster, but there is a current bug regarding this function and ragged tensors. The use of tf.while_loop is also an option.

AloneTogether
  • 25,814
  • 5
  • 20
  • 39