0

Separate (I think) but conceptually related to https://www.tensorflow.org/versions/r1.9/api_docs/python/tf/contrib/data/bucket_by_sequence_length?hl=en.

General set up:

I'd like to support batching which maximizes batch size usage.

Consider the case where:

max_batch_elements=32 training examples are actually "meta-elements" which are comprised of varying multiple batch elements that need to be kept grouped together.

E.g., one training example might fill 5 batch elements, another might fill 20, another 6, etc. We might then group those 3 examples, which will fill (5+20+6=31) batch elements--which is, of course, fairly efficient (31/32 available). (If we were being 100% efficient, we’d wait for one more example of len=1 batch element.)

I’d like to set up an efficient batching scheme.

tf.bucket_by_sequence_length would be fairly efficient for low-batch count items: e.g., with bucket sizes of 1,2,3,...,16, then there is fairly efficient slotting. E.g., bucket size=5 => 6 examples*5 batch elements=30 batch elements filled (out of 32 max).

However, this gets progressive more inefficient with longer sequences. E.g., for bucket size 17, then, of course, only a single example can get pushed through (leaving 15 batch elements unused).

Ideally, I’d like to set up a bucketing where elements fill up buckets to completion:

E.g., if the incoming examples were batch lens [5, 15, 6, 20, 30, 1, 4, …] then we’d see things fill up like:

Bucket 1: [5, 15, 6, 1, 4] => 32 total => gets sent to GPU for training
Bucket 2: [20]
Bucket 3: [30]
Bucket 4: gets created if necessary, presumably up to some maximum # of buckets, at which point the most-filled bucket gets emptied

(Potentially, the 6, 1, & 4 examples could get dropped into Bucket 2 & 3; stochastic behavior here would be fine, as long as the entire algorithm moves towards better filling out buckets.)

Is the above (reasonably) possible in tf? Fairly straightforward to implement in Python/Java/C++/whatever, but not clear to me how to implement in TF. (There very well may be a hidden library function I’m missing.)

severian
  • 488
  • 1
  • 5
  • 14
  • I'm not exactly sure on what your asking, but could this maybe help? https://stackoverflow.com/questions/40994583/how-to-implement-tensorflows-next-batch-for-own-data – Recessive May 09 '19 at 01:24
  • Thank you for the attempt, I'm pretty sure this isn't relevant. What I'm trying to outline is basically a variant of a classic queuing solution like Token Buckets or Leaky Buckets. Basically, we have some number of buckets and try to maximally stuff them with new examples, and empty buckets whenever they are full (or we don't have any more space for new elements in any bucket). – severian May 09 '19 at 01:44
  • https://www.tensorflow.org/api_docs/python/tf/data/experimental/group_by_reducer is very possibly relevant, but is sparsely documented. – severian May 09 '19 at 03:06

0 Answers0