1

I want to batch many futures into a single request that triggers either when a maximum batch size is reached, or a maximum time since the earliest future was received is reached.

Motivation

In flutter, I have many UI elements which need to display the result of a future, dependent on the data in the UI element.

For instance, I have a widget for a place, and a sub-widget which displays how long it will take to walk to a place. To compute the how long it will take to walk, I issue a request to Google Maps API to get the travel time to the place.

It is more efficient and cost-effective to batch all these API requests into a batch API request. So if there are 100 requests made instantaneously by the widgets, then the futures could be proxied through a single provider, which batches the futures into a single request to Google, and unpacks the result from Google into all the individual requests.

The provider needs to know when to stop waiting for more futures and when to actually issue the request, which should be controllable by the maximum "batch" size (i.e., # of travel time requests), or the maximum amount of time you are willing to wait for batching to take place.

The desired API would be something like:


// Client gives this to tell provider how to compute batch result.
abstract class BatchComputer<K,V> {
  Future<List<V>> compute(List<K> batchedInputs);
}

// Batching library returns an object with this interface
// so that client can submit inputs to completed by the Batch provider.
abstract class BatchingFutureProvider<K,V> {
  Future<V> submit(K inputValue);
}

// How do you implement this in dart???
BatchingFutureProvider<K,V> create<K,V>(
   BatchComputer<K,V> computer, 
   int maxBatchSize, 
   Duration maxWaitDuration,
);

Does Dart (or a pub package) already provide this batching functionality, and if not, how would you implement the create function above?

Jack Reilly
  • 463
  • 2
  • 8
  • 1
    Did you mean [Future.wait()](https://api.dartlang.org/stable/2.5.1/dart-async/Future/wait.html)? Wait for a list of future, and execute a callback when It done. – Tokenyet Oct 01 '19 at 04:51
  • so you have to use [Future](https://api.dartlang.org/stable/2.5.1/dart-async/Future-class.html) api - pay attantion to `then()` and `timeout()` methods – pskink Oct 01 '19 at 04:56
  • @Tokenyet I don't see how Future.wait() applies, could you explain how it would help implement `create` above? – Jack Reilly Oct 01 '19 at 06:55
  • @pskink I feel like your responses may be too low-level to be constructive for the question. I have an understanding of the basic units of `Future`'s, but plumbing them together in a manner that allows batching in an ergonomic API is what I'm looking for – Jack Reilly Oct 01 '19 at 07:34

2 Answers2

2

This sounds perfectly reasonable, but also very specialized. You need a way to represent a query, to combine these queries into a single super-query, and to split the super-result into individual results afterwards, which is what your BatchComputer does. Then you need a queue which you can flush through that under some conditions.

One thing that is clear is that you will need to use Completers for the results because you always need that when you want to return a future before you have the value or future to complete it with.

The approach I would choose would be:

import "dart:async";

/// A batch of requests to be handled together.
///
/// Collects [Request]s until the pending requests are flushed.
/// Requests can be flushed by calling [flush] or by configuring
/// the batch to automatically flush when reaching certain 
/// tresholds.
class BatchRequest<Request, Response> {
  final int _maxRequests;
  final Duration _maxDelay;
  final Future<List<Response>> Function(List<Request>) _compute;
  Timer _timeout;
  List<Request> _pendingRequests;
  List<Completer<Response>> _responseCompleters;

  /// Creates a batcher of [Request]s.
  ///
  /// Batches requests until calling [flush]. At that pont, the
  /// [batchCompute] function gets the list of pending requests,
  /// and it should respond with a list of [Response]s.
  /// The response to the a request in the argument list
  /// should be at the same index in the response list, 
  /// and as such, the response list must have the same number
  /// of responses as there were requests.
  ///
  /// If [maxRequestsPerBatch] is supplied, requests are automatically
  /// flushed whenever there are that many requests pending.
  ///
  /// If [maxDelay] is supplied, requests are automatically flushed 
  /// when the oldest request has been pending for that long. 
  /// As such, The [maxDelay] is not the maximal time before a request
  /// is answered, just how long sending the request may be delayed.
  BatchRequest(Future<List<Response>> Function(List<Request>) batchCompute,
               {int maxRequestsPerBatch, Duration maxDelay})
    : _compute = batchCompute,
      _maxRequests = maxRequestsPerBatch,
      _maxDelay = maxDelay;

  /// Add a request to the batch.
  ///
  /// The request is stored until the requests are flushed,
  /// then the returned future is completed with the result (or error)
  /// received from handling the requests.
  Future<Response> addRequest(Request request) {
    var completer = Completer<Response>();
    (_pendingRequests ??= []).add(request);
    (_responseCompleters ??= []).add(completer);
    if (_pendingRequests.length == _maxRequests) {
      _flush();
    } else if (_timeout == null && _maxDelay != null) {
      _timeout = Timer(_maxDelay, _flush);
    }
    return completer.future;
  }

  /// Flush any pending requests immediately.
  void flush() {
    _flush();
  }

  void _flush() {
    if (_pendingRequests == null) {
      assert(_timeout == null);
      assert(_responseCompleters == null);
      return;
    }
    if (_timeout != null) {
      _timeout.cancel();
      _timeout = null;
    }
    var requests = _pendingRequests;
    var completers = _responseCompleters;
    _pendingRequests = null;
    _responseCompleters = null;

    _compute(requests).then((List<Response> results) {
      if (results.length != completers.length) {
        throw StateError("Wrong number of results. "
           "Expected ${completers.length}, got ${results.length}");
      }
      for (int i = 0; i < results.length; i++) {
        completers[i].complete(results[i]);
      }
    }).catchError((error, stack) {
      for (var completer in completers) {
        completer.completeError(error, stack);
      }
    });
  }
}

You can use that as, for example:

void main() async {
  var b = BatchRequest<int, int>(_compute, 
      maxRequestsPerBatch: 5, maxDelay: Duration(seconds: 1));
  var sw = Stopwatch()..start();
  for (int i = 0; i < 8; i++) {
    b.addRequest(i).then((r) {
      print("${sw.elapsedMilliseconds.toString().padLeft(4)}: $i -> $r");
    });
  }
}
Future<List<int>> _compute(List<int> args) => 
    Future.value([for (var x in args) x + 1]);
lrn
  • 64,680
  • 7
  • 105
  • 121
2

See https://pub.dev/packages/batching_future/versions/0.0.2

I have almost exactly the same answer as @lrn, but have put some effort to make the main-line synchronous, and added some documentation.

/// Exposes [createBatcher] which batches computation requests until either
/// a max batch size or max wait duration is reached.
///
import 'dart:async';

import 'dart:collection';

import 'package:quiver/iterables.dart';
import 'package:synchronized/synchronized.dart';

/// Converts input type [K] to output type [V] for every item in
/// [batchedInputs]. There must be exactly one item in output list for every
/// item in input list, and assumes that input[i] => output[i].
abstract class BatchComputer<K, V> {
  const BatchComputer();
  Future<List<V>> compute(List<K> batchedInputs);
}

/// Interface to submit (possible) batched computation requests.
abstract class BatchingFutureProvider<K, V> {
  Future<V> submit(K inputValue);
}

/// Returns a batcher which computes transformations in batch using [computer].
/// The batcher will wait to compute until [maxWaitDuration] is reached since
/// the first item in the current batch is received, or [maxBatchSize] items
/// are in the current batch, whatever happens first.
/// If [maxBatchSize] or [maxWaitDuration] is null, then the triggering
/// condition is ignored, but at least one condition must be supplied.
///
/// Warning: If [maxWaitDuration] is not supplied, then it is possible that
/// a partial batch will never finish computing.
BatchingFutureProvider<K, V> createBatcher<K, V>(BatchComputer<K, V> computer,
    {int maxBatchSize, Duration maxWaitDuration}) {
  if (!((maxBatchSize != null || maxWaitDuration != null) &&
      (maxWaitDuration == null || maxWaitDuration.inMilliseconds > 0) &&
      (maxBatchSize == null || maxBatchSize > 0))) {
    throw ArgumentError(
        "At least one of {maxBatchSize, maxWaitDuration} must be specified and be positive values");
  }
  return _Impl(computer, maxBatchSize, maxWaitDuration);
}

// Holds the input value and the future to complete it.
class _Payload<K, V> {
  final K k;
  final Completer<V> completer;

  _Payload(this.k, this.completer);
}

enum _ExecuteCommand { EXECUTE }

/// Implements [createBatcher].
class _Impl<K, V> implements BatchingFutureProvider<K, V> {
  /// Queues computation requests.
  final controller = StreamController<dynamic>();

  /// Queues the input values with their futures to complete.
  final queue = Queue<_Payload>();

  /// Locks access to [listen] to make queue-processing single-threaded.
  final lock = Lock();

  /// [maxWaitDuration] timer, as a stored reference to cancel early if needed.
  Timer timer;

  /// Performs the input->output batch transformation.
  final BatchComputer computer;

  /// See [createBatcher].
  final int maxBatchSize;

  /// See [createBatcher].
  final Duration maxWaitDuration;
  _Impl(this.computer, this.maxBatchSize, this.maxWaitDuration) {
    controller.stream.listen(listen);
  }

  void dispose() {
    controller.close();
  }

  @override
  Future<V> submit(K inputValue) {
    final completer = Completer<V>();
    controller.add(_Payload(inputValue, completer));
    return completer.future;
  }

  // Synchronous event-processing logic.
  void listen(dynamic event) async {
    await lock.synchronized(() {
      if (event.runtimeType == _ExecuteCommand) {
        if (timer?.isActive ?? true) {
          // The timer got reset, so ignore this old request.
          // The current timer needs to inactive and non-null
          // for the execution to be legitimate.
          return;
        }
        execute();
      } else {
        addPayload(event as _Payload);
      }
      return;
    });
  }

  void addPayload(_Payload _payload) {
    if (queue.isEmpty && maxWaitDuration != null) {
      // This is the first item of the batch.
      // Trigger the timer so we are guaranteed to start computing
      // this batch before [maxWaitDuration].
      timer = Timer(maxWaitDuration, triggerTimer);
    }
    queue.add(_payload);
    if (maxBatchSize != null && queue.length >= maxBatchSize) {
      execute();
      return;
    }
  }

  void execute() async {
    timer?.cancel();
    if (queue.isEmpty) {
      return;
    }
    final results = await computer.compute(List<K>.of(queue.map((p) => p.k)));
    for (var pair in zip<Object>([queue, results])) {
      (pair[0] as _Payload).completer.complete(pair[1] as V);
    }
    queue.clear();
  }

  void triggerTimer() {
    listen(_ExecuteCommand.EXECUTE);
  }
}
Jack Reilly
  • 463
  • 2
  • 8