Is this pattern of batching a subset of a collection for parallel processing ok? Is there a better way to do this that I am missing?
When given a collection of entity ids that need to be fetched from a service which returns a scala Future instead of making all the requests at once we batch them because the service can only handle a certain number of requests at a time. In a way it is a primitive throttling mechanism to avoid overwhelming the data store. It looks like a code smell.
object FutureHelper{
def batchSerially[A, B, M[a] <: TraversableOnce[a]](l: M[A])(dbFetch: A => Future[B])(
implicit ctx: ExecutionContext, buildFrom: CanBuildFrom[M[A], B, M[B]]): Future[M[B]] =
l.foldLeft(Future.successful(buildFrom(l))){
case (accF, curr) => for {
acc <- accF
b <- dbFetch(curr)
} yield acc += b
}.map(s => s.result())
}
object FutureBatching extends App {
implicit val e: ExecutionContext = scala.concurrent.ExecutionContext.Implicits.global
val entityIds = List(1,2,3,4,5,6)
val batchSize = 2
val listOfFetchedResults =
FutureHelper.batchSerially(entityIds.grouped(batchSize)) {groupedByBatchSize =>
Future.sequence{
groupedByBatchSize.map( i => Future.successful(i))
}
}.map(_.flatten.toList)
}