Akka streams reduces my boilerplate code significantly and contains many useful features. However, I need to be able to limit the speed at which items are processed. The problem is that I am feeding a Hazelcast queue attached to a source links of resources to download over time (from a single online site) but the number of links entering the queue can grow quite large. Ideally, no more than 50-60 requests would run at once. Is there a feature in Akka Streams that would allow me to limit the number of items being processed at once?
A further limitation is the need for complex state management, code processing, and other features in interacting with certain websites. Akka Http is incapable of helping here. My network code is entirely written in Jsoup and Apache Http Components with an occasional call to a JavaFX based server to render script.
My current attempt to control the rate of input using a buffer as described in the docs follows:
val sourceGraph: Graph[SourceShape[(FlowConfig, Term)], NotUsed] = new HazelcastTermSource(conf.termQueue, conf)
val source = Source.fromGraph(sourceGraph)
val (killSwitch, last) = source
.buffer(conf.crawlStreamConf.maxCrawlConcurrency, OverflowStrategy.backpressure)
.viaMat(new DownloadFlow())(Keep.both)
.map(x => println(x))
.to(Sink.ignore).run()