Alpakka provides a great way to access dozens of different data sources. File oriented sources such as HDFS and FTP sources are delivered as Source[ByteString, Future[IOResult]
. However, HTTP requests via Akka HTTP are delivered as entity streams of Source[ByteString, NotUsed]
. In my use case, I would like to retrieve content from HTTP sources as Source[ByteString, Future[IOResult]
so I can build a unified resource fetcher that works from multiple schemes (hdfs, file, ftp and S3 in this case).
In particular, I would like to convert the Source[ByteString, NotUsed]
source to
Source[ByteString, Future[IOResult]
where I am able to calculate the IOResult from the incoming byte stream. There are plenty of methods like flatMapConcat
and viaMat
but none seem to be able to extract details from the input stream (such as number of bytes read) or initialise the IOResult
structure properly. Ideally, I am looking for a method with the following signature that will update the IOResult as the stream comes in.
def matCalc(src: Source[ByteString, Any]) = Source[ByteString, Future[IOResult]] = {
src.someMatFoldMagic[ByteString, IOResult](IOResult.createSuccessful(0))(m, b) => m.withCount(m.count + b.length))
}