I am hitting a REST server that is exposing a GET endpoint with headers Content-Encoding = gzip
and Content-Type = application/json
, so the server compresses the data before sending the response.
I am trying to make a sort of backup on S3 of time-based-data and the sever only allows to fetch 1 minute chunks. So for 1 day I need to send 1440 (minutes in a day) request.
This data (compressed about 10Mb, uncompressed about 70Mb per minute) I want to send to s3 via multipart upload compressed.
From all of my test I cannot find a way to stop Netty from decompress the response, and all efforts of making this reactive also failed.
My client:
@Client("${url}")
@Header(name = "Accept-encoding", value = "gzip")
public interface MyClient {
@Get(value = "/data-feeds", consumes = MediaType.APPLICATION_OCTET_STREAM)
Flux<byte[]> getData(@QueryValue("from") String from,
@QueryValue("minute") String minute,
@QueryValue("filters") List<Object> filters);
I tried many other things too, like: return type ByteBuffer or using ReactorStreamingHttpClient dataStream
Now to get the data I did:
Flux.range(0,1439)
.flatMapSequential(minute ->
client.getData('2022-01-01',minute,Collections.emptyList()),20)
This is the first part of it, after that I map the data via a GZIPOutputStream
to a new byteArray and I need to bufferUntil
it get 5mb chunks or more to do the S3 multipart upload.
I can see in the logs that the calls are made on different event-loops threads, and I do the mapping to compressed byteArray on Schedulers.boundedElastic()
but still the application scales linear.
It takes twice as long to do the data for 2 minutes then it takes for 1.
I know these are kind of 2 issues, but I think already not needing to decompress the received data and compress it again will save me some time.