We have spent a lot of hours on the inet and on stackoverflow, but none of the findings satisfied us in the way we planned a file upload in Spring context.
A few words towards our architecture. We have a node.js client which uploads files into a Spring Boot app. Let us call this REST endpoint our "client endpoint". Our Spring Boot application acts as middleware and calls endpoints of a "foreign system", so we call this endpoint a "foreign" one, due to distinction. The main purpose is the file handling between these two endpoints and some business logic in between.
Actually, the interface to our client looks like this:
public class FileDO {
private String id;
private byte[] file;
...
}
Here we are very flexible because it is our client and our interface defintion.
Due to the issue that under load our system has run out of memory sometimes, we plan to reorganize our code into a more stream-based, reactive approach. When i write "under load", i mean heavily under load, e.g. hundreds of file uploads at the same time with big files from at least some MB to at most 1GB. We know, that this tests don't represent real applications use cases, but we want to be prepared.
We spent some research into our challenge and we ended up with profiler tools showing us that according to our REST endpoints we store the files as byte arrays completely in our memory. Thats all, but not efficient.
Currently we are facing this requirement to deliver a REST endpoint for file upload and push these files into another REST endpoint of some foreign system. Doing so, our main applications intention is to be some middle tier for file upload. According to this initial situation we are looking forward to not have those files as a whole in our memory. Best would be a stream, maybe reactive. We are partially reactive with some business functions already, but at the very beginning of being familiar with all that stuff.
So, what are our steps so far? We introduced a new Client (node.js --> Spring Boot) interface as the following one. This works so far. But is it really a stream based approach? First metrics have shown, that this doesn't reduce memory utilization.
@PostMapping(value="/uploadFile", consumes = MediaType.MULTIPART_FORM_DATA_VALUE)
@ResponseStatus(HttpStatus.CREATED)
public Mono<Void> upload(@RequestPart(name = "id") String id, @RequestPart(name = "file") Mono<FilePart> file) {
fileService.save(id, file);
...
}
First question: is this type Mono<> right here? Or should we better have Flux of DataBuffer or something? And, if so, how the client shoud behave and deliver data in such a format that it is really a streaming approach?
The FileService class then should post this file(s) into the foreign system, perhaps do something else with given data, at least log the id and the file name. :-) Our code in this FileService.save(..) actually looks like the following in between:
...
MultipartBodyBuilder bodyBuilder = new MultipartBodyBuilder();
bodyBuilder.asyncPart(...take mono somehow...);
bodyBuilder.part("id", id);
return webClient.create("url-of-foreign-system")
.uri("/uploadFile")
.syncBody(bodyBuilder.build())
.retrieve()
.bodyToMono(Result.class);
...
Unfortunately, the second REST endpoint, the one of our foreign system, looks little different to our first one. It will be enriched by data from another system. It takes some FileDO2 with an id and a byte array and some other meta data specific to the second foreign system.
As said, our approach should be to minimize the memory footprint of the actions in between client and foreign system. Sometimes we have not only to deliver data to that system, but also do some business logic that maybe slows down the whole streaming process.
Any ideas to do that in a whole? Currently we have not clue to do that all... We appreciate any help or ideas.