0

When Jackson maps a JSON input to a DTO it automatically decodes base64. I want to disable this decoding for one particular field (which is a byte array), because I transfer it through as-is to an other service through a REST API and the decoding causes an increase in memory usage due to intermediate structures (encoded byte array, decoded byte array, another encoded byte array to send to the other service).

Is there any elegant way to achieve this? I debugged a bit to see what the internal code is like and found the class Base64Variant which unfortunately is final so I cannot override its behaviour. I suppose I can go and copy-paste parts of the internal logic of JsonParser to read the inputstream (minus the base64 decoding) from within a custom deserializer but I first wanted to ask here if anyone has a better solution.

This is the same question as this one but for deserialization instead.

Bruno
  • 63
  • 7
  • Err, I dont understand. The reason why bytes are encoded is: you can't send them over the wire as is. You simply CANT have arbitrary binary data within JSON text. What if the binary data bytes that resemble a quote " ? So the byte area that comes with your JSON input *must* be encoded already? – GhostCat Apr 04 '22 at 09:04
  • Deserializing of a field in Base64 decodes it to its binary dara `byte[]` - superfluously (though the DTO is then smaller). Can you make the field `String`? Sent as `byte[]` but at the receive side as String = other class, – Joop Eggen Apr 04 '22 at 09:12
  • The incoming data is in the form of a JSON String with base64 data which is read as an InputStream by Jackson. Jackson reads through the stream (the base64 data) and decodes the whole thing byte by byte to eventually map to the target DTO field which is a byte array of decoded data. I then send this byte array to my other service but for that I need to re-encode it into base64 hence I already consumed 3 times the size of the incoming JSON in terms of memory. What I'm trying to do is bypassing the decoding since I never need the actual data in my service, I just need to pass it through as-is. – Bruno Apr 04 '22 at 09:20
  • @JoopEggen if I use a `String` in my DTO I think it uses even more memory, at least based on my limited testing. I think it stores the decoded bytes, then the `String` made out of those bytes. I may be wrong, but that's the impression I got based on memory usage output by `jconsole`. If I use a byte array instead of a `String` it uses half the memory but the byte array is still decoded so I have to re-encode it to send it through, which again uses up as much memory as the size of the incoming JSON (which is large). – Bruno Apr 04 '22 at 09:24
  • If the entire structure is a Base64 string, you are out of luck. Base64 encodes 6 input bytes to 8 ASCII Base64 bytes. Hence depending on the start of the field (chance 1:6) the original Base64 is maintained, and then the padding at the end still remains to be done. You'll need the decoding. And yes a String needs 2*8/6 = 2.5 the memory. – Joop Eggen Apr 04 '22 at 09:26
  • The structure is a JSON with a bunch of normal fields and then one field representing an array of files, each file is a JSON node with one field that contains its bytes base64 encoded. They're typically PDFs so the base64 in the JSON starts with `JVBERi`. But I'm not sure I understand : the JSON is read as an input stream, the base64 for a given file is a byte array, can I not just send it through as is so I only have to consume 1x the input JSON in terms of memory? – Bruno Apr 04 '22 at 09:30
  • I get your point about base64, seems I'm either explaining it wrong or I'm misunderstanding how JSON is sent over the wire. – Bruno Apr 04 '22 at 09:46
  • Probably I am not understanding. For good order: do you get one single Base64 `meYgo35kd....` or `{a='meYgo35kd', b='meYgo35kd', ...}`. – Joop Eggen Apr 04 '22 at 10:25
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/243578/discussion-between-joop-eggen-and-bruno). – Joop Eggen Apr 04 '22 at 10:35

0 Answers0