We have an XML document of size 1.4MB which we gzipCompress and encode to Base64 and save in cosmos. Upon receiving some updates, we read cosmos, decode from base64 and unzip to get the original string. What we are observing is at some high load the slanted apostrophe character is creating junk data while saving in cosmos upon update processing.
the base64 encoded data looks like - /F9nYk3vKlhqHb65KybqXTJfLvTvuy24HFwOq1wOT55oEkdJ+0bmcuWJJisvbfanpsb7//2//w8AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA........
and decoding,unzip this gives OOM with size growing into GB with this character � like
�����������������������������
Logic for encoding and gzip
String compressed;
try (var baos = new ByteArrayOutputStream(); var gzipOut = new GZIPOutputStream(baos);) {
gzipOut.write(data.getBytes(StandardCharsets.UTF_8));
gzipOut.close();
compressed = new String(Base64.getEncoder().encode( baos.toByteArray()));
} catch (IOException e) {
throw new FOSInjestorApplicationException(Errors.UNEXPECTED_ERROR
, Errors.UNEXPECTED_ERROR.getDescription());
}
Logic for decoding and unzip
byte[] decodebase64 = Base64.getDecoder().decode(arr);
byte[] gzip;
try (var bais = new ByteArrayInputStream(arr); var gzip = new GZIPInputStream(bais);) {
gzip = gzip.readAllBytes();
} catch (IOException e) {
throw new FOSInjestorApplicationException(Errors.UNEXPECTED_ERROR
, Errors.UNEXPECTED_ERROR.getDescription());
}
return new String(gzip);
When we place this same document with slanted apostrophe in non-prod, its working fine.
I am using java 11 and java cosmos 4.x SDK What could cause this to fail at high load?
We tried to process too many updates (1 at a time) on a document which had special character - slanted apostrophe and the update should not corrupt the data but we found this junk character after decoding/unzip - �����������������������������
which was ever growing in size into 1 GB and give OOM