I am using an AWS Opensearch cluster to store some historical data in indices every day. (An index for each date: 2023.03.24, 2023.03.23 etc..). Each index has 1:1 primary to replica shard ratio, and approx. 10 million records stored per index. Recently we have made a change to store the entire data for each record in another field in the same record as a gzip compressed binary blob. (gzip compress the json data record, and then base64 encode it to create binary blob, and store it in a separate field in the same record. Let's say the new field is called 'compressed' with mapping type as keyword)
However, after making the change, our index size have jumped almost 500%!. This is not expected, as even with 0% compression, it should have at most doubled. Can anyone think of what could be the possible reasons for this massive increase, and how we can store that kind of data in Opensearch/Elasticsearch more memory efficiently?