1

I'm trying to analyise the db size for redis db and tweak the storage of our data per a few articles such as https://davidcel.is/posts/the-story-of-my-redis-database/ and https://engineering.instagram.com/storing-hundreds-of-millions-of-simple-key-value-pairs-in-redis-1091ae80f74c

I've read documentation about "key sizes" (i.e. https://redis.io/commands/object)

and tried running various tools like:

redis-cli --bigkeys

and also tried to read the output from the redis-cli:

INFO memory

The size semantics are not clear to me.

Does the reported size reflect ONLY the size for the key itself, i.e. if my key is "abc" and the value is "value1" the reported size is for the "abc" portion? Also the same question in respects to complex data structures for that key such as a hash / array or list.

Trial and error doesn't seem to give me a clear result.

Avba
  • 14,822
  • 20
  • 92
  • 192

1 Answers1

2

Different tools give different answers.

First read about --bigkeys - it reports big value sizes in the keyspace, excluding the space taken by the key's name. Note that in this case the size of the value means something different for each data type, i.e. Strings are sized by their STRLEN (bytes) whereas all other by the number of their nested elements.

So that basically means that it gives little indication about actual usage, but rather does as it is intended - finds big keys (not big key names, only estimated big values).

INFO MEMORY is a different story. The used_memory is reported in bytes and reflects the entire RAM consumption of key names, their values and all associated overheads of the internal data structures.

There also DEBUG OBJECT but note that it's output is not a reliable way to measure the memory consumption of a key in Redis - the serializedlength field is given in bytes needed for persisting the object, not the actual footprint in memory that includes various administrative overheads on top of the data itself.

Lastly, as of v4 we have the MEMORY USAGE command that does a much better job - see https://github.com/antirez/redis-doc/pull/851 for the details.

Itamar Haber
  • 47,336
  • 7
  • 91
  • 117
  • when you say "big keys" or "key names" etc. does it refer to the "key value pair" size? In regards to hash types (i.e. hset outer_key inner_key value), is the key size refering to the whole shpeel? (the full size of key and the hash it references?). we have some cases where the key is very long (50 characters) but the value is very short (like a single digit value) and cases where the value is 5 -10 times longer than the key. Thanks for your clarification – Avba Nov 07 '17 at 17:31
  • key name is the only the name of the key without the value, and with "big keys" that size isn't included. `MEMORY USAGE` provides a good estimate (depending on samples) of the associated value(s) and relevant admin overheads but also doesn't include the key name's "cost". Also, as a rule, try avoiding long key names. – Itamar Haber Nov 08 '17 at 13:03
  • What command would I use to get the memory used by the key plus its value included in human readable bytes? (value could be "plain" i.e. string, number, binary etc. or complex type such as hash , set , list) – Avba Nov 08 '17 at 13:08
  • Sorry, my mistake - `MEMORY USAGE` does do that (key name and value). There's just a little bug in it - https://github.com/antirez/redis/issues/4430 – Itamar Haber Nov 08 '17 at 14:10