3

As far as I know, every key name is stored "as-is" in the mongo database. It means that a field "name" will be stored using the 4 letters everywhere it is used.

Would it be wise, if I want my app to be ready to store a large amount of data, to rename every key in my mongo documents? For instance, "name" would become "n" and "description" would become "d".

I expect it to reduce significantly the space used by the database as well as reducing the amount of data sent to client (not to mention that it kinda uglify the mongo documents content). Am I right?

If I undertake the rename of every key in my code (no need to rename the existing data, I can rebuild it from scratch), is there a good practice or any additional advise I should know?

Dmytro Shevchenko
  • 33,431
  • 6
  • 51
  • 67
Billybobbonnet
  • 3,156
  • 4
  • 23
  • 49
  • 1
    Depending on the storage engine you use your mileage may vary. WiredTiger uses compression for example https://docs.mongodb.org/master/reference/glossary/#term-snappy – jpaljasma Nov 18 '15 at 14:51
  • 1
    Do you have really, really long field names? If you don't then I don't think shortening the names will have any impact, as normally most of the size is taken by field *values*. Have you tried to make a simple calculation of the database size you're going to save with this? I would imagine it to be less than 5%. The drawbacks you'll get (e.g. more complicated code maintenance) will outweigh the tiny performance gains you'll get. Besides, RAM and SSDs are getting cheaper and cheaper these days. – Dmytro Shevchenko Nov 18 '15 at 14:56
  • @jpaljasma I am using the meteor platform (and its WiredTiger storage engine, I guess?). I added it as a tag but the edit removed it. Also note that, by design, meteor will, by default, send every published key as it is in the database. It means, I think, that it also impact the client performance. – Billybobbonnet Nov 18 '15 at 15:04

2 Answers2

7

Note: this is mainly speculation, I don't have benchmarking results to back this up

While "minifying" your keys technically would reduce the size of your memory/diskspace footprint, I think the advantages of this are quite minimal if not actually disadvantageous.

The first thing to realize is that data stored in Mongodb is actually not stored in its raw JSON format, its actually stored as pure binary using a standard know as BSON. This allows Mongo to do all sorts of internal optimizationsm, such as compression if you're using WiredTiger as your storage engine (thanks for pointing that ouT @Jpaljasma).

Second, lets say you do minify your keys. Well then you need to minify your keys. Every time. Forever. Thats a lot of work on your application side. Plus you need to unminify your keys when you read (because users wont know what n is). Every time. Forever. All of a sudden your minor memory optimization becomes a major runtime slowdown.

Third, that minifying/unminifying process is kinda complicated. You need to maintain and test mappings between the two, keep it tested, up to date, and never having any overlap (if you do, thats the end of all your data pretty much). I wouldn't ever work on that.

So overall, I think its a pretty terrible idea to minify your keys to save a couple of characters. Its important to keep the big picture in mind: the VAST majority of your data will be not in the keys, but in the values. If you want to optimize data size, look there.

David says Reinstate Monica
  • 19,209
  • 22
  • 79
  • 122
  • Thanks for pointing out that the data is stored in BSON. Also note that I don't plan to commit into a minifying/unminifying process, I just wonder if it might be worth to rename every key in my code in order to shorten them. – Billybobbonnet Nov 18 '15 at 15:07
  • @Billybobbonnet But if you shorten everything to `n`, `d`, `i`, `zg`, etc, it will be totally unreadable, so you will need to the unminification. And if you unminify, you need to reminify when it goes back in the DB. – David says Reinstate Monica Nov 18 '15 at 15:09
0

The full name of every field is included in every document. So when your field-names are long and your values rather short, you can end up with documents where the majority of the used space is occupied by redundant field names.

This affects the total storage size and decreases the number of documents which can be cached in RAM, which can negatively affect performance. But using descriptive field-names does of course improve readability of the database content and queries, which makes the whole application easier to develop, debug and maintain.

Depending on how flexible your driver is, it might also require quite a lot of boilerplate code to convert between your application field-names and the database field-names.

Whether or not this is worth it depends on how complex your database is and how important performance is to you.

Philipp
  • 67,764
  • 9
  • 118
  • 153
  • Maybe its my lack of Mongo experience talking, but I think that if your field names are really long and your values are short its a sign of bad design. – David says Reinstate Monica Nov 18 '15 at 14:58
  • @DavidGrinberg Why do you think that? – Philipp Nov 18 '15 at 15:01
  • I think that super long field names (regardless of size of the value) are a sign of poor naming or bad design. Simply put, its just hard to read. The same way that you shouldnt create a variable or function with 500 characters, you shouldnt create a field name with 500 characters. – David says Reinstate Monica Nov 18 '15 at 15:03