0

I'm curious as to why my MMS is suddenly showing a rapid data growth since I've upgraded to Mongo 3.0 (Wired Tiger storage engine). The slope of the new growth the past couple weeks directly correlates with the upgrade. There are only a couple collections in this database that have over 500 documents...though these are both huge collections, the document structure has remained the same before an after the upgrade. Also running aggregations on these collections indicate that the number of inserts has not changed on average before/after the upgrade. This leads me to question whether the data size is being calculated differently with the new WiredTiger engine or something of that matter. Does anyone have any information on this? Here is an image of my MMS data.

enter image description here

A couple things, there are 2 jumps in size...these are times when I migrated a collection over from another database to mongo. Still after both of these the growth rate remained consistent and only increased after the upgrade. The data size decreased at the upgrade (consistent with they hypothesis that Wired Tiger has compression) but has been growing so quickly that it has almost reached it's original size. Even the Storage Size has started growing much faster than it originally did, though this image doesn't do it justice.

Benjamin Oman
  • 1,654
  • 1
  • 17
  • 19
  • The average object size is increasing - the documents going in are bigger. The same number of inserts results in more storage used. Is the increase in data size consistent with the same number of inserts and the increase in object size or is there "missing" storage? Do you know why the average document size is increasing? – wdberkeley Mar 23 '15 at 14:53
  • Look at the "storage size" - it dropped drastically, possibly due to WT compression if you reloaded the data and 3.0 is running with wiredTiger storage engine. The number of indexes went way up though - that means you added indexes - did you also add new collections? – Asya Kamsky Mar 23 '15 at 21:31
  • @wdberkely, the only explanation I can think of for that is because I changed some object keys from abbreviations to full names (IE, instead of "o":123 I did "object_id":123). I did this under the assumption that TokuMX mentioned that long key names didn't really affect storage size and having longer key names is in my situation much more convenient. There is no other reason object size should be increasing. It's also noteworthy that this change in key names happened before the upgrade (about 2 weeks) at which time avg obj size remained level. – Benjamin Oman Mar 23 '15 at 22:33
  • @AsyaKamsky It makes sense that storage size dropped with the rolling upgrade due to compression. I did add several new collections but these collections' average object size is tiny compared to other existing collections' object size, they are also small in the neighborhood of 500 records when the 3 large collections have millions of records. – Benjamin Oman Mar 23 '15 at 22:35

2 Answers2

1

The data size in WiredTiger and in MMapV1 is going to be largely the same, or at least quite similar. Your documents are still the same size (mmapv1 may report some additional padding, but wiredtiger will only report actual data size).

What will be significantly different because of compression of data on disk by Wired Tiger is the "storageSize". If the data size is growing, it's because your actual data is growing - that can be seen in "average object size" increasing as well.

Asya Kamsky
  • 41,784
  • 5
  • 109
  • 133
0

Looking at this further, I think that this has to do with the PHP Mongo Driver 1.5.0 update, specifically the mongo.native_long setting being defaulted to TRUE in this version and greater. Because Mongo 3.0 required greater than version 1.4 (which I was running) I had to upgrade the driver at the same time. Doing this caused all integers in new documents to be stored as LONG type which which is twice the size. There's no reason for me to be storing all of my ints this way, especially when many are single digits.

I've changed the native_long setting to 0 and have confirmed that is is again storing everything as a 32 bit integer by default. I assume that over the next few days I will see a decline in the rate of growth. I'll update this after a few days with the results.

Update:

The reason the size of the db was increasing was simply because it actually WAS increasing. I reviewed all 100 collections on a case by case basis over the period of a week, found the one that was causing the growth and examined it to find that there were significant numbers of rows being added and that this has nothing to do with the upgrade to 3.0. This issue was a mere coincidence and after 7 months since this post I have no reason to believe that 3.0 reports size incorrectly to MMS. Further, there is no reason to believe that the exponential growth resulted from the size of integers as 64 bit.

Benjamin Oman
  • 1,654
  • 1
  • 17
  • 19