0

I'm wondering which is better practice when storing large amounts of unique data in mongoDB. In my collection, each document has a key called shapes, which holds a series of shapes (up to possibly thousands) that each have their own properties like so:

[
  {
    _id: ...,
    title: 'design one',
    shapes: {
       shapeid-123126: {
         id: 'shapeid-123816',
         fill: 'red',
         type: 'square'
       },
       shapeid-928372: {
         id: 'shapeid-928372',
         fill: 'red',
         type: 'square'
       }
    }
  },
  {
    _id: ...,
    title: 'design two',
    shapes: {
       shapeid-52316: {
         id: 'shapeid-52316',
         fill: 'red',
         type: 'square'
       },
       shapeid-634372: {
         id: 'shapeid-634372',
         fill: 'red',
         type: 'square'
       }
    }
  }
]

My question is: is it better practice to simply store the data in an array when the number of unique shapes could potentially number in the many thousands?

[
  {
    _id: ...,
    title: 'design one',
    shapes: [
       {
         id: 'shapeid-123816',
         fill: 'red',
         type: 'square'
       },
       {
         id: 'shapeid-928372',
         fill: 'red',
         type: 'square'
       },
       // ... etc up to 1000's
    ]
  },
  {
    _id: ...,
    title: 'design two',
    shapes: [
       {
         id: 'shapeid-52316',
         fill: 'red',
         type: 'square'
       },
       {
         id: 'shapeid-634372',
         fill: 'red',
         type: 'square'
       },
       // ... etc up to 1000's
    ]
  }
]

The reason I've currently got the DB set up in the first fashion is because updating data client side is much faster using key access than having to find index then update, so I just carried the data structure over to the database. I'm just not sure this is wise.

I've already hit my first bump wherein trying to export the data from Compass means having to select ALL the keys (_id, title, shapes, shapes.shapeid-123126, , shapes.shapeid-123126.fill, shapes.shapeid-123126.type, ...etc) because shapes is an object, not an array, which means having to select all the properties of each shape as schema.

I exported that data and I'm not even sure absolutely everything properly exported because with such a large volume of unique shape ids, there are hundreds of thousands of keys as the exported schema. (I know this is likely another issue, I'm just seeing the downsides of the first approach).

Modermo
  • 1,852
  • 2
  • 25
  • 46
  • Have a look on https://morpheusdata.com/blog/2014-12-01-how-to-store-large-lists-in-mongodb – varman Jul 29 '20 at 10:14
  • You will also hit the problem with index in the first schema design. So second one is better, you can also restructure your schema if you hit the max limit of 16MB. – Yahya Jul 29 '20 at 10:23
  • According to the linked article, the first option is much more storage efficient than the second one. – Modermo Jul 29 '20 at 10:26
  • Storing the shapes in an array, is acceptable as long as you know that the array is not growing. The size needs to be definitely known, can be thousands. And, also the way you query the data (or, the kind of queries) should drive the way you structure the data. – prasad_ Jul 29 '20 at 10:34

0 Answers0