1

I have two schemas for products and folders

const ProductSchema = new Schema<TProductSchema>(
    {
        productName: String,
        vendor: { type: Schema.Types.ObjectId, ref: 'Vendor' },
        folder: { type: Schema.Types.ObjectId, ref: 'ProductFolder' },
    }
);
const ProductFolderSchema = new Schema<TProductFolderSchema>(
    {
        folderName: { type: String, required: true },
        parent: { type: Schema.Types.ObjectId, ref: 'ProductFolder' },
        children: [{ type: Schema.Types.ObjectId, ref: 'ProductFolder' }],
    }
);

I want the following functionality in my application:

When user selects a folder, all products whose folder is equal to selected folder _id and whose folder is direct or indirect children of selected folder should be returned.

Folders and Products are separate entities: user can create and delete folders independently of products and vice versa.

The problem with current schema is that it's hard to implement finding products whose folder is a child of selected folder.

I expect each folder to contain up to 100.000 products.

I was thinking if adding products field that would contain product's _ids to ProductFolder is a good idea.

What schema design would you recommend? Should I add products field or leave it as is and come up with a solution to find nested folders? Or maybe there's some other approach?

1 Answers1

1

When user selects a folder, all products whose folder is equal to selected folder _id and whose folder is direct or indirect children of selected folder should be returned.

Here is a way to implement this, for example the models are:

product: {
  _id: <ObjectId>,
  name: <string>,
  vendor: <number>,
}

folder: {
  _id: <ObjectId>,
  parent: <ObjectId>,
  child: [ <ObjectId>, <ObjectId>, ... ]
  products: [ <ObjectId>, <ObjectId>, ... ],
}

Example data (shows some fields with data for a sample querying):

products:

 [ 
   { _id: "p11" }, { _id: "p12" }, { _id: "p13" }, 
   { _id: "p14" }, { _id: "p15" }, { _id: "p16" }, 
   { _id: "p17" }, { _id: "p18" }, { _id: "p19" }, 
   { _id: "p20" }, { _id: "p21" }, { _id: "p90" }, 
   { _id: "p91" }, { _id: "p92" },
]

folders:

[
  { _id: 1, parent: "...", children: [ 2, 3 ], products: [ "p11", "p12" ] },
  { _id: 2, parent: "...", children: [], products: [ "p13" ] },
  { _id: 3, parent: "...", children: [ 4 ], products: [ "p20", "p21" ] },
  { _id: 4, parent: "...", children: [ 40 ], products: [ "p16", "p17" ] },
  { _id: 9, parent: "...", children: [ ], products: [ "p92" ] },
  { _id: 40, parent: "...", children: [ ], products: [ "p90", "p91" ] },
]

The query is to get all the products for the current folder and its sub folders recursively. Assuming the selected folder is { _id: 1 }, the aggregation query uses $graphLookup and all the resulting "products" are listed in the all_products array.

db.folders.aggregate( [
    { 
      $match: { _id: 1 } 
    }, 
    {
      $graphLookup: {
          from: "folders",
          startWith: "$children",
          connectFromField: "children",
          connectToField: "_id", 
          maxDepth: 99999,
          as: "hierarchy"
      }
    },
    { 
      $project: {
          all_products: {
              $reduce: { 
                  input: "$hierarchy", 
                  initialValue: "$products", 
                  in: { $concatArrays: [ "$$value", "$$this.products" ] } 
              }
          }
      }
    },
] )

Example's output:

{
        "_id" : 1,
        "all_products" : [
                "p11",
                "p12",
                "p16",
                "p17",
                "p13",
                "p20",
                "p21",
                "p90",
                "p91"
        ]
}

Note the $graphLookupoutputs data in no particular order. Further, you can use $lookup to fetch the product details from the products collection (this stage is not shown in the query). The query runs in mongosh.

With this design, modifying data may require changing data in both collections. For example, if you delete a product, the data needs to be deleted from both collections.

prasad_
  • 12,755
  • 2
  • 24
  • 36
  • thanks for the input. So you think it's a good idea to store all products in corresponding property even when there will be like tens of thousands of products? – PYTHON DEVELOPER999 Aug 23 '22 at 09:00
  • You can store 10,000 or 20,000 product ids in an array (and within a document) - as long as you know that the number is not continuously growing (it needs to be bounded). Note that a document can have a size of 16 MB. – prasad_ Aug 23 '22 at 09:05
  • one folder can contain up to 100k product ids – PYTHON DEVELOPER999 Aug 23 '22 at 10:29