I am trying to do a bulk insert in MongoDB using PyMongo. I have millions of product/review documents to insert into MongoDB. Here is the structure of the document:
{
"_id" : ObjectId("553858a14483e94d1e563ce9"),
"product_id" : "B000GIKZ4W",
"product_category" : "Arts",
"product_brand" : "unknown",
"reviews" : [
{
"date" : ISODate("2012-01-09T00:00:00Z"),
"score" : 3,
"user_id" : "A3DLA3S8QKLBNW",
"sentiment" : 0.2517857142857143,
"text" : "The ink was pretty dried up upon arrival. It was...",
"user_gender" : "male",
"voted_total" : 0,
"voted_helpful" : 0,
"user_name" : "womans_roar \"rohrra\"",
"summary" : "Cute stamps but came with dried up ink"
}
],
"product_price" : "9.43",
"product_title" : "Melissa & Doug Deluxe Wooden Happy Handle Stamp Set"
}
There can be multiple reviews for a single product. The requirement is to insert one document per product_id and keep appending more reviews as subdocument in the reviews array. Can you please provide some pointers on how this can be achieved? Also, will be nice to do implement bulk insert for performance.