I have documents that look like this
{
"field1" : "value",
"field3" : "value",
"attributes" : {
"key1" : {
"category" : "4",
"value" : "value"
},
"key2" : {
"category" : "5",
"value" : "value"
},
}
}
I want to transform these to look like the structure below (basically it's taking a dictionary and flattening it into an array). Note the number of keys in attributes
varies.
{
"field1" : "value",
"field3" : "value",
"attributes" : [
{
"key" : "key1",
"category" : "4",
"value" : "value"
},
{
"key" : "key2",
"category" : "5",
"value" : "value"
},
]
}
Because the number of documents is significant, I would like to use the bulk APIs. I have this basic structure to my script.
db = db.getSiblingDB('MyDB');
db.MyCollection.intializeOrderedBukOp(),
var bulk = db.MyCollection.intializeOrderedBukOp(),
count = 0;
db.MyCollection.find({}).forEach(function(doc) {
// Re-map attrs (what follows is some pseudocode
var attrs = function TransformSomehow(doc.attributes) {
// return an array...
};
});
// Queue update operation
bulk.find({ "_id": doc._id })
.updateOne({
"$set": { "attrs": attrs } }
});
count++;
// Execute every 1000
if ( count % 1000 == 0 ) {
bulk.execute();
bulk = db.collection.intializeOrderedBukOp();
}
});
// Drain any queued remaining
if ( count % 1000 != 0 )
bulk.execute();
This was borrowed from this answer, but I can't use it as is because it's going the opposite direction (array to object). It appears that I can't use map
to do this.