0

I want to compare two very big collection, the main of the operation is two know what element is change or deleted My collection 1 and 2 have a same structure and have more 3 million records example : record 1 {id:'7865456465465',name:'tototo', info:'tototo'}

So i want to know : what element is change, and what element is not present in collection 2. What is the best solution to do this?

timactive
  • 789
  • 6
  • 27

2 Answers2

0

1) Define what equality of 2 documents means. For me it would be: both documents should contain all fields with exact same values given their ids are unique. Note that mongo does not guarantee field order, and if you update a field it might move to the end of the document which is fine.

2) I would use some framework that can connect to mongo and fetch data at the same time converting it to a map-like data structure or even JSON. For instance I would go with Scala + Lift record (db.coll.findAll()) + Lift JSON. Lift JSON library has Diff function that will give you a diff of 2 JSON docs.

3) Finally I would sort both collections by ids, open db cursors, iterate and compare.

yǝsʞǝla
  • 16,272
  • 2
  • 44
  • 65
0

if the schema is flat in your case it is, you can use a free tool to compare the data(dataq.io) in two tables.

Disclaimer : I am the founder of this product.

firemonkey
  • 329
  • 4
  • 18