-1

I am new to MongoDB and planning to migrate DB2 data (~6TB) to Mongodb. so we are planning to using Java utility to read the data from DB2 and insert the same in MongoDB.

If any error occurs during the execution of Java utitity and I restart it then inserts duplicate records in MongoDB. How can i avoid those duplicate records?

Please guide me here!

Thanks!

  • It sounds like your java utility is not a good data migration tool. However, this is outside the scope of stackoverflow.com because it's not about programming code; I am going to recommend that this question be moved to dba.stackexchange.com. – Vince Bowdren Mar 16 '17 at 12:39

1 Answers1

0

Your data coming from DB2 should already have a unique primary key, and possibly additional unique business keys in the data. If you populate these field(s) as the _id in MongoDB (rather than allowing MongoDB to autogenerate an _id) then you will be able to avoid duplicates on the MongoDB side. If you attempt to insert the same record twice you will get a DuplicateKeyException.

In addition to that, it seems excessive for you to have to completely restart the load process if there are errors on individual records. But perhaps you've got more serious problems that need to be addressed, e.g. the loader is crashing the JVM?

Perhaps you just need to improve your loader process so that you don't have to start completely over. And if you populate the _id as I suggested, you will have the added assurance that you're not inserting duplicate records.

helmy
  • 9,068
  • 3
  • 32
  • 31