5

I have the following architechture: enter image description here

I Import data from a SQL datbase into a mongodb. I use the importer to migrate data into the mongodb that provides the data to a website via an API.

The importing can take a couple of minutes and if it fails I would like to be able to either rollback (it would be awsome to be able to rollback multiple imports) or drop the database/collections of the uncommited rows (if you think of it as SQL transactions).

I tried to import everything into a transactions collection that, on success, moved the data into the correct collection. This took way to much time to be performant. I also tried the solution of importing into a temp db and then swapping them. But then I run into problems if someone e.g. registers a new user on the website after the db-copy but before the importing is done (that user will be lost when swapping).

How can a perform an import in a safe way and not have the most basic concurrency problems?

EDIT: To clarify the solution: I will run the importer in a cron job, at least once a day. I currently keep a timestamp for the latest synchronization and select everything that is newer than that from the SQL-db. Things will automagically appear in the SQL-db over time.

At the end of the importing I run a downloader that downloads all the images from urls in the SQL db.

I don't want to start a new sync before the images are downloaded since that could result in strange behaviour.

Victor Axelsson
  • 1,420
  • 1
  • 15
  • 32
  • Is there a reason you have to use node for the import process? I would think there are more highly-specialized tools for data migration than node that would be a better choice; https://github.com/compose/transporter for one – markthethomas Dec 18 '15 at 16:34
  • As @markthethomas mentioned, there is highly likely a specialised tool for doing this. Regarding rolling back, it's complicated because SQL is an ACID based database system and Mongo is a BASE based database system so they share very different characteristics with atomicity of transactions. Short story is you'll need to keep a track of what has been written to the Mongo DB and then trigger a delete of those documents if you need to roll back your SQL transactions. – Seonixx Dec 18 '15 at 16:45

1 Answers1

0

In cases like this, where you need to move data between very different types of databases, you're going to tend to want something really reliable, robust, and, most-importantly, with its primary concern being transferring data and doing it well. Node.js is wonderful, but I would highly recommend you find some tool out there that is focused on only doing the transfer/mapping/etc. and use that regardless of what language/technology it uses. There's nothing about node.js, as much as I love it, that particularly recommends it for this sort of thing (e.g., it's best characteristics don't necessarily make it good for this sort of transfer/migration).

Better to find a mature, well-developed library that handles this sort of thing :)

Some tools/resources that turned up in my research:

Would love if people could suggest more in comments :)

markthethomas
  • 4,391
  • 2
  • 25
  • 38