0

I've setup an Azure batch process to read multiple csv files at the same time and write to Azure DocumentDb. I need a suggestion on the consistency level that fits the best for me.

I read through the consistency levels document(http://azure.microsoft.com/en-us/documentation/articles/documentdb-consistency-levels/) but am unable to relate my business case to the options provided in there.

My process Get Document by Id
-If found then will pull a copy of the document, update changes and replace it.
-If not found, create a new entry.

Avinash Gadiraju
  • 723
  • 1
  • 7
  • 16
  • What's the nature of your concern? Are you concerned about having multiple processes writing documents at the same time (concern being performance here), or about clients reading data from docdb that might be stale while the writes are performed? – Luis Delgado Jan 23 '15 at 05:55

1 Answers1

4

if your writes and reads are from the same process (or you can share an instance of the documentclient) then session consistency will give you the best performance while ensuring you get consistent reads. This is because each SDK manages the session tokens ensuring that the read goes to a replica that has seen the write. Even if you don't do this, in your case the write will fail if you use the same document id. Within a collection, document ids are guaranteed to be unique.

Short version - session consistency (the default) is probably a good choice.

  • My writes and reads come from my console app that is run @ multiple instances on various machines at the same time(Azure batch process). Even for that case session consistency is the best ? – Avinash Gadiraju Jan 22 '15 at 21:59
  • In this case you'll see eventually consistent reads across instances given they aren't sharing the session token. However document ids are unique within a collection. If you happen to get a stale read then subsequently insert a document that exists, you'll get a conflict. You could then handle this conflict by replacing the doc, throwing away the add/update or dumping to a dead letter queue. Depending on the nature of the data and needs of the app. I'm trying to keep you away from setting strong consistency as you'll see much better performance with the default (session) setting. – John Macintyre - MSFT Jan 23 '15 at 00:44