0

I'm looking for a cloud data storage service which offers the following:

  • Data is stored in duplicate (or more)
  • Data is identical between original and duplicate(s) at all times (i.e. sync original->duplicate is instant or data storage requests don't return until all instances have returned successfully)
  • If the original fails, we should be able to use the duplicate(s) as if it were the original

So my specific question is:

  • Does such a storage solution exist? If so, where can we find it?
  • And if not, are there any best practices for handling the duplicate instance missing data from the original in code?

Many cloud services offer some form of persistence and replication, but there is usually a delay or synchronisation moment between the instances, which in many cases can lead to the duplicate not containing all of the data of the original. This is often in the order of a few seconds to a few minutes, but even such a small time-frame can be quite significant. We're looking to eliminate this delay entirely.

Background:
Currently I'm working on a matchmaking system for an online game. This system must be very reliable, and must have as little downtime as possible. So far our setup has been to use any number of servers, and have them all connect to the same storage unit so they can all work with the same dataset. Specifically, currently our servers are Azure Webroles, and our storage unit is an Azure Redis cache. However, Redis suffers the same issue as described above (delay of ~1s), so we're looking for any alternatives.

Maarten
  • 13
  • 5

1 Answers1

1

There is a pretty extensive article on this hosted on the Redis site:

Redis Persistence

We personally use a combination of RDB and AOF on our servers the benefit of this is that any write operations are recorded alongside smaller snapshots written to disk as it goes, this is great for backing up data, the downside is that more storage space is required and there is a small performance hit depending on how you implement AOF. There is an "everysec" option which flushes the buffer for this every second and is a good balance between speed and integrity.

the1dv
  • 893
  • 7
  • 14
  • Thanks for the response. I've read about the Redis persistence. Even with everysec there is still a chance to lose data for a period of a second. They also have a fsync setting "every new command" which is "Very very slow" So I think that's not recommended either. – Maarten Feb 10 '15 at 12:34
  • It's unfortunately one of those things, you want persistence then you are relying on disk IO which is very slow – the1dv Feb 10 '15 at 22:30
  • Also - you do realise that the actual chances of losing data from one of these is extremely low? Unless the job is actually completed it isn't removed from the queue. You can set up multiple servers for redundancy and even Cluster Redis to provide multiple points of redundancy. There are also other options like RabbitMQ and ActiveMQ which might fit your use case better than Redis as they are fully designed Message Queues, not just key value stores. – the1dv Feb 10 '15 at 22:36