0

We have a c# reliable collection dictionary that implements IReliableDictionary2 & have noticed something odd due to a serialization bug.

We have Class X that contains another class, Class Y. We forgot to add the serialization to Class Y. But for days after Class X was added to the reliable collection dictionary, Class Y was there too when we retrieved an instance of X.

Apparently the reliable collection dictionary was just in memory but at some point the collection was persisted to disk, it was at this point the Class Y came back null since it was not added to the serialization with [DataMember].

So the question is when does the reliable collection get persisted to disk? Is there programmatic control over this? Or is this a cluster setting of some sort?

BZelasky
  • 3
  • 1
  • 4

3 Answers3

1

When you are working with ReliableCollections each operation (i.e. AddAsync) does the following:

  1. Updates ITransaction local storage (to provide read-your-own-writes semantics)
  2. Serializes the values and updates the local operations log (persists).
  3. Send these bytes to all secondary replicas to make sure they have the same information.

Then when ITransaction is committed the commit entry is appended to log and send to all secondary replicas. At the moment when quorum confirms commit the operation is considered done (please see here for more information).

So in general the information is serialized all the time.

The reason why you saw the 'correct' results is because most of the time you work with the same replica - primary replica (this is done because only primary replica can modify state) and all reads / writes from the same replica were returning correct values.

The trick here is that Service Fabric can move replicas between nodes i.e. imagine your primary replica was on Node1. All your reads and writes were fine but then Service Fabric decided to move your primary replica to Node2 - this results in a new idle replica on Node2 that gets initialized by transferring serialized data to it. When the replica is initialized then replica on Node1 is demoted and replica on Node2 is promoted. Now all you requests are server from Node2 rather than from Node1 (please see here and here for more information about service and replica lifecycles).

Oleg Karasik
  • 959
  • 6
  • 17
0

The question should be other way around: When service fabric reads collection from disk?

It persists the data (both to disk and in memory cache) when transaction committed. There could be multiple reasons to read the data from disk (e.g. primary node changed/restarted ..)

Aleksey L.
  • 35,047
  • 10
  • 74
  • 84
0

The main question: When is data persisted to disk when using IReliableDictionary2? has already many answers in stack overflow and in the docs.

How is data in Reliable Dictionary in Azure Service Fabric persisted to disk

This answer details how the data is changed and replicated: Downsides of CommitAsync() w/o any changes to collection

And this one answers how it is stored in memory: Azure Service Fabric reliable collections and memory

The reason why the problem happens in your case is clear:

  1. You didn't serialize the data correctly, when the data get replicated to other replicas it goes missing information
  2. As explained in many other questions & posts, the data lives in memory, so make no sense to write this data to the disk and read it to memory, the original copy keep in memory with full data, that is different from the one replicated.
  3. If the secondary replica becomes primary, the data loaded in memory will be missing information.
  4. There is also 'caching' for Reliable Dictionary that flush unused "cold" data to disk to release memory, this is one of the improvements they did to make a better use of memory, whenever your data get unused for a 'long' time, they remove from memory the dictionary values, not the keys, to free-up space, when you access the data again, it will get loaded from disk.
Diego Mendes
  • 10,631
  • 2
  • 32
  • 36