1

I didn't have the privilege to take a course on distributed systems. I am reading up on distributed systems and came to know about replication etc.

Can you tell me which strategy is the most popular/most used for handling fault tolerance or does it depend on a case to case basis? Otherwise which would be the simplest to understand?

I have a sample problem:

Suppose I have 3 servers and degree of replication is 2.

So Server A has files: x y

Server B: y z

Server C: z x

Now, each server can receive a request from the user and needs to know which server has which file. I know the general techniques of deciding which server has which file: like order of appearance, hashing by key value, using actual value etc.

So suppose we are using hashing.

  • We need to store the hash table/lookup on each server, correct? Or can we just get away with storing the hash function itself?

  • By using hashing, we can get the ID of first system where we are going to store this file. But what about the 2nd system? Do we use a separate hash function for deciding the replicating server?

  • In case we need to store a hash table, do we need to store it on each server? How do we ensure that when we store a file, all 3 server's hash tables get updated and are consistent?

On a final note, can you suggest me video resource, like youtube videos/Coursera course related to distributed systems or a good book. I want to learn the basic concepts like these.

Shubham
  • 2,847
  • 4
  • 24
  • 37
rents
  • 768
  • 1
  • 7
  • 22

0 Answers0