What is the purpose of Chubby Sequencers

Question

While reading article from google about chubby, I didn't really understand the purpose of sequencers

Assume we have 4 entities :

Chubby cell
Client 1
Client 2
Service we want to use and where we will send the requests (for which we need the lock)

As far as I understood the steps are:

Client 1 send lock_request() to Chubby cell, Chubby responses with Sequencer (assume SequenceNumber = 1)
Client 1 send request modify_data() with Sequencer (SequenceNumber = 1) to Service
Service asks Chubby cell if SequenceNumber is valid (=1)
Chubby acknowledges it, set LeasePeriod (period of lock expiration to (assume) 60 seconds) ! during this time no one is able to acquire the lock
After acknowledge, Service cache the data about Client 1 (SequenceNumber = 1) for (assume) 40 seconds

Now: if Client 2 tries to acquire lock during these 60 seconds we set, it will be rejected by Chubby cell

that means it is impossible that Client 2 will acquire the lock with the next SequenceNumber = 2 and send anything to the Service

As far as I understand all purpose of SequenceNumber is just for situation when 2 requests come to Service and Service can just compare 2 SequenceNumbers and reject the lower, without need to ask Chubby cell

but how this situation will ever happen if we have caches and impossibility to get the lock by Client 2 while Client 1 is holding this lock?

please post the link to the "article". do you mean the published scientific paper? — simbo1905, Nov 17 '19 at 18:44

score 0 · Answer 1 · answered Aug 28 '21 at 14:11

It will be a mistake to think about timing in distributed systems with actual times (like seconds), but I'll try to answer using the same semantics.

As you said, say client1 acquires write lock named foo1, foo here being the lock name and 1 being the generation number.

Now say, lease period is 60 seconds. 58th second now Client1 sends a write, say R1.

And soon enough, Client1 is now dead.

Now, here's the catch. You assumed in your analysis, that R1 would reach the server inside the 2 seconds, before another client, say Client2 becomes master.

THAT'S JUST NOT CERTAIN.

In a distributed system, with fractions of milliseconds network latencies on one hand and network partitions on the other hand, you just cannot ascertain what reaches the master first, R1 or client2's request to become master.

This is where sequence numbers would help.

Master, now having known that there is foo2, can reject R1 that came with foo1 in metadata.

Read more about generational clocks/logical clocks here.

A logical clock is a mechanism for capturing chronological and causal relationships in a distributed system. Often, distributed systems may have no physically synchronous global clock. Fortunately, in many applications (such as distributed GNU make), if two processes never interact, the lack of synchronization is unobservable. Moreover, in these applications, it suffices for the processes to agree on the event ordering (i.e., logical clock) rather than the wall-clock time.[1]

What is the purpose of Chubby Sequencers

1 Answers1