0

Recently, I read the GFS paper but I have some trouble on the data strcuture of the checkpoint. So what does Checkpoint's data structure look like, and how is it designed

  • Welcome! I think you could make this question clearer. First and foremost, you should probably link to the paper :). It might also help someone get at what you don't understand if you at least take a stab at recapitulating what you were able to understand. – josephkibe Apr 20 '21 at 20:00

1 Answers1

1

When the operation log reaches a certain size, the GFS master will do a checkpoint, which is equivalent to dumping the log data in the B-Tree format of the memory to the disk. Note that you can understand the checkpoint at this time as a process;

when the master needs to be restarted, you can read the most recent checkpoint, and then replay the operation log after it to speed up the recovery time. Note that you can understand the checkpoint at this time as a reference, which refers to the result data of the most recent process of the checkpoint.

RecharBao
  • 371
  • 2
  • 10
  • Hello, I have a question maybe stupid, what does the checkpoint really stores so that we need a B-Tree here?Isn't the log some sequential records? I didn't understand this point in the paper, Thank you ! – Pan Jun 08 '22 at 13:01
  • 1
    @Pan Some operation logs are generally stored in checkpoints. These logs are used for rollback or recovery. They can be quickly loaded into memory by using a nice structure when need. This paper has ‘The checkpoint is in a compact B-tree like form that can be directly mapped into memory and used for namespace lookup without extra parsing.’ – RecharBao Jun 11 '22 at 15:37