NameNode only has to wait for blocks to be minimally replicated before returning successfully

Question

I have a question about the block reports sent to the NameNode from the DataNodes when a client writes in HDFS, as well as the ack from the NameNode to the client about closing the file.

Could someone kindly elaborate this paragraph of the Hadoop book:

"When the client has finished writing data, it calls close() on the stream. This action flushes all the remaining packets to the datanode pipeline and waits for acknowledgments before contacting the namenode to signal that the file is complete. The namenode already knows which blocks the file is made up of, so it only has to wait for blocks to be minimally replicated before returning successfully."

"returning successfully" is ambiguous: it can mean returning successfully to the client (but it could not be before hours until the next block report arrives, which doesn't make sense to me), or it can mean returning successfully but asynchronously and locally on the NameNode, a few hours later, without the client having to wait for this.

This question is slightly related to the discussion here, where a comment asks if the write in HDFS async or sync is.

E.g., in the case of async, it would mean that the close() call returns immediately to the client and the NameNode checks for replication asynchronously after that;
in the case of sync, it would mean that the NameNode would have to wait for the block reports from the DataNodes before acknowledging back to the client about the file close().

Another comment in the previous discussion points to this source for clarification, where it says

"A call to complete() will not return true until all the file's blocks have been replicated the minimum number of times. Thus, DataNode failures may cause a client to call complete() several times before succeeding".

This info source tends to opt for the sync explanation.

I would really appreciate if a Hadoop engineer can advise on this. Thank you very much in advance.

NameNode only has to wait for blocks to be minimally replicated before returning successfully

0 Answers0