0

When file is ingested using "hdfs dfs -put" client computes checksum and sends both input data+checksum to Datanode for storing.

How does this checksum calculatio/validation happen when File is read/write using WebHdfs ? how data integrity is insured with WebHdfs ?

Hadoop documentation on apache don't mention anything about it.

Chhaya Vishwakarma
  • 1,407
  • 9
  • 44
  • 72

1 Answers1

0

WebHDFS is just a proxy through the usual datanode operations. Datanodes host the webhdfs servlets, which open standard DFSClients and read or write data through the standard pipeline. It's an extra step in the normal process but does not fundamentally change it. Here is a brief overview.

Community
  • 1
  • 1
Jakob Homan
  • 2,284
  • 1
  • 13
  • 16