I need to copy a directory of from one cluster to another with similar HDFS (both are MAPR clusters).
I am planed to use DistCp
Java API. But I wanted to avoid duplicate copies of files in the directory. I wanted to know whether these operations are fault tolerant? I.e if the files are not copied completely due to loss of connection, if the DistCp initiates the copies again to copy a file properly?