when I user distcp command as
hadoop distcp /a/b/c/d gs:/gcp-bucket/a/b/c/ , where d is a folder on HDFS containing subfolders.
If folder c is already there on gcp then it copies d ( and its subfolders) from HDFS to gcp inside c but if c folder is not there on gcp then it creates c folder on gcp and copies subfolders of d (but not d it self ) inside of c folder of gcp.
So if e is the sub folder in d on HDFS and folder c exists on gcp then the out put of following command :
hadoop distcp /a/b/c/d gs:/gcp-bucket/a/b/c/
will be
gs://a/b/c/d
If e is the sub folder in d on HDFS and folder c does not exist on gcp then the out put of following command
hadoop distcp /a/b/c/d gs:/gcp-bucket/a/b/c/
will be
gs://a/b/c/e
why is the out put of second command not same as out put of first command ? both commands are same.