2

We have two clusters, where our requirement is to pull data from one cluster to another.

Only option available to us is, pull the data thru webhdfs!!

But unfortunately, what we can see is, thru webhdfs we can only pull only one file at a time, that too requiring two commands to be executed for every single file.

My straight question is: is there a way thru webhdfs, that we can pull entire directory data ??

**Ex:** 
**directory structure in the cluster:**

dir1

        file1

        file2

        file3 


**currently observed that,** 

for every file i.e 1,2 & 3, i need to execute two commands to get data.

**Problem statement:** 

Is there a way thru webhdfs, to get all the files in a single call i.e., files 1,2 & 3 at a time from dir1 ...!!!!

Can someone please help me with this...

NOTE: DISTCP is not a working option for us due to security resons!!

Raja
  • 513
  • 5
  • 18

0 Answers0