0

I am trying to perform upload/download a file from Hadoop cluster, using a C# app, but I couldn't find the APIs for Upload and download from the documentation.

So can you please let me know how to upload and download files from Hadoop using RestAPIs?

Thanks

Kalai
  • 319
  • 1
  • 4
  • 21

1 Answers1

1

You can use the WebHDFS REST API as described here http://hadoop.apache.org/docs/r1.0.4/webhdfs.html

Edit:

Create and Write to a File

Step 1:

Submit a HTTP PUT request without automatically following redirects and without sending the file data.

curl -i -X PUT "http://:/webhdfs/v1/?op=CREATE [&overwrite=][&blocksize=][&replication=] [&permission=][&buffersize=]"

The request is redirected to a datanode where the file data is to be written: HTTP/1.1 307 TEMPORARY_REDIRECT Location: http://:/webhdfs/v1/?op=CREATE... Content-Length: 0

Step 2:

Submit another HTTP PUT request using the URL in the Location header with the file data to be written.

curl -i -X PUT -T "http://:/webhdfs/v1/?op=CREATE..."

The client receives a 201 Created response with zero content length and the WebHDFS URI of the file in the Location header: HTTP/1.1 201 Created Location: webhdfs://:/ Content-Length: 0

Note that the reason of having two-step create/append is for preventing clients to send out data before the redirect. This issue is addressed by the "Expect: 100-continue" header in HTTP/1.1; see RFC 2616, Section 8.2.3. Unfortunately, there are software library bugs (e.g. Jetty 6 HTTP server and Java 6 HTTP client), which do not correctly implement "Expect: 100-continue". The two-step create/append is a temporary workaround for the software library bugs.

Javier Abrego
  • 462
  • 3
  • 12
  • I did referred the documentation, but I couldn't find any APIs for file upload and download. – Kalai Jun 03 '14 at 04:12
  • Thanks. So for uploading I have to create a new file and write the contents of the uploaded file to it. Right? So what about download? Is there any way ? Currently I'm using shell commands in a 'cmd' process to download a file. – Kalai Jun 05 '14 at 05:06
  • Yes you can use a get request [curl -i -L "http://:/webhdfs/v1/?op=OPEN"] – Javier Abrego Jun 05 '14 at 10:40