2

I am getting an error when I execute the following command on Hortonworks sandbox HDP 2.3_1:

 curl -i "http://localhost:50075/webhdfs/v1/queryresult/part-m-00000?op=OPEN

HTTP/1.1 400 Bad Request
Content-Type: application/json; charset=utf-8
Content-Length: 161
Connection: close

{"RemoteException":{"exception":"IllegalArgumentException","javaClassName":"java.lang.IllegalArgumentException","message":"java.net.UnknownHostException: null"}}

When I change the port to 50070, I get a message "curl: (7) couldn't connect to host".

webhdfs property is enabled in my hdsf-site.xml and its a single node hadoop cluster.

<property>
  <name>dfs.webhdfs.enabled</name>
  <value>true</value>
  <final>true</final>
</property>
Chaibi Alaa
  • 1,346
  • 4
  • 25
  • 54
bigdata2
  • 999
  • 1
  • 11
  • 17

3 Answers3

0

Does /queryresult/part-m-00000 exist? Try hadoop dfs -ls /queryresult/queryresult/part-m-00000 and see if you can see the file. If you can check the permissions on the file. They should have read ability for the user making the webhdfs query.

Olivier Twist
  • 311
  • 2
  • 12
  • yes the file exists. -rw-r--r-- 1 root hdfs 59 2015-09-30 01:28 /queryresult/part-m-00000 – bigdata2 Oct 03 '15 at 19:26
  • check what the ip in sandbox is and use that instead of localhost. try that first directly in sandbox from command line and then from windows – Olivier Twist Oct 04 '15 at 21:45
  • using ip in the command in sandbox does not change the error message for port 50075; however, for port 50070, I get redirected to another site when I use ip address in the command: HTTP/1.1 307 TEMPORARY_REDIRECT Pragma: no-cache Expires: Mon, 05 Oct 2015 02:23:02 GMT Date: Mon, 05 Oct 2015 02:23:02 GMT Pragma: no-cache Content-Type: application/octet-stream Location: http://sandbox.hortonworks.com:50075/webhdfs/v1/queryresult/part-m-00000?op=OPEN&namenoderpcaddress=sandbox.hortonworks.com:8020&offset=0 Content-Length: 0 Server: Jetty(6.1.26.hwx) – bigdata2 Oct 05 '15 at 02:22
  • My host file contains the following two entries: 127.0.0.1 localhost.localdomain localhost 10.0.2.15 sandbox.hortonworks.com sandbox ambari.hortonworks.com I changed the second line to 127.0.0.1 sandbox.hortonworks.com sandbox ambari.hortonworks.com but still the same issue of redirecting to sandbox.hortonworks.com – bigdata2 Oct 05 '15 at 05:22
  • OPEN if you read the documents for webhdfs does a temporary redirect. The http header will have the "location". So we are good so far. Having said that this is what will work. 1. In your linux sandbox type ifconfig. Note the ipvm address. Then type curl -v XGET ":50075/webhdfs/v1/queryresult/part-m-0000. " This will spew some stuff with the location redirect in the header. Now do a curl -XGET with this location. 127.0.0.1 is the same as localhost. What you want is the ip address of your linux sandbox. Make sure to use bridged adapter with promiscuous mode allow all not NAT. – Olivier Twist Oct 06 '15 at 06:40
  • XGET does not work for me I get the following message: * getaddrinfo(3) failed for XGET:80 * Couldn't resolve host 'XGET' * Closing connection #0 curl: (6) Couldn't resolve host 'XGET' * About to connect() to port 50075 (#0) * Trying ... connected * Connected to port 50075 (#0) > GET /webhdfs/v1/queryresult/part-m-0000 HTTP/1.1 > User-Agent: curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.16.2.3 Basic ECC zlib/1.2.3 libidn/1.18 libssh2/1.4.2 > Host: :50075 > Accept: */* < HTTP/1.1 400 Bad Request – bigdata2 Oct 06 '15 at 07:18
  • With a bridge adapter and other settings you indicated, I cannot even connect to the VM. So back to using NAT. – bigdata2 Oct 06 '15 at 07:39
  • It is curl -v -XGET. First time using curl?:-) And bridged adapter has to work. If not we have to figure out what you are doing wrong. – Olivier Twist Oct 07 '15 at 00:43
  • Maybe we can do a text chat sometime and I will walk you through it.. It really is not that complicated. – Olivier Twist Oct 07 '15 at 00:43
  • faced the same issue, but still no solution( – serg Jan 10 '16 at 16:31
0

First the 50075 is not correct the 50070 is default, but still won't work, because some strange redirection to the sandbox.hortonworks.com.

To fix it, I added to the "hosts" (for Windows located here C:\Windows\System32\drivers\etc): file the folowing entry

127.0.0.1   sandbox.hortonworks.com

After this my PC managed to deal with this redirect. Maybe you will need to restart http client, in my case it was chrome

serg
  • 1,003
  • 3
  • 16
  • 26
0

As per https://hadoop.apache.org/docs/r1.0.4/webhdfs.html, it is better to change to actual hostname of the machine.

This command works for me (hdp 2.5):

curl -i "sandbox.hortonworks.com:50075/webhdfs/v1/data/xyz.json?op=OPEN"

I couldn't get it to work with localhost.

Cody Gray - on strike
  • 239,200
  • 50
  • 490
  • 574
dinesh rajput
  • 104
  • 1
  • 6