Questions tagged [webhdfs]

WebHDFS is a REST API that supports the complete FileSystem interface for HDFS (Hadoop Distributed File System)

WebHDFS is a REST API that supports the complete FileSystem interface for HDFS (Hadoop Distributed File System). This Api is used to establish a connection to the Hadoop Data Lake from a third-party tool such as SSIS: Using WebHDFS to connect Hadoop Data Lake to SSIS

268 questions
2
votes
2 answers

Accessing kerberos secured WebHDFS without SPnego

I have a working application for managing HDFS using WebHDFS. I need to be able to do this on a Kerberos secured cluster. The problem is, that there is no library or extension to negotiate the ticket for my app, I only have a basic HTTP…
MaBu
  • 23
  • 1
  • 5
2
votes
0 answers

REST HDFS API for HDFS Encryption Process

I want to create encrypted zone using WebHDFS. Below is command for which i am looking for its REST implementation . hdfs crypto -createZone -keyName myKey -path /zone Can someone point me documentation/examaple for same?
Shashi
  • 2,686
  • 7
  • 35
  • 67
2
votes
1 answer

How to access Azure datalake using the webhdfs API

We're just getting started evaluating the datalake service at Azure. We created our lake, and via the portal we can see the two public URLs for the service. (One is an https:// scheme, the other an adl:// scheme) The datalake documentation states…
RickS
  • 1,071
  • 1
  • 9
  • 8
2
votes
2 answers

Python HDFS gives incorrect file size

I am trying to get the size of a file from hdfs using python 3.5 and hdfs library. https://pypi.python.org/pypi/hdfs/ from hdfs.client import Client if __name__ == '__main__': cl = Client("http://hostName:50070") print…
AbtPst
  • 7,778
  • 17
  • 91
  • 172
2
votes
0 answers

hadoop & webhdfs access op = open is getting 404 not found

'm facing an issue with the WebHDFS access on my Amazon EC2 machine. I have installed hadoop with this https://letsdobigdata.wordpress.com/2014/01/13/setting-up-hadoop-1-2-1-multi-node-cluster-on-amazon-ec2-part-2/ I can retrieve the file status…
2
votes
0 answers

Can HDFS support pause and resume of file downloads / uploads?

I could not find information on whether Hadoop HDFS can support pausing/resuming of file downloads and uploads. Is there a capability out of the box that HDFS provides for this purpose, and if not can it be implemented using mapreduce jobs? Say,…
rbk
  • 21
  • 1
2
votes
0 answers

Alternatives to WebHCat

Hive has an option of using WebHCat for querying HIVE Tables via REST based API's. WebHCat requires 2 calls - Call 1 to submit the query via webhcat. Call 2 to retrieve the output file via webhdfs. Are there any other alternatives to webhcat…
myloginid
  • 1,463
  • 2
  • 22
  • 37
2
votes
3 answers

webhdfs not working on HDP sandbox

I am getting an error when I execute the following command on Hortonworks sandbox HDP 2.3_1: curl -i "http://localhost:50075/webhdfs/v1/queryresult/part-m-00000?op=OPEN HTTP/1.1 400 Bad Request Content-Type: application/json;…
bigdata2
  • 999
  • 1
  • 11
  • 17
2
votes
2 answers

In hadoop, Is there any limit to the size of data that can be accessed through knox + webhdfs?

In hadoop, Is there any limit to the size of data that can be accessed/Ingested to HDFS through knox + webhdfs?
Satheesha
  • 33
  • 1
  • 5
2
votes
1 answer

Insert data in HDFS

I need create some tables in Hive and for this I want to insert the data in hdfs so that a hive table was created automatically. I consider this example: I need this information stored in Hive. Could you tell me a example of how have I insert…
Cristina
  • 161
  • 1
  • 1
  • 9
2
votes
0 answers

Use REST API with hadoop's HDFS

Let's say I have one Text File (Size 1 GB). I want to search particular word from file and if it is found the Line number should be returned. I can Execute my java program using command line in linux. But what I want is some Interface using REST…
spt025
  • 2,134
  • 2
  • 20
  • 26
2
votes
0 answers

Accessing kerberos secured webhdfs from browser

I'm unable to access WebHDFS from browser(IE8) due to Kerberos security. Can anyone please help me with this? I used “curl” works fine curl -i --negotiate -u:qjdht93 "http://:50070/webhdfs/v1/user/qjdht93/?op=LISTSTATUS" Below is the error in…
Chhaya Vishwakarma
  • 1,407
  • 9
  • 44
  • 72
2
votes
3 answers

Hadoop name node URL for WebHDFS

I have a clustered Named Node Setup. The Named nodes are configured to be Active and Passive. When I make a WEBHDFS call, the URL to be provided is http://:/webhdfs/v1/ Since I have 2 Named nodes available, I have 2 URL's available…
Vaya
  • 560
  • 6
  • 20
2
votes
1 answer

Get directory size from WebHDFS?

I see that webhdfs does not support directory size. In HDFS, I can use hdfs dfs -du -s -h /my/directory Is there a way to derive this from webHDFS? I need to do this programmatically, not by viewing the page.
Brian Dolan
  • 3,086
  • 2
  • 24
  • 35
2
votes
2 answers

Spark with Webhdfs/httpfs

I would like to read a file from HDFS into Spark via httpfs or Webhdfs. Something along the lines of sc.textFile("webhdfs://myhost:14000/webhdfs/v1/path/to/file.txt") or,…
Brian Hess
  • 21
  • 1
  • 2