Questions tagged [webhdfs]

WebHDFS is a REST API that supports the complete FileSystem interface for HDFS (Hadoop Distributed File System)

WebHDFS is a REST API that supports the complete FileSystem interface for HDFS (Hadoop Distributed File System). This Api is used to establish a connection to the Hadoop Data Lake from a third-party tool such as SSIS: Using WebHDFS to connect Hadoop Data Lake to SSIS

268 questions
4
votes
1 answer

webHDFS API returns Exception on every query

I setuped single node Hadoop cluster to perform some experiments with HDFS. Via web access all looks good, I created a dedicated folder and copied file from local system to it using command line. It all appeared in web UI. After it I to get access…
4
votes
1 answer

How to list HDFS directory contents using webhdfs?

Is it possible to check to contents of a directory in HDFS using webhdfs? This would work as hdfs dfs -ls normally would, but instead using webhdfs. How do I list a webhdfs directory using Python 2.6 to do so?
DPEZ
  • 107
  • 2
  • 14
4
votes
0 answers

WebHDFS's redirection URL to datanode has unresolvable local hostname when on AWS

I have HDFS running on an EC2 node (pseudo multi-node setup) and I use it to access files via the WebHDFS's REST API by doing a GET at e.g. this: http: slash slash…
4
votes
1 answer

Azure Data Lake Store concurrency

I've been toying with Azure Data Lake Store and in the documentation Microsoft claims that the system is optimized for low-latency small writes to files. Testing it out I tried to perform a big amount of writes on parallel tasks to a single file,…
evilpilaf
  • 1,991
  • 2
  • 21
  • 38
4
votes
1 answer

append operation in hadoop webhdfs client

A Java client I threw together works: import java.io.File; import java.io.IOException; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.FileSystem; import org.apache.hadoop.fs.FSDataOutputStream; import…
LetMeSOThat4U
  • 6,470
  • 10
  • 53
  • 93
3
votes
2 answers

PUT with an empty body using httr (on R) to webHDFS

When trying to put to WebHDFS in order to create a file and write to it (using the following link: https://hadoop.apache.org/docs/r1.0.4/webhdfs.html#CREATE) I run into issues using httr. Using RCurl or RWebHDFS is not possible because the target…
3
votes
1 answer

Returning ip address instead of hostname in webhdfs

I am trying to get files from hadoop using webhdfs, now the webhdfs is redirecting me to the datanodes. Its returning the hostnames in address , is there a way where we can make it return ip address instead of hostnames ?
user9040429
  • 690
  • 1
  • 8
  • 29
3
votes
2 answers

Does webhdfs support high availability when failover happens

iam using hadoop 2.7.1 on centos 7 when high availability is included with hadoop cluster and active name node fails ,it becomes stand by but webhdfs doesn't support high availability ?isn't it what should be the alternative to send get and put…
oula alshiekh
  • 843
  • 5
  • 14
  • 40
3
votes
1 answer

How to take namenode backup

I have a hadoop cluster with 6 datanode and 1 namenode. But I do not have any standby namenode or journal node. I know this is not a good practice but due to some constraints I have to continue with this for the time being. Can any one tell me ,…
Sujoy
  • 117
  • 2
  • 9
3
votes
1 answer

How can I import XML data into Hadoop

Am quite new to Hadoop and I wanted to import the semi-structured data - XML into HDFS. What are the ways to import XML data from a remote location to HDFS and any open source tools used for it? Can Flume import XML data into HDFS? Thanks in advance
avinash
  • 147
  • 3
  • 15
3
votes
2 answers

Accessing kerberos protected webhdfs from .Net Application(console)

I'm unable to access WebHDFS from browser due to Kerberos security. Can anyone help me with this? Below is the error in browser for “http://****.****/webhdfs/v1/prod/snapshot_rpx/archive?op=LISTSTATUS&user.name=us” HTTP ERROR 401 Problem accessing…
SanthiRam
  • 89
  • 6
3
votes
0 answers

Hadoop WebHDFS Java Client API enable SSL and Basic Authentication

I have a Spring Boot application that uses spring-yarn-boot:2.2.0.RELEASE to get access to a Hadoop filesystem (HDFS). Operations that I do are LISTSTATUS, GETFILESTATUS and OPEN (to read a file). HDFS URI is specified through…
Tarmo
  • 3,851
  • 2
  • 24
  • 41
3
votes
0 answers

Docker with WebHDFS

I have a Spark image running in a Docker container. I want to access the results saved by Spark in HDFS using WebHDFS from the host machine outside the container. For this I am using the OPEN API which has a redirect before serving the file…
Nitin
  • 7,187
  • 6
  • 31
  • 36
3
votes
3 answers

Verifying checksum for files in HDFS

I'm using webhdfs to ingest data from Local file system to HDFS. Now I want to ensure integrity of files ingested into HDFS. How can I make sure transferred files are not corrrupted/altered etc? I used below webhdfs command to get the checksum of…
Chhaya Vishwakarma
  • 1,407
  • 9
  • 44
  • 72
3
votes
1 answer

Send cURL PUT command to create file in webhdfs programatically in c++ using libcurl

I'm trying to store files into HDFS from an application written in C++. I know you can use curl in command line/terminal: First send a PUT request, 1) curl -i -X PUT http://:50070/webhdfs/v1/?op=CREATE and then write data to the…
mintuchiha
  • 31
  • 1
  • 3
1
2
3
17 18