Confusion on HDFS 'pwd' equivalents

Question

First, I have read this post:Is there an equivalent to `pwd` in hdfs?. It says there is no such 'pwd' in HDFS.

However, as I progressed with the instructions of Hadoop: Setting up a Single Node Cluster, I failed on this command:

$ bin/hdfs dfs -put etc/hadoop input
put: 'input': No such file or directory

It's weird that I succeed on this command for the first time I went through the instructions, but failed for the second time. It's also weird that I succeed on this command on my friends computer, which has the same system (Ubuntu 14.04) and hadoop version (2.7.1) as mine.

Can anyone explain what happened here? Is there some 'pwd' in HDFS after all?

About the Current Working Directory question: `hdfs groups` will show you which "user" is used when accessing HDFS - either an admin account (say, `hdfs`) or an end-user account ( say, `johndoe`). The CWD will be `/user/hdfs/` or `/user/johndoe/` accordingly. — Samson Scharfrichter, Sep 04 '15 at 11:12

score 2 · Answer 1 · answered Sep 04 '15 at 10:25

Firstly, You are trying to run the command $ bin/hdfs dfs -put etc/hadoop input with user that doesn't exist in the VM/HDFS
Let me clearly explain you with the following example in HDP VM

[root@sandbox hadoop-hdfs-client]# bin/hdfs dfs -put /etc/hadoop input
put: `input': No such file or directory

Here I executed the command with root user and it didn't exist in the HDP VM. Check in the following command to list the users

[root@sandbox hadoop-hdfs-client]# hadoop fs -ls /user
Found 8 items
drwxrwx---   - ambari-qa hdfs           0 2015-08-20 08:33 /user/ambari-qa
drwxr-xr-x   - guest     guest          0 2015-08-20 08:47 /user/guest
drwxr-xr-x   - hcat      hdfs           0 2015-08-20 08:36 /user/hcat
drwx------   - hive      hdfs           0 2015-09-04 09:52 /user/hive
drwxr-xr-x   - hue       hue            0 2015-08-20 09:05 /user/hue
drwxrwxr-x   - oozie     hdfs           0 2015-08-20 08:37 /user/oozie
drwxr-xr-x   - solr      hdfs           0 2015-08-20 08:41 /user/solr
drwxrwxr-x   - spark     hdfs           0 2015-08-20 08:34 /user/spark

In HDFS, If you want to copy a file and not mentioning the absolute path for destination argument, it will consider home of the logged user and place your file there. Here root user not found.

Now let's switch to hive user and test

[root@sandbox hadoop-hdfs-client]# su hive
[hive@sandbox hadoop-hdfs-client]$ bin/hdfs dfs -put  /etc/hadoop input
[hive@sandbox hadoop-hdfs-client]$ hadoop fs -ls /user/hive
Found 1 items
drwxr-xr-x   - hive hdfs          0 2015-09-04 10:07 /user/hive/input

Yay..Successfully Copied..

Hope it helps..!!!

It's very clear! However, as a beginer, I wonder how you enter such [root@sandbox hadoop-hdfs-client] environment? I suppose it is not the common linux environment but the one for hadoop — Warbean, Sep 07 '15 at 04:52
It is a Hortonworks Hadoop eco system vm with all the set up ready.. This is really good for beginners and you can find more details [here](http://hortonworks.com/hdp/downloads/) — Mr.Chowdary, Sep 08 '15 at 01:55

score 0 · Answer 2 · answered Sep 03 '15 at 13:37

It means that we need to move input files to hdfs location.

Suppose you have input file named input.txt and we need to move to HDFS, then follow the below command. Command: hdfs dfs -put /input_location /hdfs_location

In case no specific directory in HDFS hdfs dfs -put /home/Desktop/input.txt /

In case specific directory in HDFS (Note: We need to create a directory before proceeding)

hdfs dfs -put /home/Desktop/input.txt /MR_input

After that you can run the examples

bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar wordcount /input /output

Here Input and output are the paths which should be in HDFS.

Hope this helps.

I know using absolute path is one way to make things work. But what I am confused is why relative path works sometimes but failed another time. — Warbean, Sep 04 '15 at 02:48

Confusion on HDFS 'pwd' equivalents

2 Answers2