Location of Spark and Hadoop Files on my HD

Question

I am sorry if I say something wrong, but I'm just a beginner. I tried to find this answer by myself but I cannot find an answer, even if I just go crazy trying to rephrase the question in a thousand times.

I installed Spark 2.0.2 and Hadoop 1.6 on my laptop, and I just want to be able to create a couple of masters and worker on my computer so I can practice how to store data in the HFDS, how to map-reduce, etc., so I'll learn and eventually I could use it at work, providing the cluster is all set up for me.

So, I am able to use the master and worker classes to create a master on localhost:8080 and a couple of worker on 8081 and 8082, and I am able to connect to that session via RStudio using the library SparkR.

Now, let us say I do some stuff there: I create a SparkR dataframe, I transform it, I duplicated it, whatever, and I use the write.df command to save it to a filepath.

Where, in my computer is it being stored?

How can I explore the data I'm storing in the HDFS via both Windows Explorer and RStudio (getwd returns a different thing)?

And Finally, it I just kill those master and workers, and later I start them up again, how can I make sure that they will be pointing to the same locations, so I will be able to take my work where I left it?

Read this: http://stackoverflow.com/questions/2358402/where-hdfs-stores-files-locally-by-default — Mohammad Yusuf, Dec 10 '16 at 09:23
and this: http://stackoverflow.com/questions/28379048/data-lost-after-shutting-down-hadoop-hdfs — Mohammad Yusuf, Dec 10 '16 at 09:27
@LostInOverflow Hadoop is developed in java and java runs on windows. So yes HDFS is supported on Windows. You can install from scratch or you can install Hortonworks HDP. — Mohammad Yusuf, Dec 10 '16 at 13:48

Location of Spark and Hadoop Files on my HD

0 Answers0