0

I want to read parquetFile in sparkR shell from the hdfs system. So I do the that:

./sparkR --master yarn-client

 sqlContext <- sparkRSQL.init(sc)
 path<-"hdfs://year=2015/month=1/day=9"
 AppDF <- parquetFile(sqlContext, path)

Error: No such file or directory

But this file is really exist in the hdfs system. And when I wrap this code in R file like dataframe.R and run ./spark-submit --master yarn ~/dataframe.R 1000. It works well. So I think the problem is running on yarn-client through sparkR shell. Could anyone help to solve this?

I'm using spark-1.4.0-bin-hadoop2.6

ysfseu
  • 666
  • 1
  • 10
  • 20
  • can u describe the details of spark/conf files..? What is content of conf/core-site.xml..? – rbyndoor Jul 20 '15 at 08:41
  • @ruby, thx for replying. I do this again. Although there is this error message, it really read the file successfully. I don't know why. But it works – ysfseu Jul 20 '15 at 09:33

1 Answers1

0

I am not sure that this might help. You might need to add the full path including the hostname and port of the defaultFS , Like

path<-"hdfs://localhost:9000/year=2015/month=1/day=9"
Abdulrahman
  • 433
  • 4
  • 11
  • I've tried this method. But the warning is still there. As I mentioned in the comments, the file is indeed read in – ysfseu Jul 22 '15 at 02:49