DataFrame Object is not showing any data

Question

I was trying to create a dataframe object on a hdfs file using spark csv lib as shown in this tutorial.

But when i tried to get the count of DataFrame object , it is showing as 0

Here is my file look like,

employee.csv:

empid,empname
1000,Tom
2000,Jerry

I loaded the above file using,

val empDf = sqlContext.read.format("com.databricks.spark.csv").option("header","true").option("delimiter",",").load("hdfs:///user/.../employee.csv");

When i queried like, empDf object.printSchema() is giving proper schema with empid,empname as string fields and i could see that delimiter was read properly.

But when i tried to display the dataFrame using, empDf.show giving only column header and no data in it and when i do empDf.count giving 0 records.

Please correct me if i missed something to do which is very much required here.

score 0 · Accepted Answer · edited May 23 '17 at 12:33

0

Be sure that the spark-csv version and the Scala version with which your Spark distribution is built are the same.

For example, if your Spark distro is built with Scala 2.10 (the default Scala version for Databricks prebuilt Spark distros), you will need spark-csv_2.10 - version spark-csv_2.11 (shown in the mentioned tutorial) will not work, and will return an empty dataframe with only column names - see my answer to this SO question for a similar case.

edited May 23 '17 at 12:33

Community

1
1

answered Aug 16 '16 at 17:19

desertnaut

57,590
26
140
166

1

Thank you. It is resolved my issue...!! My scala version is 2.10 but i was using 2.11 version of spark csv issue. Using 2.10 spark csv library resolved it...! – Krishna Reddy Aug 17 '16 at 13:57

DataFrame Object is not showing any data

1 Answers1

Linked