0

I have created sample table by spark-shell. Write datframe to external table use ORC format by partition.It's working file with in spark-shell read/writing both. But when I tried to extecute same select query over the hive-shell it throw exception.

Below code which I have implemented

scala> val df = sc.parallelize(Seq((1,"Sudhir",30),(2,"Sourabh",27),(3,"Suman",35),(4,"Basu",30))).toDF("id","name","age")

scala>df.write.partitionBy("age").format("ORC").mode(SaveMode.Append).saveAsTable("Abc1")

scala> val df3 = sqlContext.sql("select * from abc1")

scala> df3.dropDuplicates(Seq("id")).show()

Time taken: 0.486 seconds, Fetched: 35 row(s) hive (sba_db_2018)> select * from Abc1; OK abc1.col Failed with exception java.io.IOException:java.io.IOException: hdfs://nag1-vm-sprintba-11.synapse.com:8020/apps/hive/warehouse/sba_db_2018.db/abc1/age=27/part-r-00001-31ebd621-02bb-4db5-9170-5405010e68fd.orc not a SequenceFile Time taken: 0.147 seconds

Sudhir
  • 1
  • 1
  • 4

0 Answers0