3

I tried creating hive table on avro using avsc spec file and need to renam some of the columns . used alias but seems its not working. the columns are returned as null when i query the table

SPARK DATAFRAME TO SAVE DATA

val data=Seq(("john","adams"),("john","smith"))
val columns = Seq("fname","lname")
import spark.sqlContext.implicits._
val df=data.toDF(columns:_*)
df.write.format("avro").save("/test")

AVSC Spec file

{
  "type" : "record",
  "name" : "test",
  "doc" : " import of test",
  "fields" : [  {
    "name" : "first_name",
    "type" : [ "null", "string" ],
    "default" : null,
    "aliases" : [ "fname" ],
    "columnName" : "fname",
    "sqlType" : "12"
  }, {
    "name" : "last_name",
    "type" : [ "null", "string" ],
    "default" : null,
    "aliases" : [ "lname" ],
    "columnName" : "lname",
    "sqlType" : "12"
  } ],
  "tableName" : "test"
}

EXTERNAL HIVE TABLE

create external table  test
STORED AS AVRO
LOCATION '/test'
TBLPROPERTIES ('avro.schema.url'='/test.avsc');

HIVE QUERY

SELECT last_name from test;

returns null even though there is data in avro with the original name ie lname

Ajith Kannan
  • 812
  • 1
  • 8
  • 30
  • Found this post, 8 years old, doues it still not support aliases?: https://user.hive.apache.narkive.com/atNZS0xL/avro-aliases-in-hive – leftjoin Sep 09 '21 at 20:43
  • thanks @leftjoin its returning null . added full code snippet which i tried in spark2.4 – Ajith Kannan Sep 09 '21 at 23:08

0 Answers0