I am trying to read a GeoJson file in Hive inside AMAZON EMR and seems it is not possible. I do not get any error but select * from table
return null values. I have read here, that this is not supported, but this is a three years old question. Is there an update regarding this issue?
Code
hadoop fs -cat /home/hadoop/Geographic_unit1/ | head
This shows GeoJson is in the hdfx. Now in Hive,
CREATE TABLE d (PARCEL_ID bigint ,....)
ROW FORMAT SERDE 'com.esri.hadoop.hive.serde.EsriJsonSerDe'
STORED AS INPUTFORMAT 'com.esri.json.hadoop.EnclosedEsriJsonInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat';
LOAD DATA INPATH '/home/hadoop/Geographic_unit2/' OVERWRITE INTO TABLE d;
and the output is some NULL values, columns are all null.
Please let me know if you need further info.