0

My hive external table(non partitioned) has data in an S3 bucket. It is an incremental table.

Until today all the files that come to this location through an Informatica process, were of the same structure and all the fields are included columns in my incremental table structure.

Now I have a requirement the a new file comes to the same bucket in additiona to the existing ones, but with additional columns.

I have re-created table with the additional columns.

But when I query the table, I see that the newly added columns are still blank for those specific line items where they are not supposed to be.

Can I actually have a single external table pointing to an S3 location having different file structures?

spark_dream
  • 326
  • 2
  • 8
  • 23
  • _"I see that the newly added columns are still blank for those specific line items where they are not supposed to be"_ >> but you don't give any useful information about *(a)* input file format i.e.. CSV, AVRO, JSON, whatever -- and file structure *(b)* table structure *(c)* sample record in file *(d)* same exact record as displayed by Hive. – Samson Scharfrichter Apr 12 '17 at 22:06
  • you need to have underline data in format that supports schema evolution, something like Parquet ,ORC or Avro depending on use case. – Pushkr Apr 13 '17 at 00:28

0 Answers0