1、Background:
I have a hive external table A, which was created in text format when it was created. The HDFS data of the partition is also text+gz.
Table A is used by thousands of sql.files. All the 5-year historical partitions of Table A may be used.
Currently we have a better storage format parquet. To reduce switching costs I plan to change table A to a parquet table, with parquet+gz data for the new partition and text+gz data for the old partition. The business can read any partition of table A through sparksql and hivesql.
2、Verification process:
2.1、create Table enter image description here
2.2、add partition
20210702 path is text+gz
20210703 path is parquet+gz
3、Error: enter image description here
4、expect:
Are there some solutions, such as parameter configuration, that can solve this problem.
What I have done: https://issues.apache.org/jira/browse/SPARK-24965 According to the stack information reported in the error, I have not seen sparksql in the source code about the hive table metadata and partition metadata.
5、Configuration Environment: hdp2.7.3 sparksql2.3 hive1.2