0

How can I tell spark to only collect column statistics for a specific partition?

WARN SparkSqlAstBuilder: Partition specification is ignored when collecting column statistics: PARTITION(myPart='myValue')

seems to ignore my filter of:

ANALYZE TABLE ${fullyQualifiedTable}
PARTITION(${table.partitionColumn} = '$partitionVal')
COMPUTE STATISTICS
FOR COLUMNS ${co.mkString(", ")}
Georg Heiler
  • 16,916
  • 36
  • 162
  • 292
  • https://community.hortonworks.com/questions/103263/cannot-generate-stats-for-partitioned-hive-table.html might be related – Georg Heiler Jul 02 '18 at 10:27
  • Is it possible to calculate statistics on a temporarily registered table (i.e. a spark df filtered to desired partition (yes) and reliably store the statistics into the table properties of the original table (?). – Georg Heiler Jul 02 '18 at 11:15

0 Answers0