I have a hive table based on avro schema. The table was created with the following query
CREATE EXTERNAL TABLE datatbl
PARTITIONED BY (date String, int time)
ROW FORMAT SERDE
'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
WITH SERDEPROPERTIES (
'avro.schema.url'='path to schema file on HDFS')
STORED as INPUTFORMAT
'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
LOCATION '<path on hdfs>'
So far we have been inserting data into the table by setting the following properties
hive> set hive.exec.compress.output=true;
hive> set avro.output.codec=snappy;
However, if someone forgets to set the above two properties the compression is not achieved. I was wondering if there is a way to enforce compression on table itself so that even if the above two properties are not set the data is always compressed?