I am using Hive with MapReduce.
I have tried to use a few different configurations (always the same, but using different values). It is creating some mappers, but no reducers.
The configurations that I have set are (I have tried the numeric values for 64MB, 128MB and 256MB):
SET hive.exec.reducers.bytes.per.reducer=134217728;
SET hive.merge.mapfiles=true;
SET hive.merge.mapredfiles=true;
SET hive.merge.size.per.task=134217728;
SET hive.merge.smallfiles.avgsize=67108864;
SET mapred.max.split.size=134217728;
SET parquet.block.size=134217728;
SET dfs.blocksize=134217728;
SET hive.exec.reducers.bytes.per.reducer=134217728;
SET hive.exec.dynamic.partition=true;
SET hive.exec.dynamic.partition.mode=nonstrict;
The main objective is to run this query the more efficiently possible:
INSERT OVERWRITE TABLE my_table2 PARTITION(partition) SELECT * FROM mytable1;
This is one of the INFO messages on running the Hive query:
INFO : Hadoop job information for Stage-1: number of mappers: 675; number of reducers: 0
I have tried to run this query for 4 different sized tables: <100.000 rows, <10.000.000, <100.000.000, >100.000.000 rows (all with more than 20 columns and less than 30 columns).