I am new to hive and hadoop and just created a table (orc fileformat) on Hive. I am now trying to create indexes on my hive table (bitmap index). Every time I run the index build query, hive starts a map reduce job to index. At some point my map reduce job just hangs and one of my nodes (randomly different across multiple retries so its probably not the node) fails. I tried increasing my mapreduce.child.java.opts
to 2048mb but that was giving me errors with using up more memory than available so I increased, mapreduce.map.memory.mb
and mapreduce.reduce.memory.mb
to 8GB. All other configurations are left to the defaults.
Any help with what configurations I am missing out would be really appreciated.
Just for context, I am trying to index a table with 2.4 Billion rows, which is 450GB in size and has 3 partitions.