I have a tez problem, when running about 14 queries at the same time, some of them get delays of more than 5 minutes, but the cluster utilization is just 14%.
This is the message that I am talking about.
INFO SessionState: [HiveServer2-Background-Pool: Thread-322319]: Get Query Coordinator (AM) 308.84s
My configuration is the following:
yarn.scheduler.maximum-allocation-mb=188000
yarn.app.mapreduce.am.resource.mb = 16000
tez.am.resource.memory.mb = 8000
hive.tez.container.size = 8192
tez.runtime.io.sort.mb 2048
tez.am.launch.cmd-opts default - .8
tez.runtime.unordered.output.buffer.size-mb= 800
hive.server2.tez.sessions.per.default.queue = 2
tez.session.am.dag.submit.timeout.secs = 900
tez.am.session.min.held.containers=8
tez.am.resource.memory.mb = 8000
hive.prewarm.enabled = TRUE
This is a 15 node cluster, 254GB ram p/node, 32 cores p/node.
Any clue what might be happening? Is the AM well sized? I don't have out of memory errors, just this long wait times when everything is running, but they are processing only 35 million records when they are all together.
Thanks