I'm trying to use Hive on tez to query orc format data stored in S3. Tez AM scheduled tasks very slow, a lot of Map tasks remained in "PENDING" for a long time.
There were enough resources in the cluster (quite enough I would say. There were more than 6TB memory and more than 1 thousand vcores available and in this job each container costs only 2GB memory. And this is the only job running in the yarn cluster), but the am just doing slow in scheduling tasks.
Is there any way I can accelerate this procedure?