1

our hive query creates 9 map-reduce jobs and 17 stages(when I ran EXPLAIN command, output showed 17 STAGES and STAGE DEPENDENCIES). Every child job has the same mapreduce.job.name

To distinguish these child jobs, is there any way I can set the mapreduce.job.name inside the hive query so that for each job, I can see the stage of the job. existing job name for all 9 child jobs:

Job.Name : hive_query_map_reduce_job

Is there a way I can get the job names in job tracker as

Job.Name : hive_query_map_reduce_job_stage_1
Job.Name : hive_query_map_reduce_job_stage_2
Job.Name : hive_query_map_reduce_job_stage_3
...

I refered How do I control a hive job name but keep the stage info? but it did not work as expected. I tried setting mapreduce.job.name inside the query at multiple places with different values but all the child jobs are taking the last value I assigned. Say my query file is hiveQuery.q

hiveQuery.q

set hiveconf:mapreduce.job.name="unique name 1".
...
--some query statements
...
set hiveconf:mapreduce.job.name="unique name 2".
...
--some query statements
...
set hiveconf:mapreduce.job.name="unique name 3".

For the above query, All the 9 mapreduce jobs took "unique name 3" as the job name. I also tried hive.query.name and hive.query.string but those didn't help. Is this possible? Does anyone know how to achieve this?

0 Answers0