Hadoop old API is deprecated for some time now, and there is not much information about the new one (I'm not talking about YARN, but about the http://hadoopbeforestarting.blogspot.com/2012/12/difference-between-hadoop-old-api-and.html). I searched for days how could I enable it by default? Only solutions I've seen so far is setting configurations in oozie workflow.xml
or setting JobConf.setUseNewMapper(true)
and JobConf.setUseNewReducer(true)
inside own mapreduce. So my question is: how can I enable it by default, so every single job would use it? I mean Hive, HBase and etc. generated jobs. I tried setting mapred.mapper.new-api
and mapred.mapper.new-api
to true in mapred-site.xml
but it doesn't work.
Moreover I found deprecated settings: http://archive.cloudera.com/cdh4/cdh/4/hadoop/hadoop-project-dist/hadoop-common/DeprecatedProperties.html . I think new properties only works on new api, because after setting mapreduce.tasktracker.map.tasks.maximum
and mapreduce.tasktracker.reduce.tasks.maximum
in mapred-site.xml
to other than the default value, I still get default value (2). And if I set deprecated properties - mapred.tasktracker.map.tasks.maximum
and mapred.tasktracker.reduce.tasks.maximum
it works like a charm.