1

I installed RHADOOP on a HORTONWORKS SANDBOX, following these instructions: http://www.research.janahang.com/install-rhadoop-on-hortonworks-hdp-2-0/

Everything seems to have installed correctly. But when I run the test script at the bottom I get an error, it seems that - (REDUCE capability required is more than the supported max container capability in the cluster. Killing the Job. reduceResourceReqt: 4096 maxContainerCapability:2250) is most likely my issue.

How can I set the maxcontainercapability ? or fix this issue? any help would be welcomed. Thanks

Error output is here:

Be sure to run hdfs.init()
14/09/09 14:29:25 WARN util.NativeCodeLoader: Unable to load native-hadoop library for     your platform... using builtin-java classes where applicable
14/09/09 14:29:27 WARN hdfs.BlockReaderLocal: The short-circuit local reads feature     cannot be used because libhadoop cannot be loaded.
packageJobJar: [] [/usr/lib/hadoop-mapreduce/hadoop-streaming-2.4.0.2.1.1.0-385.jar]     /tmp/streamjob4407691883964292767.jar tmpDir=null
14/09/09 14:29:29 INFO client.RMProxy: Connecting to ResourceManager at     sandbox.hortonworks.com/192.168.32.128:8050
14/09/09 14:29:29 INFO client.RMProxy: Connecting to ResourceManager at     sandbox.hortonworks.com/192.168.32.128:8050
14/09/09 14:29:31 INFO mapred.FileInputFormat: Total input paths to process : 1
14/09/09 14:29:32 INFO mapreduce.JobSubmitter: number of splits:2
14/09/09 14:29:32 INFO mapreduce.JobSubmitter: Submitting tokens for job:     job_1410297633075_0001
14/09/09 14:29:33 INFO impl.YarnClientImpl: Submitted application     application_1410297633075_0001
14/09/09 14:29:33 INFO mapreduce.Job: The url to track the job:     http://sandbox.hortonworks.com:8088/proxy/application_1410297633075_0001/
14/09/09 14:29:33 INFO mapreduce.Job: Running job: job_1410297633075_0001
14/09/09 14:29:42 INFO mapreduce.Job: Job job_1410297633075_0001 running in uber mode :     false
14/09/09 14:29:42 INFO mapreduce.Job:  map 100% reduce 100%
14/09/09 14:29:43 INFO mapreduce.Job: Job job_1410297633075_0001 failed with state     KILLED due to: MAP capability required is more than the supported max container capability     in the cluster. Killing the Job. mapResourceReqt: 4096 maxContainerCapability:2250
Job received Kill while in RUNNING state.
REDUCE capability required is more than the supported max container capability in the     cluster. Killing the Job. reduceResourceReqt: 4096 maxContainerCapability:2250

14/09/09 14:29:43 INFO mapreduce.Job: Counters: 2
    Job Counters
            Total time spent by all maps in occupied slots (ms)=0
            Total time spent by all reduces in occupied slots (ms)=0    
14/09/09 14:29:43 ERROR streaming.StreamJob: Job not Successful!
Streaming Command Failed!
Error in mr(map = map, reduce = reduce, combine = combine, vectorized.reduce,  :
hadoop streaming failed with error code 1
Calls: wordcount -> mapreduce -> mr
Execution halted
14/09/09 14:29:49 INFO fs.TrashPolicyDefault: Namenode trash configuration: Deletion     interval = 360 minutes, Emptier interval = 0 minutes.
Moved: 'hdfs://sandbox.hortonworks.com:8020/tmp/file1f937beb4f39' to trash at:     hdfs://sandbox.hortonworks.com:8020/user/root/.Trash/Current
user3357415
  • 94
  • 1
  • 10

2 Answers2

1

To do this on Hortonworks 2.1, I had to

  1. increase VirtualBox memory from 4096 to 8192 (don't know if that was strictly necessary)
  2. Enabled Ambari from http://my.local.host:8000
  3. Log into Ambari from http://my.local.host:8080
  4. change the values of yarn.nodemanager.resource.memory-mb and yarn.scheduler.maximum-allocation-mb from the defaults to 4096
  5. Save and restart everything (via Ambari)

This got me past the "capability required" errors, but the actual wordcount.R doesn't seem to want to complete. Things like hdfs.ls("/data") do work, however.

schnee
  • 1,050
  • 2
  • 9
  • 20
-2

This memory issue was not easy to fix, however I switched over to Cloudera platform and everything worked as intended.

user3357415
  • 94
  • 1
  • 10