0

I'm working with rapidminer to extract rules from a big dataset. Radoop the extension to hadoop ecosystem and the sparkRM operator allow proceeding the fp-growth from retrieving data from hive to exploring the analysis. I'm working on: -windows 8.1 -hadoop 6.2 -spark 1.5 -hive 2.1 I have configured the spark-default-conf as follow:

# spark.master                     yarn
# spark.eventLog.enabled           true
# spark.eventLog.dir               hdfs://namenode:8021/directory
# spark.serializer                 org.apache.spark.serializer.KryoSerializer
# spark.driver.memory              2G
# spark.driver.cores                    1
# spark.yarn.driver.memoryOverhead  384MB
# spark.yarn.am.memory             1G
# spark.yarn.am.cores               1
# spark.yarn.am.memoryOverhead      384MB
# spark.executor.memory            1G
# spark.executor.instances          1
# spark.executor.cores              1
# spark.yarn.executor.memoryOverhead    384MB
# spark.executor.extraJavaOptions  -XX:+PrintGCDetails -Dkey=value -Dnumbers="one two three"

The yarn-site Xml file i have :

<property>
    <name>yarn.resourcemanager.schedular.address</name>
    <value>localhost:8030</value>
</property>

<property>
    <name>yarn.resourcemanager.admin.address</name>
    <value>localhost:8033</value>
</property>

<property>
    <name>yarn.resourcemanager.resource-tracker.address</name>
    <value>localhost:8031</value>
</property>

<property>
    <name>yarn.resourcemanager.resource.cpu-vcores</name>
    <value>2</value>
</property>

<property>
    <name>yarn.resourcemanager.resource.memory-mb</name>
    <value>2048</value>
</property>

<property>
    <name>yarn.resourcemanager.hostname</name>
    <value>localhost</value>
</property>

<property>
    <name>yarn.resourcemanager.address</name>
    <value>localhost:8032</value>
</property>

<property>
    <name>yarn.resourcemanager.webapp.address</name>
    <value>localhost:8088</value>
</property>

<property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
</property>

<property>
    <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
    <value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>

<property>
    <name>yarn.nodemanager.log-dirs</name>
    <value>/E:/tweets/hadoopConf/userlog</value>
    <final>true</final>
</property>

<property>
    <name>yarn.nodemanager.local-dirs</name>
    <value>/E:/tweets/hadoopConf/temp/nm-localdir</value>
</property>

<property>
    <name>yarn.nodemanager.delete.debug-delay-sec</name>
    <value>600</value>
</property>

<property>
    <name>yarn.nodemanager.resource.memory-mb</name>
    <value>2048</value>
</property>

<property>
    <name>yarn.scheduler.minimum-allocation-mb</name>
    <value>512</value>
</property>

<property>
    <name>yarn.scheduler.maximum-allocation-mb</name>
    <value>2048</value>
</property>

<property>
    <name>yarn.nodemanager.resource.cpu-vcores</name>
    <value>1</value>
</property>     

<property>
    <name>yarn.scheduler.minimum-allocation-vcores</name>
    <value>1</value>
</property>

<property>
    <name>yarn.scheduler.maximum-allocation-vcores</name>
    <value>3</value>
</property>

<property>
<name>yarn.application.classpath</name>
<value>
/tweets/hadoop/,
/tweets/hadoop/share/hadoop/common/*,
/tweets/hadoop/share/hadoop/common/lib/*,
/tweets/hadoop/share/hadoop/hdfs/*,
/tweets/hadoop/share/hadoop/hdfs/lib/*,
/tweets/hadoop/share/hadoop/mapreduce/*,
/tweets/hadoop/share/hadoop/mapreduce/lib/*,
/tweets/hadoop/share/hadoop/yarn/*,
/tweets/hadoop/share/hadoop/yarn/lib/*
/C:/spark/lib/spark-assembly-1.5.0-hadoop2.6.0.jar
</value>
</property>
</configuration>

The quick connection test to the Hadoop is completed successfully. when I run the rapidminer process, it's finished by an error:

Process failed before getting into running state. this indicates that an error occurred during submitting or starting the spark job or writing the process output or the exception to the disc. Please check the logs of the spark job on the YARN Resource Manager interface for more information about the error.

in localhost:8088 I have this diagnostics informations enter image description here

this is the scheduler of the job enter image description here

I'm new with Hadoop and spark and I can't configure the memory in an efficient manner.

asma
  • 57
  • 7

1 Answers1

1

This error message describes that a submitted job couldn't allocate the required cluster resources (vcore,memory) before a timeout, thus it failed to run (likely more was requested than available in total so otherwise it could have waited forever). I assumed based on the content of your yarn-site.xml that the cluster was deployed on localhost. In that case you can check the available resources for spark-on-yarn jobs on the http://localhost:8088/cluster/scheduler page (aka YARN Resource Manager interface). During radoop process execution you can check the corresponding yarn/spark application logs there for more information about the requested amount and type of resources. With that information you can fine tune your cluster, probably along the lines of allowing more resources to be used by applications.

I would also suggest to look around in Radoop docs to check which resource allocation would fit both your use-case and your system. Radoop is capable of executing its spark jobs using different resource allocation policies. These policies describe the way radoop can request resources for the spark job execution from YARN. By adjusting this setting you might be able to fit into the cluster side available resources. You can read more about these policies here.

  • I'm using static, default configuration policy for spark. I just added the scheduler state on the question bellow. – asma Dec 20 '18 at 18:28