1

I am trying to debug Spark application running on eclipse in clustered/distributed environment but not able to succeed. Application is java based and I am running it through Eclipse. Configurations to spark for Master/worker is provided through Java only.

Though I can debug the code on driver side but as the code flow moves in Spark(i.e call to .map(..)), the debugger doesn't stop. Because that code is running in Workers JVM.

Is there anyway I can achieve this ?

I have tried giving following configurations in Tomcat through Eclipse : -Xdebug -Xrunjdwp:server=y,transport=dt_socket,address=7761,suspend=n

and setting respective port in Debug->remote java application.

But after these settings I get the error: Failed to connect to remote VM. Connection Refused

If anybody has any solution to this, please help.

jammer
  • 151
  • 3
  • 23

1 Answers1

1

I was facing the same issue while configuring the spark debugging on remote master. But after that I've installed the spark on my Ubuntu machine then it worked fine. If you really want to debug, my suggestions are

1- configure spark on your testing machine then you can easily debug applications.
2- use IntelliJ IDEA, I've used it for for debugging if I've to use remote spark.

EDITED: If you are going to use IntelliJ IDEA then you can easily configure remote debugging as explained here. Debugging Apache Spark Jobs

Zia Kiyani
  • 812
  • 5
  • 21
  • I have tried this on Windows as well as on Linux(CentOS) environment. But nothing works. Can you please share your configuration used for Remote Debugging. Like where you provided all of the configuration settings required for debugging remote spark and what all jars your included ? – jammer Jun 25 '15 at 05:18
  • I've installed the Hadoop and spark on my Ubuntu machine. Then debug the spark application as I debug any other java or scala application with master as local. no configuration was required if i use local machine cluster (running spark and eclipse on same machine with one node) – Zia Kiyani Jun 25 '15 at 05:47
  • I can't use IntelliJ IDEA due to project restriction. I am running cluster with master and 3 worker nodes and java application in eclipse on same machine. – jammer Jun 25 '15 at 08:55
  • Have you tried by setting master as local ??? Try it first, don't run on cluster mode – Zia Kiyani Jun 25 '15 at 10:03
  • I dont have any problem with local mode. Its running fine. I am able to debug it on Local mode with master as local. – jammer Jun 26 '15 at 05:28