how to attach debugger to remote Hadoop instance

Question

I am not looking for these so-called "debugging" solutions which rely on println. I mean to attach a real debugger to a running Hadoop instance, and debugging it from a different machine.

Is this possible? How? jdb?

How will you know which task tracker you want to attach to? Or is that unimportant? — davek, May 31 '13 at 07:58

score 4 · Accepted Answer · answered May 31 '13 at 13:06

A nicely given at LINK

To debug task tracker, do following steps.

Edit conf/hadoop-env.sh to have following

export HADOOP_TASKTRACKER_OPTS="-Xdebug -Xrunjdwp:transport=dt_socket,address=5000,server=y,suspend=n"
Start Hadoop (bin/start-dfs.sh and bin/start-mapred.sh)
It will block waiting for debug connection
Connect to the server using Eclipse "Remote Java Application" in the Debug configurations and add the break points
Run a map reduce Job

score 1 · Answer 2 · answered May 31 '13 at 08:03

I've never done it that way as I'd rather my "real" jobs run unhindered by debug-overhead (which can, under circumstances, change the environment conditions anyway): I debug "locally" against a pseudo-instance (normal debugging in eclipse is absolutely no problem), copying specific files from the live environment once I've isolated (by using e.g. counters) where the problem lies.

how to attach debugger to remote Hadoop instance

2 Answers2