I am not looking for these so-called "debugging" solutions which rely on println. I mean to attach a real debugger to a running Hadoop instance, and debugging it from a different machine.
Is this possible? How? jdb?
I am not looking for these so-called "debugging" solutions which rely on println. I mean to attach a real debugger to a running Hadoop instance, and debugging it from a different machine.
Is this possible? How? jdb?
A nicely given at LINK
To debug task tracker, do following steps.
Edit conf/hadoop-env.sh to have following
export HADOOP_TASKTRACKER_OPTS="-Xdebug -Xrunjdwp:transport=dt_socket,address=5000,server=y,suspend=n"
Start Hadoop (bin/start-dfs.sh and bin/start-mapred.sh)
I've never done it that way as I'd rather my "real" jobs run unhindered by debug-overhead (which can, under circumstances, change the environment conditions anyway): I debug "locally" against a pseudo-instance (normal debugging in eclipse is absolutely no problem), copying specific files from the live environment once I've isolated (by using e.g. counters) where the problem lies.