5

I am not looking for these so-called "debugging" solutions which rely on println. I mean to attach a real debugger to a running Hadoop instance, and debugging it from a different machine.

Is this possible? How? jdb?

Community
  • 1
  • 1
T. Webster
  • 9,605
  • 6
  • 67
  • 94

2 Answers2

4

A nicely given at LINK

To debug task tracker, do following steps.

  1. Edit conf/hadoop-env.sh to have following

    export HADOOP_TASKTRACKER_OPTS="-Xdebug -Xrunjdwp:transport=dt_socket,address=5000,server=y,suspend=n"

  2. Start Hadoop (bin/start-dfs.sh and bin/start-mapred.sh)

  3. It will block waiting for debug connection
  4. Connect to the server using Eclipse "Remote Java Application" in the Debug configurations and add the break points
  5. Run a map reduce Job
twid
  • 6,368
  • 4
  • 32
  • 50
1

I've never done it that way as I'd rather my "real" jobs run unhindered by debug-overhead (which can, under circumstances, change the environment conditions anyway): I debug "locally" against a pseudo-instance (normal debugging in eclipse is absolutely no problem), copying specific files from the live environment once I've isolated (by using e.g. counters) where the problem lies.

davek
  • 22,499
  • 9
  • 75
  • 95