8

I'm using Linux with Hadoop, Cloudera and HBase.

Could you tell me how to correct this error?

Error: could to find or load main class org.apache.nutch.crawl.InjectorJob

The following command gave me the error:

src/bin/nutch inject crawl/crawldb dmoz/

if you need any other information ask for me.

ayhan
  • 70,170
  • 20
  • 182
  • 203
orilion
  • 81
  • 4

1 Answers1

2

I think you probably missed a step or two. Please confirm:

  1. Did you install Apache ANT and then navigate to the nutch folder and type in "ant"?
  2. Did you set the environment variables:
    • NUTCH_JAVA_HOME: The java implementation to use. Overrides JAVA_HOME.
    • NUTCH_HEAPSIZE: The maximum amount of heap to use, in MB. Default is 1000.
    • NUTCH_OPTS: Extra Java runtime options.Multiple options must be separated by white space.
    • NUTCH_LOG_DIR: Log directory (default: $NUTCH_HOME/logs)
    • NUTCH_LOGFILE: Log file (default: hadoop.log)
    • NUTCH_CONF_DIR: Path(s) to configuration files (default: $NUTCH_HOME/conf). Multiple paths must be separated by a colon ':'.
    • JAVA_HOME
    • NUTCH_JAVA_HOME
    • NUTCH_HOME

If you install using "ant", then you will get a new folder in /nutch called /nutch/runtime/local and this is from where you must actually run nutch.

Tip: Try reading this page.

D.J
  • 1,439
  • 1
  • 12
  • 23
rockstardev
  • 13,479
  • 39
  • 164
  • 296