0

I've installed Nutch 2.3 on windows 10 through Cygwin. I'm following this Nutch tutorial https://cwiki.apache.org/confluence/display/nutch/NutchTutorial, and all is good. But now I'm stuck at this point:

    `$ ./nutch inject crawl/crawldb urls 
    `Error: Could not find or load main class org.apache.nutch.crawl.InjectorJob`

While Injectjob.java is already in C:\cygwin64\home\apache-nutch-2.3\src\java\org\apache\nutch\crawl.

I've tried ant -find apache-nutch-2.3 and it builds the build.xml successfully. But even though the issue is still undergoing.

EDIT:

Solved by running Nutch from the correct path as quote here "If you install using "ant", then you will get a new folder in /nutch called /nutch/runtime/local and this is from where you must actually run nutch." could to find or load main class org.apache.nutch.crawl.InjectorJob.

Kay zhrani
  • 11
  • 3
  • The [linked tutorial](https://cwiki.apache.org/confluence/display/nutch/NutchTutorial#NutchTutorial-Steps) is for Nutch 1.x, so you definitely should install Nutch 1.x (right now, 1.15 is the latest release on this branch). Although Nutch 1.x and 2.x are based on different architectures. Nutch 1.x is easier to use and also better maintained. So, I would recommend to use 1.x (the 1.15 release or built from branch "master"). – Sebastian Nagel Sep 12 '19 at 06:36
  • Can Nutch 1.x work with hbase and Hadoop? Becuse later on I want my project to act as service in a destrebuted environment. – Kay zhrani Sep 15 '19 at 05:55
  • Nutch 1.x runs on Hadoop but it uses map and sequence files to hold the data. If HBase is a strong requirement, then 2.x is the only option. – Sebastian Nagel Sep 15 '19 at 10:04

0 Answers0