I'm running Ubuntu 14.04, I'm tying to get a basic Nutch Web Crawl running to no avail. Following this tutorial I set up the following building blocks:
- Ubuntu 14.04
- HBase 0.90.4
- Nutch 2.2.1
- Solr 4.3.1
I confirm both HBase and Solr is running, I populate the urls/seed.txt
file. Then when I call;
bin/nutch inject urls
I'm presented with the following output and then it seems Nutch just hangs.
InjectorJob: starting at 2014-06-09 23:38:49
InjectorJob: Injecting urlDir: urls/seed.txt
This stackoverflow question seems similar to mine, I am however not behind a proxy so the answer is not applicable.
Any help in resolving this issue would be greatly appreciated.