2

i am using the nutch2.2 and hbase 0.94 and gora 0.4 and when i am executing the steps as follows

1.nutch inject seed.txt
2.nutch generate -batchId 231
3.nutch fetch 231
4.nutch parse 231
5.nutch updatedb 231

i'll get the html content of a particular page say([http://www.flipkart.com/mens-clothing/t-shirts?otracker=hp_nmenu_sub_men_0_T-Shirts]) but when i am executing the step 4

nutch parse 231

and see my webpage table created in hbase there is a ol(outlink) column family but it is empty

if anyone can help it will be good for me if i get all the outlink.

Thanks in advance

sachingupta
  • 709
  • 2
  • 9
  • 30
  • make sure that the fetch phase succeed to download the content of the URL. The webpage is created even the fetch phase get empty content of the URL – Do Do Mar 18 '16 at 20:19

0 Answers0