I am trying to use Nutch 1.14 for crawling a website. There are some web pages on which content is loaded through ajax. I am trying to integrate interactive selenium plugin to handle some js functionality to fetch dynamic data.
As per documentation, i made below changes :
#Added SearchHandler in Interactive-selenium plugin directory
public class SearchHandler implements InteractiveSeleniumHandler {}
#Added below conf in nutch-site.xml
<property>
<name>plugin.includes</name>
<value>protocol-interactiveselenium|protocol-(file|http)|urlfilter-regex|parse-(html|tika|text|metatags)|index-(static|basic|anchor|metadata|more)|indexer-solr|scoring-opic|urlnormalizer-(pass|regex|basic)</value>
<description>
</description>
</property>
<property>
<name>interactiveselenium.handlers</name>
<value>SearchHandler,DefaultHandler</value>
<description></description>
</property>
It is invoking browser for some random urls. Not sure why it is not triggered for every crawled url. What i am doing wrong?