I need a open source web crawler developed in java with incremental crawling support.
Web crawler should be easily customized and integrated with solr or elasticsearch.
It should be an active one which is developing further with more features.
Aperture is one of a good crawler, it has all features i mentioned but its not an active crawler and due to license (if i use it for commercial purpose) of their dependency i ignored.
Nutch - a web crawler which has more features with hadoop support. But i go through many websites and tutorials, there is no proper documents, api found for customizing it programmatically in windows. I could edit the code in eclipse but it cause many errors while running map reduce jobs. There is no java api for nutch to implement like aperture.
Crawl4j is a good web crawler but it has no incremental crawling features and i haven't checked license problems.
Is there any other crawler which have all features that i mentioned or is there any way to use any one of above mentioned crawler for my requirements?
Helpful answers will be greatly appreciated.