I am experiencing difficulties to get started using StormCrawler using the StormCrawler+ElasticSearch archetype. On the StormCrawler website, I see two versions, namely 1x and 2x. Similarly, Apache Storm comes in version 1 and 2.
- Should I install StormCrawler using the version 1x or 2x?
- What version of JDK does StormCrawler require? Is there a need to use Oracle JDK or can the OpenJDK be used as well?
- I want to use StormCrawler to identify and process images and documents. At what place in the topology can these tasks best be added?
Update: According to the following URL (Storm Crawler with Java 11), StormCrawler 2 is advised. What StormCrawler+ElasticSearch archetype should be used when using StormCrawler 2?