-1

i have multiple working crawlers run together

eg.

-crawler 1

-crawler 2

-crawler 3

my question is: what if i want to shut down crawler number 2 only?

i imagine that every crawler in crawler4j has a session ID and i can shut it off while requesting its ID

HOW CAN I IMPLEMENT THAT ?

edit

i know how to shut down working crawler but my question is .. if i have crawling system with users and i want each user have its own crawlers and if user x wants to shut down its crawler .so, crawler of user x shut down with out reflecting and shutting down the user y crawler

Ahmed Sakr
  • 129
  • 1
  • 9
  • https://github.com/dgoiko/crawler4j/commit/0690eb5c02a5d27d31c80c9e56c737d6b86ec4e9 Too lazy to explain in a full answer – DGoiko Mar 26 '20 at 01:34

1 Answers1

1

you have to wrap your crawler in a CrawlController instance:

CrawlController controller = new CrawlController(config,..);
controller.startNonBlocking(BasicCrawler.class, numberOfCrawlers);

Thread.sleep(30 * 1000);
controller.shutdown(); // shutdown crawling
controller.waitUntilFinish();

the complete example you will find here

update
a sample code, a User with an instance of controller:

public class UserCreator {
  public User createNewUser() {
    CrawlController controller = new CrawlController(config,..);
    controller.startNonBlocking(BasicCrawler.class, numberOfCrawlers);

    return new User(controller);
  }
}

public class User {
  private CrawlController controller; 

  public User(CrawlController controller) {
    this.controller = controller;
  }

  public void shutdownCrawler() {
    controller.shutdown(); // shutdown crawling
    controller.waitUntilFinish();
  }
}
sudipn
  • 619
  • 5
  • 14
  • hello @sudipn , i know that but my question was .. if i have crawling system with users and i want earch user have its crawlers and if user x wants to shut down its crawler , it shut down with reflecting and shutting down the user y crawler – Ahmed Sakr Dec 12 '19 at 08:59
  • thank you , but does that code affect the other users crawlers? – Ahmed Sakr Dec 12 '19 at 09:57
  • no, if you create a new instance of `CrawlController` every time you create a user. check my updated answer. – sudipn Dec 12 '19 at 09:59
  • do you mean that if i created new instance of CrawlerController each crawler would has unique session to shut it down ? – Ahmed Sakr Dec 12 '19 at 10:19
  • the crawler(s) for your user is shutdown using the `CrawlController` as it provides an abstraction on the crawler. I guess you do not have to deal with session ids – sudipn Dec 12 '19 at 10:30