Questions tagged [crawler4j]

Crawler4j is an open source Java web crawler.

Crawler4j is an open source Java crawler which provides a simple interface for crawling the Web.

Reference: https://github.com/yasserg/crawler4j

174 questions
0
votes
1 answer

How to add seed in Crawler4j at the runtime?

I am using crawler4j and I need to add links at runtime. Let say , I add a seed 'LinkA' and crawler4j started crawling it. While program is running , I want to add one more seed 'LinkB'. Can it be done ? if yes , how ? Thanks in advance.
user801154
0
votes
2 answers

Crawler4j and Tripadvisor

I'm writing a crawler for Tripadvisor, using crawler4j. I need to collect all the reviews for an item, but the links to the "next" reviews (those with numbers) have associated not a link, but a javascript function. This function is defined somewhere…
0
votes
1 answer

How to Search for a String that is existing in Different web pages using crawler4j

I am using Crawler4j it is returning output as Processed Pages: 10 Total Links found: 369 Total Text Size: 20077 up to this it is working fine but i want to search a string which is existing in this pages how can i achieve this could you…
Raghavender Reddy
  • 180
  • 2
  • 5
  • 18
-1
votes
1 answer

Shutting Down a specific crawler of 3 working crawlers in Crawler4j?

i have multiple working crawlers run together eg. -crawler 1 -crawler 2 -crawler 3 my question is: what if i want to shut down crawler number 2 only? i imagine that every crawler in crawler4j has a session ID and i can shut it off while requesting…
Ahmed Sakr
  • 129
  • 1
  • 9
-1
votes
3 answers

What is a web crawler and how does it work?

I want to learn web crawling with Java EE. I don't know where to start. What are good books or tutorials?
-1
votes
3 answers

Crawler4j runtime error

I have implemented a web crawler using the crawler4j library. I am encountering the following error: SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See…
-1
votes
1 answer

Crawler4j With Grails App

I am making a crawler application in Groovy on Grails. I am using Crawler4j and following this tutorial. I created a new grails project Put the BasicCrawlController.groovy file in controllers->package Did not create any view because I expected on…
clever_bassi
  • 2,392
  • 2
  • 24
  • 43
-1
votes
1 answer

Crawler4j Regex Pattern for url

im using crawler4J , and i want to make some patterns to urls only but i couldn't solve regex for that url : http://www.site.com/liste/product_name_changable/productDetails.aspx?productId={id}&categoryId={category_id} i try that…
Muhammet Arslan
  • 975
  • 1
  • 9
  • 33
-4
votes
3 answers

How to solve the

I m trying the QuickStart from https://github.com/yasserg/crawler4j I do the following steps to test the example: 0) Add crawler4j.jar to java library 1) Create a java package called mycrawler 2)Paste the Quickstart code to class-mycrawler…
evabb
  • 405
  • 3
  • 21
1 2 3
11
12