0

I want to use windmill or selenium to simulate a browser that visits a website, scrapes the content and after analyzing the content goes on with some action depending of the analysis.

As an example. The browser visits a website, where we can find, say 50 links. While the browser is still running, a python script for example can analyze the found links and decides on what link the browser should click.

My big question is with how many http Requests this can be done using windmill or selenium. I mean do these two programs can simulate visiting a website in a browser and scrape the content with just one http request, or would they use another internal request to the website for getting the links, while the browser is still running?

Thx alot!

zwieback86
  • 387
  • 3
  • 7
  • 14
  • 1
    Selenium only uses the browser. It makes no additional HTTP requests to the server. Does this answer your question? – User Jul 14 '13 at 16:50
  • Yes this answers my question, i was curious if for the scraping part it would send additional requests even after the browser finnished loading. Thx – zwieback86 Jul 14 '13 at 18:45

1 Answers1

0

Selenium uses the browser but number of HTTP request is not one. There will be multiple HTTP request to the server for JS, CSS and Images (if any) mentioned in the HTML document.

If you want to scrape the page with single HTTP request, you need to use scrapers which only gets what is present in the HTML source. If you are using Python, check out BeautifulSoup.

Arun
  • 119
  • 1
  • 3
  • Ok let me formulate my question a bit different. When simulating a browser and open a website, after the simulated browser is done loading the complete page (i understand for that there can be several requests to the website depending on javascript etc), and i want to scrape the given content to decide what the browser should do next, will there be any more requests for the scraping part from my machine to the website or can selenium just scrape the content by communicating just with the browser? – zwieback86 Jul 15 '13 at 11:28