1

I'm trying to run web searches using a python script. I know how to make it work for most sites, such as using the requests library to get "url+query arguments". I'm trying to run searches on wappalyzer.com. But when you run a search its url doesn't change. I also tried inspecting the html to try and figure out where the search is taking place, so that I could use beautiful soup to change the html and run it but to no avail. I'm really new to web scraping so would love the help.

Hamza Umar
  • 21
  • 3
  • The search may be using some javascript, which is not being run when using requests. You could try Selenium or similar. I tried to have a look at the page you mentioned, but their search seems to be down. – Ozzy Walsh Jul 31 '18 at 14:17
  • Thanks, javascript being run makes sense. Its annoying though. Will have to learn to use selenium. Also their searches seem to be limited for some urls. Not working for youtube but is for imdb. – Hamza Umar Jul 31 '18 at 14:42
  • @HamzaUmar can you accept answer if it answers to your question ? – NanoPish Jul 31 '18 at 16:05

1 Answers1

0

The URL does not change because the search works with javascript and asynchronous requests. The easiest way to automate such task is to execute the javascript and interact with programatically (often easier than retro engineering the requests the client does, except if a public API is available).

You could use selenium with python, which is pretty easy to use, or any automation framework that executes Javascript by running a web driver (gecko, chrone, phantomjs).

With selenium, you will be able to program your scraper pretty easily, by selecting the field of search (using css selectors or xpath for example), inputing a value and validating the search. You will then be able to dump the whole page or specific parts you need.

NanoPish
  • 1,379
  • 1
  • 19
  • 35