0

I have written a code that visits a URL using pycurl. I have tor enabled. The URL gets redirected to some other url.

Below is the code.

import pycurl
curl = pycurl.Curl()
curl.setopt(pycurl.URL, URL)
curl.setopt(pycurl.PROXY, '127.0.0.1')
curl.setopt(pycurl.PROXYPORT, 9050)
curl.setopt(pycurl.PROXYTYPE, pycurl.PROXYTYPE_SOCKS5_HOSTNAME)
curl.setopt(pycurl.USERAGENT, 'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:8.0) Gecko/20100101 Firefox/8.0')
curl.perform()

It prints the expected html content. But whenever there is a visit to a URL, there is an increment to a count somewhere else.

Now, when I run the script, I get the html content, but there is no increment in the count, but when the same html output is run in some online html rendering website(htmledit.squarefree.com/ ), the count is incremented.

Any help to increment the count automatically, using the script itself?

Thanks.

Satys
  • 2,319
  • 1
  • 20
  • 26

1 Answers1

0

Any kind of updation of some data on server when client visit their website is possibly done through javascript.

When some website content is loaded on client machine, it has got some javascript, which gets executed onto client's machine to notify the server. Now when the webpage is visited through browser, javascript are executed(if the browser is enabled to do it). But when the webpage is visited through curl, it can't execute javascript.

I managed to do it using dryscrape. Dryscrape uses http protocol. You can read here for work around to enable socks5 protocol for dryscrape.

Community
  • 1
  • 1
Satys
  • 2,319
  • 1
  • 20
  • 26