6

i need visit website whit pycurl, follow redirects, and print final url, i write this python code :

c = pycurl.Curl()
c.setopt(c.URL, 'http://localhost/redirect.php')
c.setopt(c.HTTPPOST, values)
c.setopt(c.WRITEFUNCTION, buf_pagina.write)
c.setopt(c.HEADERFUNCTION, buf_header.write)
c.setopt(c.CONNECTTIMEOUT, 30)
c.setopt(c.AUTOREFERER,1)
c.setopt(c.FOLLOWLOCATION, 1)
c.setopt(c.COOKIEFILE, '')
c.setopt(c.TIMEOUT, 30)
c.setopt(c.USERAGENT, '')
c.perform()

i need print the final url, how can i make this ? thanks.

the solution is this : url_effective = c.getinfo(c.EFFECTIVE_URL)

kingcope
  • 1,121
  • 4
  • 19
  • 36
  • do you really need to use `pycurl`? If not, try using `requests`, the solution to do what you want is, as far as I remember, really more obvious. – zmo Jan 29 '14 at 23:33
  • yes, i need use pycurl, is very fast library ! – kingcope Jan 29 '14 at 23:34
  • here's a way that some guy implemented in php: http://forums.devshed.com/php-development-5/curl-get-final-url-after-inital-url-redirects-544144.html good thing with curl, it's that the library behaves the same across languages. – zmo Jan 29 '14 at 23:35

1 Answers1

7

here's an adaptation of the PHP script I linked in the comments:

import pycurl
import sys
import StringIO

o = StringIO.StringIO()
h = StringIO.StringIO()

c = pycurl.Curl()
c.setopt(c.URL, 'http://stackoverflow.com/questions/21444891')
# c.setopt(c.HTTPPOST, values)
c.setopt(c.WRITEFUNCTION, o.write)
c.setopt(c.HEADERFUNCTION, h.write)
c.setopt(c.CONNECTTIMEOUT, 30)
c.setopt(c.AUTOREFERER,1)
c.setopt(c.FOLLOWLOCATION, 1)
c.setopt(c.COOKIEFILE, '')
c.setopt(c.TIMEOUT, 30)
c.setopt(c.USERAGENT, '')
c.perform()

h.seek(0)

location = ""

for l in h:
    if "Location" in l:
        location = l.split(": ")[-1]

print location

though, as this example shows, you may not always have the full URI, only the path part of the URI (but if that's the case, that's easy to add the fqdn back)

zmo
  • 24,463
  • 4
  • 54
  • 90