1

I have created a page on my site http://shedez.com/test.html this page redirects the users to a jpg on my server

I want to copy this image to my local drive using a python script. I want the python script to goto main url first and then get to the destination url of the picture

and than copy the image. As of now the destination url is hardcoded but in future it will be dynamic, because I will be using geocoding to find the city via ip and then redirect my users to the picture of day from their city.

== my present script ===

import  urllib2, os

req = urllib2.urlopen("http://shedez.com/test.html")

final_link = req.info()
print req.info()

def get_image(remote, local):   
    imgData = urllib2.urlopen(final_link).read()
    output = open(local,'wb')
    output.write(imgData)
    output.close()
    return local

fn = os.path.join(self.tmp, 'bells.jpg')
firstimg = get_image(final_link, fn)

4 Answers4

3

It doesn't seem to be header redirection. This is the body of the url -

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">\n<html>\n<head>\n<title>Your Page Title</title>\n<meta http-equiv="REFRESH" content="0;url=htt
p://2.bp.blogspot.com/-hF8PH92aYT0/TnBxwuDdcwI/AAAAAAAAHMo/71umGutZhBY/s1600/Professional%2BBusiness%2BCard%2BDesign%2B1.jpg"></HEAD>\n<BODY>\nOptional page t
ext here.\n</BODY>\n</HTML>

You can easily fetch the content with urllib or requests and parse the HTML with BeautifulSoup or lxml to get the image url from the meta tag.

Bibhas Debnath
  • 14,559
  • 17
  • 68
  • 96
1

You seem to be using html http-equiv redirect. To handle redirects with Python transparently, use HTTP 302 response header on the server side instead. Otherwise, you'll have to parse HTML and follow redirects manually or use something like mechanize.

Community
  • 1
  • 1
bereal
  • 32,519
  • 6
  • 58
  • 104
0

As the answers mention: either redirect to the image itself, or parse out the url from the html.

Concerning the former, redirecting, if you're using nginx or HAproxy server side you can set the X-Accel-Redirect to the image's uri, and it will be served appropriately. See http://wiki.nginx.org/X-accel for more info.

jtmoulia
  • 660
  • 8
  • 19
0

The urllib2 urlopen function by default follows the redirect 3XX HTTP status code. But in your case you are using html header based redirect for which you will have use what Bibhas is proposing.

Supreet Sethi
  • 1,780
  • 14
  • 24