I'm using the urllib2.urlopen
method to open a URL and fetch the markup of a webpage. Some of these sites redirect me using the 301/302 redirects. I would like to know the final URL that I've been redirected to. How can I get this?
Asked
Active
Viewed 3.6k times
22

Mark Amery
- 143,130
- 81
- 406
- 459

Mridang Agarwalla
- 43,201
- 71
- 221
- 382
4 Answers
38
Call the .geturl()
method of the file object returned. Per the urllib2
docs:
geturl()
— return the URL of the resource retrieved, commonly used to determine if a redirect was followed
Example:
import urllib2
response = urllib2.urlopen('http://tinyurl.com/5b2su2')
response.geturl() # 'http://stackoverflow.com/'

Mark Amery
- 143,130
- 81
- 406
- 459

mmmmmm
- 32,227
- 27
- 88
- 117
-
1how to handle when there are multiple intermediate urls and i want final url ? This does not work for that case. – Kishan Mehta Nov 29 '16 at 08:58
4
The return value of urllib2.urlopen
has a geturl()
method which should return the actual (i.e. last redirect) url.

Michael
- 8,920
- 3
- 38
- 56
1
e.g.:
urllib2.urlopen('ORIGINAL LINK').geturl()
urllib2.urlopen(urllib2.Request('ORIGINAL LINK')).geturl()

kevin
- 1,107
- 1
- 13
- 17
-1
You can use HttpLib2
with follow_all_redirects = True
and get the content-location
from the response headers. See my answer to 'httplib is not getting all the redirect codes' for an example.