0

I'm using the urllib to build a simple web scraper and need to know if a url redirects after being opened with the urlopen method. I'm calling the geturl() method on the response object but getting the original url, not the redirected one. Any ideas on how I can get the redirected one?

response = urlopen("https://en.wikipedia.org/wiki/" + url, context=ctx)
            final_url = response.geturl()[30:]
            print("url:", url)
            print("final_url:", final_url)

Expected output:

url: Qasemabad
final_url: Qasimabad

Actual output:

url: Qasemabad
final_url: Qasemabad

2 Answers2

0

I think that urllib.geturl() just make a simple HTTP request but don't read it so there isn't any redirection. If you want to have the redirected one, try with another module.

0

No redirect happened.

Wikipedia did not serve a 302 document followed by a 200 document. Rather, it immediately satisfied the request with a 200 success document, which happens to contain JavaScript that changes the URL displayed by a browser. This happens without need of re-fetching the page from a new location.

The geturl() response is correct, it accurately reflects what the server sent.

J_H
  • 17,926
  • 4
  • 24
  • 44