I'm trying to programmatically determine the final landing pages of some urls and I ran into http://event.four33.co.kr/20131030/redirect.html which is basically looping back onto itself:
<script type="text/javascript">
var agent = navigator.userAgent;
var redirectUrl = "";
if (agent.indexOf("Windows NT") != -1)
{
redirectUrl = "https://play.google.com/store/apps/details?id=com.ftt.suhoji_gl_4kakao";
}
else if (agent.indexOf("iPhone") != -1)
{
redirectUrl = "https://itunes.apple.com/kr/app/id705181473?mt=8";
}
else if (agent.indexOf("iPad") != -1)
{
redirectUrl = "https://itunes.apple.com/kr/app//id705181473?mt=8";
}
else if (agent.indexOf("Android") != -1)
{
redirectUrl = "market://details?id=com.ftt.suhoji_gl_4kakao";
}
location.href = redirectUrl;
</script>
When my script (see snippet below) hits it, the driver.current_url doesn't ever return.
from pyvirtualdisplay import Display
from selenium import webdriver
display = Display(visible=0, size=(1024, 768))
display.start()
driver=webdriver.Firefox()
driver.get('http://event.four33.co.kr/20131030/redirect.html')
driver.current_url
I tried urllib2 and requests and have not found a way for me to catch this, nor to prevent it. Any tips?
(Note that this url actually looks at the agent accessing it because redirecting. Both FireFox and Chrome aren't "captured" and thus it loops to itself.)