4

I'm trying to send GET request to some urls with User-Agent as mobile to get the the redirected url (for example - http://m.google.com instead of http://google.com).

I've tried requests library and urllib2 too - it seems that the User-Agent isn't sent with the request. Also read another questions here but the answers was not clear enough - if its just buggy or I miss something?

This is my code:

try:
    req = requests.get(item.url, headers={'User-Agent': 'Mozilla/5.0 (iPhone; CPU iPhone OS 9_1 like Mac OS X) AppleWebKit/601.1.46 (KHTML, like Gecko) Version/9.0 Mobile/13B137 Safari/601.1'}, timeout=5)

except requests.exceptions.HTTPError:
    continue

except requests.exceptions.Timeout:
    continue

print (item.url + ' => ' + req.url + ' (' + str(req.status_code) + ')')

Still, always get the computer version instead of mobile version.

RonZ
  • 743
  • 1
  • 11
  • 31
  • I'm not sure why you think that the user-agent isn't sent. Most websites - including Google - don't redirect to another URL for mobile, they just show the correct layout/stylesheet on the main URL. – Daniel Roseman Mar 22 '17 at 09:15
  • 1
    What if not everything is being done by detecting user-agent? How about JS, cookies? Especially JS. – leovp Mar 22 '17 at 09:16
  • check [out](https://github.com/scrapy-plugins/scrapy-splash) this for JS render – Alex Fung Mar 22 '17 at 09:18
  • You might want to debug with tools like `fiddler` or `charles`, set proxy in your request then compare it with the actual request coming from your iPhone, and you'll find out what's missing. – Shane Mar 22 '17 at 09:19
  • @DanielRoseman because there is websites that I know for sure that there is mobile versions (If it was just few links I could do it without scripts but its hundreds of links). – RonZ Mar 22 '17 at 10:02
  • @leovp this is something that I thought about.. maybe its JS redirect - but can I catch it? – RonZ Mar 22 '17 at 10:02
  • Well then, show an example where you know this redirect on mobile but is not happening with your code. – Daniel Roseman Mar 22 '17 at 10:03
  • @DanielRoseman www.mako.co.il is regular version, if I'll try to open this link with my phone I'll get mobile.mako.co.il, the code prints `http://www.mako.co.il => http://www.mako.co.il/ (200)` – RonZ Mar 22 '17 at 10:12

2 Answers2

3

Well, eventually found a solution.. its a bit slow, and if don't need the mobile version as I needed just use urllib2 or requests.

import requests
import os

from selenium import webdriver
from selenium.webdriver.chrome.options import Options as SeleniumOptions
from selenium.common.exceptions import ErrorInResponseException, TimeoutException, UnexpectedAlertPresentException

headers = SeleniumOptions()
headers.add_argument("user-agent=Mozilla/5.0 (iPhone; CPU iPhone OS 9_1 like Mac OS X) AppleWebKit/601.1.46 (KHTML, like Gecko) Version/9.0 Mobile/13B137 Safari/601.1")

driver = webdriver.Chrome(executable_path=os.path.abspath('app/static/drivers/chromedriver'), chrome_options=headers) # path of the chrome driver
driver.set_page_load_timeout(10) # request timeout - 10 seconds

try:
    req = driver.get(YOUR_URL_HERE)

    print YOUR_URL_HERE + ' => ' + driver.current_url + ' (' + str(requests.get(driver.current_url).status_code) + ')'

except ErrorInResponseException:
    continue

except TimeoutException:
    continue

except UnexpectedAlertPresentException: # dismiss alerts
    alert = driver.switch_to.alert
    alert.dismiss() # can be alert.accept() if you want to accept the alert

driver.quit()

Note that I used Chrome Driver - you can find it here https://sites.google.com/a/chromium.org/chromedriver/downloads

Enjoy.

RonZ
  • 743
  • 1
  • 11
  • 31
2

To request as a cellphone, the quick solution (non Selenium one) is using the BeautifulSoup as following:

from bs4 import BeautifulSoup
import requests
headers_mobile = { 'User-Agent' : 'Mozilla/5.0 (iPhone; CPU iPhone OS 9_1 like Mac OS X) AppleWebKit/601.1.46 (KHTML, like Gecko) Version/9.0 Mobile/13B137 Safari/601.1'}
link = 'some link'
B_response = requests.get(link, headers=headers_mobile)
B_soup = BeautifulSoup(B_response.content, 'html.parser')
Amirkhm
  • 948
  • 11
  • 13