I am trying to automate login to this page http://portal.globaltransit.net/ the thing is the page return a 401 header when you first reach the page but does not show standerd bassic http auth page rather a http form. Here is the output of curl -vvv http://portal.globaltransit.net/
* About to connect() to portal.globaltransit.net port 80 (#0)
* Trying 124.158.236.65... connected
* Connected to portal.globaltransit.net (124.158.236.65) port 80 (#0)
> GET / HTTP/1.1
> User-Agent: curl/7.19.7 (i486-pc-linux-gnu) libcurl/7.19.7 OpenSSL/0.9.8k zlib/1.2.3.3 libidn/1.15
> Host: portal.globaltransit.net
> Accept: */*
>
< HTTP/1.1 401 Unauthorized
< Date: Thu, 14 Nov 2013 07:18:06 GMT
< Server: Apache
< X-Powered-By: PHP/5.2.11
< Set-Cookie: symfony=1960d9b76a5f9fc3b00786e126cc69af; path=/
< Content-Length: 1211
< Content-Type: text/html; charset=utf-8
<
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title></title>
<link rel="shortcut icon" href="/favicon.ico" />
<link rel="stylesheet" type="text/css" media="screen" href="/css/main.css" />
</head>
<body>
<form action="/login" method="post">
<table>
<tr>
<th><label for="signin_username">Username</label></th>
<td><input type="text" name="signin[username]" id="signin_username" /></td>
</tr>
<tr>
<th><label for="signin_password">Password</label></th>
<td><input type="password" name="signin[password]" id="signin_password" /></td>
</tr>
<tr>
<th><label for="signin_remember">Remember</label></th>
<td><input type="checkbox" name="signin[remember]" id="signin_remember" /><input type="hidden" name="signin[_csrf_token]" value="6bdf80ca900038ada394467752593135" id="signin__csrf_token" /></td>
</tr>
</table>
<input type="submit" value="sign in" />
<a href="/request_password">Forgot your password?</a>
</form>
</body>
</html>
when i try to use machanize to load the page with the following script
import mechanize
import mimetypes
import logging
import urllib2
from urlparse import urlparse
import cookielib
from base64 import b64encode
class Browser:
def __init__(self, url):
br = mechanize.Browser()
br.set_handle_robots(False) # no robots
br.set_handle_refresh(False)
br.set_handle_redirect(True)
br.set_debug_http(True)
cj = cookielib.LWPCookieJar()
br.set_cookiejar(cj) # can sometimes hang without this
br.addheaders = [('User-agent', 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.1) Gecko/2008071615 Fedora/3.0.1-1.fc9 Firefox/3.0.1')]
self.page = br.open(url).read()
print self.page
if __name__ == '__main__':
browser = Browser("http://portal.globaltransit.net/")
I get the following error mechanize._response.httperror_seek_wrapper: HTTP Error 401: Unauthorized
. I was wondering if there any way to get mechanize to ignore the 401 returned by the server so I can process the form.