5

I am trying to get urllib2 to work with PyWebKitGtk to support cookies. I think it's mostly working, but cookies aren't working between sessions. The cookies.txt file is saved, and it does look like it uses the cookies in the requests (examined in Wireshark), but the data I am seeing loaded into the browser window doesn't appear to have been using the cookies. After I log in, shut down the app, then restart it, my login session is gone.

My code

def load_uri_in_browser(self):
    self.cookiejar = LWPCookieJar(config_dir + "/cookies.txt")
    if os.path.isfile(self.cookiejar.filename):
        self.cookiejar.load(ignore_discard=True)

    #for testing, this does print cookies    
    for index, cookie in enumerate(self.cookiejar):
        print index, '  :  ', cookie        

    self.opener = urllib2.build_opener(
        urllib2.HTTPRedirectHandler(),
        urllib2.HTTPHandler(debuglevel=0),
        urllib2.HTTPSHandler(debuglevel=0),
        urllib2.HTTPCookieProcessor(self.cookiejar))
    self.opener.addheaders = [('User-Agent', 'Mozilla/5.0 (Windows; U; Windows NT 6.1; en-GB; rv:1.9.2.13) Gecko/20101203 Firefox/3.6.13')]

    self.view = webkit.WebView()        
    self.view.connect('navigation-policy-decision-requested', self.navigation_policy_decision_requested_cb)

    self.mainFrame = self.view.get_main_frame()
    self.mainFrame.load_uri("http://twitter.com")

    #gtk window loaded earlier
    self.window.add(self.view)
    self.window.show_all() 

    self.window.show()

def navigation_policy_decision_requested_cb(self, view, frame, net_req, nav_act, pol_dec):
    uri=net_req.get_uri()
    if uri.startswith('about:'):
        return False

    page = self.opener.open(uri)
    self.cookiejar.save(ignore_discard=True)
    view.load_string(page.read(),None,None,page.geturl())
    pol_dec.ignore()
    return True
AndrewR
  • 6,668
  • 1
  • 24
  • 38

2 Answers2

0

Note, this is some what pseudo but the code will probably work to and extent of 99% :) I'd try going with a simple code as possible:
(I'm not sure if cj.save(...) discards cookies between sessions so i've used pickle for the most part, and for other things i need to store "as is" between sessions)

import cookielib, urllib2, os, pickle

if os.path.isFile('./cookies.txt'):
    cj = pickle.load(open('./cookies.txt', 'rb'))
else:
    cj = cookielib.CookieJar()

opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
r = opener.open("http://example.com/")

pickle.dump(cj, open("./cookies.txt", "wb"))

Secondly, are you SURE that the cookie you are getting is not just a session cookie that's supposed to end after a certain period of time or when you close the connection? You know, not one of those "remember me" cookies?

Try setting up your own "webserver" in Python:

import socket
socket.bind(('', 80))
socket.listen(5)
ns, na = socket.accept()
ns.recv(8192)
ns.send("""\
HTTP/1.1 200 OK\r\n
Date: Wed, 26 Oct 2011 08:37:34 CET\r\n
Server: Apache/1.3.3.7 (Unix) (Red-Hat/Linux)\r\n
Last-Modified: Wed, 26 Oct 2011 08:37:34 CET\r\n
Accept-Ranges: bytes\r\n
Content-Length: 5\r\n
Connection: close\r\n
Set-Cookie: moo=wtf; path=/\r\n
Content-Type: text/html; charset=UTF-8\r\n
\r\n
Hello""")

ns.close()

ns, na = socket.accept()
ns.recv(8192)
ns.close()

See what your output is in actual terms of HTTP data? It's always nice to have the "before" data and the "after" data.. this way you'll know why it isn't stored/loaded.

Torxed
  • 22,866
  • 14
  • 82
  • 131
  • I messed around with idea, and couldn't get it to work. I think the issue is while the front page of whatever site I am trying to load gets cookies passed to it, internally webkit doesn't have a record of those cookies, so any other files or redirects or page refreshes do not load those cookies again. I thought my navigation_policy_decision event should have handled that, but it's apparently not catching all requests. – AndrewR Oct 28 '11 at 22:29
0

I tried a similar approach myself, and couldn't get it working. I'm not sure about the LWPCookieJar, but you can get persistent cookie support with pywebkitgtk "natively"- check out my answer to python webkit webview remember cookies?

Community
  • 1
  • 1
Matt Luongo
  • 14,371
  • 6
  • 53
  • 64
  • @AndrewR have you had a chance to try this? I've got cookies working successfully using ctypes and libsoup. If you really need to use the Python cookiejar, I also came up with a way for the two to interact. – Matt Luongo Jan 18 '12 at 19:28
  • Thank you! The answer you gave at the other question was just what I needed. :) – AndrewR Feb 20 '12 at 22:05