1

I am running into a weird error with spynner, though the question is a generic one. Spynner is the stateful web-browser module for python. It works fine when it works but I almost with every run I get a failure saying this --

Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/spynner-2.16.dev0-py2.7.egg/spynner/browser.py", line 1651, in createRequest
    self.cookies,
AttributeError: 'Browser' object has no attribute 'cookies'
Segmentation fault (core dumped)

The problem here is its segfaulting and not letting me continue.

Looking at the code for spynner I see that the cookies variable is in fact initialized in the __init__() function for the Browser class like this:

self.cookies = []

Now on failure its really saying that the __init__() is not run since its not seeing the cookies variable. I do not understand how that can be possible. Without restricting to the spynner module can someone venture a guess as to how a python object could fail with an error like this?

EDIT: I definitely would have pasted my code here except its not all in one place for me to compactly show it. I should have done it earlier but here is the overall structure and how I instantiate and use spynner.

# helper class to get url data
class C:
   def __init__(self):
       self.browser = spynner.Browser()

   def get_data(self, url):
       try:
           self.browser.load(url)
           return self.browser.html
       except:
           raise

# class that does other stuff among saving url data to disk
class B:
    def save_url_to_disk(self, url):
        urlObj = C()
        html = urlObj.get_data(url)
        # do stuff with html


# class that drives everything
class A:
    def do_stuff_and_save_url_data(self, url):
       fileObj = B()
       fileObj.save_url_to_disk(url)

driver = A()
# call this function for multiple URLs.
driver.do_stuff_and_save_url_data(url)

The way I run it is ---

# xvfb-run python myfile.py

The segfault is probably something else I am doing. May be its because of the xvfb I am using and not handling properly? I don't know yet. I need to mention that I am relatively new to python.

I noticed that when I run the code above with say 'http://www.google.com' I get the segfault every other time.

user220201
  • 4,514
  • 6
  • 49
  • 69
  • How are you calling Spynner? Are you subclassing Browser? – Daniel Roseman Dec 11 '13 at 10:01
  • Segmentation fault? Whatever you did, that should not be happening. – user2357112 Dec 11 '13 at 10:04
  • Can you show us your code please? Does any code do `del self.cookies` at any point? – Martijn Pieters Dec 11 '13 at 10:04
  • @user2357112: that happens *after* the traceback. Yes, that is worrying too, but could be unrelated. – Martijn Pieters Dec 11 '13 at 10:05
  • @MartijnPieters: Could be a sign of an unstable extension, though, in which case the solution may be to switch versions or abandon the extension. The `dev0` bit in the file path looks like it might indicate a development version; if so, not using the development version might be something to try. – user2357112 Dec 11 '13 at 10:07
  • Looks like Spynner 2.16 is still in development. The most recent stable release is 2.15. I dunno if 2.15 has this problem, but you might want to try it. – user2357112 Dec 11 '13 at 10:10
  • @user2357112 - I see the same problem reported with 2.15 on the github page. – user220201 Dec 11 '13 at 10:20
  • @MartijnPieters - I added the structure of the code. The only interaction I have with spynner is calling the load() function on the Browser object. I do not call del on anything – user220201 Dec 11 '13 at 10:23
  • can you add a print statement or so after your self.browser = spynner.Browser() to see if it has a cookies attribute? – RemcoGerlich Dec 11 '13 at 10:46
  • The segfaults are undoubtedly caused by xvfb which in my experience (I use it for headless browser automated testing) crashes all the time without much consistency, it just seems random. For that reason instead of trying to fix it I just `retry until it works or 10 tries which ever comes first`. – smassey Dec 11 '13 at 10:51
  • The difference between 2.15 and 2.16 is one commit, ostensibly to fix Travis integration. I see your issue has been reported as a bug: https://github.com/makinacorpus/spynner/issues/49 – Martijn Pieters Dec 11 '13 at 10:53
  • @MartijnPieters - I am hoping the author will get to it. But I thought may be I can fix it if I understand what might be causing that weird behavior in python. But I guess I will wait. In the meantime I will just run again till it works as it goes through sometimes. – user220201 Dec 11 '13 at 10:56

1 Answers1

0

The code block of do_stuff_and_save_url_data() doesn't use the reference self:
then the execution of this function doesn't depend on driver.

The code block of save_url_to_disk() also doesn't use the reference self:
then the execution of this second function doesn't depend on the object fileObj.

Only the code block of get_data() uses the reference self, and more precisely the reference self.browser:
so its execution and result depends on the attribute browser of the instance urlObj from class C. This attribute is in fact a browser instance named browser of the spynner.Browser class.

In the end, you "do stuff with html" with just the data outputed by spynner.Browser().html. And creation of driver and fileObj aren't mandatory in any way.

.

Another point is that
when the instruction driver.do_stuff_and_save_url_data(url) is executed,
the method driver.do_stuff_and_save_url_data(url) is first created, then executed, and finally "destroyed" (or more precisely forgot somewhere in the RAM) because it hasn't been assigned to any identifier.

Then the identifier fileObj, which is an identifier belonging to the local namespace of the function driver.do_stuff_and_save_url_data() , is lost too, which means the instance fileObj of class B is also lost for ulterior use since it has no more assigned identifier alive.

It's the same for save_url_to_disk():
after the creation and execution of the method fileObj.save_url_to_disk(url), the object urlObj of class C , which contains an instance of browser ( an object created by spynner.Browser() ), is lost: the created browser and all its data is lost.

I wonder if this isn't because of this destruction of the browser instance after each execution of do_stuff_and_save_url_data() and save_url_to_disk() that the cookies information wouldn't be destroyed before an ulterior call.

.

So, in my opinion, your code only embeds two functions in two definitions of classes A and B and they are used as being considered functions , not as methods.

1/ I don't think it is a good coding pattern. When one wants only plain functions, they must be written outside of any class.

2/ The problem is that if operations are triggered by functions, a new browser is created each time these functions are activated , even if they have the mantle of methods.

You will say me that you want these functions to act with data provided by the browser defined by spynny.Browser().
That's why I think that they must not be functions embeded in classes as now, but real methods attached to a stable instance of a browser. That's the aim of object to keep in the same namespace the data and the tools to treat the data.

-

.

All that said, I would personnally write:

class C(spynner.Browser):
   def __init__(self):
       spynner.Browser.__init__(self)

   def get_data(self, url):
       try:
           self.html = self.load(url).html
       except:
           raise

    # method that does other stuff among saving url data to disk
    def save_url_to_disk(self, url):
        get_data(url)
        # do stuff with self.html

    # method that drives everything
    def do_stuff_and_save_url_data(self, url):
        self.save_url_to_disk(url)


driver = C()
driver.do_stuff_and_save_url_data(url)

But I'm not sure to have well undesrtood all your considerations, and I warn that I didn't know spynner before reading your post. All that I've written could be stupid relatively to your real problem. Keep a critic eye on my post, please.

eyquem
  • 26,771
  • 7
  • 38
  • 46