0

Trying to experiment and build a class resolving around the use of UrlRequest just to check if a given URL is valid. Turns out being a bit more difficult than anticipated!

The issue is that the on_success and on_failure/error methods defined as part of the class are never called. The script throws the following output (based on the print commands):

http://www.google.com
request sent
URL doesn't work

Now my suspicion is that I´m getting the return code ("None") from the test_connection method, and not the connectionSuccess or connectionFailure. How can I make the call wait for one of the latter to give a return? Any suggestion welcome. Thanks.

from kivy.app import App
from kivy.uix.floatlayout import FloatLayout
from kivy.network.urlrequest import UrlRequest

class WebExplorer():      
    def test_connection(self, path):
        self.path = path
        print (self.path)
        req = UrlRequest(self.path,on_failure=self.connectionFailure,on_error=self.connectionFailure,on_success=self.connectionSuccess)
        print ("request sent")

    def connectionSuccess(self,*args):
        print ("connectionSuccess")
        return 0

    def connectionFailure(self,*args):
        print ("connectionFailure")
        return 1


class MainScreen(FloatLayout):
    def __init__(self, **kwargs):
        super(MainScreen, self).__init__(**kwargs)
        self.address = 'http://www.google.com'

        if WebExplorer().test_connection(self.address) == 0:
            print ("URL works")
        else:
            print ("URL doesn't work")

class App(App):
    def build(self):
        return MainScreen()


if __name__ == "__main__": 
    App().run() 

UPDATE 2016-09-27 I changed my code and I have spent hours on trying to figure out the problem. First the code:

from kivy.app import App
from kivy.uix.floatlayout import FloatLayout
from kivy.network.urlrequest import UrlRequest

class WebExplorer():

    def test_connection(self, path):
        self.path = path
        req = UrlRequest(self.path,on_failure=self.connectionFailure,on_error=self.connectionFailure,on_success=self.connectionSuccess)
        req.wait()
        return (self._return_value)

    def connectionSuccess(self, req, results):
        print ("Success")
        self._return_value = [0,results]

    def connectionFailure(self, req, results):
        print ("Failure")
        self._return_value = [1,results]


class MainScreen(FloatLayout):
    def __init__(self, **kwargs):
        super(MainScreen, self).__init__(**kwargs)
        self.URLtest = ['http://www.ikea.com/','https://www.google.com','https://www.sdfwrgaeh.com']
        for URL in self.URLtest:
            self.returnCode = WebExplorer().test_connection(URL)
            if self.returnCode[0] == 0:
                print ("Correct URL")
            else:
                print ("Wrong URL")

class App(App):
    def build(self):
        return MainScreen()

if __name__ == "__main__": 
    App().run() 

Why 3 URLs? Because one is "correct" (IKEA), one is a redirect (Google) and one is completely bogus. Turns out, the code works for the first one only. req.wait doesn't work when the result is a failure/error (btw I have zero idea what's the difference between these two).

So the question is how to make req.wait process a failure, alternatively how to exit the class with the correct error code. I considered Clock.schedule_interval to periodically check the status, ut since the event methods are not even executed when the URL is incorrect, I have nowhere to set my variales -_-

Omar Little
  • 45
  • 1
  • 5

1 Answers1

0

UrlRequest is an asynchron, which means that the actual request is processed in the background while your application proceeds. The functions on_failure etc. are callbacks, i.e. they will be called when the request is done - but that means that the return value is "lost". The return value you get is indeed from the test_connection function.

You mentioned on the mailing list that you want to make a call to a root url, do some processing on what you get back and then issue new requests to suburls depending on content and user interaction.

The following should provide a skeleton for what you are trying to do.

class WebExplorer

    def __init__(self, gui):
        self._visited_urls = [] # to prevent endless redirect loops
        self._visiting = set() # to check if we are still doing something
        self.gui = gui # used to display buttons in process_result

    def get_url(self, url):
        self._visited_urls.append(url)
        self._visiting.add(url)
        UrlRequest(url, 
            on_success=self.process_result,
            on_failure=self.failure,
            on_error=self.failure, # what is the difference between failure and error?
            on_redirect=self.redirect)

    def redirect(self, req):
        self._visiting.discard(req.url)
        self.get_url(req.req_headers["Location"]) # check if correct handling of 3xx

    def process_result(self, req):
        self._visiting.discard(req.url)
        #TODO process the body to extract urls you want to visit

        box = BoxLayout()
        for url in urls_you_want_to_visit:
            button = Button()
            button.text = url
            # I'm not sure if the following works due to some scoping oddities
            # if - no matter which button is pressed - the same url is called
            # this line is probably at fault
            button.bind(on_press=lambda: self.get_url(url)) 
            box.add_widget(button)

        gui.add_widget(box)

Of course there is still a lot to improve. It's also not tested so it can contain bugs.

syntonym
  • 7,134
  • 2
  • 32
  • 45
  • Thanks for your time and for the suggestion to use req.wait(). It still returns "None", though, like if I hadn't put the code in. I applied the changes as you suggested, and added a "print ("return value",self._return_value)" right before the "return self._return_value" of the test_connection method. It does return "None" – Omar Little Sep 23 '16 at 06:14
  • (unable to add to my previous comment). I also tried putting a sleep command after the req.wait(). It still returns "None". The documentation for wait() is that the process returns to the main thread "This method is intended to be used in the main thread, and the callback will be dispatched from the same thread from which you’re calling.", so in theory there is no reason why the on_success method wouldn't be called, thus setting the return code accordingly. Bug? – Omar Little Sep 23 '16 at 07:42
  • what is `req.result`? – syntonym Sep 24 '16 at 10:15
  • Hmm I don't think the script uses something called "req.result"? What do you mean? – Omar Little Sep 26 '16 at 06:54
  • Sorry, I meant "what is the value of `req.result` after the `wait` call". `req.result` should indicate whether the request was successful or not. Maybe you get a redirect and neither on_error nor on_success is getting called? – syntonym Sep 26 '16 at 09:30
  • Oh ok :) It returned this: 302 Moved

    302 Moved

    The document has moved here.
    – Omar Little Sep 27 '16 at 11:36
  • Updated my OP.The problem is that req.wait doesn't do anything if the request failed. The google URL is a redirect and somehow does not count towards success. See my update for more information. Honestly, this is really frustrating. – Omar Little Sep 27 '16 at 12:58
  • What do you mean with `req.wait` does not work? Does it not stall until the request failed because of the move? Try to also give a function for `on_redirect` does that work? – syntonym Sep 27 '16 at 22:30
  • Assigning a function to on_redirect worked for the first URL. Now the only thing not working is the wrong URL (the thrid one). What I mean by "req.wait does not work" is that altough the connectionFailure method is called (print command seems to work), the WebExplorer class never exits, It's stuck on req.wait. So that ´m wondering what re.wait is waiting for, the request should be complete at this point? – Omar Little Sep 28 '16 at 16:06
  • That is only in the case that the URL you gave has no associated IP (DNS is bogus), correct? Maybe UrlRequest "doesn't work" in that case. Do you get any debug output e.g. an exception? The UrlRequest is in another thread so it will not exit the mainthread. – syntonym Sep 28 '16 at 16:25
  • The debug gave [DEBUG ] [UrlRequest ] 59954736 Download error <[Errno 11004] getaddrinfo failed> Maybe I should try to exit the WebExplorer class from within the connectionFailure method instead – Omar Little Sep 28 '16 at 17:23
  • It looks like the `wait` function does only work when there is no exception in the code. Use something like `while not req.is_finished: sleep(1)`, `is_finished` should be set to True when something goes wrong. – syntonym Sep 28 '16 at 20:53
  • I think you should also be able to first check for `req.is_finished` and then call `wait` - but that sounds like a race condition (which means it can fail unpredictable). – syntonym Sep 28 '16 at 21:01
  • Thanks for the suggestion. I had already thought about implementing a loop, although I was waiting on the return value to be set by one of the connectionSuccess/connectionFailure functions instead of the UrlRequest is_finished status flag. But sleep makes the process hang, and I can't set up a kivy clock schedule instead of sleep because the method that needs to wait is test_connection, as it's the one I´m calling and getting the return from, in the main App. I think the problem may just be my design. I don't think I can build a class around UrlRequest like I did. – Omar Little Sep 29 '16 at 05:33
  • I think the API is a bit weird. You could try to overwrite `req.wait` so that it also looks out for the `is_finished` property. – syntonym Sep 29 '16 at 10:21
  • `req.is_finished` seems to not be set to `True` when an error occurs. Maybe it's easier to create your own thread and use something like [requests](http://docs.python-requests.org/en/master/). You could also give the kivy mailing list a try if you don't get a working answer here. – syntonym Sep 29 '16 at 15:13
  • Yeah I gave it a try to yesterday, you can find the discussion on https://groups.google.com/forum/#!topic/kivy-users/ERcZDjr3znE I appreciate your feedback, thanks for helping me on this issue. – Omar Little Sep 29 '16 at 22:26
  • But this didn't answer the question which callback is called when the url is invalid, does it? – syntonym Sep 30 '16 at 11:05
  • I tried to outline how such a program could look like regarding the feedback you got on the mailing list. Somehow when I tested before UrlRequest did not call `on_error` but I think I forgot to use `App().run()`. Sorry about the confusion! – syntonym Sep 30 '16 at 11:37