9

I have a stream of links coming in, and I want to check them for rss every now and then. But when I fire off my get_rss() function, it blocks and the stream halts. This is unnecessary, and I'd like to just fire-and-forget about the get_rss() function (it stores its results elsewhere.)

My code is like thus:

self.ff.get_rss(url)    # not async
print 'im back!'

(...)

def get_rss(url):
    page = urllib2.urlopen(url)     # not async
    soup = BeautifulSoup(page)

I'm thinking that if I can fire-and-forget the first call, then I can even use urllib2 wihtout worrying about it not being async. Any help is much appreciated!

Edit: Trying out gevent, but like this nothing happens:

print 'go'
g = Greenlet.spawn(self.ff.do_url, url)
print g
print 'back'

# output: 
go
<Greenlet at 0x7f760c0750f0: <bound method FeedFinder.do_url of <rss.FeedFinder object at 0x2415450>>(u'http://nyti.ms/SuVBCl')>
back

The Greenlet seem to be registered, but the function self.ff.do_url(url) doesn't seem to be run at all. What am I doing wrong?

knutole
  • 1,709
  • 2
  • 22
  • 41

3 Answers3

13

Fire and forget using the multiprocessing module:

def fire_and_forget(arg_one):
    # do stuff
    ...

def main_function():
    p = Process(target=fire_and_forget, args=(arg_one,))
    # you have to set daemon true to not have to wait for the process to join
    p.daemon = True
    p.start()
    return "doing stuff in the background"
Radix
  • 1,317
  • 2
  • 17
  • 32
1
  1. here is sample code for thread based method invocation additionally desired threading.stack_size can be added to boost the performance.
import threading
import requests

#The stack size set by threading.stack_size is the amount of memory to allocate for the call stack in threads.
threading.stack_size(524288)

def alpha_gun(url, json, headers):
    #r=requests.post(url, data=json, headers=headers)
    r=requests.get(url)
    print(r.text)


def trigger(url, json, headers):
    threading.Thread(target=alpha_gun, args=(url, json, headers)).start()


url = "https://raw.githubusercontent.com/jyotiprakash-work/Live_Video_steaming/master/README.md"
payload="{}"
headers = {
  'Content-Type': 'application/json'
}

for _ in range(10):
    print(i)
    #for condition 
    if i==5:
        trigger(url=url, json =payload, headers=headers)
        print('invoked')
    
0

You want to use the threading module or the multiprocessing module and save the result either in database, a file or a queue.

You also can use gevent.

Bite code
  • 578,959
  • 113
  • 301
  • 329