I have a function named spider
which takes seed
as an argument. seed
is the name of the URL I send to the spider function. Now my question is how do I use beanstalkc in Python to queue the URLs and perform the jobs.
Asked
Active
Viewed 2,237 times
1

Keith Pinson
- 7,835
- 7
- 61
- 104

Srikanth
- 21
- 2
1 Answers
1
According to the tutorial you would need:
- beanstalkd server is running.
Connect:
import beanstalkc beanstalk = beanstalkc.Connection(host='localhost', port=14711)
Add jobs using:
beanstalk.put('seed url')
Get job via:
job = beanstalk.reserve() spider(job.body)
Mark job as completed:
job.delete()
-
job = beanstalk.reserve() spider(job.body) can you please explain where am i sending the url(or seed) to spider because when i try printing job.body() it prints "True" not the url – Srikanth Jun 27 '11 at 08:58
-
2It's an attribute so job.body not job.body(). Please follow the tutorial step by step first, that should give you a nice intro. – Damian Jun 27 '11 at 09:08