0

How I can clear scrapy jobs list? When I start any spider I have a lot jobs with specific spider and I know how can I kill all them ? After reading documentation I have done next code, which I run in a loop:

cd = os.system('curl http://localhost:6800/schedule.json -d project=default -d spider=google > kill_job.text')
file = open('kill_job.text', 'r')
a = ast.literal_eval(file.read())
kill='curl http://localhost:6800/cancel.json -d project=default -d job={}'.format(a['jobid'])
pprint(kill)

cd = os.system(kill)

but looks like that it doesn't works. How can I kill all jobs because even if I have finished manually scrapy's process in the next start all jobs come back. Find this https://github.com/DormyMo/SpiderKeeper for project management. Does anybody know how to include existing project ?

kolas
  • 43
  • 1
  • 9

2 Answers2

1

So, I do not know what is wrong with my first example, but I fixed problem with this:

cd = os.system('curl http://localhost:6800/listjobs.json?project=projectname > kill_job.text')
file = open('kill_job.text', 'r')
a = ast.literal_eval(file.read())
b = a.values()
c = b[3]
for i in c:
    kill = 'curl http://localhost:6800/cancel.json -d project=projectname -d job={}'.format(i['id'])
    os.system(kill)
kolas
  • 43
  • 1
  • 9
1

Took @kolas's script and updated it for python 3:

import json, os
PROJECT_NAME = "MY_PROJECT"

cd = os.system('curl http://localhost:6800/listjobs.json?project={} > kill_job.text'.format(PROJECT_NAME))
with open('kill_job.text', 'r') as f:
    a = json.loads(f.readlines()[0])

pending_jobs = list(a.values())[2]
for job in pending_jobs:
    job_id = job['id']
    kill = 'curl http://localhost:6800/cancel.json -d project={} -d job={}'.format(PROJECT_NAME, job_id)
    os.system(kill)
reading_ant
  • 344
  • 3
  • 12