Celery task reprocessing itself in an infinite loop

Question

I'm running into an odd situation where celery would reprocess a task that's been completed. The overall design looks like this:

Celery Beat: Pulls files periodically, if a file was pulled it creates a new entry in the DB and delegates processing of that file to another celery task in a 1 worker queue (that way only 1 file gets processed at a time)

Celery Task: Processes the file, once it's done it's done, no retries, no loops.

@app.task(name='periodic_pull_file')
def periodic_pull_file():
    for f in get_files_from_some_dir(...):
        ingested_file = IngestedFile(filename=filename)
        ingested_file.document.save(filename, File(f))
        ingested_file.save()
        process_import(ingested_file.id)
        #deletes the file from the dir source
        os.remove(....somepath)

def process_import(ingested_file_id):
    ingested_file = IngestedFile.objects.get(id=ingested_file_id)
    if 'foo' in ingested_file.filename.lower():
        f = process_foo
    else:
        f = process_real_stuff
    f.apply_async(args=[ingested_file_id], queue='import')

@app.task(name='process_real_stuff')
def process_real_stuff(file_id):
    #dostuff

process_foo and process_real_stuff is just a function that loops over the file once and once it's done it's done. I can actually keep track of the percentage of where it's at and the interesting thing I noticed was that the same file kept getting processed over and over again (note that these are large files and processing is slow, takes hours to process. Now I started wondering if it was just creating duplicate tasks in the queue. I checked my redis queue when I have 13 pending files to import:

-bash-4.1$ redis-cli -p 6380 llen import
(integer) 13

And aha, 13, I checked the content of each queued task to see if it was just repeated ingested_file_ids using:

redis-cli -p 6380 lrange import 0 -1

And they're all unique tasks with unique ingested_file_id. Am I overlooking something? Is there any reason why it would finish a task -> loop over the same task over and over again? This only started happening recently with no code changes. Before things used to be pretty snappy and seamless. I know it's also not from a "failed" process that somehow magically retries itself because it's not moving down in the queue. i.e. it's receiving the same task in the same order again and again, so it never gets to touch the other 13 files it should've processed.

Note, this is my worker:

python manage.py celery worker -A myapp -l info -c 1 -Q import

This question was asked one year ago and I'm not sure if you find the root cause. As far as I know, celery has an API `add_periodic_task` that can periodically run a task. — CodingNow, Jun 19 '18 at 19:03
https://stackoverflow.com/questions/27310899/celery-is-rerunning-long-running-completed-tasks-over-and-over?rq=1 — CodingNow, Jun 19 '18 at 19:09
Hey @CodingNow, I actually don't even remember if I solved it lol. I think I solved it at some point but can't remember what I did. It was also at my old job so I can't go back to check :( Ah the nostalgia. — Stupid.Fat.Cat, Jun 20 '18 at 03:02

score 0 · Answer 1 · answered Aug 06 '21 at 08:00

0

Use this

celery -Q your_queue_name purge

answered Aug 06 '21 at 08:00

Khaled Ramadan

812
1
10
26

Celery task reprocessing itself in an infinite loop

1 Answers1