I'm having problems with the code bellow. I'm generating a list of lists using the chunk function. The code is adapted from here
Splitting a list of into N parts of approximately equal length
Upon inspection in shell this is what i'm getting as types for the list and sublists.
In [18]: urls = list(range(957))
In [19]: type(urls)
Out[19]: list
In [20]: chunks = chunk(urls, 10)
In [21]: type(chunks)
Out[21]: list
In [22]: type(chunks[0])
Out[22]: list
They are all lists. But when i send this list to a celery task, it's throwing an error complaining that it's not a list, it's actually a slice.
This is the traceback:
[2016-10-03 12:30:57,360: ERROR/MainProcess] Task tasks.scrape_id_group[3759ff40-45f3-455f-87e7-1dadbbb201fc] raised unexpected: TypeError("unhashable type: 'slice'",)
Traceback (most recent call last):
File "/home/milan/.virtualenvs/upwork/lib/python3.4/site-packages/celery/app/trace.py", line 240, in trace_task
R = retval = fun(*args, **kwargs)
File "/home/milan/.virtualenvs/upwork/lib/python3.4/site-packages/celery/app/trace.py", line 438, in __protected_call__
return self.run(*args, **kwargs)
File "/home/milan/skilled/upwork/tasks.py", line 55, in scrape_id_group
for i, sublist in enumerate(chunk(ids, count))
File "/home/milan/skilled/upwork/utils.py", line 219, in chunk
return [l[i::n] for i in range(n)]
File "/home/milan/skilled/upwork/utils.py", line 219, in <listcomp>
return [l[i::n] for i in range(n)]
TypeError: unhashable type: 'slice'
This is the modified code to show the problem:
from celery import Celery, group
app = Celery('tasks', broker='redis://localhost:6379/0')
def chunk(l, n):
return [l[i::n] for i in range(n)]
@app.task
def scrape_list_task(account, urls):
print(urls)
@app.task
def scrape_id_task(account, ids):
print(ids)
@app.task
def scrape_list_group():
accounts = list(range(10))
count = len(accounts)
urls = list(range(957))
return group(
scrape_list_task.s(accounts[i], sublist)
for i, sublist in enumerate(chunk(urls, count))
)
@app.task
def scrape_id_group(ids):
accounts = list(range(10))
count = len(accounts)
return group(
scrape_id_task.s(accounts[i], sublist)
for i, sublist in enumerate(chunk(ids, count))
)
@app.task
def scrape_task():
(scrape_list_group.s() | scrape_id_group.s())()
I'm running this code in the default celery queue with redis as a message broker. Any suggestions? Thanks!
EDIT:
The actual problem appears when running the scrape_task in a celery queue. When the scrape_list_group task is invoked with delay
scrape_list_group.delay()
the task runs fine
The problem seems to be chaining of the tasks. Tried to convert list to tuples, but no result with that as well. Thanks for any info!
SOLUTION:
I've changed the chunk code to an iterative approach. Thanks for the suggestion @Nf4r :)
def chunk(l, n):
result = [[] for _ in range(n)]
for i, el in enumerate(l):
result[i % n].append(el)
return result