Consider a recursive code in Cython of the following generic form:
cpdef function(list L1, list L2):
global R
cdef int i,n #...
cdef list LL1,LL2 #...
# ...
# core of the code
# ...
n= #...
for i in range(n):
LL1= #...
LL2= #...
function(LL1,LL2)
New remark: my relevant code is just a tree exploration collecting fruits, all the branchs are independant. Consider a computer with several CPUs, I would like to parallelize as follows: each CPU has a queue, when the code arrives to a new node of the tree, there are several possible new children, and a child is allocated to the CPU with the smallest queue. It seems to be a generic way to parallelize a tree exploration.
Question: What is the easiest way to implement such a parallelization?
I tried to precede my code by from cython.parallel import prange
and then to replace range(n)
by prange(n)
but I got the error:
prange() can only be used without the GIL
Then I replaced prange(n)
by prange(n,nogil=True)
but I got many errors like:
Assignment of Python object not allowed without gil
Coercion from Python not allowed without the GIL
Indexing Python object not allowed without gil
Calling gil-requiring function not allowed without gil
Below is the relevant code I want to parallelize:
cpdef SmithFormIntegralPointsSuperFiltred(list L, list LL, list co, list A):
global R,clp
cdef int i,j,k,l,ll,p,a,c,cc,rc,m,f,b,z,zz,lp,s,la,kk,ccc,zo,jj,lM
cdef list LB,S,P,CP,F,cco,PP,PPP,coo,V,LLP,LLPO,Mi,M
m=10000
l=len(L)
ll=len(LL)
la=len(A[0])
z=0
zz=0
P=[]
for i in range(l):
if L[i]==-1:
P.append(i)
lp=len(P)
if lp<clp:
print([lp,L])
clp=lp
if lp==0:
F=list(matrix(LL)*vector(L))
b=0
for f in F:
if f<0:
b=1
break
if b==0:
R.append(F); print(L)
if lp>0:
PP=[m for j in range(lp)]
PPP=[[] for j in range(lp)]
for i in range(ll):
a=0
for j in P:
if LL[i][j]>0:
a+=1
if a==2:
break
if a<=1:
CP=list(set(range(l))-set(P))
c=sum([LL[i][j]*L[j] for j in CP])
if a==0 and c<0:
z=1
break
if a==1 or (a==0 and c>=0):
LLPO=[LL[i][P[k]] for k in range(lp)]
for j in range(lp):
LLP=LLPO[:]
cc=-LLP[j]
if cc<>0:
del LLP[j]
if LLP==[0 for k in range(lp-1)]:
PPP[j].append(i)
zz=1
if cc>0:
rc=c/cc
if rc<PP[j]:
PP[j]=rc
if z==0 and zz==1:
zo=0
for i in range(lp):
Mi=[]
if PPP[i]<>[]:
for j in range(PP[i]+1):
ccc=0
coo=copy.deepcopy(co)
for k in PPP[i]:
s=sum([LL[k][kk]*L[kk] for kk in range(l)])+(j+1)*LL[k][P[i]]
V=A[k]
for kk in range(la):
if V[kk]<>0:
if s>=0 and coo[kk][V[kk]]>=s:
coo[kk][V[kk]]-=s
else:
ccc=1
break
if ccc==1:
break
if ccc==0:
Mi.append(j)
if len(Mi)<m:
zo=1
m=len(Mi)
M=Mi
p=i
if zo==1:
M.reverse()
lM=len(M)
for jj in range(lM):
j=M[jj]
cco=copy.deepcopy(co)
for k in PPP[p]:
s=sum([LL[k][kk]*L[kk] for kk in range(l)])+(j+1)*LL[k][P[p]]
V=A[k]
for kk in range(la):
if V[kk]<>0:
cco[kk][V[kk]]-=s
LB=L[:]
LB[P[p]]=j
SmithFormIntegralPointsSuperFiltred(LB,LL,cco,A)
The global variables R and clp are not essential, I can manage without global variable if necessary.