-1

When Cythonizing a file and running it via import, an error is presented that claims there is a problem getting the python interpreter state.

Removing multiprocessing code like multiprocessing.start(); switching to Cython's own prange().

Googling what the error means, and there is not a single result that has talked about this since I first came across it (2 years ago).

Cython 0.29.6 Python 3.7

build using

pyhton3 setup.py build_ext --inplace

Where the setup.py contains

from distutils.core import setup
from distutils.extension import Extension
from Cython.Build import cythonize

ext_modules = [Extension("Logic", ["Logic.pyx"], extra_compile_args=['-Ofast'],)]
setup(name="Logic", ext_modules=cythonize("Logic.pyx"))

And the problem code... There is a small list of errors expected, but the interpreter state was not one of them.

import secrets
import os
import time
import Cython

cdef list race = ["Asian", "Black", "White"]
cdef list hair = ["Brown", "Brown", "Brown", "Brown", "Black", "Black", "Black", "Blond", "Blond", "Red"]
cdef list eyes = ["Brown", "Brown", "Brown", "Blue", "Blue", "Blue", "Green", "Green", "Grey", "Hazel"]
cdef list gender = ["Female", "Male", "Female", "Female", "Female", "Female", "Male", "Male", "Male", "Male", "Female", "Female", "Female", "Female", "Male", "Male", "Male", "Male", "Female", "Female", "Female", "Female", "Male", "Male", "Male", "Male", "Female", "Female", "Female", "Female", "Male", "Male", "Male", "Male", "Female", "Female", "Female", "Female", "Male", "Male", "Male", "Male", "Female", "Female", "Female", "Female", "Male", "Male", "Male", "Male", "Female", "Female", "Female", "Female", "Male", "Male", "Male", "Male", "Female", "Female", "Female", "Female", "Male", "Male", "Male", "Male", "Female", "Female", "Female", "Female", "Male", "Male", "Male", "Male", "Female", "Female", "Female", "Female", "Male", "Male", "Male", "Male", "Female", "Female", "Female", "Female", "Male", "Male", "Male", "Male", "Female", "Female", "Female", "Female", "Male", "Male", "Male", "Male", "X", "XXY"]
cdef long age_limit_elder = 115
cdef long weight_limit = 400
cdef long age_limit_adult = 60
cdef long age_limit_child = 18
cdef long weight_limit_infant = 15
cdef long age_limit_infant = 4
cdef long population = 4873057333
cdef long infant = 633497453  # 13%
cdef long child = 1461917200  # 30%
cdef long adult = 2095414652  # 43%
cdef long elder = 682228028  # 14%
cdef list infant_population = []
cdef list child_population = []
cdef list adult_population = []
cdef list elder_population = []

cdef list adult_race_pool = []

cpdef dict adult_generation(long adult=adult, long age_limit_adult=age_limit_adult, long weight_limit=weight_limit, list race=race, list hair=hair, list eyes=eyes, list gender=gender, long age_limit_child=age_limit_child):
    cdef long age_ = secrets.choice(range(age_limit_child, age_limit_adult))
    cdef long weight_ = secrets.choice(range(90, weight_limit))
    cdef str race_ = secrets.choice(race)
    cdef str hair_ = secrets.choice(hair)
    cdef str eyes_ = secrets.choice(eyes)
    cdef str gender_ = secrets.choice(gender)
    # adult_population.append({"AGE": age_, "WEIGHT": weight_, "RACE": race_, "HAIR": hair_, "EYES": eyes_, "GENDER": gender_})
    return {"AGE": age_, "WEIGHT": weight_, "RACE": race_, "HAIR": hair_, "EYES": eyes_, "GENDER": gender_}

# adult_generation()
# cdef long adult_processes = adult // 4
# adult_processes = 10000
# cdef long adult_processes = 100000
cdef long adult_processes = adult  # Fallback
cdef long adult_percent = adult_processes // 10000
cdef double percent = 0
cdef long current = 0
cdef double t1 = time.time()
cdef double master = t1
cdef long i

for i in Cython.parallel.prange(adult_processes, nogil=True):
    if current == adult_percent:
        percent += 0.01
        t2 = time.time()
        t3 = t2 - t1
        os.system("tput reset")
        print(f"{percent}% Complete | {current} Profiles in {t3} Seconds")
        current = 0
        t1 = time.time()
    adult_population.append(adult_generation())
    current += 1

cdef long F = 0
cdef long M = 0
cdef long X = 0
cdef long XXY = 0

cdef dict item

for item in adult_population:
    if item["GENDER"] == "Female":
        F += 1
    elif item["GENDER"] == "Male":
        M += 1
    elif item["GENDER"] == "X":
        X += 1
    elif item["GENDER"] == "XXY":
        XXY += 1
    else:
        pass


print(f"FEMALE: {F}")
print(f"MALE: {M}")
print(f"X: {X}")
print(f"XXY: {XXY}")
print(f"Total Time: {time.time() - master} Seconds")

The result should just be a giant list of pseudo random people like in the Python version but without the odd 70% CPU limit on all threads/cores.

However, in Cython, when made parallel or multiprocessed it returns "undefined symbol: PyInterpreterState_GetID" on import and just quits.

Testing was done with smaller numbers as the current numbers would take 146 days of processing... Changing the number of results still produces an error.

It was also expected to get hung up at the "cdef dict item", but again, it never even gets to that.

Promus Aster
  • 23
  • 1
  • 5
  • On SO, the asker is supposed to provide a [mcve], in your case that means: reduce the example (90% of code is not needed to trigger the issue), give a precise description how the extension was built. Also add Python and Cython version with which it was built and Python version in which you try to import the extension. – ead Apr 07 '19 at 04:33
  • I tried making the code smaller but it produced unrelated errors or was pythonic enough that the base interpreter handled it, producing no errors again since it was just running it through python. – Promus Aster Apr 07 '19 at 06:44
  • At very least you're appending to a Python list in a nogil block. First, this shouldn't work. Second, modifying global variables in parallel code is something you need to do _very_ carefully (and you need to understand why it's a bad idea) – DavidW Apr 07 '19 at 06:51
  • It is a tell, that you example isn’t minimal, if there are several issues. However, you don’t tell us how you build it ( there are several ways of doing) and which python version is used when imported ( are you sure it is 3.7? ) – ead Apr 07 '19 at 07:05
  • Setup the env to only use 3.7 in PyCharm. I know it will run into other errors, it should, such as the nogil will fail because it interacting with python objects/operations, and the dict, since it isn't really a dict... this StateID was unexpected. Build methods will be updated in a few seconds. – Promus Aster Apr 07 '19 at 07:38
  • This question doesn't make any sense: if there are errors while cythonizing, there is no resulting so-object, thus you cannot see those errors while importing. In this shape this question isn't useful and only polluting google's search results. – ead Apr 08 '19 at 12:55
  • I think part of your problem is that `prange` isn't really an `import` despite looking like one. You appear to have to use the exact line `from cython.parallel import prange` (I think it's a bug that it needs to be that...). If you do that you'll get appropriate errors about what is suitable for `nogil`. I have no idea where your undefined symbol came from though – DavidW Apr 09 '19 at 21:15

1 Answers1

-1

I can't say this is a definite answer, but, I noticed that the Python error output was substantially short. I took a little time and a gamble and straight compiled the Cython code into an embedded package. This package provided full error outputs, even though it was in a mess, it still led somewhere.

The interpreter state seemed to have something to do with threading/multiprocessing, so I tweaked that part, and found the error output to be complaining about the GIL. Apparently the error given to Python was from Cython but not shown to user until compiled as a standalone "C" app.

This error seems to center around the GIL, with the output complaining that it can't remove the GIL from functions that need the GIL, namely, every piece of Python code except return, yield, print, simple math, etc...

So, the prange and its parallel ability and very close to useless to someone wanting to boost a pure Python script. As anything relying on def, open, threading/multiprocessing module, or really anything Pythonic, will produce this error, and should the one using it be testing their Cython code through Python import, they will simply be at a dead end, with a one line error that does not come up in google search results.

The fix for the prange error requires rewriting the entirety of what is being interacted with in C, or import something that uses C instead of Python. This again makes it less useful to a pure Python developer.

In the end, I jumped to the map function, which allows multiprocessing without the overhead of spawning so many processes (in the end it would be near 3 billion).

Promus Aster
  • 23
  • 1
  • 5