I'm using papermill in a web app, and running execute_notebook inside of celery tasks. I'm logging the output, and the entire notebook finishes, I get the export I'm waiting for in GCS, and it all seems perfect. but my execute_notebook statement never returns, so my celery task never finishes either.
Here's some pared down code:
def execute_nb(parameters: NotebookParameters, product_type: ProductType, notebook_name: str = None):
try:
nb_path = f"{settings.NOTEBOOK_DIR_PATH}/{product_type}.ipynb"
if notebook_name:
nb_path = f"{settings.NOTEBOOK_DIR_PATH}/{notebook_name}.ipynb"
nb_content = NOTEBOOK_REPO.get_contents(nb_path, ref=settings.NOTEBOOK_REF).decoded_content
except Exception as e:
print(e)
raise NotebookDoesNotExistException(path=nb_path)
# Make sure these local notebook folders exist locally
if not os.path.isdir(settings.NOTEBOOK_DIR_PATH):
os.makedirs(settings.NOTEBOOK_DIR_PATH)
if not os.path.isdir(f"{settings.NOTEBOOK_DIR_PATH}/outputs"):
os.makedirs(f"{settings.NOTEBOOK_DIR_PATH}/outputs")
# Writes the notebook from git to a local file (at the same relative path as git, from main.py)
open(nb_path, "wb").write(nb_content)
parameters_dict = json.loads(parameters)
pm.execute_notebook(
nb_path, f"{settings.NOTEBOOK_DIR_PATH}/outputs/{product_type}_output.ipynb", parameters=parameters_dict, log_output=True
)
print('done!')
return True
It never prints that done statement, so my celery task never finishes. The logs in my container show this:
data-layer-generation-service-worker-1 | [2022-11-16 01:48:43,496: WARNING/ForkPoolWorkerExecuting: 100%|##########| 7/7 [00:39<00:00, 6.36s/cell]
So it's reaching the end. Am I supposed to do something to trigger the end of execute_notebook?