0

I'm using papermill in a web app, and running execute_notebook inside of celery tasks. I'm logging the output, and the entire notebook finishes, I get the export I'm waiting for in GCS, and it all seems perfect. but my execute_notebook statement never returns, so my celery task never finishes either.

Here's some pared down code:

def execute_nb(parameters: NotebookParameters, product_type: ProductType, notebook_name: str = None):
    try:
        nb_path = f"{settings.NOTEBOOK_DIR_PATH}/{product_type}.ipynb"

        if notebook_name:
            nb_path = f"{settings.NOTEBOOK_DIR_PATH}/{notebook_name}.ipynb"
        nb_content = NOTEBOOK_REPO.get_contents(nb_path, ref=settings.NOTEBOOK_REF).decoded_content
    except Exception as e:
        print(e)
        raise NotebookDoesNotExistException(path=nb_path)

    # Make sure these local notebook folders exist locally
    if not os.path.isdir(settings.NOTEBOOK_DIR_PATH):
        os.makedirs(settings.NOTEBOOK_DIR_PATH)
    if not os.path.isdir(f"{settings.NOTEBOOK_DIR_PATH}/outputs"):
        os.makedirs(f"{settings.NOTEBOOK_DIR_PATH}/outputs")

    # Writes the notebook from git to a local file (at the same relative path as git, from main.py)
    open(nb_path, "wb").write(nb_content)

    parameters_dict = json.loads(parameters)

    pm.execute_notebook(
        nb_path, f"{settings.NOTEBOOK_DIR_PATH}/outputs/{product_type}_output.ipynb", parameters=parameters_dict, log_output=True
    )
    print('done!')
    return True

It never prints that done statement, so my celery task never finishes. The logs in my container show this:

data-layer-generation-service-worker-1     | [2022-11-16 01:48:43,496: WARNING/ForkPoolWorkerExecuting: 100%|##########| 7/7 [00:39<00:00,  6.36s/cell]

So it's reaching the end. Am I supposed to do something to trigger the end of execute_notebook?

Big Guy
  • 712
  • 1
  • 8
  • 21

0 Answers0