4

I have a Flow in prefect whose with a task whose output is a dataframe. In the example provided below it always fails. I would like the task to return an empty dataframe with the state of SUCCESS using @task(on_failure=handle_task_fail). What is the correct syntax to achieve this?

from pprint import pprint
import pandas as pd

from prefect import Flow, task
from prefect.engine.signals import SUCCESS


def handle_disambig_error(task, old_state, new_state):
    if new_state.is_failed():
        new_state.result["wiki_df"] = pd.DataFrame()

        # Is this needed?
        #set state to SUCCESS
    return new_state


@task(on_failure=handle_disambig_error)
def get_wiki_resource():

    wiki_df = pd.DataFrame(
        {
            "a":[1],
            "b":[1/0]
        }
    )

    return wiki_df

with Flow("Always Fail") as flow:
    wiki_df = get_wiki_resource()

state = flow.run()
task_state = state.result[wiki_df]
pprint(task_state.result)

Traceback:

Traceback (most recent call last):
  File "/miniconda3/lib/python3.7/site-packages/prefect/engine/runner.py", line 161, in handle_state_change
    new_state = self.call_runner_target_handlers(old_state, new_state)
  File "/miniconda3/lib/python3.7/site-packages/prefect/engine/task_runner.py", line 120, in call_runner_target_handlers
    new_state = handler(self.task, old_state, new_state) or new_state
  File "/miniconda3/lib/python3.7/site-packages/prefect/utilities/notifications.py", line 69, in state_handler
    fn(obj, new_state)
TypeError: handle_disambig_error() missing 1 required positional argument: 'new_state'
[2020-01-28 17:39:41,759] INFO - prefect.TaskRunner | Task 'get_wiki_resource': finished task run for task with final state: 'Failed'
[2020-01-28 17:39:41,762] INFO - prefect.FlowRunner | Flow run FAILED: some reference tasks failed.

Some places I searched State Handlers, Logging with a State Handler

Vadim Kotov
  • 8,084
  • 8
  • 48
  • 62
Itay Livni
  • 2,143
  • 24
  • 38

1 Answers1

4

There are two things going on here:

1.) Generic State Handlers: these can be set via the state_handlers kwarg and will be called on every state change. A state handler is required to have a signature state_handler(task: Task, old_state: State, new_state: State) -> Optional[State] (which is the signature you are using); the Task's state after calling this handler will be the state which is returned from the handler, or the new_state if None is returned.

2.) On Failure callbacks: the on_failure kwarg that you are using here is intended to be a convenience API for state handlers; functions which are passed to this keyword are required to have signature fn(task: Task, state: State) -> None and will only be called whenever this Task enters a Failed state. Note that on failure callbacks cannot alter a Task's state in the way that state handlers can.

In your example, you appear to be mixing the two keyword arguments. I believe the following code will do what you're expecting:

from prefect.engine.state import Success


def handle_disambig_error(task, old_state, new_state):
    if new_state.is_failed():
        return_state = Success(result=pd.DataFrame())
    else:
        return_state = new_state
    return return_state

@task(state_handlers=[handle_disambig_error])
def get_wiki_resource():
   return df
Itay Livni
  • 2,143
  • 24
  • 38
chriswhite
  • 1,370
  • 10
  • 21