0

I am trying to create a decorator in which I need some information about the project and/or catalog. Is it possible to access the project context from inside of the decorator? I am looking for things like project_name, catalog entry name, and pipeline node name.

I created a way to find the root directory of the by getting the functions file path inspect.getfile(func), then walking up the path until I find .kedro.yml, but this method breaks when using a function from a library.

Waylon Walker
  • 543
  • 3
  • 10

1 Answers1

1

Passing the context around in the nodes is not recommended and would not fit in with the Kedro architecture (see https://kedro.readthedocs.io/en/latest/06_resources/02_architecture_overview.html). The ProjectContext holds the a collection of nodes (i.e. pipeline), which would hold the ProjectContext, which then gets into dangerous recursive teritory.

Alternatives would be to either a) pass the values via parameters, and referencing them in the node inputs using params:abc, or b) passing the actual values in the decorator.

a) For instance, project_name you could pass in via parameters. Either by entering it in parameters.yml or dynamically by overriding the method _get_feed_dict in the ProjectContext to add an entry {"project_name": self.project_name} to the returned dictionary.

b) You could have a decorator that looks like this and pass it to Node.decorate():

from functools import partial, wraps
from kedro.pipeline import Pipeline, node


def print_metadata(func, **outer_kwargs):
    @wraps(func)
    def wrapper(*args, **kwargs):
        print("METADATA: {}".format(outer_kwargs))
        return func(*args, **kwargs)
    return wrapper


p1 = Pipeline([node(...), ...])
decorated_nodes = [
    n.decorate(partial(print_metadata, node_name=n.name, catalog_entries=n.inputs))
    for n in p1.nodes
]
p2 = Pipeline(decorated_nodes)

Not super-pretty but can get you unstuck. What is your exact use case? Why do you need the node name and catalog entry name in the node?