0

Hi in my Airflow DAG I'd like to set some arguments to values directly from the conf --conf parameters, i.e.,

{{ dag_run.conf['key'] }}
#or
context['dag_run'].conf['key']

For example, if I have a DAG that involves emailing a recipient address and I want to assign the recipient address in the 'to' argument of the EmailOperator task from conf. I found that I can accomplish this using logic involving Variable.set() and Variable.get() in a PythonOperator.

with DAG("my-dag") as dag:
    ping = SimpleHttpOperator(endpoint="http://example.com/update/")
    email = EmailOperator(to="admin@example.com", subject="Update complete")

    ping >> email
def conf_func(**context):

  Variable.set("email_address", context["dag_run"].conf["email_address"])

with DAG("my-dag") as dag:
    ping = SimpleHttpOperator(endpoint="http://example.com/update/")
    set_variable = PythonOperator(python_callable = conf_func, provide_context = True) 
    email = EmailOperator(to=Variable.get("email_address"), subject="Update complete")

    ping >> set_variable >> email

However, some folks on Stackoverflow have pointed out that using Variable in this fashion is a bad practice. Since the Airflow Variable is not local/specific to each DAG run, it looks like you can have issues if you have multiple runs of the same DAG that set/retrieve different values from the same Variable key. I'm guessing bad things also happens if you have different DAGs that call the same Variable.

That being said, are there any better ways to pass values from dag_run conf into the parameter fields of Airflow tasks? (Values that may also include JSON objects other than strings, like lists and dictionaries?)

Edit: Update to include code for custom operator from this post:

class ExtenedK8sPodOperator(KubernetesPodOperator):
    template_fields = (*KubernetesPodOperator.template_fields, 'volumes', 'volume_mounts', 'V1Toleration')

    def __init__(self, **kwargs):
        vols = kwargs.get('volumes', [])
        for vol in vols:
            vol.template_fields = ('name')
        kwargs['volumes'] = vols

        vol_mounts = kwargs.get('volume_mounts', [])
        for vol_mount in vol_mounts:
            vol_mount.template_fields = ('name')
        kwargs['volume_mounts'] = v
        super().__init__(**kwargs)

hamhung
  • 53
  • 8
  • That works, but when I tried applying it to other Airflow objects that are not operator based I run into an issue with the Jinja template rendering. E.g., if I try doing ```with DAG("my-dag") as dag: foo = "{{dag_run.conf['email_address']}}"``` foo will be assigned {{dag_run.conf['email_address']}} instead of the actual value behind the 'email_address' key in the dag run conf – hamhung Feb 23 '23 at 18:54
  • Could you describe your use case why you want to get a value outside of operators/tasks? – Emma Feb 23 '23 at 19:02
  • @Emma my use case is nearly identical to the one in this [post](https://stackoverflow.com/questions/70150198/airflow-unable-to-use-jinja-template-for-resources-in-kubernetes-pod-operator). I.e., being able to set resources or other parameters directly from conf – hamhung Feb 23 '23 at 19:10
  • my hunch is the work-around is to create a customer operator if I wanted to extend Jinja templating to certain fields that are non-template by default. e.g. the resources parameter in a Kubernetes Pod Operator – hamhung Feb 23 '23 at 19:28
  • Sure if you need to pass an argument to a non-templated field or additional field, you need to create a custom operator. However, you are not showing that in question. – Emma Feb 23 '23 at 19:31
  • I updated my post. I tried creating a custom operator to extend templating to the Volume, VolumeMount, and V1Toleration objects that can be used with a KubernetesPodOperator. But I'm getting an error that a VolumeMount object has no template_fields attribute - does that mean it is an object that can't use templates even if customized? – hamhung Feb 23 '23 at 21:04
  • Airflow v2.5.1, added `volumes` and `volume_mount` to `templated_fields`. If you can upgrade to it, you can just add `V1Toleration` to your custom operator. You can follow this code how you can do it. If you cannot upgrade it, you can also refer to this to create your custom operator. https://github.com/apache/airflow/pull/27719/files ps I would recommend to delete all previous `EmailOperator` in the question as it is irrelevant to your actual question about "how you can extend KubernetesPodOperator" and update the title for future users who looking for the same questions. – Emma Feb 23 '23 at 22:05
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/252093/discussion-between-hamhung-and-emma). – hamhung Feb 23 '23 at 23:14

1 Answers1

0

Airflow v2.5.1+

class KubernetesPodExtendedOperator(KubernetesPodOperator):

    template_fields: Sequence[str] = (
        "image",
        "cmds",
        "arguments",
        "env_vars",
        "labels",
        "config_file",
        "pod_template_file",
        "namespace",
        "container_resources",
        "volumes",
        "volume_mounts",
        "tolerations",  # extend to add tolerations
    )

    def _render_nested_template_fields(
        self,
        content: Any,
        context: Context,
        jinja_env: jinja2.Environment,
        seen_oids: set,
    ) -> None:
        if id(content) not in seen_oids:
            if isinstance(content, k8s.V1Toleration):
                # Add which fields you want to make it templated
                template_fields = ("key", "value",)
            
            if template_fields:
                seen_oids.add(id(content))
                self._do_render_template_fields(content, template_fields, context, jinja_env, seen_oids)
        
        super()._render_nested_template_fields(content, context, jinja_env, seen_oids)

If you have lower version, add volumes and volume_mounts like tolerations.

How to use

tolerations = [
    k8s.V1Toleration(key="{{dag_run.conf['toleration_key']}}", 
                     operator="Equal",
                     value="{{dag_run.conf['toleration_value']}}")
]

dry_run = KubernetesPodExtendedOperator(
    name="hello-dry-run",
    ...
    tolerations=tolerations
)
Emma
  • 8,518
  • 1
  • 18
  • 35