5

While running a DAG which runs a jar using a docker image,
xcom_push=True is given which creates another container along with the docker image in a single pod.

DAG :

jar_task = KubernetesPodOperator(
    namespace='test',
    image="path to image",
    image_pull_secrets="secret",
    image_pull_policy="Always",
    node_selectors={"d-type":"na-node-group"},
    cmds=["sh","-c",..~running jar here~..],
    secrets=[secret_file],
    env_vars=environment_vars,
    labels={"k8s-app": "airflow"},
    name="airflow-pod",
    config_file=k8s_config_file,
    resources=pod.Resources(request_cpu=0.2,limit_cpu=0.5,request_memory='512Mi',limit_memory='1536Mi'),
    in_cluster=False,
    task_id="run_jar",
    is_delete_operator_pod=True,
    get_logs=True,
    xcom_push=True,
    dag=dag)

Here are the errors when the JAR is executed successfully..

    [2018-11-27 11:37:21,605] {{logging_mixin.py:95}} INFO - [2018-11-27 11:37:21,605] {{pod_launcher.py:166}} INFO - Running command... cat /airflow/xcom/return.json
    [2018-11-27 11:37:21,605] {{logging_mixin.py:95}} INFO - 
    [2018-11-27 11:37:21,647] {{logging_mixin.py:95}} INFO - [2018-11-27 11:37:21,646] {{pod_launcher.py:173}} INFO - cat: can't open '/airflow/xcom/return.json': No such file or directory
    [2018-11-27 11:37:21,647] {{logging_mixin.py:95}} INFO - 
    [2018-11-27 11:37:21,647] {{logging_mixin.py:95}} INFO - [2018-11-27 11:37:21,647] {{pod_launcher.py:166}} INFO - Running command... kill -s SIGINT 1
    [2018-11-27 11:37:21,647] {{logging_mixin.py:95}} INFO - 
    [2018-11-27 11:37:21,702] {{models.py:1760}} ERROR - Pod Launching failed: Failed to extract xcom from pod: airflow-pod-hippogriff-a4628b12
    Traceback (most recent call last):
      File "/usr/local/airflow/operators/kubernetes_pod_operator.py", line 126, in execute
        get_logs=self.get_logs)
      File "/usr/local/airflow/operators/pod_launcher.py", line 90, in run_pod
        return self._monitor_pod(pod, get_logs)
      File "/usr/local/airflow/operators/pod_launcher.py", line 110, in _monitor_pod
        result = self._extract_xcom(pod)
      File "/usr/local/airflow/operators/pod_launcher.py", line 161, in _extract_xcom
        raise AirflowException('Failed to extract xcom from pod: {}'.format(pod.name))
    airflow.exceptions.AirflowException: Failed to extract xcom from pod: airflow-pod-hippogriff-a4628b12

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last):
      File "/usr/local/lib/python3.6/site-packages/airflow/models.py", line 1659, in _run_raw_task
        result = task_copy.execute(context=context)
      File "/usr/local/airflow/operators/kubernetes_pod_operator.py", line 138, in execute
        raise AirflowException('Pod Launching failed: {error}'.format(error=ex))
    airflow.exceptions.AirflowException: Pod Launching failed: Failed to extract xcom from pod: airflow-pod-hippogriff-a4628b12
    [2018-11-27 11:37:21,704] {{models.py:1789}} INFO - All retries failed; marking task as FAILED
Deep Nirmal
  • 1,141
  • 1
  • 15
  • 14

3 Answers3

12

If xcom_push is True then KubernetesPodOperator creates one more sidecar container (airflow-xcom-sidecar) in Pod along with the base container(actual worker container). This sidecar container reads data from /airflow/xcom/return.json and returns as xcom value. So in your base container you need to write the data you want to return in /airflow/xcom/return.json file.

sdvd
  • 433
  • 2
  • 10
5

I want to point out the error I faced regarding xcom and KubernetesPodOperator although it was not the same cause as the OP. Just in case anyone stumbles on this question since this is the only one regarding KPO and XCom.

I am using Google Cloud Platform (GCP) Cloud Composer, it uses a slightly older than latest Airflow version, hence when i referred to official GitHub, it mentions to use do_xcom_push whereas the old Airflow uses the arg xcom_push instead!

cryanbhu
  • 4,780
  • 6
  • 29
  • 47
4

This happened because the result of the task execution is not being pushed to the xcom in the expected path required by the KubernetesPodOperator plugin. Take a look at the following unit test from the Airflow repository to check how it should be implemented (source code snippet included below for your convenience, followed by the link to the repository):

    def test_xcom_push(self):
        return_value = '{"foo": "bar"\n, "buzz": 2}'
        k = KubernetesPodOperator(
            namespace='default',
            image="ubuntu:16.04",
            cmds=["bash", "-cx"],
            arguments=['echo \'{}\' > /airflow/xcom/return.json'.format(return_value)],
            labels={"foo": "bar"},
            name="test",
            task_id="task",
            xcom_push=True
        )
        self.assertEqual(k.execute(None), json.loads(return_value))

https://github.com/apache/incubator-airflow/blob/36f3bfb0619cc78698280f6ec3bc985f84e58343/tests/contrib/minikube/test_kubernetes_pod_operator.py#L321

edit: it is worth mentioning that the result pushed to the xcom must be a json.

jfunez
  • 397
  • 6
  • 23
  • 1
    Hey! I'm trying to run my code as in your example, but Kubernetes operator (GKEPodOperator in my case) does not return any value - just None. The path I specify is correct and the json value is written to the result.json. Do you know what can be the issue? – Zarial Oct 12 '19 at 11:26
  • 1
    edit: it is worth mentioning that the result pushed to the xcom must be a json. You saved my life – Tan Sang Dec 16 '20 at 08:33
  • THE JSON COMMENT IS GOLD. you saved my life too. – Alex Ingberg Aug 28 '23 at 13:30