0

I have the dag which tries connects to http endpoint nevertheless somehow it doesn't work. I defined connection using envinorment variable but somehow it is not seen by httpsensor and i am getting below error however that variable was created in the system. Whats wrong here? Below dag and full error code.

The conn_id `AIRFLOW_VAR_FOO` isn't defined

DAG:

import os
import json
import pprint
import datetime
import requests

from airflow.models import DAG
from airflow.operators.python import PythonOperator
from airflow.providers.sftp.operators.sftp import SFTPOperator
from airflow.providers.sftp.sensors.sftp import SFTPSensor
from airflow.utils.dates import days_ago
from airflow.models import Variable
from airflow.sensors.http_sensor import HttpSensor
from airflow.hooks.base_hook import BaseHook

def init_vars():
    os.environ['AIRFLOW_VAR_FOO'] = "https://mywebxxx.net/"
    print(os.environ['AIRFLOW_VAR_FOO'])

with DAG(
         dag_id='request_test',
         schedule_interval=None,
         start_date=days_ago(2)) as dag:

    init_vars = PythonOperator(task_id="init_vars",
                                  python_callable=init_vars)

    task_is_api_active = HttpSensor(
        task_id='is_api_active',
        http_conn_id='AIRFLOW_VAR_FOO',
        endpoint='post'
    )

    get_data = PythonOperator(task_id="get_data",
                                  python_callable=get_data)

    init_vars >> task_is_api_active

Full Error:

 File "/home/airflow/.local/lib/python3.7/site-packages/airflow/models/connection.py", line 379, in get_connection_from_secrets
    raise AirflowNotFoundException(f"The conn_id `{conn_id}` isn't defined")
airflow.exceptions.AirflowNotFoundException: The conn_id `AIRFLOW_VAR_FOO` isn't defined
[2022-11-04 10:32:41,720] {taskinstance.py:1551} INFO - Marking task as FAILED. dag_id=request_test, task_id=is_api_active, execution_date=20221104T103235, start_date=20221104T103240, end_date=20221104T103241
[2022-11-04 10:32:42,628] {local_task_job.py:149} INFO - Task exited with return code 1

EDIT:

import os
import json
import pprint
import datetime
import requests

from airflow.models import DAG
from airflow.operators.python import PythonOperator
from airflow.providers.sftp.operators.sftp import SFTPOperator
from airflow.providers.sftp.sensors.sftp import SFTPSensor
from airflow.utils.dates import days_ago
from airflow.models import Variable
from airflow.sensors.http_sensor import HttpSensor
from airflow.hooks.base_hook import BaseHook

def init_vars():
    os.environ['AIRFLOW_VAR_FOO'] = "https://mywebxxx.net/"
    print(os.environ['AIRFLOW_VAR_FOO'])

with DAG(
         dag_id='request_test',
         schedule_interval=None,
         start_date=days_ago(2)) as dag:

    init_vars = PythonOperator(task_id="init_vars",
                                  python_callable=init_vars)

    call init_vars()
    
    task_is_api_active = HttpSensor(
        task_id='is_api_active',
        http_conn_id='AIRFLOW_VAR_FOO',
        endpoint='post'
    )

    get_data = PythonOperator(task_id="get_data",
                                  python_callable=get_data)

    task_is_api_active
Henry
  • 537
  • 1
  • 9
  • 22

1 Answers1

1

You need to define the env variable as AIRFLOW_CONN_VAR_FOO and then your http_conn_id="var_foo".

for more details see this link

def init_vars():
os.environ['AIRFLOW_VAR_FOO'] = "https://mywebxxx.net/"
print(os.environ['AIRFLOW_VAR_FOO'])

with DAG(
     dag_id='request_test',
     schedule_interval=None,
     start_date=days_ago(2)) as dag:

init_vars()

task_is_api_active = HttpSensor(
    task_id='is_api_active',
    http_conn_id='AIRFLOW_VAR_FOO',
    endpoint='post'
)

get_data = PythonOperator(task_id="get_data",
                              python_callable=get_data)

task_is_api_active
ozs
  • 3,051
  • 1
  • 10
  • 19
  • I've done this and i am still getting: `File "/home/airflow/.local/lib/python3.7/site-packages/airflow/models/connection.py", line 379, in get_connection_from_secrets raise AirflowNotFoundException(f"The conn_id {conn_id} isn't defined") airflow.exceptions.AirflowNotFoundException: The conn_id var_foo isnt defined` – Henry Nov 04 '22 at 13:29
  • by doing: `os.environ['AIRFLOW_CONN_VAR_FOO'] = "https://mywebaddres"` and then http_conn_id="var_foo" – Henry Nov 04 '22 at 13:30
  • call init_vars as is (not as part of an PythonOperator). – ozs Nov 04 '22 at 13:54
  • Where exactly don't understand? – Henry Nov 09 '22 at 19:40
  • instead of init_vars = PythonOperator(task_id="init_vars", ... ) just call init_vars() (the python method) – ozs Nov 10 '22 at 10:41
  • so simply speaking replace that line: init_vars = PythonOperator(task_id="init_vars", ... ) with call init_vars() and remove this: init_vars >> ?? – Henry Nov 11 '22 at 08:52
  • 1
    @Henry, yes exactly . – ozs Nov 11 '22 at 10:20
  • ok will check it later but do not understand what would be a difference as python operator calls python code and then it triggered as first task. What's the difference in this approach? – Henry Nov 11 '22 at 10:59
  • I did as stated above means replaces this: init_vars = PythonOperator(task_id="init_vars", ... ) with call init_vars() When i did this change i have an error: call init_vars() ^ SyntaxError: invalid syntax – Henry Nov 22 '22 at 09:25
  • 1
    @Henry, can you share the full code please. 10x – ozs Nov 22 '22 at 09:29
  • see in main post EDIT section. – Henry Nov 22 '22 at 09:34
  • 1
    @Henry, remove "call". – ozs Nov 22 '22 at 10:20
  • Seems like it works now nevertheless i do not understand what is the diffrence between the calling directly the method and as i had before using python operator. Can u explain? – Henry Nov 22 '22 at 11:34
  • 1
    @Henry, I am not sure. but Airflow might encapsulate each python environment (like venv) and thats why you can not share between os.environ cause its not the machine enviroment – ozs Nov 22 '22 at 12:23
  • so that was just ur bet to replace Python operator with regular method call? Second question is there any diffrence when i just call python method like we did and calling that method using the PythonOperator? – Henry Nov 22 '22 at 15:48
  • @Henry, yes. Airflow is parsing the file every 30 sec (by default) and actually runs this code. the code in Operator runs only when the dag is schedule triggered. The best case in your case is to define the connection in airflow connections via airflow.cfg or in docker-compose if you use it this way. – ozs Nov 22 '22 at 16:57