2

I have a workflow I want to move into Airflow, the workflow is made up of multiple Python scripts and needs to be run in a Conda environment I have created.

At the minute I am using the BashOperator to run a shell script which activates my environment and then runs the python code, but this will get messy as I add more Python scripts to the workflow. At the minute this is what I am doing:

with dag:
    dictionary_generator = BashOperator(
        task_id='dictionary_generator',
        bash_command='bash -i /home/lnxuser/PycharmProjects/DEN/DEN_ns/dic.sh '
    )

the bash command runs the shell script 'dic.sh':

conda activate DEN_env3.9
cd /home/lnxuser/PycharmProjects/DEN/DEN_ns/
python3 Jbn/Ops/_dict_generation.py Job_File 

This shell script activates the environment, changes the directory and then runs the python script.

This is not very neat, so I am looking for a way to specify the working directory airflow uses through the dag as well as a method to activate the environment (but this does not have to be specified in the dag).

I am relatively new to Airflow so any help or suggestions would be appreciated.

tribo32
  • 333
  • 3
  • 15

1 Answers1

1

I managed to it to work by specifying the python executable as mentioned here: https://stackoverflow.com/a/58440405

bash_command='/Users/<username>/opt/anaconda3/envs/DEN_env3.9/bin/python /path/to/python/script/my_script.py'
Hans Bambel
  • 806
  • 1
  • 10
  • 20