2

GCP documentation distinguishes between two kinds of logs for Composer:

Cloud Composer has the following Airflow logs:

Airflow logs: These logs are associated with single DAG tasks. You can view the task logs in the Cloud Storage logs folder associated with the Cloud Composer environment. You can also view the logs in the Airflow web interface.

Streaming logs: These logs are a superset of the logs in Airflow. To access streaming logs, you can go to the logs tab of Environment details page in Google Cloud Console, use the Cloud Logging, or use Cloud Monitoring.

As documented, I can see the Airflow logs in the buckets, but they do not display in GCP Logging, should they? In fact, when I query Logging for resource.type="cloud_composer_environment" I can see very few logs including no errors while I know that in this period I noticed some failed jobs that resulted in errors. Does Logging need any special configuration for the Airflow logs to be displayed?

Tim
  • 7,075
  • 6
  • 29
  • 58
  • Try any one of the filters given [here](https://cloud.google.com/composer/docs/concepts/logs#streaming) for `resource.type`. That might do the trick. The specific one for jobs would theoretically be in `airflow-worker`. – jayg_code Jan 12 '22 at 09:36
  • @jayg_code it doesn't return any results for me for the period where I know for certain that there are logs. – Tim Jan 12 '22 at 11:24

1 Answers1

2

Airflow logs are stored in the corresponding buckets and are not visible in the Logs Explorer - you have to go to the Cloud storage and find a bucket corresponding to the environment you want to view:

The logs folder includes folders for each workflow that has run in the environment. Each workflow folder includes a folder for its DAGs and sub-DAGs. Each folder contains log files for each task. The task filename indicates when the task started.

The following example shows the logs directory structure for an environment.

Logs are stored in a folder structure described also in the documentation;

us-central1-my-environment-60839224-bucket
   └───dags
   |   │
   |   |   dag_1
   |   |   dag_2
   |   |   ...
   |
   └───logs
       │
       └───dag_1
       |   │
       |   └───task_1
       |   |   │   datefile_1
       |   |   │   datefile_2
       |   |   │   ...
       |   |
       |   └───task_2
       |       │   datefile_1
       |       │   datefile_2
       |       │   ...
       |
       └───dag_2
           │   ...

Streaming logs however are visible in Logs Explorer and you can view them using resource.type="cloud_composer_environment" query which will display all the logs from all the environments.

If you want to specify which environment you want to view just add another parameter to your query:

resource.type="cloud_composer_environment"
resource.labels.environment_name="composer_env_name_here".

You can also view the same logs in the Console in Composer environment details under "Logs" tab. If in doubt have a look at the documentation describing the process.

Wojtek_B
  • 4,245
  • 1
  • 7
  • 21
  • OK, the question is: is it possible to push them to Cloud Logging anyhow? – Tim Jan 13 '22 at 15:02
  • 1
    There's not. You would have to write some script to run (with access to the buckets holding the logs) which could push them to Cloud Logging. – Wojtek_B Jan 27 '22 at 10:50
  • @Tim The best way I can think of to workaround the issue that stdout logs are not automatically pushed to Cloud Logging is to use the Python Cloud Logging client library to log from within your tasks (https://cloud.google.com/logging/docs/setup/python#use_the_cloud_client_library_directly). – Matt Welke Aug 06 '22 at 17:40