3

Since a few days ago some tasks throw an error at the start of every DAG run. It seems the log file is not found to retrieve the logging from the task.

*** 404 GET https://storage.googleapis.com/download/storage/v1/b/europe-west1-ventis-brand-f-65ab79d1-bucket/o/logs%2Fimport_ebay_snapshot_feeds_ES%2Fstart%2F2021-11-30T08%3A00%3A00%2B00%3A00%2F3.log?alt=media: No such object: europe-west1-ventis-brand-f-65ab79d1-bucket/logs/import_ebay_snapshot_feeds_ES/start/2021-11-30T08:00:00+00:00/3.log: ('Request failed with status code', 404, 'Expected one of', <HTTPStatus.OK: 200>, <HTTPStatus.PARTIAL_CONTENT: 206>)

I've upgraded to the latest version of Cloud Composer and the tasks run on Python3.

There is the environment configuration:

Resources:

Workloads configuration:

  • Scheduler
  • 1 vCPU, 2 GB memory, 2 GB storage

Number of schedulers: 2

Web server:

  • 1 vCPU, 2 GB memory, 2 GB storage

Worker 1 vCPU, 2 GB memory, 1 GB storage

Number of workers:

  • Autoscaling between 1 and 4 workers

Core infrastructure: Environment size: Small

GKE cluster projects/*********************************

There are no related issues regarding this error in Cloud Composer changelog. How could this being fixed?

Wytrzymały Wiktor
  • 11,492
  • 5
  • 29
  • 37
Julian
  • 45
  • 6
  • Probabilly the bucket of the cluster has been deleted or some folder on it. Go to `Cloud console` -> `Google Cloud Storage (Browse)` and look if the bucket with the name `europe-west1-ventis-brand-f-65ab79d1-bucket` exists. If yes, check also if the folder `/logs/import_ebay_snapshot_feeds_ES` exists. You can also check via cloud console running `gsutil ls gs://europe-west1-ventis-brand-f-65ab79d1-bucket/logs`. Post the results please. – ewertonvsilva Dec 01 '21 at 09:18
  • @ewertonvsilva the bucket exists and all the folders are there. The issue I believe is between the Airflow webserver and the log during the initial handshake (a value in the configuration that cannot be changed in Cloud Composer). – Julian Dec 01 '21 at 09:40
  • @Julian What is the version of the Composer? I am facing the same issue with Composer version 1.17.2 and Airflow version 2.1.2 – codninja0908 Jan 11 '22 at 11:40
  • For me, this happened quite often and a temporary fix before the composer version update was to increase retry amounts in DAG tasks. – Heikura Jan 13 '22 at 10:01
  • We are facing this issue too on Composer 2. We have updated to the latest stable version and recreated the environment. The error happens less now, but it still happens and can fail some tasks. Despite support telling us that it's been solved. I'm running out of ideas. Anyone else in the same situation? We use composer-2.0.2-airflow-2.1.4. – Vlad Gheorghe Jan 21 '22 at 14:53

1 Answers1

1

Its a bug in the Cloud composer environment and has already been reported. You can track this conversation: Re: Log not visible via Airflow web log and other similar forums. For fixing the issue, its recommended to update you composer environment or use an stable version.

There is some workaround suggested (You can try in this order, they are independent from each other):

  1. Remove the logs from the /logs folder in Composer GCS bucket and archive it in some other place (outside of /logs folder).

or

  1. Manually update the web server configuration to read logs directly from a new bucket in your project. You would first need to grant viewer roles (like roles/storage.legacyBucketReader and roles/storage.legacyObjectReader) on the bucket to the service account running the web server.
  • edit /home/airflow/gcs/airflow.cfgremote_base_log_folder = <newbucket> with proper permissions as described above.

or

  1. If you don't have DRS (Domain restricted sharing) enabled which I believe you don't. You can create a new Composer environment, this time through v1 Composer API or without Beta features enabled in Cloud Console. This way Composer will create an environment without the DRS-compliant setup, so without the bucket-to-bucket synchronization. The problem is that you would need to migrate your DAGs and data to the new environment.
ewertonvsilva
  • 1,795
  • 1
  • 5
  • 15
  • I am also facing the issue and as per your comments Option 2 is not possible in my opinion in Cloud composer env. While creating cloud composer environment the log file path is default bucket path which is set by itself and can't be edited. Also we cannot override webserver configs. – codninja0908 Jan 11 '22 at 11:30