1

I installed the google provider with

pip install 'apache-airflow[google]'  

and also tried

pip install apache-airflow-providers-google

But I can not find Google Cloud option when adding a new connection via airflow webserver, as shown below: enter image description here

All I found is Google Dataprep.

I have tried to restart the webserver multiple times but the Google Cloud option is still not listed. It returns this info when I'm starting the webserver:

{providers_manager.py:215} INFO - Optional provider feature disabled when importing 'airflow.providers.google.leveldb.hooks.leveldb.LevelDBHook' from 'apache-airflow-providers-google' package

Also tried to set lazy_discover_providers = False on the airflow.cfg with no luck.

Any help is appreciated. Thanks!

Blaze Tama
  • 10,828
  • 13
  • 69
  • 129

2 Answers2

4

I had a similar problem with MWAA. I had two option to create the connection:

  1. using Airflow CLI. To simplify creating it with CLI, I suggest to generate it with python, then run the script in your CLI:
import json
from airflow.models.connection import Connection

connection_extra = {
    "extra__google_cloud_platform__key_path":"path/to/key",
    "extra__google_cloud_platform__key_secret_name": "key_file_name_if_it_is_stored_in_secret_manager",
    "extra__google_cloud_platform__keyfile_dict":"{"
        "\"type\": \"service_account\","
        " \"project_id\": \"<PROJECT_ID>\","
        " \"private_key_id\": \"<PRIVATE_KEY_ID>\","
        " \"private_key\": \"-----BEGIN PRIVATE KEY-----\\n<PRIVATE_KEY>\\n-----END PRIVATE KEY-----\\n\","
        " \"client_email\": \"<CLIENT_EMAIL>\","
        " \"client_id\": \"<CLIENT_ID>\","
        " \"auth_uri\": \"https://<AUTH_URI>\","
        " \"token_uri\": \"https://<TOKEN_URI>\","
        " \"auth_provider_x509_cert_url\": \"https://<AUTH_CERT_URI>\","
        " \"client_x509_cert_url\": \"https://<CLIENT_CERT_URI>\""
    "}",
    "extra__google_cloud_platform__num_retries":"5",
    "extra__google_cloud_platform__project":"PROJECR_NAME",
    "extra__google_cloud_platform__scope":"https://www.googleapis.com/auth/cloud-platform"
}

c = Connection(
    conn_id="gcp_conn",
    conn_type="google-cloud-platform",
    description="A connection to access GCP resources",
    extra=connection_extra
)

my_connection_json = {
    "conn_type": c.conn_type,
    "login": c.login,
    "password": c.password,
    "host":c.host,
    "port": c.port,
    "schema": c.schema,
    "extra": c.extra
}

print(f"airflow connections add '{c.conn_id}' --conn-json '{json.dumps(my_connection_json)}'")

You can run this script on your scheduler host, it will be print an Airlfow CLI command, copy, paste and run it in a terminal to create the connection.

  1. from the UI with type http (you are not supposed to set all the variables), you can check this doc:
{
   "extra__google_cloud_platform__project":"<POJECT NAME>",
   "extra__google_cloud_platform__key_path":"",
   "extra__google_cloud_platform__keyfile_dict":{
      "type":"service_account",
      "project_id":"<PROJECT ID>",
      "private_key_id":"<PRIVATE KEY ID>",
      "private_key":"-----BEGIN PRIVATE KEY-----\n<PRIVATE KEY>\n-----END PRIVATE KEY-----\n",
      "client_email":"<CLIENT EMAIL>",
      "client_id":"<CLIENT ID>",
      "auth_uri":"https://<AUTH URI>",
      "token_uri":"https://<TOKEN URI>",
      "auth_provider_x509_cert_url":"https://<AUTH CERT URI>",
      "client_x509_cert_url":"https://<CLIENT CERT URI>"
   },
   "extra__google_cloud_platform__scope":"",
   "extra__google_cloud_platform__num_retries":"10"
}

You can also create it using an environment variable, but it's not secure:

export AIRFLOW_CONN_GOOGLE_CLOUD_DEFAULT='google-cloud-platform://?extra__google_cloud_platform__project=<PROJECT_NAMR>&extra__google_cloud_platform__scope=<SCOPE>&extra__google_cloud_platform__key_path=<KEY_PATH>&extra__google_cloud_platform__num_retries=10'

Hussein Awala
  • 4,285
  • 2
  • 9
  • 23
  • Hi Hussein, thanks so much for helping! I'm also exploring the CLI but it doesn't work so far. Do you have a step-by-step example of how to do this? (what should I input into the CLI) `from the UI with type http (you are not supposed to set all the variables), you can check this doc` On which file should I put the JSON you attached? and what do you mean by "UI with type http"? Thanks! – Blaze Tama Aug 28 '22 at 02:44
  • By `HTTP` type, I mean you can create a connection via the UI, and instead of selecting `Google Cloud` which is not in your connection types list, you can choose `HTTP` and add the extra dict. For the CLI, I will add an example to my answer. – Hussein Awala Aug 28 '22 at 08:24
0

please try to upgrade your db using 'airflow db upgrade'. Once it did, then you can restart the server. I faced a table creation error while upgrading, then I had to remove the db and initialize it again. Obviously it's not advisable since it removes history. If you don't have anything to preserve, you can check that as well.

MI Haque
  • 41
  • 7