Our team is currently migrating dags to Airflow 2. One of the changes we're applying is moving from the deprecated BQ ops to the new ones, for example, BigQueryInsertJobOperator
The problem is, with these new operators, the query is no longer passed directly to the operator, but one must pass a configuration object (including the query as a property) in JSON format.
Of course, Jinja2 allows for the parsing of a file containing the query
select_query_job = BigQueryInsertJobOperator(
task_id="select_query_job",
configuration={
"query": {
"query": "{% include 'example_bigquery_query.sql' %}",
"useLegacySql": False,
}
},
location=location,
)
but when looking at the rendered template of the task, instead of getting a nicely formatted query like
SELECT
foo
FROM
bar
WHERE
x=y
it just ends up looking like
"query": {
"query": "SELECT\n foo\n FROM\n bar\n WHERE\n x=y\n"
"useLegacySql": False,
}
It may not look so problematic with a tiny query, but we have queries in the 1000s of lines and as you can imagine, this makes debugging a really tedious task, not just because of the newlines, but because it also goes for horizontal scrolling.
Is there another way to read the query or to tell airflow to render it in a nicer format? As it is now, rolling back to the deprecated ops would be better than dealing with this format.