0

I'm reading and executing sql queries from file and I need to inspect the result sets to count all the null values across all columns. Because the SQL is read from file, I don't know the column names and thus can't call the columns by name when trying to find the null values.

I think using CTE is the best way to do this, but how can I call the columns when I don't know what the column names are?

WITH query_results AS
(
<sql_read_from_file_here>
)
select count_if(<column_name> is not null) FROM query_results

FunnyChef
  • 1,880
  • 3
  • 18
  • 30

1 Answers1

0

If you are using Python to read the file of SQL statements, you can do something like this which uses pglast to parse the SQL query to get the columns for you:

import pglast
sql_read_from_file_here = "SELECT 1 foo, 1 bar"
ast = pglast.parse_sql(sql_read_from_file_here)
cols = ast[0]['RawStmt']['stmt']['SelectStmt']['targetList']
sum_stmt = "sum(iff({col} is null,1,0))"
sums = [sum_sql.format(col = col['ResTarget']['name']) for col in cols]
print(f"select {' + '.join(sums)} total_null_count from query_results")


# outputs: select sum(iff(foo is null,1,0)) + sum(iff(bar is null,1,0)) total_null_count from query_results
Nat Taylor
  • 1,122
  • 7
  • 9