My goal is to find the top 1000 values for any column in any table, and how frequently the values appear across the dataset in a given timeframe.
I'm expecting the output:
<column_name>, PercentOfDataSet
Value1, X.XX
Value2, X.XX
Value3, X.XX
Currently I'm doing this in 2 steps, to get the total number of rows in one query, and then divide by that number on the following one:
sql_row_count := 'select count(*) from '|| table_name ||' where '|| date_column ||' > DATEADD(day, -365, getdate());';
total_row_count := (execute immediate :sql_row_count);
sql_query := 'select '|| column_name ||', iff('|| total_row_count ||' = 0, 0.00, cast(count(*) as numeric(18,2))/'|| total_row_count ||'*100) PercentOfDataSet' ||
' from '|| table_name ||' where '|| date_column ||'> DATEADD(day, -365, getdate())' ||
' group by 1 order by 2 desc limit 1000;';
Is there a better way to calculate this with SQL?
(this is based on the code found on Snowflake dynamic SQL: Storing query results as variables between executions)