I am building an analytics page backed up with Python(Flask) and Redshift as data source. Data is presented in a start schema, so all I want to do is basic aggregation and filtering in specified time frame (sounds not like a rocket science). Though I can't find any elegant way to do this.
Let's say I have a SQL query that nicely provides daily stats for current month.
with current_month as (
select date
from date_d
where month_name = 'November' AND year_actual = '2018'
order by date
),
filtered as (
select date ,fact.id, fact.created_id,
from fact
join date_d ON date_d.id = fact.created_id
where date_d.month_name = 'November' AND date_d.year_actual = '2018' AND fact.foo = 'bar'
),
total as (
SELECT COUNT(id),DATE(date)
from filtered GROUP BY 2),
SELECT current_month.date, COALESCE(total.count,0) as total
from current_month
LEFT JOIN total ON current_month.date = total.date
group by 1,
order by current_month.date
Is there any way I can execute this query and load results into some predefined data structure? I looked at SQLAlchemy, because I didn't feel like executing raw SQL Queries, but ORM looks useless here. Looks like the only way is to execute raw SQL and load it into some sort of schema (using marshmallow). I looked at this article which is close but not that elaborate.
Maybe I am missing something? People should do such things rather often. Or maybe my approach is wrong over all?
P.S. Simple filtering works perfectly on my schema with Flask-Restless