1

In my app an area can have many happenings. I want to find out over a given timespan, what percentage of those weeks had at least one session listed. It doesn't matter how many were listed, so long as there was at least one. So as an example, say in February Area X listed 5 sessions in week 1, 0 in week 2, 1 in week 3, 1 in week 4 then I should get back that Area X has a listing rate of 75% (3 out of 4 weeks had at least one session)

I've tried to accomplish this using the following query:

SELECT a.location, sum(case when date_trunc('week', h.started_at) is null then 0 else 1 end) / (({{end}} - {{start}}) / 7)::float * 100 as percentage 
FROM areas a 
JOIN happenings h on h.primary_area_id = a.id 
and h.started_at between {{start}} and {{end}}
group by 1

My thinking is that go through each week, if a happening exists then give it a value of 1, if not 0. Then divide that by the number of weeks and multiply by 100 to get the percentage.

But the fact that I'm getting percentages over 100% indicates to me that weeks with multiple happenings are being counted for more than 1. What's the best way to limit it to 0 or 1 for a given week? Is there a better way to calculate this?

DavidM
  • 173
  • 1
  • 9
  • I would `GROUP BY` the data by both the area and the actual week-of-year. From there, you would have a resultset of distinct areas, with the number of entries per area being the number of weeks of the year they had something occuring (if using the implicit `INNER` join as above). You can go further and perform another group/sum operation to count those rows per area, and then `rowCount/52` is the percentage for that area. – Rogue Feb 28 '23 at 15:26
  • Did this during lunch: https://dbfiddle.uk/fyvhqxtB Can explain later if you have questions. – Mike Organek Feb 28 '23 at 18:10
  • Please update your question to include table definitions (ddl) and sample data for each table, as formatted text - **no images**. Also post the results desired from that data. See [Senseful Solutions](https://senseful.github.io/text-table/) for a method of creating good looking/workable tables. Past its results be between lines containing only **```**. Better yet create a [fiddle](https://dbfiddle.uk/) and post a link. – Belayer Feb 28 '23 at 19:04
  • @MikeOrganek thanks for that. Some pretty insane sql but seems to work. Would be great if you could break down what's going on, just for my sanity. I've never encountered laterals before and the documentation for them is a little unclear. What is the cross join for? And what are all the `bow` references? – DavidM Mar 01 '23 at 12:16

1 Answers1

0

This is my approach with annotations:

with parms as (          -- CTE for your input parameters
  select '2023-02-01'::date as start_date, '2023-02-28'::date as end_date
), area_weeks as(        -- Create pigeonholes for all weeks 
  select gs.bow, count(*) over () as num_weeks
    from parms 
         cross join lateral 
           generate_series(
             date_trunc('week', start_date),
             date_trunc('week', end_date),
             interval '7 days'
           ) as gs(bow)
), collapse_weeks as (   -- Keep only whether an area had any event
  select distinct primary_area_id, 
         date_trunc('week', started_at) as bow -- "beginning of week"
    from happenings
)
select a.location, 
       count(h.bow) as weeks_with_happenings,
       max(aw.num_weeks) as num_weeks,
       1.0 * avg((h.bow is not null)::int) as result
       -- Use the average of a boolean result cast to int to get a percentage
       -- without having to calculate counts or guard against 
       -- division by zero.  Multiply by 1.0 to force float arithmetic.
  from areas a
       cross join area_weeks aw    -- expand week pigeonholes to weeks * areas
       left join collapse_weeks h  -- happening exists or does not
         on h.bow = aw.bow and h.primary_area_id = a.id
 group by a.location;

Fiddle here

Mike Organek
  • 11,647
  • 3
  • 11
  • 26