0

I have a table of data which contains attributes like body, offer_id and created_at. When in chronological order I need to find the count of rows until 'body' satisfies my 'where' clause for a particular offer_id, i.e.

created at | offer id | body
---------------------------------------------
Jan        | 12       | does not satisfy
Feb        | 12       | does not satisfy
Mar        | 12       | satisfies
Jan        | 13       | does not satisfy
Feb        | 13       | satisfies
Jan        | 14       | does not satisfy
Feb        | 14       | satisfies
Mar        | 14       | does not satisfy
Apr        | 14       | does not satisfy

Expected output:

offer_id | count
---------|------
12       | 3
13       | 2
14       | 2
Dale K
  • 25,246
  • 15
  • 42
  • 71
Eduard Kim
  • 265
  • 4
  • 15
  • 1
    Answer provided does not demonstrate any _chronological_ logic as well as criteria satisfaction. It's a straightforward `GROUP BY + Count()` – Ivan Starostin Feb 15 '16 at 14:03
  • 2
    use analytic function `rownumber() over (partition by offer_ID order by...)` to generate a count against the query? – xQbert Feb 15 '16 at 14:04
  • In your example the last record for an offer_id is always the one and only matching record. Is this always the case or can there be two matching records for one offer_id and can a record that matches be followed by a mismatching record again? – Thorsten Kettner Feb 15 '16 at 14:26
  • Sorry, that is my mistake, I should have expanded the example. Yes, there will be cases where the body that satisfies 'where' clause will not be the last in chronological order – Eduard Kim Feb 15 '16 at 15:22

7 Answers7

1

First - you need to generate a number for every record inside its offer window:

select t.*, rownumber() over (partition by t.offer_ID order by t.created_at) as rn
from t

it will result in something like:

created at | offer id | body             | rn
---------------------------------------------
Jan        | 12       | does not satisfy | 1
Feb        | 12       | does not satisfy | 2
Mar        | 12       | satisfies        | 3
Jan        | 13       | does not satisfy | 1
Feb        | 13       | satisfies        | 2
Jan        | 14       | does not satisfy | 1
Feb        | 14       | satisfies        | 2
Mar        | 14       | does not satisfy | 3
Apr        | 14       | does not satisfy | 4

from this subquery you can get a minimal rn (first record that satisfies the condition):

with sub as (
    select t.*, rownumber() over (partition by t.offer_ID order by t.created_at) as rn
    from t)
select offer_ID, min(rn)
from sub
where (satisfies)
group by offer_ID
Ihor Romanchenko
  • 26,995
  • 8
  • 48
  • 44
1

straight as an arrow

select t.offer_id, count(*)
from mytable t
where not exists
  (
    select 1 from mytable tt 
    where tt.offer_id = t.offer_id
      and tt.created_at < t.created_at
      and tt.body = 'satisfies'
  )
group by t.offer_id
Ivan Starostin
  • 8,798
  • 5
  • 21
  • 39
0

DO you have tried something like this?

select count(*)
from mytable
where "satisfies"

Or, if you want to count only the different offer_id:

select count(distinct  offer_id)
from mytable
where "satisfies"

Or, finally:

select count(offer_id)
from mytable
where "satisfies"
group by offer_id

Does is this what you need? If not, give me more details! ;)

Piero Alberto
  • 3,823
  • 6
  • 56
  • 108
  • Thanks, sorry but the example I provided didn't actually show the full picture as there will be cases where the body that satisfies 'where' clause will not be the last in chronological order – Eduard Kim Feb 15 '16 at 15:23
0

One way to count the number that don't satisfy the condition is to use a cumulative sum:

select offer_id, count(*)
from (select t.*,
             sum(case when <condition> then 1 else 0 end) over
                 (partition by offer_id order by created_at) as num
      from t
     ) t
where num = 0;

However, this is one less than the number you have. So, instead:

select offer_id,
       (sum(case when num = 0 then 1 else 0 end) +
        max(case when num = 1 then 1 else 0 end)
       )
from (select t.*,
             sum(case when <condition> then 1 else 0 end) over
                 (partition by offer_id order by created_at) as num
      from t
     ) t
where num in (0, 1)
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • Can you explain the meaning of t.* as I am not sure I am familiar with this syntax (or point to some further reading)? – Eduard Kim Feb 15 '16 at 15:27
  • @EduardKim . . . That is how you select all the columns from a table. – Gordon Linoff Feb 15 '16 at 17:44
  • @Eduard Kim: `t` is the table name here. `t.*` means all columns from `t`. Gordon uses this here to say: give me all columns from `t` plus something else. This is because some DBMS complain when you use a simply asterix in `select *, sum() over()`, somehow arguing that `*` means all fields, and `*, sum() over()` would mean *all* fields *plus* another field, which would make no sense. I consider this a stupid limitation, but that's the way it is. – Thorsten Kettner Feb 15 '16 at 20:13
0

If you just want the count of offer_id , you can use the below

select offer_id, count(*) as count_1 from table_name
where <<your condition>>
group by offer_id

If my understanding is wrong, please share a detailed description on what exactly you require.

Chendur
  • 1,099
  • 1
  • 11
  • 23
0

You can break the task in two parts:

  1. For each offer ID find the record/date that first satisfies the condition.
  2. Count all records per offer ID until that found record/date.

With a subquery in SELECT:

select 
  offer_id,
  (
    select count(*)
    from mytable m
    where m.offer_id = mfit.offer_id
    and m.created_at <= min(mfit.created_at)
  ) as cnt
from mytable mfit
where <condition>
group by offer_id

or a subquery in FROM:

select
  mfit.offer_id, 
  count(*) as cnt
from
(
  select offer_id, min(created_at) as min_date
  from mytable
  where <condition>
  group by offer_id
) mfit
join mytable m on m.offer_id = mfit.offer_id and m.created_at <= mfit.created_at
group by mfit.offer_id;
Thorsten Kettner
  • 89,309
  • 7
  • 49
  • 73
  • That works great, although I can' really understand the logic fully – Eduard Kim Feb 15 '16 at 15:37
  • First query: select all records which meet the condition. Group by offer_id, so you only have one record per offer_id, take min(created_date) so you get the first created_date per offer_id. Then select again from the same table (this happens for every record we just built): select all records for the same order_id and a date not later then the minimum satisfy date. – Thorsten Kettner Feb 15 '16 at 16:39
  • Second query (didn't work yet, because the outer group by clause was missing): again select all records which meet the condition. Group by offer_id, so you only have one record per offer_id, take min(created_date) so you get the first created_date per offer_id. Then join the same table again, so as to combine each built record with all records with the same offer_id and a date not later then the minimum satisfy date. Then group by offer_id again and count how many records you joined. – Thorsten Kettner Feb 15 '16 at 16:44
  • You can imagine both queries as two steps and a loop: 1. get the minimum satisfy date per offer_id. 2. for each of these records loop through the table again in order to count the records until then. As we access the same table twice, we must give it two different alias names, so the DBMS knows to which of the two records we are currently looking at we are referring. – Thorsten Kettner Feb 15 '16 at 16:49
0

Here is another query using an analytic function. Analytic functions have the advantage that you read the table just once and get different aggregations on-the-fly. The idea is to have a running total per offer_id with a one for a record matching your condition plus a count per offer_id. This looks as follows:

created at | offer id | body               | s | c
---------------------------------------------------
Jan        | 13       | does not satisfy   | 0 | 1
Feb        | 13       | satisfies          | 1 | 2
Jan        | 14       | does not satisfy   | 0 | 1
Feb        | 14       | satisfies          | 1 | 2
Mar        | 14       | does not satisfy   | 1 | 3
Apr        | 14       | does not satisfy   | 1 | 4
May        | 14       | satisfies          | 2 | 5
Jun        | 14       | does not satisfy   | 2 | 6
Apr        | 14       | does not satisfy   | 2 | 7
May        | 14       | satisfies          | 3 | 8

So we are simply looking for the min(c) for s = 1.

select offer_id, min(c) as cnt
from
(
  select 
    offer_id,
    sum(case when <condition> then 1 else 0 end) 
             over (partition by offer_id order by created_at) as s,
    count(*) over (partition by offer_id order by created_at) as c
  from mytable
) data
where s = 1
group by offer_id
order by offer_id;
Thorsten Kettner
  • 89,309
  • 7
  • 49
  • 73