0

I want to find the count of rows between first and last occurrence of a value. However when there are five or more records of a different value between them, stop counting.

So if last occurrence is today and first occurrence is yesterday, the result would be 2 (today plus yesterday).

If last occurrence is today and first occurrence is 8 days ago AND there is no occurrence in between the two, the result would be '1'. If however there would be another occurrence 3 days ago, the result would be 4 (3+2+1 days ago plus today).

I hope that makes sense.

Here's my data

Date        City    Weather
==============================
2018-08-11  Ankara  Sun
2018-08-10  Ankara  Sun
2018-08-09  Ankara  Sun
2018-08-08  Ankara  Sun
2018-08-07  Ankara  Sun
2018-08-06  Ankara  Sun
2018-08-05  Ankara  Rain
2018-08-04  Ankara  Clouds
2018-08-03  Ankara  Rain
2018-08-02  Ankara  Sun
2018-08-01  Ankara  Sun
2018-08-11  Cairo   Clouds
2018-08-10  Cairo   Sun
2018-08-09  Cairo   Sun
2018-08-08  Cairo   Sun
2018-08-07  Cairo   Sun
2018-08-06  Cairo   Sun
2018-08-05  Cairo   Clouds
2018-08-04  Cairo   Sun
2018-08-03  Cairo   Sun
2018-08-02  Cairo   Sun
2018-08-01  Cairo   Sun

What I'm after is a query that returns for a given city and date the weather that day and the number of days since this weather first occurred. However, when there are gaps of five days or more, the count restarts from 1.

Like when queried for Ankara on the 11th Aug it would return 11 since it's been 11 days (including today) since Sun first occurred.

However for Cairo on the 11th Aug it would return 1 and not 7 because it's been 5+ days between Clouds on the 5th Aug and Clouds today.

I've tried many things with first_value(), LEAD, LAG and ROW_NUMBER but there is nothing that makes sense to how as it all failed miserably.

Here it is anyway...

select 
 city, val,datediff(day, min(datadate), '2018-10-30') + 1 as DaysPresent
from d
where val = last_val
group by city,val;

or...

select 
        date, city, weather, datediff(day,ca.prior,d.date)+1 as daysPresent
from d
cross apply (
    select min(prev.date) as prior
    from d as prev 
    where prev.city = d.city
    and prev.date between dateadd(day,-4,d.date) and dateadd(day,0,d.date)
    and prev.weather = d.weather
    ) ca

order by city,date

Expected Result

+----+---------------------+--------+---------+----------------+-------------+----------+----------------------------
|    |        date         |  city  | weather | prior_the_same | prior_types |expected  | why?
+----+---------------------+--------+---------+----------------+-------------+----------+----------------------------
|  1 | 11.08.2018 00:00:00 | Ankara | Sun     |              5 |           2 |       11 | 11t day since 1st time Sun
|  2 | 10.08.2018 00:00:00 | Ankara | Sun     |              4 |           3 |       10 | 10t day since 1st time Sun
|  3 | 09.08.2018 00:00:00 | Ankara | Sun     |              7 |           3 |        9 | 9th day since 1st time Sun
|  4 | 08.08.2018 00:00:00 | Ankara | Sun     |              7 |           3 |        8 | 8th day since 1st time Sun
|  5 | 07.08.2018 00:00:00 | Ankara | Sun     |              6 |           3 |        7 | 7th day since 1st time Sun
|  6 | 06.08.2018 00:00:00 | Ankara | Sun     |              5 |           3 |        6 | 6th day since 1st time Sun ( <5 days gap since last Sun keeps counting )
|  7 | 05.08.2018 00:00:00 | Ankara | Rain    |              2 |           3 |        3 | 3rd day since 1st time Rain
|  8 | 04.08.2018 00:00:00 | Ankara | Clouds  |              0 |           3 |        1 | 1st day Clouds
|  9 | 03.08.2018 00:00:00 | Ankara | Rain    |              0 |           2 |        1 | 1st day Rain 
| 10 | 02.08.2018 00:00:00 | Ankara | Sun     |              1 |           0 |        2 | 2nd day since 1st time Sun
| 11 | 01.08.2018 00:00:00 | Ankara | Sun     |              0 |           0 |        1 | 1st day Sun
| 12 | 11.08.2018 00:00:00 | Cairo  | Clouds  |              6 |           6 |        1 | 1st time Clouds ( >5 days gap since last Clouds resets the count )
| 13 | 10.08.2018 00:00:00 | Cairo  | Sun     |              7 |           1 |       10 | 10t day since 1st time Sun
| 14 | 09.08.2018 00:00:00 | Cairo  | Sun     |              7 |           1 |        9 | 9th day since 1st time Sun
| 15 | 08.08.2018 00:00:00 | Cairo  | Sun     |              7 |           1 |        8 | 8th day since 1st time Sun
| 16 | 07.08.2018 00:00:00 | Cairo  | Sun     |              6 |           1 |        7 | 7th day since 1st time Sun
| 17 | 06.08.2018 00:00:00 | Cairo  | Sun     |              5 |           1 |        6 | 6th day since 1st time Sun ( <5 days gap since last Sun keeps counting )
| 18 | 05.08.2018 00:00:00 | Cairo  | Clouds  |              0 |           4 |        1 | 1st time Clouds
| 19 | 04.08.2018 00:00:00 | Cairo  | Sun     |              3 |           0 |        4 | 4th day since 1st time Sun
| 20 | 03.08.2018 00:00:00 | Cairo  | Sun     |              2 |           0 |        3 | 3rd day since 1st time Sun
| 21 | 02.08.2018 00:00:00 | Cairo  | Sun     |              1 |           0 |        2 | 2nd day since 1st time Sun
| 22 | 01.08.2018 00:00:00 | Cairo  | Sun     |              0 |           0 |        1 | 1st day Sun
+----+---------------------+--------+---------+----------------+-------------+----------+----------------------------
Peete
  • 125
  • 11
  • 2
    I think you should do a better job of showing sample data which covers all edge cases. Your logic is not completely clear to me. – Tim Biegeleisen Oct 29 '18 at 02:34
  • 1
    @Peete . . . I agree with Tim. The rules seem rather arcane. Some sort of justification for them would help, as would an additional column on the sample data that shows the results. – Gordon Linoff Oct 29 '18 at 02:46
  • 1
    it was 2 questions ago that [the cross apply seen above was proposed](https://stackoverflow.com/a/52959870/2067753) and accepted, then another variant accepted, although there is no point in using `dateadd(day,0,d.date)` just use `d.date` without the function. – Paul Maxwell Oct 29 '18 at 03:07
  • I've added a result table including an explanation why the result would be like that following the business logic described. Thanks for your input, helps me getting better at asking here. – Peete Oct 29 '18 at 20:43

1 Answers1

3

Latest

declare @day_range integer = 5;

select 
        t.date, t.city, t.weather
      , datediff(day,ca1.prior_dt,t.date)+1 as prior_the_same
      , twist.prior_types
      , twist.prior_mx_dt
from mytable t
cross apply (
    select count(prev.weather) as prior_types, max(prev.date) as prior_mx_dt
    from mytable as prev 
    where prev.city = t.city
    and prev.date between dateadd(day,-@day_range,t.date) and t.date
    and prev.weather <> t.weather
    ) twist
cross apply (
    select min(prev.date) as prior_dt
    from mytable as prev 
    where prev.city = t.city
    and (twist.prior_types < @day_range or prev.date >= twist.prior_mx_dt)
    and prev.weather = t.weather
    ) ca1

order by t.city, t.date DESC

result:

+----+---------------------+--------+---------+----------------+-------------+---------------------+
|    |        date         |  city  | weather | prior_the_same | prior_types |     prior_mx_dt     |
+----+---------------------+--------+---------+----------------+-------------+---------------------+
|  1 | 11.08.2018 00:00:00 | Ankara | Sun     |             11 |           0 | NULL                |
|  2 | 10.08.2018 00:00:00 | Ankara | Sun     |             10 |           1 | 05.08.2018 00:00:00 |
|  3 | 09.08.2018 00:00:00 | Ankara | Sun     |              9 |           2 | 05.08.2018 00:00:00 |
|  4 | 08.08.2018 00:00:00 | Ankara | Sun     |              8 |           3 | 05.08.2018 00:00:00 |
|  5 | 07.08.2018 00:00:00 | Ankara | Sun     |              7 |           3 | 05.08.2018 00:00:00 |
|  6 | 06.08.2018 00:00:00 | Ankara | Sun     |              6 |           3 | 05.08.2018 00:00:00 |
|  7 | 05.08.2018 00:00:00 | Ankara | Rain    |              3 |           3 | 04.08.2018 00:00:00 |
|  8 | 04.08.2018 00:00:00 | Ankara | Clouds  |              1 |           3 | 03.08.2018 00:00:00 |
|  9 | 03.08.2018 00:00:00 | Ankara | Rain    |              1 |           2 | 02.08.2018 00:00:00 |
| 10 | 02.08.2018 00:00:00 | Ankara | Sun     |              2 |           0 | NULL                |
| 11 | 01.08.2018 00:00:00 | Ankara | Sun     |              1 |           0 | NULL                |
| 12 | 11.08.2018 00:00:00 | Cairo  | Clouds  |              1 |           5 | 10.08.2018 00:00:00 |
| 13 | 10.08.2018 00:00:00 | Cairo  | Sun     |             10 |           1 | 05.08.2018 00:00:00 |
| 14 | 09.08.2018 00:00:00 | Cairo  | Sun     |              9 |           1 | 05.08.2018 00:00:00 |
| 15 | 08.08.2018 00:00:00 | Cairo  | Sun     |              8 |           1 | 05.08.2018 00:00:00 |
| 16 | 07.08.2018 00:00:00 | Cairo  | Sun     |              7 |           1 | 05.08.2018 00:00:00 |
| 17 | 06.08.2018 00:00:00 | Cairo  | Sun     |              6 |           1 | 05.08.2018 00:00:00 |
| 18 | 05.08.2018 00:00:00 | Cairo  | Clouds  |              1 |           4 | 04.08.2018 00:00:00 |
| 19 | 04.08.2018 00:00:00 | Cairo  | Sun     |              4 |           0 | NULL                |
| 20 | 03.08.2018 00:00:00 | Cairo  | Sun     |              3 |           0 | NULL                |
| 21 | 02.08.2018 00:00:00 | Cairo  | Sun     |              2 |           0 | NULL                |
| 22 | 01.08.2018 00:00:00 | Cairo  | Sun     |              1 |           0 | NULL                |

see it online: https://rextester.com/ZSHT63407


Original

with sample data of:

CREATE TABLE mytable(
   Date    DATE  NOT NULL
  ,City    VARCHAR(6) NOT NULL
  ,Weather VARCHAR(6) NOT NULL
);
INSERT INTO mytable(Date,City,Weather) VALUES ('2018-08-11','Ankara','Sun');
INSERT INTO mytable(Date,City,Weather) VALUES ('2018-08-10','Ankara','Sun');
INSERT INTO mytable(Date,City,Weather) VALUES ('2018-08-09','Ankara','Sun');
INSERT INTO mytable(Date,City,Weather) VALUES ('2018-08-08','Ankara','Sun');
INSERT INTO mytable(Date,City,Weather) VALUES ('2018-08-07','Ankara','Sun');
INSERT INTO mytable(Date,City,Weather) VALUES ('2018-08-06','Ankara','Sun');
INSERT INTO mytable(Date,City,Weather) VALUES ('2018-08-05','Ankara','Rain');
INSERT INTO mytable(Date,City,Weather) VALUES ('2018-08-04','Ankara','Clouds');
INSERT INTO mytable(Date,City,Weather) VALUES ('2018-08-03','Ankara','Rain');
INSERT INTO mytable(Date,City,Weather) VALUES ('2018-08-02','Ankara','Sun');
INSERT INTO mytable(Date,City,Weather) VALUES ('2018-08-01','Ankara','Sun');
INSERT INTO mytable(Date,City,Weather) VALUES ('2018-08-11','Cairo','Clouds');
INSERT INTO mytable(Date,City,Weather) VALUES ('2018-08-10','Cairo','Sun');
INSERT INTO mytable(Date,City,Weather) VALUES ('2018-08-09','Cairo','Sun');
INSERT INTO mytable(Date,City,Weather) VALUES ('2018-08-08','Cairo','Sun');
INSERT INTO mytable(Date,City,Weather) VALUES ('2018-08-07','Cairo','Sun');
INSERT INTO mytable(Date,City,Weather) VALUES ('2018-08-06','Cairo','Sun');
INSERT INTO mytable(Date,City,Weather) VALUES ('2018-08-05','Cairo','Clouds');
INSERT INTO mytable(Date,City,Weather) VALUES ('2018-08-04','Cairo','Sun');
INSERT INTO mytable(Date,City,Weather) VALUES ('2018-08-03','Cairo','Sun');
INSERT INTO mytable(Date,City,Weather) VALUES ('2018-08-02','Cairo','Sun');
INSERT INTO mytable(Date,City,Weather) VALUES ('2018-08-01','Cairo','Sun');

Using this query:

declare @day_range integer = 7;
declare @ignore_range integer = 5;

select 
        t.date, t.city, t.weather
      , datediff(day,ca1.prior_dt,t.date) as prior_the_same
      , ca2.prior_types
from mytable t
cross apply (
    select min(prev.date) as prior_dt
    from mytable as prev 
    where prev.city = t.city
    and prev.date between dateadd(day,-@day_range,t.date) and t.date
    and prev.weather = t.weather
    ) ca1
cross apply (
    select count(prev.weather) as prior_types
    from mytable as prev 
    where prev.city = t.city
    and prev.date between dateadd(day,-@day_range,t.date) and t.date
    and prev.weather <> t.weather
    ) ca2
order by t.city, t.date DESC

The following is the result:

+----+---------------------+--------+---------+----------------+-------------+----------+
|    |        date         |  city  | weather | prior_the_same | prior_types |expected? |
+----+---------------------+--------+---------+----------------+-------------+----------+
|  1 | 11.08.2018 00:00:00 | Ankara | Sun     |              5 |           2 |          |
|  2 | 10.08.2018 00:00:00 | Ankara | Sun     |              4 |           3 |          |
|  3 | 09.08.2018 00:00:00 | Ankara | Sun     |              7 |           3 |          |
|  4 | 08.08.2018 00:00:00 | Ankara | Sun     |              7 |           3 |          |
|  5 | 07.08.2018 00:00:00 | Ankara | Sun     |              6 |           3 |          |
|  6 | 06.08.2018 00:00:00 | Ankara | Sun     |              5 |           3 |          |
|  7 | 05.08.2018 00:00:00 | Ankara | Rain    |              2 |           3 |          |
|  8 | 04.08.2018 00:00:00 | Ankara | Clouds  |              0 |           3 |          |
|  9 | 03.08.2018 00:00:00 | Ankara | Rain    |              0 |           2 |          |
| 10 | 02.08.2018 00:00:00 | Ankara | Sun     |              1 |           0 |          |
| 11 | 01.08.2018 00:00:00 | Ankara | Sun     |              0 |           0 |          |
| 12 | 11.08.2018 00:00:00 | Cairo  | Clouds  |              6 |           6 |          |
| 13 | 10.08.2018 00:00:00 | Cairo  | Sun     |              7 |           1 |          |
| 14 | 09.08.2018 00:00:00 | Cairo  | Sun     |              7 |           1 |          |
| 15 | 08.08.2018 00:00:00 | Cairo  | Sun     |              7 |           1 |          |
| 16 | 07.08.2018 00:00:00 | Cairo  | Sun     |              6 |           1 |          |
| 17 | 06.08.2018 00:00:00 | Cairo  | Sun     |              5 |           1 |          |
| 18 | 05.08.2018 00:00:00 | Cairo  | Clouds  |              0 |           4 |          |
| 19 | 04.08.2018 00:00:00 | Cairo  | Sun     |              3 |           0 |          |
| 20 | 03.08.2018 00:00:00 | Cairo  | Sun     |              2 |           0 |          |
| 21 | 02.08.2018 00:00:00 | Cairo  | Sun     |              1 |           0 |          |
| 22 | 01.08.2018 00:00:00 | Cairo  | Sun     |              0 |           0 |          |
+----+---------------------+--------+---------+----------------+-------------+----------+

Over more than one question you have expanded on your requirements. May I suggest you consider the above and decide if you can use the 2 calculations to arrive at the wanted final result. If you are still unable to come to a conclusion use the text table format to include the "expected result" as a new column

Paul Maxwell
  • 33,002
  • 3
  • 32
  • 51
  • Thank you so much for this. I'm learning to get better at asking questions here. Thanks for your patience! +++ It's very close. When I remove the two lines `and prev.date between dateadd(day,-@day_range,t.date) and t.date` (lines 13 and 20) from your query it's even closer. However '@ignore_range' is defined but not used. Perhaps that's why it keeps counting even after gaps of 5+ days? I've tried getting the right result but I'm not quite there. +++ Please see the edited question for the 'expected result' column. I also added a 'why' column. Cheers – Peete Oct 29 '18 at 20:11
  • Ignore_range was something I thought was going to need but didn't use. It has no impact at all on the results I show above. – Paul Maxwell Oct 29 '18 at 22:07
  • The biggest issue for us (to provide answers) is that you keep changing the rules. This is now the third question and on each iteration the rules have moved. Up till now there was always a limit on how far back you wanted to go. Now it seems that there is NO LIMIT to the number of days backward you wish to consider, and that **the only "twist"** is a gap of 6 days or more (>5 days). – Paul Maxwell Oct 29 '18 at 22:12
  • Sorry for the confusion. Please ignore the other questions. All logic is supposed to be in this one, I hope I didn't miss anything. +++ The limit for going backwards is just that gap of 5 days (>=5). So stop going back once there are >=5 days of different weather. Or back to the very first record if there is no gap. Many thanks. – Peete Oct 29 '18 at 22:58
  • 1
    I have proposed a new query to consider (it contains a "twist") – Paul Maxwell Oct 29 '18 at 23:17
  • Excellent, you are a hero! I've taken the opportunity to learn more about cross apply as well. Also thanks for helping me improve my SO question style. – Peete Oct 30 '18 at 21:33