2

I have the following two tables:

  • DimensionTime is a table that contains every month, ordered by ID, in a YYYMM00 form
  • LogPlayer is a table where there are some statistics related to a player and a specific month.

What I'd like to get is the following:

+--------+--------+----------+----------+
| Player |  Team  |  Start   |   End    |
+--------+--------+----------+----------+
| John   | Red    | 20180100 | 20180300 |
| John   | Red    | 20180600 | 20180700 |
| Luke   | Yellow | 20180100 | 20180100 |
| Luke   | Yellow | 20190100 | 20190100 |
+--------+--------+----------+----------+

I can't use MIN and MAX functions because the periods are discontinuous...how can I resolve? I have tryied with MIN/MAX combined with GROUP BY but I get nothing useful. I dind't find any question or answer here on Stackoverflow.

SELECT *
    INTO #DimensionTime
    FROM (
        SELECT 1 AS [ID], 20180100 AS [TIMEID]
        UNION ALL
        SELECT 2 AS [ID], 20180200 AS [TIMEID]
        UNION ALL
        SELECT 3 AS [ID], 20180300 AS [TIMEID]
        UNION ALL
        SELECT 4 AS [ID], 20180400 AS [TIMEID]
        UNION ALL
        SELECT 5 AS [ID], 20180500 AS [TIMEID]
        UNION ALL
        SELECT 6 AS [ID], 20180600 AS [TIMEID]
        UNION ALL
        SELECT 7 AS [ID], 20180700 AS [TIMEID]
        UNION ALL
        SELECT 8 AS [ID], 20180800 AS [TIMEID]
        UNION ALL
        SELECT 9 AS [ID], 20180900 AS [TIMEID]
        UNION ALL
        SELECT 10 AS [ID], 20181000 AS [TIMEID]
        UNION ALL
        SELECT 11 AS [ID], 20181100 AS [TIMEID]
        UNION ALL
        SELECT 12 AS [ID], 20181200 AS [TIMEID]
        UNION ALL
        SELECT 13 AS [ID], 20190100 AS [TIMEID]
        UNION ALL
        SELECT 14 AS [ID], 20190200 AS [TIMEID]
        UNION ALL
        SELECT 15 AS [ID], 20190300 AS [TIMEID]
    ) A

SELECT *
INTO #LogPlayer
FROM (
    SELECT 'John' AS [Player], 'Red' AS [Team], 20180100 AS [TIMEID]
    UNION ALL
    SELECT 'John' AS [Player], 'Red' AS [Team], 20180200 AS [TIMEID]
    UNION ALL
    SELECT 'John' AS [Player], 'Red' AS [Team], 20180300 AS [TIMEID]
    UNION ALL
    SELECT 'John' AS [Player], 'Red' AS [Team], 20180600 AS [TIMEID]
    UNION ALL
    SELECT 'John' AS [Player], 'Red' AS [Team], 20180700 AS [TIMEID]
    UNION ALL
    SELECT 'Luke' AS [Player], 'Yellow' AS [Team], 20180100 AS [TIMEID]
    UNION ALL
    SELECT 'Luke' AS [Player], 'Yellow' AS [Team], 20190100 AS [TIMEID]
) B
Nicolaesse
  • 2,554
  • 12
  • 46
  • 71
  • 2
    Are you really using SQL Server, and especially SQL Server 2005 (which is way out of support)? SQL Server doesn't support `yyyyMM00` dates; that's a MySQL thing. – Thom A Dec 11 '18 at 14:54
  • what datatype is timeID varchar? – xQbert Dec 11 '18 at 14:55
  • @Larnu I thing his field is of type INT or BIG INT, in any case if indeed SQL Server 2005 is not the best scenario – apomene Dec 11 '18 at 14:56
  • 4
    Aside from the strange "dates" this seems to be a situation where you need to group contiguous dates. Jeff Moden has a great way of handling this [here](http://www.sqlservercentral.com/articles/T-SQL/71550/). You would just need to modify this slightly to use month instead of day. – Sean Lange Dec 11 '18 at 14:56
  • @Lamu Yes, unfortunally I am using a vintage SQL Server 2005 and YYYYMM00, for these reason, is stored as VARCHAR. If could be helpful I could run the stored procedure on a new SQL Server 2018 by linked server. – Nicolaesse Dec 11 '18 at 14:57
  • `L a r n u`. Also SQL Server 2018? That doesn't exist. There is 2017 and 2019, which is currently in preview. Sean, however, gives a good option for you though. – Thom A Dec 11 '18 at 15:16
  • @Larnu, You are defenitely right, I have SQL Server 2017 :-) – Nicolaesse Dec 11 '18 at 15:31
  • Every time you store a date as a varchar, Satan drowns a kitten. – Joel Coehoorn Dec 11 '18 at 15:50

2 Answers2

2

This is a type of gaps-and-islands problem. It is solvable even in non-supported ancient software like SQL Server 2005, because that version has row_number().

One trick is converting the time id to a bona fide date/time. The other trick is to define the groups by subtracting a sequential number of months from the date/time value:

select player, team, min(timeid), max(timeid)
from (select lp.*,
             row_number() over (partition by player, team order by timeid) as seqnum,
             cast(cast(timeid + 1 as varchar(255)) as datetime) as yyyymm
      from logplayer lp
     ) lp
group by player, team, dateadd(month, - seqnum, yyyymm)
order by player, team, min(timeid);

Here is a db<>fiddle.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
1

you could do this sort of thing to find the begining and end of runs of dates.

Convert to 'date' via a CTE (which I think exists in 2005) then use Cross Apply an EXIST to find the start and end of a run of dates

you give no data for player and team, but you could add to the WHERE conditions in EXISTS , then GROUP BY - if needed

;WITH dats as (SELECT CAST(LEFT(timeid, 6) + '01' as datetime) as DT from #DimensionTime)
    select CONVERT(varchar(7),d1.DT,112) +'0'  as strt, 
           CONVERT(varchar(7),dq.dt,112) +'0' as [end] from dats d1 
        CROSS  APPLY
          (SELECT TOP 1 d3.dt from dats d3 where 
                                            d3.dt >  d1.dt 
                                            and 
                                            not exists(
                                                    select 0 from dats d4 where d4.DT = dateadd(month,1,d3.DT)
                                                      )
        ORDER BY d3.dt asc) DQ           
        where not exists(select 0 from dats d2 where d2.DT = dateadd(month,-1,d1.DT)) ;

taking a guess as to some sample data, I tried

SELECT *
INTO #DimensionTime
FROM (
    SELECT 1 AS [ID], 20180100 AS [TIMEID], 'john' as player, 'red' as team 
    UNION ALL
    SELECT 2 AS [ID], 20180200 AS [TIMEID], 'john','red'
    UNION ALL
    SELECT 3 AS [ID], 20180300 AS [TIMEID], 'john','red'
    UNION ALL
    SELECT 4 AS [ID], 20180400 AS [TIMEID], 'john','red'
    UNION ALL
    SELECT 5 AS [ID], 20180500 AS [TIMEID], 'john','red'
    UNION ALL
    SELECT 7 AS [ID], 20180700 AS [TIMEID], 'john','red'
    UNION ALL
    SELECT 8 AS [ID], 20180800 AS [TIMEID], 'john','red'
    UNION ALL
    SELECT 9 AS [ID], 20180900 AS [TIMEID], 'john','red'
    UNION ALL
    SELECT 11 AS [ID], 20181100 AS [TIMEID], 'john','red'
    UNION ALL
    SELECT 12 AS [ID], 20181200 AS [TIMEID], 'john','red'
    UNION ALL
    SELECT 13 AS [ID], 20190100 AS [TIMEID], 'john','red'
    UNION ALL
    SELECT 14 AS [ID], 20190200 AS [TIMEID], 'john','red'
    UNION ALL
    SELECT 15 AS [ID], 20190300 AS [TIMEID], 'john','red'
    UNION ALL 
    SELECT 1 AS [ID], 20180100 AS [TIMEID], 'luke','yellow'
    UNION ALL
    SELECT 2 AS [ID], 20180200 AS [TIMEID], 'luke','yellow'
    UNION ALL
    SELECT 4 AS [ID], 20180400 AS [TIMEID], 'luke','yellow'
    UNION ALL
    SELECT 5 AS [ID], 20180500 AS [TIMEID], 'luke','yellow'
    UNION ALL
    SELECT 8 AS [ID], 20180800 AS [TIMEID], 'luke','yellow'
    UNION ALL
    SELECT 9 AS [ID], 20180900 AS [TIMEID], 'luke','yellow'
    UNION ALL
    SELECT 12 AS [ID], 20181200 AS [TIMEID], 'luke','yellow'
    UNION ALL
    SELECT 13 AS [ID], 20190100 AS [TIMEID], 'luke','yellow'
    UNION ALL
    SELECT 14 AS [ID], 20190200 AS [TIMEID], 'luke','yellow'
    UNION ALL
    SELECT 15 AS [ID], 20190300 AS [TIMEID], 'luke','yellow'


) A



;WITH dats as (SELECT CAST(LEFT(timeid, 6) + '01' as datetime) as DT,player,team from #DimensionTime)
    select d1.team,d1.player,
            CONVERT(varchar(7),d1.DT,112) +'0'  as strt, 
           CONVERT(varchar(7),dq.dt,112) +'0' as [end] from dats d1 
        CROSS  APPLY
          (SELECT TOP 1 d3.dt from dats d3 where 
                                            d3.dt >  d1.dt 
                                            and
                                            d3.player = d1.player
                                            and
                                            d3.team = d1.team
                                            and 
                                            not exists(
                                                    select 0 from dats d4 where d4.DT = dateadd(month,1,d3.Dt)
                                                         and  d4.team = d3.team
                                                         and d4.player = d3.player
                                                      )
        ORDER BY d3.dt asc) DQ           
        where not exists(select 0 from dats d2 where 
                    d2.player=d1.player
                    and
                    d2.team = d1.team
                    and 
                    d2.DT = dateadd(month,-1,d1.DT) and d1.team=d2.team and d1.player = d2.player ) ;

drop table #DimensionTime;

with the latest tables that sorry I missed, I devised

;WITH dats as (SELECT CAST(LEFT(timeid, 6) + '01' as datetime) as DT,player,team from #LogPlayer)
    select d1.team,d1.player,
            CONVERT(varchar(7),d1.DT,112) +'0'  as strt, 
           CONVERT(varchar(7),dq.dt,112) +'0' as [end] from dats d1 
        CROSS  APPLY
          (SELECT TOP 1 d3.dt from dats d3 where 
                                            d3.dt >  d1.dt 
                                            and
                                            d3.player = d1.player
                                            and
                                            d3.team = d1.team
                                            and 
                                            not exists(
                                                    select 0 from dats d4 where d4.DT = dateadd(month,1,d3.Dt)
                                                         and  d4.team = d3.team
                                                         and d4.player = d3.player
                                                      )
        ORDER BY d3.dt asc) DQ           
        where not exists(select 0 from dats d2 where 
                    d2.player=d1.player
                    and
                    d2.team = d1.team
                    and 
                    d2.DT = dateadd(month,-1,d1.DT) and d1.team=d2.team and d1.player = d2.player ) ;
Cato
  • 3,652
  • 9
  • 12
  • Sorry but I can't understand how to run it. I have run the code and I only get strt 20180100 AS [strt] and 20190300 AS [end], which is the same as calculate MIN and MAX of #DimensionTime table – Nicolaesse Dec 11 '18 at 15:31
  • so which dates would you want to be listed from your sample data? 20180100 is the start of a run of consecutive months, and 20190300 is the end of the run - if you delete a record in between, you will get two sequences – Cato Dec 11 '18 at 15:33
  • I'd like to have the start/end data for every period of every player and avery team, like the example in the first question. – Nicolaesse Dec 11 '18 at 15:34
  • your sample data has no teams or players mentioned - i'll try and put some in for you – Cato Dec 11 '18 at 15:35
  • sorry I saw some data tghere after all, I added an attempt at doing this – Cato Dec 11 '18 at 15:56