1

I Have the following SQL Tables:

[Calendar]

[CalendarId]
[Name]

SAMPLE DATA:

CalendarId  ResourceKey    Name
1           1              tk1-Room1
2           2              tk1-Room2
3           3              tk1-noentries

[CalendarEntry]

[CalendarId]
[CalendarEntryId]
[Start]
[End]

SAMPLE DATA:

CalendarId  Start                            End
1           2019-11-18 16:00:00.0000000      2019-11-18 17:00:00.0000000
1           2019-11-19 16:00:00.0000000      2019-11-19 17:00:00.0000000
2           2019-11-25 16:00:00.0000000      2019-11-25 17:00:00.0000000
1           2019-11-25 17:00:00.0000000      2019-11-25 18:00:00.0000000

Expected output:

Name             StartDate             EndDate                ResourceKey
tk1-Room1        2019-11-25 17:00:00   2019-11-25 17:00:00    1
tk1-Room2        2019-11-25 16:00:00   2019-11-25 17:00:00    2
tk1-noentries    NULL                  NULL                   3

I am trying to list all Calendar entries, with their corresponding Start(Most Recent) and End Times.

I have the following code which is working partially:

SELECT Name,StartDate,ResourceKey FROM [Calendar].[dbo].[Calendar] CAL
LEFT JOIN(
    SELECT 
        CalendarId, 
        MAX(ENT.[Start]) as StartDate
    FROM [CalendarEntry] ENT
    GROUP BY CalendarId
    )
AS ST on CAL.CalendarId = ST.CalendarId

However, If i was to include that column, In my sub SELECT, EG:

    SELECT 
        CalendarId, 
        MAX(ENT.[Start]) as StartDate,
        ENT.[End] as endDate

I get the following error:

Column 'CalendarEntry.End' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.

However, including it in the GROUP BY now causes Multiple CalendarEntry Rows to come back for each Calendar..

What is the best way for me to grab the most recent row out of CalendarEntry which allows me access to all the columns?

Thanks!

GMB
  • 216,147
  • 25
  • 84
  • 135
tornup
  • 263
  • 1
  • 4
  • 15
  • 1
    Sample data and desired results would help. Your code has columns that are not mentioned as part of the tables. – Gordon Linoff Nov 25 '19 at 13:02
  • When GROUP BY, UNION and SELECT DISTINCT, only select list items can be used in the ORDER BY. Makes perfect sense. – jarlh Nov 25 '19 at 13:07

3 Answers3

2

This is a typical top 1 per group question.

You can either use row_number():

select *
from (
    select 
        c.*,
        e.*,
        row_number() over(partition by c.CalendarId order by e.Start desc) rn
    from [Calendar].[dbo].[Calendar] c
    left join [CalendarEntry] e ON c.CalendarId = e.CalendarId
) t
where rn = 1

Or you can filter with a correlated subquery:

select c.*, e.*
from [Calendar].[dbo].[Calendar] c
left join [CalendarEntry] e 
    on c.CalendarId = e.CalendarId
    and c.Start = (
        select max(e1.Start) from [CalendarEntry] e where c.CalendarId = e1.CalendarId
    ) 
GMB
  • 216,147
  • 25
  • 84
  • 135
  • This worked, However on closer inspection, it looks like any Calendar Which does not have any Entries is not returned... my Test data contains 13 Calendars, only 12 have entries. and I am only getting 12 results back. – tornup Nov 25 '19 at 14:25
1

I am trying to list all Calendar entries, with their corresponding Start(Most Recent) and End Times.

I interpret this as the most recent record from CalendarEntry for each CalendarId:

select ce.*
from CalendarEntry ce
where ce.StartDate = (select max(ce2.StartDate)
                      from CalendarEntry ce2
                      where ce2.CalendarId = ce.CalendarId
                     );
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
0

You can try OUTER APPLY too, however @GMB's answer is a better approach from performance prospective

SELECT Name,
       StartDate,
       EndDate,
       ResourceKey
FROM dbo.Calendar AS C
    OUTER APPLY
(
    SELECT TOP 1 *
    FROM dbo.CalendarEntry
    WHERE CalendarId = C.CalendarId
    ORDER BY StartDate DESC,
             EndDate DESC
) AS K;

You can also try LAST_VALUE/FIRST_VALUE(available in SQL Server 2012 and later) functions too, as below, , however again @GMB's answer is a better approach from performance prospective:

SELECT DISTINCT
       Name,
       LAST_VALUE(StartDate) OVER (PARTITION BY C.CalendarId
                                   ORDER BY StartDate,EndDate
                                   ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING),
       LAST_VALUE(EndDate) OVER (PARTITION BY C.CalendarId
                                 ORDER BY StartDate,EndDate
                                 ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING),
       ResourceKey
FROM dbo.Calendar AS C
    LEFT JOIN dbo.CalendarEntry
        ON CalendarEntry.CalendarId = C.CalendarId;

If you want to use FIRST_VALUE function, then you should rewrite the order by as below:

ORDER BY StartDate DESC,EndDate DESC

And you also you will not need to specify the ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING section

SELECT DISTINCT
       Name,
       FIRST_VALUE(StartDate) OVER (PARTITION BY C.CalendarId
                                   ORDER BY StartDate DESC,EndDate DESC),
       FIRST_VALUE(EndDate) OVER (PARTITION BY C.CalendarId
                                 ORDER BY StartDate DESC,EndDate DESC),
       ResourceKey
FROM dbo.Calendar AS C
    LEFT JOIN dbo.CalendarEntry
        ON CalendarEntry.CalendarId = C.CalendarId;
Vahid Farahmandian
  • 6,081
  • 7
  • 42
  • 62