30

I am facing a rather interesting problem. I have a table with the following structure:

CREATE TABLE [dbo].[Event]
(
    Id int IDENTITY(1,1) NOT NULL,
    ApplicationId nvarchar(32) NOT NULL,
    Name nvarchar(128) NOT NULL,
    Description nvarchar(256) NULL,
    Date nvarchar(16) NOT NULL,
    Time nvarchar(16) NOT NULL,
    EventType nvarchar(16) NOT NULL,
    CONSTRAINT Event_PK PRIMARY KEY CLUSTERED ( Id ) WITH (
        PAD_INDEX = OFF, 
        STATISTICS_NORECOMPUTE = OFF, 
        IGNORE_DUP_KEY = OFF, 
        ALLOW_ROW_LOCKS = ON, 
        ALLOW_PAGE_LOCKS  = ON
    )
)

So the issue is that I have to display this data in a grid. There are two requirements. The first one is to display all events regardless of what application threw them. This is simple - a select statement will do the job very easily.

The second requirement is to be able to group events by Application. In other words display all events in a way that if the ApplicationId is repeated more than once, grab only the last entry for every application. The primary key of the Event (Id) at this point is no longer needed in this query/view.

You may also notice that the Event Date and Time are in string format. This is ok because they follow the standard date time formats: mm/dd/yyyy and hh:mm:ss. I can pull those as follows:

Convert( DateTime, (Date + ' ' +  Time)) AS 'TimeStamp'

My issue is that if I use AGGREGATE functions on the rest of the columns I don't know how would they behave:

SELECT
    ApplicationId,
    MAX(Name),
    MAX(Description),
    MAX( CONVERT(DateTime, (Date + ' ' + Time))) AS 'TimeStamp',
    MAX( EventType )
FROM
    Event
GROUP BY
    ApplicationId

The reason I am hesitant to do so is because a function such as MAX will return the largest value for a given column from a (sub)set of records. It does not necessary pull the last record!

Any ideas on how to select only the last record on a per application basis?

Tab Alleman
  • 31,483
  • 7
  • 36
  • 52
bleepzter
  • 9,607
  • 11
  • 41
  • 64
  • 3
    Use windowing functions (in Oracle, something like row_number() over (partition by...), AFAIK SQL server has similar functionality. – Frank Schmitt Jun 01 '11 at 12:42

10 Answers10

50

You can use a ranking function and a common table expression.

WITH e AS
(
     SELECT *,
         ROW_NUMBER() OVER
         (
             PARTITION BY ApplicationId
             ORDER BY CONVERT(datetime, [Date], 101) DESC, [Time] DESC
         ) AS Recency
     FROM [Event]
)
SELECT *
FROM e
WHERE Recency = 1
Anthony Faull
  • 17,549
  • 5
  • 55
  • 73
  • You can't just order by Date and Time without converting to a datetime value, because `mm/dd/yyyy` format doesn't sort correctly as a string. – Damien_The_Unbeliever Jun 01 '11 at 12:56
  • 1
    Thank you @Anthony Faull. This works however I don't understand how. – bleepzter Jun 01 '11 at 12:56
  • @damien Good catch. I have updated the ORDER BY clause to convert US dates (month-day-year) into sortable dates. – Anthony Faull Jun 01 '11 at 13:11
  • 1
    Even though it is a very late comment and doesn't help @bleepzter anymore it might help others to understand how it works: The partition by devides the data (the events) into subsets where each subset has the same appliationid. Each subset is ordered by date and time. Then each row gets a row number. That describes the "with" part. The follwing select takes the result of the "with" statement and outputs all entries with a row number of 1. – Sascha Mar 16 '18 at 11:33
10

Since SQL Server 2012 you can simply

SELECT 
    [Month]
    , [First] = FIRST_VALUE(SUM([Clicks])) OVER (ORDER BY [Month])
    , [Last]  = FIRST_VALUE(SUM([Clicks])) OVER (ORDER BY [Month] DESC)
FROM 
    [dbo].[Table]
GROUP BY [Month]
ORDER BY [Month]
tartakynov
  • 2,768
  • 3
  • 26
  • 23
  • FIRST_VALUE with OVER - very impressive! learned something new today!! thank you. A plus for SQL version ref too! – Leo Gurdian Feb 14 '19 at 23:56
1

You can use a sub query with group by - the group by argument does not need to be in the select. This assumes Id is a auto incremented so that the largest one is the most recent.

SELECT
    ApplicationId,
    Name,
    Description,
    CONVERT(DateTime, (Date + ' ' + Time)) AS 'TimeStamp',
    EventType
FROM
    Event e
WHERE
    Id in (select max(Id) from Event GROUP BY ApplicationId)
cordsen
  • 1,691
  • 12
  • 10
0

I think it will work for many out there willing to fetch the last inserted record and it should be group by:

select * from (select * from TableName ORDER BY id DESC) AS x GROUP BY FieldName

It will work for the following:

Table Structure ID Name Status 1 Junaid Yes 2 Jawad No 3 Fahad Yes 4 Junaid No 5 Kashif Yes

Results After Query Above ID Name Status 4 Junaid No 2 Jawad No 3 Fahad Yes 4 Kashif Yes

It is simply resulting the last record of group by names.

0

After 6 years another answer for SQL Server:

select t1.[Id], t2.[Value]  
from [dbo].[Table] t1  
  outer apply (  
    select top 1 [Value]  
      from [dbo].[Table] t2  
        where t2.[Month]=t1.[Month]  
      order by [dbo].[Date] desc  
  )  

Although I like Postgresql solution much better with its distinct on feature which is nicer to type and much more efficient:

select distinct on (id),val  
from tbl  
order by id,val  
0

At first I used to use CTE with row_number, but one example in SQL server certification course showed me better example (judging by getting consistently better execution plans):

SELECT
  ApplicationId,
  Name,
  Description,
  CONVERT(DateTime, (Date + ' ' + Time)) AS 'TimeStamp',
  EventType
FROM
  Event AS E
WHERE
  NOT EXISTS(SELECT * FROM Event AS Newer WHERE Newer.ApplicationId = E.ApplicationId AND Newer.Id > E.Id)
GROUP BY
  ApplicationId

I'm assuming that larger Id implies larger Date+Time (otherwise I would use convert to datetime, which is not SARGable though). This query will find youngest record - for which younger record doesn't exist. If indexes are set right, this will use index seeks. The alternative with ranking function generally uses table scan, since it ranks all records.

andowero
  • 439
  • 4
  • 13
0
SELECT
    E.ApplicationId,
    E.Name,
    E.Description,
    CONVERT(DateTime, (E.Date + ' ' + E.Time)) AS 'TimeStamp',
    E.EventType
FROM
    Event E
    JOIN (SELECT ApplicationId,
                 MAX(CONVERT(DateTime, (Date + ' ' + Time))) AS max_date
            FROM Event
        GROUP BY ApplicationId) EM 
      on EM.ApplicationId = E.ApplicationId
     and EM.max_date = CONVERT(DateTime, (E.Date + ' ' + E.Time)))
manji
  • 47,442
  • 5
  • 96
  • 103
0

Because you don't have a where clause in there, the subset of records, is all the records. But you are putting max on the wrong column(s) I think. This query will give you what you're looking for.

Select max(applicationid), name, description, CONVERT(DateTime, (Date + ' ' + Time)) 
from event
group by name, description, CONVERT(DateTime, (Date + ' ' + Time)) 
Mikael Eriksson
  • 136,425
  • 22
  • 210
  • 281
Jody
  • 8,021
  • 4
  • 26
  • 29
0

You can use a subqery or CTE table to do this:

;WITH CTE_LatestEvents as (
SELECT
    ApplicationId,    
    MAX( CONVERT(DateTime, (Date + ' ' + Time))) AS 'LatestTimeStamp',
FROM
    Event
GROUP BY
    ApplicationId
)
SELECT
    ApplicationId,
    Name,
    Description,
    CONVERT(DateTime, (Date + ' ' + Time))) AS 'TimeStamp',
    EventType
FROM
    Event e
    Join CTE_LatestEvents le 
        on e.applicationid = le.applicationid
        and CONVERT(DateTime, (e.Date + ' ' + e.Time))) = le.LatestTimeStamp
Jon Egerton
  • 40,401
  • 11
  • 97
  • 129
0

I had the same issue. Now, I didn't want to overly complicate things with CTEs and "OVER". Here's a simple example. I wrote a sub-query with a group by MAX(DateEntered). You can possibly want to do by ID if it's int, for instance, that'd be more accurate than Date/Time. In any case, once you have this sub-query, you just inner-join that to your main query to act as a filter for the records. It's that simple.

Table a is my users table. Table b is the sub-query and table c is the table I want "filtered".

SELECT DISTINCT a.FirstName,a.LastName,a.ImagePath, c.MessageText
        FROM [AuthUsers] a 
            INNER JOIN (SELECT MessageFromId,MAX(DateEntered) AS LastEntered FROM ChatRoomConversation GROUP BY MessageFrom) AS b
                ON a.Id=b.MessageFromId
            INNER JOIN ChatRoomConversation c
                ON b.LastEntered=c.DateEntered
Umar AlFarooq
  • 53
  • 1
  • 8