0

I'm attempting to pull some custom reports from a mandated SQL program than we must use at work and I'm running into a couple issues. I can pull all the data I need easily but for each unique person id/task id combination I only need the most current value. Additionally, if possible I want the latest value from either the due date or the waiver date column whichever is greater.


    PersonnelTrainingEvent PersonnelID  TrainingEventTypeID DueDate   WaiverDate    Personnel ID    TrainingEventType ID    Taskcode    PersonnelDetail PersonnelID   
    5351                                25947               1/1/1900  1/1/1900      5351            25947                   Mob2        5351  
    5351                                28195               8/1/2012  1/1/1900      5351            28195                   CA01        5351  
    5351                                26551               7/29/2010 1/1/1900      5351            26551                   Mob10       5351  
    5351                                25947               1/31/2012 1/1/1900      5351            25947                   Mob2        5351  
    5351                                28196               11/1/2012 1/1/1900      5351            28196                   CA02        5351  
    5418                                28195               1/1/1900  1/1/1900      5418            28195                   CA01        5418  
    5418                                30174               1/1/1900  1/1/1900      5418            30174                   PJ18        5418  
    5418                                28624               1/31/2014 2/1/2014      5418            28624                   GA42        5418  
    5418                                28595               6/30/2014 6/30/2014     5418            28595                   GA43        5418  
    5418                                28196               1/1/1900  1/1/1900      5418            28196                   CA02        5418  
    6022                                28195               3/3/2011  1/1/1900      6022            28195                   CA01        6022  
    6022                                28885               10/31/20121/1/1900      6022            28885                   CA07        6022  
    6022                                28884               1/1/1900  1/1/1900      6022            28884                   CA06        6022  
    6022                                28884               1/31/1901 1/1/1900      6022            28884                   CA06        6022  
    6022                                28196               1/1/1900  1/1/1900      6022            28196                   CA02        6022  
    6022                                28196               2/28/2011 1/1/1900      6022            28196                   CA02        6022  
    6022                                28624               9/30/2013 1/1/1900      6022            28624                   GA42        6022  
    6022                                28595               2/28/2014 1/1/1900      6022            28595                   GA43        6022  
    6022                                30174               2/28/2014 1/1/1900      6022            30174                   PJ18        6022  

Here is the query I'm using...

SELECT
  PersonnelTrainingEvent.PersonnelID AS [PersonnelTrainingEvent PersonnelID]  
  ,PersonnelTrainingEvent.TrainingEventTypeID  
  ,PersonnelTrainingEvent.DueDate  
  ,PersonnelTrainingEvent.WaiverDate  
  ,Personnel.ID AS [Personnel ID]  
  ,TrainingEventType.ID AS [TrainingEventType ID]  
  ,TrainingEventType.Taskcode  
  ,PersonnelDetail.PersonnelID AS [PersonnelDetail PersonnelID]  
FROM  
  PersonnelTrainingEvent  
  INNER JOIN TrainingEventType  
    ON PersonnelTrainingEvent.TrainingEventTypeID = TrainingEventType.ID  
  INNER JOIN Personnel  
    ON PersonnelTrainingEvent.PersonnelID = Personnel.ID  
  INNER JOIN PersonnelDetail  
    ON Personnel.ID = PersonnelDetail.PersonnelID  
WHERE  
  TrainingEventType.Taskcode IN (N'GA43', N'MOB2', N'CA01', N'CA02', N'Mob10', N'PJ67', N'CA06', N'CA07', N'T104', N'GA42', N'PJ18')  
Group By  
  Personnel.ID, TrainingEventType.Taskcode;  

I'm currently on vacation and getting glared at by my wife but I've been working on this query for 3 weeks now and I'm pounding my head against the wall. I've included a sample of the preferred outcome below...


    PersonnelTrainingEvent PersonnelID  TrainingEventTypeID DueDate   WaiverDate    Personnel ID    TrainingEventType ID    Taskcode    PersonnelDetail PersonnelID   
    5351                                28195               8/1/2012  1/1/1900      5351            28195                   CA01        5351  
    5351                                26551               7/29/2010 1/1/1900      5351            26551                   Mob10       5351  
    5351                                25947               1/31/2012 1/1/1900      5351            25947                   Mob2        5351  
    5351                                28196               11/1/2012 1/1/1900      5351            28196                   CA02        5351  
    5418                                28195               1/1/1900  1/1/1900      5418            28195                   CA01        5418  
    5418                                30174               1/1/1900  1/1/1900      5418            30174                   PJ18        5418  
    5418                                28624               1/31/2014 2/1/2014      5418            28624                   GA42        5418  
    5418                                28595               6/30/2014 6/30/2014     5418            28595                   GA43        5418  
    5418                                28196               1/1/1900  1/1/1900      5418            28196                   CA02        5418  
    6022                                28195               3/3/2011  1/1/1900      6022            28195                   CA01        6022  
    6022                                28885               10/31/20121/1/1900      6022            28885                   CA07        6022  
    6022                                28884               1/31/1901 1/1/1900      6022            28884                   CA06        6022  
    6022                                28196               2/28/2011 1/1/1900      6022            28196                   CA02        6022  
    6022                                28624               9/30/2013 1/1/1900      6022            28624                   GA42        6022  
    6022                                28595               2/28/2014 1/1/1900      6022            28595                   GA43        6022  
    6022                                30174               2/28/2014 1/1/1900      6022            30174                   PJ18        6022  


Here are the links to the other answers I've looked at but I'm a learn by doing kinda guy and these seemed to help a little but I'm not understanding all the syntax...

SQL Select, Specific Rows based on multiple conditions?
SQL server select distinct rows using most recent value only
SQL Select with Group By and Order By Date
SQL server select distinct rows using values before a certain date
How to select only the latest rows for each user?
Get Distinct rows from a result of JOIN in SQL Server
http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=47479
Selecting latest rows in subgroups

I appreciate any help as I'm working to learn to do this myself. I will provide any input requested or a screenshot if I increase my rep enough to allow that. Thanks!

Community
  • 1
  • 1
Ryan
  • 53
  • 6
  • Have a look at using [ROW_NUMBER](http://technet.microsoft.com/en-us/library/ms186734(v=sql.105).aspx) using partition by `PersonID, TaskID` and ordering it by what you need, then only selecting where the `Row_Number = 1` – Adriaan Stander Dec 31 '13 at 05:08
  • If this is SQL Server, your current example query should throw a syntax error because of a mismatch between your `SELECT` list and the `GROUP BY` clause (and no aggregates being used). Could we get some source/example data? I'm not convinced the `GROUP BY` clause is completely necessary. – Clockwork-Muse Dec 31 '13 at 06:20
  • Add Inner join somthing like this query : `Inner join (Select PersonID, TaskID, MAX(dueDate) as MaxdueDate From yourTable Group By PersonID, TaskID) ` – Al-3sli Dec 31 '13 at 06:43
  • @Clockwork-Muse the `Group by` was added after I pulled the data as I was thinking that should be included in my next step. I meant to pull it out prior to posting, the original query includes references to additional tables that I didn't feel were relevant to the question. The top table is the result of the query as posted. – Ryan Dec 31 '13 at 06:53

3 Answers3

0

Use Common tabular Expression

Here is the query for the requirement.

WITH cte ( personneltrainingeventpersonnelid, trainingeventtypeid, duedate, 
     waiverdate, personnelid, taskcode, personneldetailpersonnelid, dupcount) 
     AS (SELECT personneltrainingeventpersonnelid, 
                trainingeventtypeid, 
                duedate, 
                waiverdate, 
                personnelid, 
                taskcode, 
                personneldetailpersonnelid, 
                Row_number() 
                  OVER( 
                    partition BY personnelid 
                    ORDER BY duedate, waiverdate) AS DupCount 
         FROM   personneltrainingevent) 
SELECT C.* 
FROM   cte C, 
       (SELECT personneltrainingeventpersonnelid, 
               personnelid, 
               Max(duedate) AS Maximum 
        FROM   cte 
        GROUP  BY personnelid, 
                  personneltrainingeventpersonnelid) CC 
WHERE  C.personnelid = cc.personnelid 
       AND C.personneltrainingeventpersonnelid = 
           cc.personneltrainingeventpersonnelid 
       AND c.duedate = CC.maximum 
ORDER  BY C.personnelid 
Hawk
  • 5,060
  • 12
  • 49
  • 74
  • Given that you don't do anything with `DupCount`, this isn't going to work the way you expect (your query is at least doing more work than it needs). And you can't just use `MAX(dueDate)`, because the OP needs one of two _different_ dates (`dueDate` **or** `waiverDate`); your `ORDER BY` in the window statement won't give you the correct `1` row all the time. – Clockwork-Muse Dec 31 '13 at 06:07
  • DupCount is Tracked based on Duplicate PersonID.OP Needs an one of the date as requested. Else we can use Case Statement for choosing the date. Order By PersonID fetch the record of distinct top Dupcount based on Maximum Date requested. – Premchand Yelavarthi Dec 31 '13 at 06:15
  • You do nothing with `DupCount` after defining it - the final `SELECT` doesn't count! In order for the OP to get the results he needs, the _query itself_ should perform the exclusion/selection; among other things, the DB server will likely perform this better than application level code. – Clockwork-Muse Dec 31 '13 at 06:24
  • Exactly. Dupcount was not used in the Final window query.Its just the count of Duplicates existing with respect to PersonID Column. – Premchand Yelavarthi Dec 31 '13 at 06:52
0

What you probably want is something like this:

SELECT Event.personnelId, Event.trainingEventTypeId, Event.dueDate, Event.waiverDate,
       Event.taskCode
FROM (SELECT Event.personnelId, Event.trainingEventTypeId, Event.dueDate, Event.waiverDate,
             TrainingEventType.taskCode
             ROW_NUMBER() OVER(PARTITION BY Event.personnelId, Event.trainingEventTypeId
                               ORDER BY CASE WHEN Event.dueDate >= Event.waiverDate
                                             THEN Event.dueDate
                                             ELSE Event.waiverDate END DESC) rn
      FROM PersonnelTrainingEvent Event
      JOIN TrainingEventType
        ON TrainingEventType.id = Event.trainingEventTypeId
           AND TrainingEventType.taskCode IN (N'GA43', N'MOB2', N'CA01', N'CA02', N'Mob10', N'PJ67', N'CA06', N'CA07', N'T104', N'GA42', N'PJ18')) Event
WHERE Event.rn = 1  

However, it's difficult to tell because the query you've provided has syntax errors, has additional unneeded columns, and references tables which should not influence the results (the tables are likely to have at least one row, but no data from those tables is referenced, and multiple rows are usually unwanted).

Clockwork-Muse
  • 12,806
  • 6
  • 31
  • 45
0

Thanks for all the assistance but I went back to basics. I cut the joined tables out and went to the basic four columns that I needed without the translation data. Once I did that I saw that the table only created duplicate entries with a 1900 date field so I simply used a WHERE clause to strip out the extra entry. I was really interested in Clockwork-Muse's method but I kept receiving a syntax error that didn't make sense and once I saw the issue I was having it was less code to strip what I needed. Thank you again for the support and information.

Ryan
  • 53
  • 6