0

The code below joins two tables and I need to extract only the latest date per account, though it holds multiple accounts and history records. I wanted to use the MAX function, but not sure how to incorporate it for this case. I am using My SQL server.

Appreciate any help !

select 
    PROP.FileName,PROP.InsName, PROP.Status,
    PROP.FileTime, PROP.SubmissionNo, PROP.PolNo, 
    PROP.EffDate,PROP.ExpDate, PROP.Region,  
    PROP.Underwriter, PROP_DATA.Data , PROP_DATA.Label
from
    Property.dbo.PROP 
inner join 
    Property.dbo.PROP_DATA on Property.dbo.PROP.FileID = Actuarial.dbo.PROP_DATA.FileID
where 
    (PROP_DATA.Label in ('Occupancy' , 'OccupancyTIV')) 
    and (PROP.EffDate >= '42278' and PROP.EffDate <= '42643')
    and (PROP.Status = 'Bound')
    and (Prop.FileTime = Max(Prop.FileTime))
order by
    PROP.EffDate DESC 
Shasti
  • 45
  • 9
  • Possible duplicate of [SQL Server - Only Select Latest Date](http://stackoverflow.com/questions/35442758/sql-server-only-select-latest-date) – paparazzo Sep 29 '16 at 22:23
  • 4
    @Shasti SQL is a language, not a product. Which DBMS do you use (Oracle, PostgreSQL, SQLite, DB2, ...)? – ypercubeᵀᴹ Sep 29 '16 at 22:35
  • 1
    @Paparazzi: the only tag on the question is "sql," so we may not be able to presume it's SQL Server. And even then, Microsoft SQL Server, Sybase SQL Server, Sybase SQL Anywhere, Sybase Adaptive Server Enterprise? – Craig Tullis Sep 29 '16 at 22:59
  • 1
    What was the goal in removing the code sample? The question no longer presents enough context to be able to provide any kind of cogent answer. ;-) – Craig Tullis Sep 30 '16 at 00:57
  • @Craig If the OP comes back and tags it properly it will not get closed. It take 5 votes to close. – paparazzo Sep 30 '16 at 01:40
  • 1
    @Paparazzi: I was just wondering why the OP deleted the code sample. It did a pretty good job of describing the parameters of the problem. I just wanted to encourage the OP to put it back or put back an equivalent sample. ;-) – Craig Tullis Sep 30 '16 at 02:48
  • 2
    Which RDBMS is this for? Please add a tag to specify whether you're using `mysql`, `postgresql`, `sql-server`, `oracle` or `db2` - or something else entirely. – marc_s Sep 30 '16 at 05:07
  • 1
    `PROP.EffDate >= '42278'` looks worrying. Why are you storing dates a strings? (with a strange format as well) –  Sep 30 '16 at 05:54

3 Answers3

2

Assuming your DBMS supports windowing functions and the with clause, a max windowing function would work:

with all_data as (
  select
   PROP.FileName,PROP.InsName, PROP.Status,
   PROP.FileTime, PROP.SubmissionNo, PROP.PolNo, 
   PROP.EffDate,PROP.ExpDate, PROP.Region,  
   PROP.Underwriter, PROP_DATA.Data , PROP_DATA.Label,
   max (PROP.EffDate) over (partition by PROP.PolNo) as max_date

  from Actuarial.dbo.PROP 
      inner join Actuarial.dbo.PROP_DATA 
      on Actuarial.dbo.PROP.FileID = Actuarial.dbo.PROP_DATA.FileID
  where (PROP_DATA.Label in ('Occupancy' , 'OccupancyTIV')) 
      and (PROP.EffDate >= '42278' and PROP.EffDate <= '42643')
      and (PROP.Status = 'Bound')
      and (Prop.FileTime = Max(Prop.FileTime))
)
select
  FileName, InsName, Status, FileTime, SubmissionNo,
  PolNo, EffDate, ExpDate, Region, UnderWriter, Data, Label
from all_data
where EffDate = max_date
ORDER BY EffDate DESC 

This also presupposes than any given account would not have two records on the same EffDate. If that's the case, and there is no other objective means to determine the latest account, you could also use row_numer to pick a somewhat arbitrary record in the case of a tie.

Hambone
  • 15,600
  • 8
  • 46
  • 69
  • Good point... he said "account," and I made my best guess as to what field constituted an account. – Hambone Sep 29 '16 at 22:46
  • You should also remove the `and (Prop.FileTime = Max(Prop.FileTime))`. This is probably their failed attempt to write the query. – ypercubeᵀᴹ Sep 29 '16 at 22:47
1

Using straight SQL, you can use a self-join in a subquery in your where clause to eliminate values smaller than the max, or smaller than the top n largest, and so on. Just set the number in <= 1 to the number of top values you want per group.

Something like the following might do the trick, for example:

select
    p.FileName
    , p.InsName
    , p.Status
    , p.FileTime
    , p.SubmissionNo
    , p.PolNo
    , p.EffDate
    , p.ExpDate
    , p.Region
    , p.Underwriter
    , pd.Data
    , pd.Label
from Actuarial.dbo.PROP p
inner join Actuarial.dbo.PROP_DATA pd
    on p.FileID = pd.FileID
where (
    select count(*)
    from Actuarial.dbo.PROP p2
    where p2.FileID = p.FileID
    and p2.EffDate <= p.EffDate
    ) <= 1
and (
    pd.Label in ('Occupancy' , 'OccupancyTIV')
    and p.Status = 'Bound'
)
ORDER BY p.EffDate DESC 

Have a look at this stackoverflow question for a full working example.

Craig Tullis
  • 9,939
  • 2
  • 21
  • 21
  • CTEs and window functions are standard SQL, not vendor-specific. You may call them fancy though ;) – ypercubeᵀᴹ Sep 29 '16 at 23:47
  • Well, vendor-specific in the sense that they are not universally supported or identically implemented across all vendors. MSSQL is not the only database around. There's no support for common table expressions in MySQL, for instance, or versions of Sybase SQL Anywhere or ASE in the very recent past. So if you're using those tools, CTE's are useless to you. And they don't necessarily result in better performance where they are supported, either. – Craig Tullis Sep 30 '16 at 00:54
  • MySQL has a giant user base, and the OP didn't specify which server is involved here, so there's no way to know without more input whether a CTE solution will work. I'm directly aware of people who have no choice but to use MySQL v5, and versions of other databases **and reporting tools** that do not support CTE's. And, the SQL I presented here will work and perform just fine, very possibly outperforming a CTE based solution. I have no moral object to CTE's, but there isn't adequate evidence that they'd work here. – Craig Tullis Sep 30 '16 at 02:46
  • The CTE isn't required by that solution, it is an option, a derived table could be used instead. The lack of windowing functions in MySQL is a deficiency of that product rather then a problem with a proposed solution that uses them. While a generic SQL solution is always useful there can be a performance trade-off. Windowing functions like row_number() over() are efficient and can avoid multiple passes of the data. – Paul Maxwell Sep 30 '16 at 05:22
  • And yet the OP still has not specified which database is in use. So this solution *will* work with the OP's database, and a CTE solution *might* work with the OP's database. Anyway, have a nice evening folks. – Craig Tullis Sep 30 '16 at 05:37
  • @ypercubeᵀᴹ: referring back to our earlier discussion, I currently, right this moment, have a case with almost identical queries, producing identical results, but one using group by and one using a CTE. The group by version is over twice as fast. This is in Sybase SQL Anywhere 16, so of course mileage may vary. But Sybase (and me) isn't exactly new to databases, and my point is that dogmatically insisting that CTE's are universally superior doesn't seem particularly helpful. I have still given the only solution here guaranteed to work based on the information provided by the OP. ;-) – Craig Tullis Oct 12 '16 at 01:02
  • Your claim however that your answer is the only working is completely unfounded. The only information provided by the OP is the `[sql]` tag. Provided that both yours and Hambone's answer work. – ypercubeᵀᴹ Oct 12 '16 at 01:28
  • That would be true **if** all SQL implementations supported window functions/CTE's, but we know they don't. That's my only point. ;-) There was discussion of CTE's outperforming older-style aggregate functions. Anyway, I'm not trying to be super contrary here. Maybe I just feel like I got a lot of grief for providing a perfectly acceptable working solution. – Craig Tullis Oct 12 '16 at 02:13
  • Not all SQL implementations support correlated subqueries either. So what? I don't know why you feel you got a lot of grief. You have a nice answer (althoudh I'd prefer a `NOT EXISTS` instead of the `(COUNT ) < 1` subquery) and both yours and the other answer have a +1. The question should have been closed as "not clear" anyway, much sooner. Until the OP edits and adds what dbms they use. – ypercubeᵀᴹ Oct 12 '16 at 07:26
  • Using count instead of NOT EXISTS let's me easily pull the top n records from each partition. In general, I obviously prefer EXISTS, as well. – Craig Tullis Oct 12 '16 at 14:49
  • And it's a pretty rare SQL implementation that doesn't support correlated subqueries these days. :). I've seen SQL implementations that only implemented full Cartesian joins, too, but that's too rare to bother with here. But I did obviously notice the dbo references in the OP's query. The possibility that the OP is using a pre-CTE version of MSSQL or a Sybase product is much greater, and they did not specify. – Craig Tullis Oct 12 '16 at 14:59
-2

Not tested

with temp1 as
(
select foo
from bar
whre xy = MAX(xy)
)
                select PROP.FileName,PROP.InsName, PROP.Status,
                 PROP.FileTime, PROP.SubmissionNo, PROP.PolNo, 
                 PROP.EffDate,PROP.ExpDate, PROP.Region,  
                 PROP.Underwriter, PROP_DATA.Data , PROP_DATA.Label

                from Actuarial.dbo.PROP 
                    inner join temp1 t
                    on Actuarial.dbo.PROP.FileID = t.dbo.PROP_DATA.FileID

                    ORDER BY PROP.EffDate DESC 
Christian Bohli
  • 331
  • 1
  • 4
  • 15