0

This is sybase 15. Here's my problem.

I have 2 tables.

t1.jobid          t1.date
------------------------------
1                   1/1/2012
2                   4/1/2012
3                   2/1/2012
4                   3/1/2012

t2.jobid   t2.userid    t2.status
-----------------------------------------------
1              100            1
1              110            1
1              120            2
1              130            1
2              100            1
2              130            2
3              100            1
3              110            1
3              120            1
3              130            1
4              110            2
4              120            2

I want to find all the people who's status for THEIR two most recent jobs is 2.

My plan was to take the top 2 of a derived table that joined t1 and t2 and was ordered by date backwards for a given user. So the top two would be the most recent for a given user.

So that would give me that individuals most recent job numbers. Not everybody is in every job.

Then I was going to make an outer query that joined against the derived table searching for status 2's with a having a sum(status) = 4 or something like that. That would find the people with 2 status 2s.

But sybase won't let me use an order by clause in the derived table.

Any suggestions on how to go about this?

I can always write a little program to loop through all the users, but I was gonna try to make one horrendus sql out of it.

Juicy one, no?

stu
  • 8,461
  • 18
  • 74
  • 112

2 Answers2

1

You could rank the rows in the subquery by adding an extra column using a window function. Then select the rows that have the appropriate ranks within their groups.

I've never used Sybase, but the documentation seems to indicate that this is possible.

Matt Fenwick
  • 48,199
  • 22
  • 128
  • 192
1
With Table1 As
    (
    Select 1 As jobid, '1/1/2012' As [date]
    Union All Select 2, '4/1/2012'
    Union All Select 3, '2/1/2012'
    Union All Select 4, '3/1/2012'
    )
    , Table2 As
    (
    Select 1 jobid, 100 As userid, 1 as status
    Union All Select 1,110,1
    Union All Select 1,120,2
    Union All Select 1,130,1
    Union All Select 2,100,1
    Union All Select 2,130,2
    Union All Select 3,100,1
    Union All Select 3,110,1
    Union All Select 3,120,1
    Union All Select 3,130,1
    Union All Select 4,110,2
    Union All Select 4,120,2
    )
    , MostRecentJobs As
    (
    Select T1.jobid, T1.date, T2.userid, T2.status
        , Row_Number() Over ( Partition By T2.userid Order By T1.date Desc ) As JobCnt
    From Table1 As T1
        Join Table2 As T2
            On T2.jobid = T1.jobid
    )
Select *
From MostRecentJobs As M2
Where Not Exists    (
                    Select 1
                    From MostRecentJobs As M1
                    Where M1.userid = M2.userid
                        And M1.JobCnt <= 2
                        And M1.status <> 2
                    )
     And M2.JobCnt <= 2

I'm using a number of features here which do exist in Sybase 15. First, I'm using common-table expressions both for my sample data and clump my queries together. Second, I'm using the ranking function Row_Number to order the jobs by date.

It should be noted that in the example data you gave, no user satisfies the requirement of having their two most recent jobs both be of status "2".

__

Edit

If you are using a version of Sybase that does not support ranking functions (e.g. Sybase 15 prior to 15.2), then you need simulate the ranking function using Counts.

Create Table #JobRnks
    (
    jobid int not null
    , userid int not null
    , status int not null
    , [date] datetime not null
    , JobCnt int not null
    , Primary Key ( jobid, userid, [date] )
    )

Insert #JobRnks( jobid, userid, status, [date], JobCnt )    
Select T1.jobid, T1.userid, T1.status, T1.[date], Count(T2.jobid)+ 1 As JobCnt
From    (
        Select T1.jobid, T2.userid, T2.status, T1.[date]
        From @Table2 As T2
            Join @Table1 As T1
                On T1.jobid = T2.jobid
        ) As T1
    Left Join   (
                Select T1.jobid, T2.userid, T2.status, T1.[date]
                From @Table2 As T2
                    Join @Table1 As T1
                        On T1.jobid = T2.jobid
                ) As T2
        On T2.userid = T1.userid
            And T2.[date] < T1.[date]
Group By T1.jobid, T1.userid, T1.status, T1.[date]

Select *
From #JobRnks As J1
Where Not Exists    (
                    Select 1
                    From #JobRnks As J2
                    Where J2.userid = J1.userid
                        And J2.JobCnt <= 2
                        And J2.status <> 2
                    )
    And J1.JobCnt <= 2

The reason for using the temp table here is for performance and ease of reading. Technically, you could plug in the query for the temp table into the two places used as a derived table and achieve the same result.

Thomas
  • 63,911
  • 12
  • 95
  • 141
  • I was thinking of using @@identity to have something to sort on/search max on, but row count is good too. and the partition, good idea there. – stu Apr 27 '12 at 18:51
  • Two problems: 1) I'm getting syntax error near the word over, and when I run your example verbatim, I get syntax error on 'with'. I'll keep playing, but thanks for the ideas. – stu Apr 27 '12 at 18:52
  • ahh, it seems my version of sybase doesn't support row_number() – stu Apr 27 '12 at 19:06
  • @stu - The documentation for Sybase 15 (http://infocenter.sybase.com/help/index.jsp?topic=/com.sybase.infocenter.dc38151.1520/html/iqrefbb/iqrefbb262.htm) shows that it does support Row_Number. Are you sure you are using Sybase 15? – Thomas Apr 28 '12 at 04:31
  • I'm inclined to agree with you that it is supported. Yet it doesn't work when I try it. I asked my dba (yes sybase 15) and they found a knowledgebase article offering unhelpful workarounds. I think there is more than one thing named sybase 15. – stu Apr 29 '12 at 12:48
  • 1
    @stu - Which exact version of Sybase are you using? I believe that ranking functions were added in 15.2. – Thomas Apr 29 '12 at 16:44
  • ahhhh. that explains it. Silly me, I thought sybase 15 was sybase 15. Didn't know they added major features in minor release versions. – stu Apr 30 '12 at 18:09
  • I am running... Adaptive Server Enterprise/15.0.3/EBF 17156 ESD#3/P/Sun_svr4/OS 5.8/ase1503/2726/64-bit/FBO/Fri Feb 5 05:26:23 2010 – stu Apr 30 '12 at 18:09
  • @stu - Without ranking functions, the problem is obviously significantly more difficult. There is still a way to simulate the Row_Number function, but performance will suffer. I'll have to look at it again tonight. – Thomas Apr 30 '12 at 23:52
  • @stu - What is the primary key of the second table? – Thomas May 01 '12 at 04:54
  • the first two columns, job_id and the user_id. – stu May 01 '12 at 19:00
  • but I think we've all answered my question here. At this point, without windowing functions it becomes easier to write a program and not try and do it all in one sql. My example is a simple case of what I actually need to do (which by the way includes getting three most recent dates, not two) so while it might be a fun exercise to work out, the actual run would cause more grief on the database than writing a program would, so I'll just end up doing that. Thanks for all your help. – stu May 01 '12 at 19:01
  • A friend of mine came up with a simpler solution I'm working on not sure if it will be viable though. He does a correlated query to compare the t2.date to (select max(t4.date) where t2.user_id = t4.user_id). (t3 and t4 are derived tables) That gets you the most recent date, then you say "or t2.date = ..." and then you do another 2 subqueries to get the max(t6.date) where the t6.date < t4.date so it gets you the biggest date less than the first biggest date. I have to extend that to a bunch more subqueries to get the third most recent date. – stu May 01 '12 at 19:04
  • needless to say, this is a lot of pain on the database as well. – stu May 01 '12 at 19:04