3

I have 2 similar queries which both work on the same table, and I essentially want to combine their results such that the second query supplies default values for what the first query doesn't return. I've simplified the problem as much as possible here. I'm using Oracle btw.

The table has account information in it for a number of accounts, and there are multiple entries for each account with a commit_date to tell when the account information was inserted. I need get the account info which was current for a certain date.

The queries take a list of account ids and a date.

Here is the query:

-- Select the row which was current for the accounts for the given date. (won't return anything for an account which didn't exist for the given date)
SELECT actr.*
FROM Account_Information actr 
WHERE actr.account_id in (30000316, 30000350, 30000351) 
AND actr.commit_date <= to_date( '2010-DEC-30','YYYY-MON-DD ')
AND actr.commit_date = 
(
    SELECT MAX(actrInner.commit_date) 
    FROM Account_Information actrInner 
    WHERE actrInner.account_id = actr.account_id
    AND actrInner.commit_date <= to_date( '2010-DEC-30','YYYY-MON-DD ')
) 

This looks a little ugly, but it returns a single row for each account which was current for the given date. The problem is that it doesn't return anything if the account didn't exist until after the given date.

Selecting the earliest account info for each account is trival - I don't need to supply a date for this one:

-- Select the earliest row for the accounts.
SELECT actr.*
FROM Account_Information actr 
WHERE actr.account_id in (30000316, 30000350, 30000351) 
AND actr.commit_date = 
(
    SELECT MAX(actrInner .commit_date) 
    FROM Account_Information actrInner 
    WHERE actrInner .account_id = actr.account_id
)  

But I want to merge the result sets in such a way that:

For each account, if there is account info for it in the first result set - use that. Otherwise, use the account info from the second result set.

I've researched all of the joins I can use without success. Unions almost do it but they will only merge for unique rows. I want to merge based on the account id in each row.

Sql Merging two result sets - my case is obviously more complicated than that

SQL to return a merged set of results - I might be able to adapt that technique? I'm a programmer being forced to write SQL and I can't quite follow that example well enough to see how I could modify it for what I need.

Community
  • 1
  • 1
Scott
  • 33
  • 4

3 Answers3

5

The standard way to do this is with a left outer join and coalesce. That is, your overall query will look like this:

SELECT ...
FROM defaultQuery
LEFT OUTER JOIN currentQuery ON ...

If you did a SELECT *, each row would correspond to the current account data plus your defaults. With me so far?

Now, instead of SELECT *, for each column you want to return, you do a COALESCE() on matched pairs of columns:

SELECT COALESCE(currentQuery.columnA, defaultQuery.columnA) ...

This will choose the current account data if present, otherwise it will choose the default data.

anon
  • 4,578
  • 3
  • 35
  • 54
  • Ah, thank you for the easy to follow explanation. I thought I should be able to do it with a join but couldn't figure out how to stop it doubling up on the columns. I haven't used coalesce before and that seems to be the missing piece of the puzzle. I'm off to read up on it now! – Scott Mar 07 '13 at 05:33
1

You can do this more directly using analytic functions:

select *
from (SELECT actr.*, max(commit_date) over (partition by account_id) as maxCommitDate,
             max(case when commit_date <= to_date( '2010-DEC-30','YYYY-MON-DD ') then commit_date end) over
                   (partition by account_id) as MaxCommitDate2
      FROM Account_Information actr 
      WHERE actr.account_id in (30000316, 30000350, 30000351)
     ) t
where (MaxCommitDate2 is not null and Commit_date = MaxCommitDate2) or
      (MaxCommitDate2 is null and Commit_Date = MaxCommitDate) 

The subquery calculates two values, the two possibilities of commit dates. The where clause then chooses the appropriate row, using the logic that you want.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • That's an interesting way to do it, combining both the queries and then removing the unwanted rows. – Scott Mar 07 '13 at 05:47
  • @Scott . . . It seemed to me that this would be more efficient and maintainable than dealing with two separate but similar queries. – Gordon Linoff Mar 07 '13 at 11:47
  • I agree completely. The queries are actually in hsql since we're using hibernate, and the query builder that's being used is... not pretty. There's only so much I can change without breaking everything else. If I had my way I would just be using hibernate to do the data binding on result sets from stored procs in the database. I'm certainly no SQL expert and COALESCE was enough to get me where I wanted to go. – Scott Mar 08 '13 at 04:37
  • @Scott . . . In the future, you should tag your questions with Hibernate if that is what you are using to generate the queries. – Gordon Linoff Mar 08 '13 at 11:56
1

I've combined the other answers. Tried it out at apex.oracle.com. Here's some explanation.

MAX(CASE WHEN commit_date <= to_date('2010-DEC-30', 'YYYY-MON-DD')) will give us the latest date not before Dec 30th, or NULL if there isn't one. Combining that with a COALESCE, we get COALESCE(MAX(CASE WHEN commit_date <= to_date('2010-DEC-30', 'YYYY-MON-DD') THEN commit_date END), MAX(commit_date)).

Now we take the account id and commit date we have and join them with the original table to get all the other fields. Here's the whole query that I came up with:

SELECT *
FROM Account_Information
JOIN (SELECT account_id,
             COALESCE(MAX(CASE WHEN commit_date <=
                                    to_date('2010-DEC-30', 'YYYY-MON-DD')
                               THEN commit_date END),
                      MAX(commit_date)) AS commit_date
      FROM Account_Information
      WHERE account_id in (30000316, 30000350, 30000351)
      GROUP BY account_id)
  USING (account_id, commit_date);

Note that if you do use USING, you have to use * instead of acrt.*.

Dave Johnson
  • 411
  • 3
  • 9
  • Very tidy looking query, I knew my original queries were not the best way to filter to just 1 row per account id. I think it's pretty clear I need to read up on coalesce, it looks incredibly useful! – Scott Mar 07 '13 at 05:43