3

I have two tables table which are identical in structure but belong to different schemas (schemas A and B). All rows in question will always appear in the A.table but may or may not appear in B.table. B.table is essentially an override for the defaults in A.table.

As such my query uses a COALESCE on each field similar to:

SELECT COALESCE(B.id, A.id) as id,
       COALESCE(B.foo, A.foo) as foo,
       COALESCE(B.bar, A.bar) as bar
FROM A.table LEFT JOIN B.table ON (A.id = B.id)
WHERE A.id in (1, 2, 3)

This works great, but I also want to add the source of the data. In the example above, assuming id=2 existed in B.table but not 1 or 3, I would want to include some indication that A is the source for 1 and 3 and B is the source for 2.

So the data might look like the following

+---------------------------------+
|  id  |  foo  |  bar  |  source  |
+---------------------------------+
|   1  |    a  |    b  |       A  |
|   2  |    c  |    d  |       B  |
|   3  |    e  |    f  |       A  |
+---------------------------------+

I don't really care what the value of source is as long as I can distinguish A from B.

I am no pgsql expert (not by a long shot) but I have tinkered around with EXISTS and a subquery but have had no luck so far.

Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
sberry
  • 128,281
  • 18
  • 138
  • 165

2 Answers2

3

As records showing the default value (from A.table) have NULLs for B.id, all you need is to add this column specification to your query:

CASE WHEN B.id IS NULL THEN 'A' ELSE 'B' END AS Source
Terje D.
  • 6,250
  • 1
  • 22
  • 30
1

The USING clause would simplify the query you have:

SELECT id
     , COALESCE(B.foo, A.foo) AS foo
     , COALESCE(B.bar, A.bar) AS bar
     , CASE WHEN b.id IS NULL THEN 'A' ELSE 'B' END AS source  -- like @Terje provided
FROM   a
LEFT   JOIN b USING (id)
WHERE  a.id IN (1, 2, 3);

But typically, this alternative query should serve you better:

SELECT x.*  --  or list columns of your choice
FROM  (VALUES (1), (2), (3)) t (id)
     , LATERAL (
   SELECT *, 'B' AS source FROM b WHERE id = t.id
   UNION ALL
   SELECT *, 'A'           FROM a WHERE id = t.id
   LIMIT 1
   ) x
ORDER  BY x.id;

Advantages:

  • You don't have to add another COALESCE construct for every column you want to add to the result.
  • The same query works for any number of columns in a and b.
  • The query even works if the column names are not identical. Only number and data types of columns must match. Of course, you can always list selected, compatible columns as well:

    SELECT *  --  or list columns of your choice
    FROM  (VALUES (1), (2), (3)) t (id)
         , LATERAL (
       SELECT foo, bar, 'B' AS source FROM b WHERE id = t.id
       UNION ALL
       SELECT foo2, bar17, 'A'        FROM a WHERE id = t.id
       LIMIT 1
       ) x
    ORDER  BY x.id;

    The first SELECT determines names, data types and number of columns.

  • This query doesn't break if columns in b are not defined NOT NULL.
    COALESCE cannot tell the difference between b.foo IS NULL and no row with matching id in b. So the source of any result column (except id) can still be 'A', even if the result row says 'B' - if any relevant column in b can be NULL.
    My alternative returns all values from b if the row exists - including NULL values. So the result can be different if columns in b can be NULL. It depends on your requirements which behavior is desirable.

Either query assumes that id is defined as primary key (so exactly 1 or 0 rows per given id value).

Related:

Community
  • 1
  • 1
Erwin Brandstetter
  • 605,456
  • 145
  • 1,078
  • 1,228
  • Thanks for the answer and thorough breakdown. As it turns out the tables don't have the same number of columns so I guess the second one won't work. Interesting read regardless. – sberry Aug 16 '16 at 05:09
  • @sberry: The 2nd one works for that, too. You just can't take the syntax shortcut with `SELECT *` in the LATERAL query. I added an example and some more explanation. – Erwin Brandstetter Aug 16 '16 at 11:14