1

I have the following query that produces multiple records for a single id. I'm trying to figure out how to merge these multiple records into one record:

SELECT DISTINCT id, gender, dateofbirth, city, state, zip
FROM t

This may give me the following resultset:

1, M, 2000-01-01, dallas, tx, 12345
1, M, 2000-01-01, NULL, NULL, NULL

What I want is a single record:

1, M, 2000-01-01, dallas, tx, 12345

A similar scenario occurs when the second row has different data:

1, M, 2000-01-01, dallas, tx, 12345
1, M, 2000-01-01, houston, tx, 67890

In this case I would just want to pick one of the records and ignore the other in order to only have a single record per id.

Is there a way to do this in PostgreSQL? I've tried coalesce to no avail and is wondering if there is some way to handle this.

BDad
  • 95
  • 1
  • 5
  • So you have no primary key in that table, no unique column? – michaJlS Oct 14 '16 at 22:51
  • 1
    What do you expect if there are two more rows (in addition to the ones you have shown) with `1, M, 2000-01-01, NULL, NULL, 45678` and `1, M, 2000-01-01, NULL, 'wa', NULL`. Do you always have at least one row where all columns are NOT NULL? –  Oct 14 '16 at 23:25

2 Answers2

1

The query below appears to be working, at least for your sample data. Have a look at the Fiddle below for a demo. I used MySQL, because Fiddle tends to break for any other database type.

SELECT t1.*
FROM yourTable t1
INNER JOIN
(
    SELECT id, MAX(city || ', ' || state || ', ' || zip) AS location
    FROM yourTable
    GROUP BY id
) t2
    ON t1.id = t2.id AND
       t1.city || ', ' || t1.state || ', ' || t1.zip = t2.location

SQLFiddle

The trick I use is to concatenate the city, state, and zip into a single string and then choose the max value for group of id values. This would work assuming that it is not possible to somehow form the same string from two different addresses. I think this would hold true for your US address format.

Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360
  • The only issue with such approach, is that it may mix data from different rows. – michaJlS Oct 14 '16 at 22:51
  • @michaJlS You're spot on, I posted too fast. Let's see if I can fix this before Gordon Linoff arrives. – Tim Biegeleisen Oct 14 '16 at 22:52
  • If I remember, in MySQL you are able to not apply any aggregate func to the field (http://stackoverflow.com/questions/2197693/any-reason-for-group-by-clause-without-aggregation-function), but I have no idea if that works the same way in psql, neither what it really returns then. – michaJlS Oct 14 '16 at 22:56
0
Select * from 
(select a.*, row_number () over (partition by id order by city desc)rn
From tablename a
) b where rn=1

The above query will always return one record for an id. If there are multiple values for city, one with a value and one with null, the query will get the record which has a value for city.

Bagavathi
  • 438
  • 2
  • 7
  • 17
  • I like it - that partition logic always throws me for a loop (no pun intended), but I got this one working as expected. Thanks very much for the help. – BDad Oct 17 '16 at 23:02