2

I need to select distinct values from multiple columns in an h2 database so I can have a list of suggestions for the user based on what is in the database. In other words, I need something like

SELECT DISTINCT a FROM table
SELECT DISTINCT b FROM table
SELECT DISTINCT c FROM table

in one query. In-case I am not clear enough, I want a query that given this table (columns ID, thing, other, stuff)

0 a 5 p
1 b 5 p
2 a 6 p
3 c 5 p

would result in something like this

a 5 p
b 6 -
c - -

where '-' is an empty entry.

Andriy M
  • 76,112
  • 17
  • 94
  • 154
BrainStorm.exe
  • 1,565
  • 3
  • 23
  • 40
  • 1
    Please specify the RDBMS that you are targeting by adding the appropriate tag (Oracle, SQL Server, MySQL, etc.). There may be answers that take advantage of language or product features that are not universally supported. Also, by tagging it with a specific RDBMS, your question may receive attention from people better suited to answer it – Taryn Mar 19 '13 at 20:54
  • possible duplicate of [Selecting distinct values for multiple columns](http://stackoverflow.com/questions/5196194/selecting-distinct-values-for-multiple-columns) – Andriy M Mar 19 '13 at 21:36
  • Generic doesn't help anyone because even the "standard" syntax is likely different between platforms, never mind the most efficient path which will probably take advantage of platform-specific extensions. – Aaron Bertrand Mar 19 '13 at 22:33
  • Understood, I guess I'll tailor the question to h2 then. – BrainStorm.exe Mar 19 '13 at 22:38
  • The tags already cover the fact that your question is about H2, [you don't need to put the name in the title](http://meta.stackexchange.com/questions/19190/should-questions-include-tags-in-their-titles/130208#130208). – Andriy M Mar 19 '13 at 22:47

1 Answers1

2

This is a bit complicated, but you can do it as follows:

select max(thing) as thing, max(other) as other, max(stuff) as stuff
from ((select row_number() over (order by id) as seqnum, thing, NULL as other, NULL as stuff
       from (select thing, min(id) as id from t group by thing
            ) t
      ) union all
      (select row_number() over (order by id) as seqnum, NULL, other, NULL
       from (select other, min(id) as id from t group by other
            ) t
      ) union all
      (select row_number() over (order by id) as seqnum, NULL, NULL, stuff
       from (select stuff, min(id) as id from t group by stuff
            ) t
      )
     ) t
group by seqnum

What this does is assign a sequence number to each distinct value in each column. It then combines these together into a single row for each sequence number. The combination uses the union all/group by approach. An alternative formulation uses full outer join.

This version uses the id column to keep the values in the same order as they appear in the original data.

In H2 (which was not originally on the question), you can use the rownum() function instead (documented here). You may not be able to specify the ordering however.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786