SELECT DISTINCT ON in SAS PROC SQL

Question

How can I select ALL ROWS and where A ROW is existent more then once, (based on ID) just take one (doesn't matter which one).

So I have:

ID name val
===========
1 | aa | 32
2 | aa | 32
3 | ad | 32
3 | zy | 11
4 | rr | 21
5 | oi | 44
5 | df | 11

result should be

ID name val
===========
1 | aa | 32
2 | aa | 32
3 | zy | 11
4 | rr | 21
5 | df | 11

It doesn't matter if 5 | df | 11 or 5 | oi | 44 is taken.

On Postgres, SELECT DISTINCT ON ID would be my choice.

Many thanks in advance

Does it *have* to be done using SQL? SQL is not really suited to this kind of task. SAS provides other ways to do it which are much easier. — Robert Penridge, Sep 10 '15 at 16:55

score 2 · Accepted Answer · answered Sep 10 '15 at 17:10

The most convenient way to do it will be using Proc Sort:

PROC SORT DATA=HAVE OUT=WANT NODUPKEY;
BY ID;
RUN;

Otherwise if it is presorted, then @Reeza's solution is nice. If you have to use Proc SQL, then an undomented feature may need to be implemented: monotonic():

PROC SQL;
CREATE TABLE WANT AS
SELECT * FROM HAVE
GROUP BY ID
HAVING MONOTONIC()=MIN(MONOTONIC());
QUIT;

score 1 · Answer 2 · answered Sep 10 '15 at 15:55

1

A SAS data step answer, using First/Last processing

data want;
  set have;
  by id;
  if first.id;
 run;

answered Sep 10 '15 at 15:55

Reeza

20,510
4
21
38

Your previous SQL answer worked nicely, as I'm new to SQL, could you explain me, why the `MAX()` function in the String works in combination with `GROUP BY`? Actually this gave me a great solution and improvement for my project! Thanks!! – G-M Sep 10 '15 at 16:00

score 0 · Answer 3 · answered Sep 10 '15 at 16:07

0

With proc sql:

proc sql;

select id, max(name) as maxName, val
from work.table
group by id, val;

answered Sep 10 '15 at 16:07

rwking

1,032
6
18

SELECT DISTINCT ON in SAS PROC SQL

3 Answers3