I have a dataset that contains 250 variables. I think some rows may be exact duplicates. If I only had 3 variables, I could run this code to check for dupes:
proc sql;
create table checkDupe as
select count(*) as N, *
from bigTable
group by 1, 2, 3
having N > 1;
quit;
However, with 250 variables, I don't want to type out group by 1, 2, 3, ... , 250
.
The following group by
statements don't work:
group by *
group by _ALL_
group by 1:250
Is there a concise way to group by all variables?