-1

I want to assign a value in new_col based on value in column 'ind' when months = 1;

idnum1  months  ind new_col
1       1       X   X
1       2       X   X
1       3       Y   X
1       4       Y   X
1       5       X   X
2       1       Y   Y
2       2       Y   Y
2       3       X   Y
2       4       X   Y
2       5       X   Y

Below query just assign the value X where months = 1 but I want in all the rows of new_col for all the id -

create table tmp as
select t1.*, 
case when months = 1 then ind end as new_col
from table t1;

I am trying to do it in SAS using proc sql;

Richard
  • 25,390
  • 3
  • 25
  • 38
Swapnil
  • 25
  • 1
  • 9

4 Answers4

2

Ideally you would use RETAIN within a data step:

data want;
set have;

retain new_var;
if month=1 then new_var = ind;
run;

SQL isn't as good with this as a data step.

But assuming your variable ID is repeated then this would work. If it's not then you really do need the data step approach.

proc sql;
create table want as 
select *, max(ind) as new_col
from have
group by ID;
quit;

EDIT: If you want to retain the first per ID just use FIRST. instead of If month =1.

data want;
set have;
by ID;
retain new_var;

if first.id then new_var = ind;
run;
Reeza
  • 20,510
  • 4
  • 21
  • 38
  • Thanks for the data step solution. For it to work correctly - months should be in sorted order - in case let's say months column is having values < 0, otherwise first encounter of anything not months = 1 will have null values. @Reeza – Swapnil Aug 07 '20 at 17:12
  • No idea what that means. Does this not work? I see you updated your data sample, please post it as text. Solutions are untested because I'm not going to type out your data to test a solution. – Reeza Aug 07 '20 at 17:39
  • I mean with retain - it looks for the 1st entry in months column that = 1, if the 1st entry is not 1 - it will set the value of new_col for those rows as null - right? – Swapnil Aug 07 '20 at 18:12
  • Yes it will. This solution works for your shown data and use case. If it doesn't expand your sample data to be reflective of your actual data. And use FIRST. instead of if month=1. – Reeza Aug 07 '20 at 18:15
1

A robust Proc SQL statement that deals with possibly repeated first month situations that chooses the lowest ind to distribute to the group

data have; input
idnum1  months  ind $ new_col $; datalines;
1       1       X   X
1       2       X   X
1       3       Y   X
1       4       Y   X
1       5       X   X
2       1       Y   Y
2       2       Y   Y
2       3       X   Y
2       4       X   Y
2       5       X   Y
3       1       Z   .
3       1       Y   .
3       1       X   .
3       2       A   .
;

create table want as
select 
  have.idnum1, months, ind, new_col, lowest_first_ind
from
  have
join 
  ( select idnum1, min(ind) as lowest_first_ind from 
    (
    select idnum1, ind 
    from have
    group by idnum1
    having months = min(months)
    )
    group by idnum1
  ) value_seeker
on
  have.idnum1 = value_seeker.idnum1
;

enter image description here

Richard
  • 25,390
  • 3
  • 25
  • 38
0

You can use a window function:

select t1.*,
       max(case when months = 1 then ind end) over (partition by id) as new_col
from t1;
Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • 1
    Thanks but I am trying to do it in SAS using proc sql; and it doesn't support window functions. – Swapnil Aug 07 '20 at 15:44
0

If there is only one MONTH=1 observation per BY group then just use a simple join.

create table WANT as
  select t1.*,t2.ind as new_col 
  from table t1
  left join (select idnum1,ind from table where month=1) t2
    on t1.idnum1 = t2.idnum1
;
Tom
  • 47,574
  • 2
  • 16
  • 29
  • Yeah - i think this will also work out. I wrote it as - create table WANT as select t1.*,t2.ind as new_col from table t1, table t2 where t1.idnum1 = t2.idnum1 and t2.month=1 ; but, i think SQL joins will take more time when compared to DATA STEP; – Swapnil Aug 11 '20 at 18:33
  • 1
    @Swapnil Your the one that asked for SQL solution. Data step is easy if MONTH=1 is first observation per IDNUM1 group. A little more difficult if there might be earlier months for some of the INDNUM1 values. – Tom Aug 11 '20 at 18:41