0

I have the following two sas datasets:

data have ;
 input a b;
cards;
1 15
2 10
3 40
4 200
1 25
2 15
3 10
4 75
1 1
2 99
3 30
4 100
;

data ref ;
 input x y;
cards;
1 10
2 20
3 30
4 100
;

I would like to have the following dataset:

data want ;
 input a b outcome ;
cards;
1 15 0
2 10 1
3 40 0
4 200 0
1 25 0
2 15 1
3 10 1
4 75 1
1 1 1
2 99 0
3 30 1
4 100 1
;

I would like to create a variable 'outcome' which is produced by an if statement upon conditions of variables a, b, x and y. As in reality the 'have' dataset is extremely large I would like to avoid a sort and merging the two datasets together (where a = x).

I am trying to use macro variables with the following code:

data _null_ ;
set ref ;
  call symput('listx', x) ;
  call symput('listy', y) ;
run ;

data want ;
set have ;
if a=&listx and b le &listy then outcome = 1 ; else outcome = 0 ;
run ;

which does not however produce the desired result:

data want ;
 input a b outcome ;
cards;
1 15 0
2 10 1
3 40 0
4 200 0
1 25 0
2 15 1
3 10 1
4 75 1
1 1 1
2 99 0
3 30 1
4 100 1

;
user2568648
  • 3,001
  • 8
  • 35
  • 52

1 Answers1

2

redone my solution using hash tables. Below my approach

data ref2(rename=(x=a));
set ref ;
run;

data want;
declare Hash Plan ();
    rc = plan.DefineKey ('a');  /*x originally*/
    rc = plan.DefineData('a', 'y');
    rc = plan.DefineDone();

    do until (eof1);
     set ref2 end=eof1;
     rc = plan.add();   /*add each record from ref2 to plan (hash table)*/
    end;

    do until (eof2);
     set have end=eof2;
     call missing(y);
     rc = plan.find();
     outcome = (rc =0 and b<y);
     output;
    end;
    stop;
run;

hope it helps

Altons
  • 1,422
  • 3
  • 12
  • 23
  • ups I misread your `if a=&listx and b le &listy then outcome = 1 ; else outcome = 0 ;` my solutions does not quite work. think `hash` tables is a better solution. Give me sometime and will back with solution – Altons Aug 10 '16 at 09:19
  • added new solution using hash tables - if works pls accept my answer. – Altons Aug 10 '16 at 09:37