1

I have 3 datasets in SAS:

main_1

ID Rep Dose Response
1   2   34    567
1   1   45    756
2   1   35    456
3   1   56    345

main_2

ID Rep Hour Day
1   1   89  157
2   1   62  365
3   1   12  689

main_3

ID  Rep  Town   City
1   1    3268   151643
2   1    2574   165435
3   1    1254   135435
2   2    1563   135435

After importing the tables into SAS I currently have the following proc SQL code (I am learning so sorry of its terrible) to merge the tables so that each ID and Rep contains all relevant information. I cannot seem to merge all 3 tables effectively as main_1 does not include all possible ID and Rep combinations (indeed none of them do):

  PROC SQL; 
        create table merged_sql_2 as 
        select A.*,
               B.hour,
               B.day,
               C.Town,
               C.City
        from main_1 as A 
        LEFT JOIN main_2 as B 
            on A.ID = B.ID
            and A.Rep = B.Rep
        LEFT JOIN main_3 as C 
            on A.ID = C.ID
            and A.Rep = C.Rep;
    QUIT;

Produced the following with 1 observation missing:

   ID   Rep Dose    Response    Hour    Day   Town    City
    1    1   45       756        89     157   3268   151643
    1    2   34       567        -       -      -      -
    2    1   35       456        62     365   2574   165435
    3    1   56       345        12     689   1254   135435

I need to merge on all possible matches to achieve the following:

ID  Rep Dose    Response    Hour    Day   Town    City
1    1   45       756        89     157   3268   151643
1    2   34       567        -       -      -      -
2    1   35       456        62     365   2574   165435
2    2   -         -         -       -    1563   135435
3    1   56       345        12     689   1254   135435

If the code above is the complete wrong way top achieve this I am happy to learn other SQL alternatives!

Many thanks.

Jason Rogers
  • 667
  • 1
  • 6
  • 19

2 Answers2

1

You need to use FULL JOIN instead of LEFT JOIN

Try this:

PROC SQL; 
    create table merged_sql_2 as 
    select coalesce(a.id,b.id,c.id) as id,
           coalesce(a.rep,b.rep,c.rep) as rep,
           a.dose,
           a.response,
           B.hour,
           B.day,
           C.Town,
           C.City
    from main_1 as A 
    FULL JOIN main_2 as B 
        on A.ID = B.ID
        and A.Rep = B.Rep
    FULL JOIN main_3 as C 
        on A.ID = C.ID
        and A.Rep = C.Rep;
QUIT;

The COALESCE() function picks the first non-null value from the list. This lets you keep the key from any table that contributes a row.

DomPazz
  • 12,415
  • 17
  • 23
0

Using a full natural join also works:

proc sql;
    create table new as 
    select rep, dose, response, hour, day, town, city from 
    Main_1 natural full join Main_2 natural full join Main_3 
 ;
quit;
Todd
  • 46
  • 2