0

I'm trying to functionise some of my SAS operations

I have a bunch of tables which all have the same index (id) and the same number of rows

Each time I call the function that I want to write, the number of tables can differ. I intend to store the names of the tables in a dataset to loop over them. Sometimes there will only be 3 tables, but sometimes there could be hundreds

How do I write an automatic join to all the tables, but only keep one column of the "ID" - the ID exists in all tables so I would like to drop them all apart from the first table I join

Joe
  • 62,789
  • 6
  • 49
  • 67
shecode
  • 1,716
  • 6
  • 32
  • 50
  • I'm assuming you're referring to a macro here. If you post your macro, we can make suggestions but as your question is phrased there are too many possible answers. If you're using a Data Step vs PROC SQL the answers will differ. – Reeza Feb 14 '16 at 21:10
  • ok - i realise my question is too generic. it would be proc sql rather than data step – shecode Feb 14 '16 at 21:14
  • My next question - are you sure you need a join rather than an append. Long data is easier to automate than wide data. And you can always transpose it afterwards if required. – Reeza Feb 14 '16 at 21:15
  • how would you append - if i set all of them together they would stack together but they all have different columns – shecode Feb 14 '16 at 21:17
  • If the columns are different then probably not :( – Reeza Feb 14 '16 at 21:17
  • Are you assuming that you're going to bring in all columns other than ID from each table? – Reeza Feb 14 '16 at 21:18
  • if the id column contains a unique set of id's i.e. no ids are repeated in a given table and if the columns other then id do not overlap between tables then this is pretty easy to do. Tell us more about the data – DCR Feb 15 '16 at 01:33
  • @DCR the tables are all the same number of rows, have the same id (named the same) but each has different numbers of columns. very simple datasets. I may end up with over 100 datasets though, and won't always be joining all of them – shecode Feb 15 '16 at 03:31

2 Answers2

0

If you want PROC SQL to figure out the key variable(s) for you then use the NATURAL JOIN operation. If some id's will appear in some datasets and not others then use NATURAL FULL JOIN.

data a (keep=id a1 a2) b(keep=id b1 b2) c(keep=id c1 c2) ;
  input id a1 a2 b1 b2 c1 c2 ;
  if id ne 1 then output a;
  if id ne 2 then output b;
  if id ne 3 then output c;
cards;
1 1 2 3 4 5 6
2 7 8 9 10 11 12
3 13 14 15 16 17 18 
;;;;
proc sql noprint ;
  create table want as
  select * from c natural full join 
   (select * from b natural full join a)
  ;
quit;

Which yields:

enter image description here

Tom
  • 47,574
  • 2
  • 16
  • 29
0

I think the following will do what you're looking. Look it over and if you have any questions post back:

 data test.table1;                                                                                                                      
 ID = 1;                                                                                                                                
 A=7;                                                                                                                                   
 B=8;                                                                                                                                   
 OUTPUT;                                                                                                                                
 ID = 2;                                                                                                                                
 A=9;                                                                                                                                   
 B=10;                                                                                                                                  
 OUTPUT;                                                                                                                                
 ID = 3;                                                                                                                                
 A=11;                                                                                                                                  
 B=12;                                                                                                                                  
 OUTPUT;                                                                                                                                
 RUN;                                                                                                                                   


 data test.table2;                                                                                                                      
 ID = 1;                                                                                                                                
 C=9;                                                                                                                                   
 D=10;                                                                                                                                  
 OUTPUT;                                                                                                                                
 ID = 2;                                                                                                                                
 C=11;                                                                                                                                  
 D=12;                                                                                                                                  
 OUTPUT;                                                                                                                                
 ID = 3;                                                                                                                                
 C=13;                                                                                                                                  
 D=14;                                                                                                                                  
 OUTPUT;                                                                                                                                
 RUN;                                                                                                                                   


 data test.table3;                                                                                                                      
 ID = 1;                                                                                                                                
 E=11;                                                                                                                                  
 F=12;                                                                                                                                  
 OUTPUT;                                                                                                                                
 ID = 2;                                                                                                                                
 E=13;                                                                                                                                  
 F=14;                                                                                                                                  
 OUTPUT;                                                                                                                                
 ID = 3;                                                                                                                                
 E=15;                                                                                                                                  
 F=16;                                                                                                                                  
 OUTPUT;                                                                                                                                
 RUN;                                                                                                                                   


proc sql noprint;                                                                                                                       
    create table test.names_tables as                                                                                                   
    select distinct libname,memname                                                                                                     
    from sashelp.vcolumn                                                                                                                
    where libname = "TEST"                                                                                                              
    and substr(memname,1,5)="TABLE";                                                                                                    
quit;                                                                                                                                   

proc sql noprint;                                                                                                                       
    select count(*) into: cnt                                                                                                           
    from test.names_tables;                                                                                                             
quit;                                                                                                                                   

%let cnt = &cnt;                                                                                                                        

%macro jmac;                                                                                                                            


proc sql noprint;                                                                                                                       
     select libname,memname into: lib, :table                                                                                           
     from test.names_tables                                                                                                             
     where monotonic() = 1;                                                                                                             
quit;                                                                                                                                   

%let lib = &lib;                                                                                                                        
%let table = &table;                                                                                                                    

%let fname = &lib..&table;                                                                                                              

data test.tablejoin;                                                                                                                    
set &fname;                                                                                                                             
run;                                                                                                                                    

%let fname = test.tablejoin;                                                                                                            

%do i=2 %to &cnt;                                                                                                                       
proc sql noprint;                                                                                                                       
     select libname,memname into: lib1, :table1                                                                                         
     from test.names_tables                                                                                                             
     where monotonic() = &i;                                                                                                            
quit;                                                                                                                                   

%let fname1 = &lib1..&table1;                                                                                                           


proc sql noprint;                                                                                                                       
     create table &fname as                                                                                                             
     select a.*,b.*                                                                                                                     
     from &fname as a                                                                                                                   
     full join                                                                                                                          
          &fname1 as b                                                                                                                  
     on a.id = b.id;                                                                                                                    
quit;                                                                                                                                   


%end;                                                                                                                                   

%mend jmac;                                                                                                                             
%jmac;                                                                                                                                  
DCR
  • 14,737
  • 12
  • 52
  • 115