0

I have a table_A. Sample data for table_A with name and dates

name date
Mark 29-APR-22
Dave 05-May-22

I would like to pass the name and date column of table_A to the following code as values. The whole code is running in SAS.

proc sql;
select table_A.name, table_A.date, count(distinct(table_B.shipments)) as cul_count
from table_B 
where table_B.names = table_A.name
and table_B.dates between (table_A.date and date());/*date () is to get today's date*/
quit;

I want to pass table_A values. Is there a way to do this without using joins? Trying to learn dynamic/parametric coding. And anyway with using joins? The expected output is as follows

name date cul_count
Mark 29-APR-22 65
Dave 05-May-22 102
Rogue258
  • 109
  • 1
  • 9

1 Answers1

1
  • Use a right join to limit the data to the 'right' table and then join on the names and specify that the dates should be after (including) the date in table A.

  • Alias make it simpler to read the code, ta/tb.

  • Table references to variables need to apply to the variable not the function (ie tb.shipments not tb.count).

  • For summations, you want to include a GROUP BY typically


proc sql;
select ta.name, ta.date, count(distinct(tb.shipments)) as cul_count
from table_B tb
right join table_A ta
on tb.name=ta.name
and ta.date >= tb.date
group by ta.name, ta.date;
quit;

EDIT: based on your comment
This is a manual version of the query you could build dynamically:

proc sql;
select tb.name, min(tb.date) as date, count(distinct(tb.shipments)) as cul_count
from table_B tb
where (name='Mark' and date>'29APR2022'd)
or (name = 'Dave' and date>'05May2022'd )
group by tb.name;
quit;

Here's my initial attempt at it. First you have to figure out how to manually build the part above, then you can use CALL EXECUTE to construct the code as desired. But the part above is key. Untested, so parentheses may be off, included for illustration purposes.

data _null_;
set table_A end=eof;

*runs only for first record;
if _n_=1 then do;
call execute('proc sql;
select tb.name, min(tb.date) as date, count(distinct(tb.shipments)) as cul_count
from table_B tb
where ');
end;

*runs for every record;
str = catt('(name = ', quote(name), ' and date > ' , date, ')');
call execute(str);

*runs for every record except for the last one;
if not  eof then call execute(' or ');

*runs only for last record;
if eof then call execute(' group by tb.name;quit;');

run;

Reeza
  • 20,510
  • 4
  • 21
  • 38
  • Anyway to do this without join? I wanted to learn how to pass the table_A values as parameters to table_B. I work with big data sets and I am trying to lessen the use of joins. – Rogue258 May 09 '22 at 17:07
  • 1
    With two fields, it's more difficult because you need to align dates/names. If it's just name then it's doable but you could do code generation via macro/call execute I suppose. – Reeza May 09 '22 at 19:23
  • 1
    Note that I had to add in the min(date) for the date calculation. Depending on the logic, adding in data from a different table could get complex. – Reeza May 09 '22 at 20:25