0

The following is a macro I wrote to left out join two datasets.

/*
* Macro to add the t-rate to the data
*/
%MACRO RFRATE(WITHOUT_RATE, WITH_RATE);
/*read the tbill rate data*/
data TBRATE;
  INFILE "T-RATE/T-RATE-FORMATTED.csv"
  DELIMITER = "," 
  MISSOVER DSD 
  FIRSTOBS=2
  LRECL=32767;
  INPUT DATE one_M three_M six_M one_Y two_Y three_Y five_Y seven_Y ten_Y twenty_Y thirty_Y;
  format DATE yymmddn8.;
  length Date 4;
run;

data &WITH_RATE;
  merge &WITHOUT_RATE(IN=A) TBRATE(IN=B);
  by Date;
  if A;
run;

/*I have also tried the following way (SQL)*/
/*
PROC SQL;
  Create table &WITH_RATE as
  Select a.*,b.*
  from
    &WITHOUT_RATE a
  left join
    TBRATE b
  on a.Date=b.Date;
quit;
*/

proc export data=&WITH_RATE (obs=99999)
  outfile= 'samplesmall.csv'
  dbms=CSV REPLACE;
  putname=YES;
run;

proc contents data= TBRATE  position; 
run;

proc contents data= &WITHOUT_RATE  position; 
run;

%mend;

The TBRATE is some data similar to the following format:

| Date     | Rate     |
|----------|----------|
| 20120101 | 1.0      | 
| 20120102 | 1.5      | 
| 20120103 | 1.5      |
| 20120104 | 1.3      |
| 20120105 | 1.1      |

The WITHOUT_RATE is some data similar to the following:

| Date     | Other Data    |
|----------|---------------|
| 20120101 | 7.0           | 
| 20120101 | 3.5           | 
| 20120101 | 4.5           |
| 20120101 | 2.3           |
| 20120101 | 11.1          |
| 20120102 | 23.0          | 
| 20120102 | 12.5          | 
| 20120102 | 12.5          |
| 20120102 | 11.3          |
| 20120102 | 11.1          |

I wanted to create something similar to the following:

| Date     | Other Data    |   Rate  |
|----------|---------------|----------
| 20120101 | 7.0           | 1.0     | 
| 20120101 | 3.5           | 1.0     |
| 20120101 | 4.5           | 1.0     |
| 20120101 | 2.3           | 1.0     |
| 20120101 | 11.1          | 1.0     |
| 20120102 | 23.0          | 1.5     |
| 20120102 | 12.5          | 1.5     |
| 20120102 | 12.5          | 1.5     |
| 20120102 | 11.3          | 1.5     |
| 20120102 | 11.1          | 1.5     |

The format of the date variable in WITHOUT_RATE is:

 Variable      Type    Len    Format       Label
 DATE          Num       4    YYMMDDN8.    Quote date

The format of the date variable in TBRATE is:

 Variable    Type    Len    Format
 DATE        Num       4    YYMMDDN8.

It seems that they are the same.

However, the code keep producing null results:

| Date     | Other Data    |   Rate  |
|----------|---------------|----------
| 20120101 | 7.0           |         | 
| 20120101 | 3.5           |         |
| 20120101 | 4.5           |         |
| 20120101 | 2.3           |         |
| 20120101 | 11.1          |         |
| 20120102 | 23.0          |         |
| 20120102 | 12.5          |         |
| 20120102 | 12.5          |         |
| 20120102 | 11.3          |         |
| 20120102 | 11.1          |         |

Therefore, I was wondering where did I go wrong?

Update #2

The code used to call the macro: 

/*The output dataset*/
%NBBO(20130102, tempoutputset);
%NBBOReturn(tempoutputset, NBBODATA);
%RFRATE(NBBODATA, RFRATEDATA);
FlipperPA
  • 13,607
  • 4
  • 39
  • 71
Jinhua Wang
  • 1,679
  • 1
  • 17
  • 44
  • Macro is text replacement. Literally take your macro code and substitute it in and you'll see the errors because you won't have SAS syntax right away. – Reeza May 22 '17 at 19:08
  • Why do you have both a `SET` and a `MERGE` statement? With your posted example data that will mean that the rates for 03JAN2012 and beyond do not make it into the output since the data step will stop when the SET statement reads past the end of the WITHOUT dataset. – Tom May 22 '17 at 21:31
  • Why have wrapped the macro calls inside of data steps? That makes no sense because the macro is generating multiple steps (2 data steps and 3 proc steps). – Tom May 22 '17 at 21:33
  • @Tom I removed the set statement but it seems that it is still not matching. – Jinhua Wang May 23 '17 at 02:52
  • @Reeza hmm... Does removing macro have an effect on the results at all? – Jinhua Wang May 23 '17 at 02:53
  • @Tom Thanks for the reminder. I removed the data steps wrapping the macros. The reason that I added them was I thought, like OOP, the variables need to be declared before using them ... – Jinhua Wang May 23 '17 at 02:56
  • 1
    Your posted data merges fine. If your actual data is not merging then look at the data to see why it does not match. Note your posted code is still referring to macro code that you didn't post. You should probably NOT be trying to use macro code until you understand what actual SAS code you want the macros to generate. – Tom May 23 '17 at 15:28

3 Answers3

1

Remove set from data merge statement. You cannot write set in Data Merge statement. Also you need to sort datasets, if they are not already sorted by default by date for them to Merge on Date:-

Proc sort data=&WITHOUT_RATE; by Date; run;
Proc sort data=TBRATE; by Date; run;

data &WITH_RATE;
  merge &WITHOUT_RATE(IN=A) TBRATE(IN=B);
  by Date;
  if A;
run;

OR 

PROC SQL:
Create table &WITH_RATE as
Select a.*,b.Rate
from
&WITHOUT_RATE a
left join
TBRATE b
on a.Date=b.Date;
quit;
India.Rocket
  • 1,225
  • 1
  • 8
  • 11
  • Nope still empty. I deleted the set statement but it seems that it is not working. – Jinhua Wang May 22 '17 at 18:34
  • Can you give the code which you are using to call the macro? – India.Rocket May 22 '17 at 18:41
  • Try using PROC SQL instead of Data merge statement as in my updated edit. Also, you don't need a data set statement to call a macro. Directly replace this statement:- data RFRATEDATA; %RFRATE(NBBODATA, RFRATEDATA); run; as this %RFRATE(NBBODATA, RFRATEDATA); – India.Rocket May 22 '17 at 18:58
  • I tried the SQL statement, but still no luck. Is it possible that something is wrong with the data formatting? – Jinhua Wang May 23 '17 at 03:17
  • Query is absolutely correct. There's no reason for dates not to merge, if the formats are same. The only problem could be the data. Check your data again. Take it outside the macro. Run step by step by creating datsets in work/ in any library you want and check these datasets. sometimes what we think is happening is actually totally different then what is actually happening during the code run – India.Rocket May 23 '17 at 03:22
1

Agree with @Reeza, I would remove the macro code to work on finding the source of the problem. Once you have it working outside of the macro language, go back to working on the macro. From a quick look, I would guess your BY values are not matching.

Try something like below to look at the mismatched records:

data WITH_RATE
     Aonly
     Bonly
     BothAandB
  ;
  merge WITHOUT_RATE(IN=A) TBRATE(IN=B);
  by Date;
  if A then output With_Rate;

  if A=1 and B=1 then output BothAandB;  
  else if B=0 then output Aonly;
  else if A=0 then output Bonly;

run;

Then remove the formats on date and look at the values in Aonly and Bonly to see why they don't match. The output dataset Aonly will have records from WITHOUT_RATE which do not have a matching DATE in TBRATE. The output dataset Bonly will have records from TBRATE which do not have a matching DATE in WITHOUT_RATE. The output dataset BothAandB will have the records that did match.

If you see values in Aonly and Bonly that look like they should match, and assuming these are dates, perhaps round them before the merge to make sure they are integers to avoid any possible numeric precision / floating point differences.

Quentin
  • 5,960
  • 1
  • 13
  • 21
  • What is the Aonly and Bonly here? – Jinhua Wang May 23 '17 at 02:47
  • @JinhuaWang I added an explanation in the answer. – Quentin May 23 '17 at 17:20
  • The data I am analyzing is above 1.5 G, and there are hundreds of those files. It seems that I cannot see manually where is Aonly and Bonly ... – Jinhua Wang May 23 '17 at 19:55
  • I would grab 10 records from Aonly. Then look into Bonly and see if you can find those dates. If you think you can see them, then you can compare the values of the 10 dates in Aonly to the values of the 10 dates in Bonly and see why they are not an exact match. – Quentin May 23 '17 at 19:59
  • Good idea. Trying now. – Jinhua Wang May 23 '17 at 20:01
  • For some reason, the date in Bonly are all ********. So weird! – Jinhua Wang May 23 '17 at 20:49
  • Also, if I dump the data TBRATE, the dates are in correct formats (such as 20130101). But the dates in Bonly is all messed up! (they are all *******) What happened? – Jinhua Wang May 23 '17 at 21:00
  • Formats only effect how values are displayed. They shouldn't effect the merge. Try removing all the formats, e.g. `proc print data=Bonly; format date;run;` SAS dates are numeric (usually integers), the number of days since Jan 1, 1960. – Quentin May 23 '17 at 21:08
0

I solved it. Thanks to @Quentin and @India.Rocket's suggestions, I was able to find the solution.

The problem was reading the dates from csv file. It seems that dates in the format of yyyymmdd or yyyy-mm-dd cannot be read correctly. I had to reformat the dates in the csv to mm/dd/yyyy with python, then input them as DATE:mmddyy10 in sas. However, the other dataset I was merging with has the date format in yymmddn8. Therefore, I had to reformat DATE:mmddyy10. to yymmddn8. to make the whole thing work.

Following code was used.

data TBRATE;
  INFILE "T-RATE/T-RATE-FORMATTED.csv"
  DELIMITER = ","
  MISSOVER DSD
  FIRSTOBS=2
  LRECL=32767;
  INPUT DATE:mmddyy10. DateRaw one_M three_M six_M one_Y two_Y three_Y five_Y seven_Y ten_Y twenty_Y thirty_Y;
  format DATE yymmddn8.;
run;
Jinhua Wang
  • 1,679
  • 1
  • 17
  • 44
  • SAS has many INFORMATs to support the data formats you mention. I suspect your problem is caused by using PROC IMPORT to read a CSV file. It is not necessary to use PROC IMPORT to read a CSV file since all the proc does is generate a data step to read it. You can write the data step yourself and save yourself all of the headache. – Tom May 24 '17 at 17:42
  • How is this different than the original code other than the format used for the date variable? Note that your original program should work since the YYMMDD informat can read strings like '20120101' just as easily as the MMDDYY informat can read strings like '01/01/2012'. – Tom May 24 '17 at 17:45