0

I am trying to pair 2 tables (nicknamed PERSIP and ECIF) on their ID field, (labeled TABLE1 & TABLE2) to create a RESULTTABLE, where the ym_id (for both tables) variable is set to my timekey0 variable for a specific datetime.

I am wondering why this code produces 0 rows of resulting data. After looking online, this was the format people posted as solutions to similar problems.

%let timekey0 = 202110;
proc sql;

CREATE TABLE RESULTTABLE AS
SELECT

PERSIP.col1,
PERSIP.col2,
PERSIP.col3,

ECIF.col1,
ECIF.col2,
ECIF.col3,
ECIF.col4

FROM DB.TABLE1 PERSIP

LEFT JOIN DB.TABLE2 ECIF

ON PERSIP.ID = ECIF.ID 

WHERE ECIF.ym_id = &timekey0.

AND PERSIP.ym_id = &timekey0.;

quit;

I got a result of 0 rows with many columns. Not sure if my join type is incorrect but I have 0 rows in the table.

Dario
  • 3
  • 3
  • Does the variable name `timekey0.` ends with a dot ? – The Impaler Jun 21 '22 at 18:40
  • 2
    The left join is silently converted into an inner join by the `ECIF.ym_id = &timekey0.` predicate. Change to: `LEFT JOIN DB.TABLE2 ECIF ON PERSIP.ID = ECIF.ID AND ECIF.ym_id = &timekey0. WHERE PERSIP.ym_id = &timekey0.;`. – The Impaler Jun 21 '22 at 18:51
  • I assigned it as %let timekey0 = 202110; – Dario Jun 21 '22 at 18:51
  • I entered your edit changes and it has been loading for 5 mins. Will see shortly. – Dario Jun 21 '22 at 18:59
  • Running still.. does this mean it is not working? – Dario Jun 21 '22 at 19:08
  • I would guess it's selecting a massive amount of data. Is there an index on `DB.TABLE1 (ym_id)` and also in `DB.TABLE2 (ID, ym_id)`? – The Impaler Jun 21 '22 at 19:11
  • What type of values does the YM_ID variable have? You are testing if the value is equal to the integer value 202,110. Does YM_ID have date values? If you meant to test if the value as '01OCT2021'd then you are using the wrong value in your macro variable. – Tom Jun 21 '22 at 19:30
  • ym_id is in the format of 202110, meaning October 2021. (Monthly), as I am pulling month end data. – Dario Jun 21 '22 at 19:45
  • So YM_ID just has the number 202,110 and you have decided that means the 10 month of the the 2021st year? Are you positive that both dataset have coded it that way? Or does one dataset have the actual date '01OCT2021'd (which is actually the number 22,554 instead) and is just displaying it as 202110 because it is using the YYMMN6. format? – Tom Jun 22 '22 at 14:36

2 Answers2

0

There may be two reasons for this:

  1. There is no records matching to your where criteria (ECIF.ym_id = &timekey0. AND PERSIP.ym_id = &timekey0.)
  2. There is no records to join matching your on criteria (ON PERSIP.ID = ECIF.ID)
Ozan BAYRAM
  • 2,780
  • 1
  • 28
  • 35
  • You can confirm that it is not a syntax error? – Dario Jun 21 '22 at 18:58
  • No I cannot. Looks like valid SQL to me as a T-SQL guy but don't have any idea about SAS – Ozan BAYRAM Jun 21 '22 at 19:06
  • This is a big DB but i am beginner coder in SQL, so im so unsure if it is an issue that it has said 'running...' on the task status on SAS enterprise guide. – Dario Jun 21 '22 at 19:07
  • Try querying tables separately. That may help to ensure that you have the records or not. And may be it's worth to give it a try using nested (inner) query. – Ozan BAYRAM Jun 21 '22 at 19:10
  • It finally produced an output, but by the looks of it, I've got no data from my ECIF tables' columns that merged. – Dario Jun 21 '22 at 19:18
  • How would I go about querying them separately? I wanted to join 5 tables so this was the problem already being broken down, not sure how to query separately. – Dario Jun 21 '22 at 19:18
  • It's not a big thing just take single table and related where clauses. By that way you may check that table has the related records. Than later you may start creating more complex queries like joins – Ozan BAYRAM Jun 21 '22 at 19:22
0

Your logic seems off. You say you want a LEFT JOIN then use a variable from the "RIGHT" table in your WHERE condition.

Most likely you just want to add those conditions to the ON condition.

FROM TABLE1 PERSIP
LEFT JOIN TABLE2 ECIF
  ON PERSIP.ID = ECIF.ID 
  AND ECIF.ym_id = &timekey0.
  AND PERSIP.ym_id = &timekey0.

Or perhaps just keep the condition that will limit the observations read from the "LEFT" table in the WHERE condition

FROM TABLE1 PERSIP
LEFT JOIN TABLE2 ECIF
  ON PERSIP.ID = ECIF.ID 
  AND PERSIP.ym_id = ECIF.ym_id 
WHERE PERSIP.ym_id = &timekey0.
Tom
  • 47,574
  • 2
  • 16
  • 29
  • I'd like to take various columns from 4 different tables essentially, and they have to ALL fall under the condition of the timekey0. Can I follow the same steps for multiple left joins? – Dario Jun 21 '22 at 19:21
  • Can we move this conversation to a chat? – Dario Jun 21 '22 at 19:23
  • Create a simple example and test it. – Tom Jun 21 '22 at 19:25
  • I did your first example and it produced an output with missing values from the ECIF table. There were no values (empty col) for it. The data from PERSIP was there though. – Dario Jun 21 '22 at 19:29
  • So there are no matching values in the ECIF table. Why did you think that there should have been matching values in the ECIF table? – Tom Jun 21 '22 at 19:34
  • I thought it would be something related to the way I am joining. I am selecting the desired columns from the ECIF table now under the timekey condition to ensure the data exists for the time period. – Dario Jun 21 '22 at 19:42