1

My issue is with trying to join tables that don't have primary keys with a full join. I need to generate a table that is joined on COMPANY and Program Year, but the participant doesn't need to participate in both events, just one of the two. The PHA and BIO tables have multiple entries for each user as they participate in multiple years. When I join one of the PHA or BIO tables to the registration table (RL) then try a full join with the remaining table, it essentially acts like a left join.

What I am looking for is outlined in the simple excel example attached. For each company the participation in either of the events could vary (as in BIO could have higher participation one year, but PHA could the next).

Simple example of data and desired output

Below is some of what I have been experimenting with:

SELECT
    RL.UserName
    ,RL.Company
    ,BIO.ScreeningDate
    ,BIO.ProgramYear
    ,PHA.PHADate
    ,PHA.ProgramYear            

FROM    Registration RL 
        LEFT OUTER JOIN PHADetails PHA
            on RL.UserName = PHA.Username
        FULL OUTER JOIN Biometrics BIO
            on RL.UserName = BIO.Username 
                AND BIO.ProgramYear = PHA.ProgramYear

--FROM  Registration RL 
--      LEFT OUTER JOIN Biometrics BIO
--          on RL.UserName = BIO.Username
--      FULL OUTER JOIN PHADetails PHA
--          on RL.UserName = PHA.Username 
--              AND BIO.ProgramYear = PHA.ProgramYear

WHERE RL.Company = 'City' 
        and CASE WHEN BIO.ProgramYear = 2019 and PHA.ProgramYear = 2019 THEN 1
            WHEN BIO.ProgramYear = 2019 AND PHA.ProgramYear IS NULL THEN 1
            WHEN BIO.ProgramYear IS NULL AND PHA.ProgramYear = 2019 THEN 1 
            ELSE 0 END > 0

-- have 27 participants in biometrics program
-- have 37 particiapnts in PHA program
-- want to always include participants who did both or either one, but
--   the participation for either one will vary
-- need to combined the two left joins shown above into the same full join
willi7c7
  • 33
  • 3
  • 2
    The WHERE clause using a column from the "outer table", turns the outer join back into an inner join. You need to move the CASE expression for `bio.programyear` out off the WHERE condition into the JOIN condition. –  Feb 11 '20 at 21:45
  • Does this answer your question? [Full outer join not returning all rows?](https://stackoverflow.com/questions/16167870/full-outer-join-not-returning-all-rows) – philipxy Feb 11 '20 at 23:55
  • This is a faq. Before considering posting please always google any error message or many clear, concise & precise phrasings of your question/problem/goal, with & without your particular strings/names & site:stackoverflow.com & tags, & read many answers. See [ask] & the voting arrow mouseover texts. Please [use text, not images/links, for text--including tables & ERDs](https://meta.stackoverflow.com/q/285551/3404097). Use images only for what cannot be expressed as text or to augment text. Include a legend/key & explanation with an image. – philipxy Feb 11 '20 at 23:57
  • Please in code questions give a [mre]--cut & paste & runnable code; example input (as initialization code) with desired & actual output (including verbatim error messages); tags & versions; clear specification & explanation. For errors that includes the least code you can give that is code that you show is OK extended by code that you show is not OK. (Debugging fundamental.) For SQL that includes DBMS & DDL, which includes constraints & indexes & tabular initialization. When you get a result that you don't expect, stop trying to find your overall goal & find out what your misunderstanding is. – philipxy Feb 11 '20 at 23:57

1 Answers1

1

Notwithstanding the WHERE clause situation, your current query can also produce undesired results if records that appear in one table does not match by year of record in other table. Consequently, rows with NULL can be returned when there supposedly should be values. For this reason, consider only joining on UserName then running DISTINCT and COALESCE to de-dupe and combine years.

SELECT DISTINCT
    RL.UserName
    ,RL.Company
    ,COALESCE(BIO.ProgramYear, PHA.ProgramYear) AS ProgramYear
    ,BIO.ScreeningDate
    ,PHA.PHADate

FROM    Registration RL 
        LEFT OUTER JOIN PHADetails PHA
            on RL.UserName = PHA.Username
        FULL OUTER JOIN Biometrics BIO
            on RL.UserName = BIO.Username 
...

To use your Excel data, see Online demo,

SELECT DISTINCT
     RL.UserName
    ,RL.Firstname
    ,RL.Lastname
    ,COALESCE(BIO.Year, PHA.Year) As Year
    ,BIO.One    
    ,PHA.Four


FROM    UserTable RL 
        LEFT OUTER JOIN AddInfoTable PHA
            ON RL.UserName = PHA.Username
        FULL OUTER JOIN BaseInfoTable BIO
            ON RL.UserName = BIO.Username
ORDER BY 
    COALESCE(BIO.Year, PHA.Year) DESC;

SQL Output

Parfait
  • 104,375
  • 17
  • 94
  • 125
  • Thank you! That makes complete sense. This was the main issue I was wrestling with and was trying a bunch of things that didn't work (the above code included). – willi7c7 Feb 12 '20 at 14:35
  • Great to hear and glad to help! Happy SQL-ing! – Parfait Feb 12 '20 at 15:30