0

I have two tables where I want to compare multiple columns in the tables. Then, I want to output a compare result column for each column comparison.

Table 1 (t1)

id gender age state
1 M 15 CA
2 F 20 NY

Table 2 (t2)

id gender age
1 M 15
2 F 21

Resulttable

id t1.gender t2.gender gender_compare t1.age t2.age age_compare
1 M M true 15 15 true
2 F F true 20 21 false

I am able to create a select statement to return the compared columns from each table. The part I'm uncertain about is how to create a compare column for each comparison (t1.age vs. t2. age and t1.gender vs. t2.gender).

My guess is I could write a case statement for each compare column (i.e. gender_compare, age_compare, etc). But, I have close to 30 columns to compare. This can become very messy to write 30 case statements. Is there a better way?

SELECT t1.id ,t2.id, 
t1.gender, t2.gender
t1.age, t2 age
FROM table1 t1
join table2 t2 on t1.id=t2.id

For reference, I am running the query in AWS Athena.

Guru Stron
  • 102,774
  • 10
  • 95
  • 132
TechNewbie
  • 164
  • 2
  • 15
  • Are you after a boolean result or a string value? You can either compare one column to the other or use a case expreesion. – Stu Mar 08 '23 at 20:26
  • @Stu, I'd be happy with boolean or string value. I had thought about the case statement, but was hoping there's a way around writing 30 different case statements. – TechNewbie Mar 08 '23 at 20:28
  • 1
    30 case expressions aren't that bad. As Stu mentioned, the alternative would be a comparison such as `t1.age = t2.age as age_compare` which returns boolean value. – Isolated Mar 08 '23 at 20:29
  • 1
    Ahhh... I had no idea I could just do t1.age=t2.age. True face palm moment. – TechNewbie Mar 08 '23 at 20:34

1 Answers1

2

This can become very messy to write 30 case statements. Is there a better way?

You don't need the case statements, just compare the columns (i.e. t1.gender = t2.gender as gender_compare, etc.):

SELECT t1.id id
    t1.gender, t2.gender, t1.gender = t2.gender as gender_compare
    t1.age, t2.age, t1.age = t2.age as age_compare
FROM table1 t1
join table2 t2 on t1.id = t2.id

If you need some other output instead of true/false - use if function which is more succinct approach then using case :

The IF function is actually a language construct that is equivalent to the following CASE expression:

CASE
    WHEN condition THEN true_value
    [ ELSE false_value ]
END

For example - if(t1.gender = t2.gender, 'Y', 'N') as gender_compare

I can't think of better options then listing all needed columns and comparison. Potentially you can query information_schema.columns (see this answer) to "automate" manual query generation (i.e. you write the SQL query which produces another SQL query as a result - have done this myself several times)

AFAIK Presto/Trino does not provide options for dynamic SQL generation and execution. If you want to fully automate this then you will need to use some outside scripting (for example generating and executing query via AWS Lambda).

Guru Stron
  • 102,774
  • 10
  • 95
  • 132
  • Thank you very much for this detailed response. I learn two SQL things today. I have equate two fields and return a boolean True/False and there's an IF function. I will have to play around with that capability. Much appreciated. – TechNewbie Mar 09 '23 at 19:29