I have two large tables. I am joining these two tables in Spark sql like
select * from table1 A Join table2 B on(A.client=B.client,A.sitecode=B.sitecode,A.spec_nbr=B.spec_nbr).
table 1 has skewed data and making the query run longer. I want to avoid skewed data by using the salting technique.
For this scenario how to apply the salting technique?
I am not able to find any relevant material on how to apply the salting technique. Any help is appreciated.