We are aware of how map join and SMBM join works reducing the execution time( eliminating reduce phase i.e eliminating shuffle).
Ex: For join between two tables select a.col1,b.col2 from a join b on a.col1=b.col1 (both the tables are bucketed on col1 into same no of buckets)
But while joining with 3 or more tables on different columns,
Ex: Select a. col1,b.col3,c.col2,d.date from a join b on a.id=b.id join c on a.state=b.state join d on c.date=d.date
A scenario like this, how bucketing will help, if we don't want to split up the query in multiple smaller queries.