Can someone please help me understand which approach would be the most efficient.
The first table users_of_interest_table
has one column users
that has ~1,000 unique user ID's.
The second table app_logs_table
has a users
column as well as an app_log
column. The table has more than 1 billion rows and over 10 million unique users.
What is the most efficient way to get all the app log data for the users in users_of_interest
. Here is what I have come up with so far.
Option 1: Use Inner Join
SELECT
u.users, a.app_logs
FROM
users_of_interest_table u
INNER JOIN
app_logs_table a
ON
u.users = a.users
Option 2: Subquery in Where Clause
SELECT
a.users, a.app_logs
FROM
app_logs_table a
WHERE
a.users IN (SELECT u.users FROM users_of_interest_table u)