Questions tagged [anti-join]

An anti-join, also called an excluding-join or left-outer-join, comes from requesting data from a table where some value is not in another table

Formal Definition

The antijoin, written as R ▷ S (where R and S are relations), is similar to the semijoin, but the result of an antijoin is only those tuples in R for which there is no tuple in S that is equal on their common attribute names.

Venn Diagram

anti-join

SQL Phrasing

SELECT <select_list>
FROM TableA A
LEFT JOIN TableB B
ON A.Key = B.Key
WHERE B.Key IS NULL

Related to

References:

78 questions
0
votes
2 answers

Oracle - Finding missing /non-joined records

I have an issue in Oracle 12 that is easiest explained with the traditional database design scenario of students, classes, and students taking classes called registrations. I understand this model well. I have a scenario where I need to get a…
user1009073
  • 3,160
  • 7
  • 40
  • 82
0
votes
2 answers

dplyr top_n with group_by - error in as.list(x) : trying to get slot "matsin" from an object (class "quosures") that is not an S4 object

I have been successfully using the below piece of code for quite a while, but for some reason now it got broken. Although I haved loaded the very same packages as before. No changes to R or dplyr versions in the meantime. It looks like that dplyr's…
lrdbs
  • 33
  • 1
  • 7
0
votes
0 answers

Why is my code getting rid of rows when I subtract?

So I have some code that looks at two data frames and subtracts a column value named "Intensity" for certain Compositions of molecules. However for instance if the molecule is not in the other data frame, it completely gets rid of that row for some…
David
  • 43
  • 5
0
votes
4 answers

Finding missing emails in SQL Server

I am trying to do something I've done a million times and it's not working, can anyone tell me why? I have a table for people who sent in resumes, and it has their email address in it... I want to find out if any of these people have NOT signed up…
Jasmine
  • 4,003
  • 2
  • 29
  • 39
0
votes
2 answers

SQL Anti-join Delete Optimisation

I have two tables in a postgres database, posts and users. posts has a user_id foreign key that references the users.id primary key column. Both tables are very large. I have just deleted a random set of users (about 80% of the total users) and I…
James Stonehill
  • 1,125
  • 10
  • 22
0
votes
2 answers

SQL: How do I delete rows that are also in another table?

Say I have two tables. E.g, Table 1: Store Product 1 2 1 1 2 3 2 4 And Table 2 Store Product 1 2 2 3 How do I delete all the rows in Table 1 that are also in Table 2? So, new Table 1 would be:…
Jacob Curtis
  • 788
  • 1
  • 8
  • 22
0
votes
1 answer

Left Anti join in Spark dataframes

I have two dataframes, and I would like to retrieve only the information of one of the dataframes, which is not found in the inner join, see the picture: I have tried several ways: Inner join and filtering the rows that return at least one null,…
0
votes
1 answer

R: How to get missing records based on values in two columns

I have two large dataframes of longitude/latitude coordinates, CoastalStates_Tax and CoastalStates, which are mostly the same except CoastalStates_Tax has a few million more coordinates. I want to figure out which rows in CoastalStates_Tax are not…
Alissa
  • 99
  • 1
  • 5
0
votes
1 answer

Spark Dataframe leftanti Join Fails

We are trying to publish deltas from a Hive table to Kafka. The table in question is a single partition, single block file of 244 MB. Our cluster is configured for a 256M block size, so we're just about at the max for a single file in this case.…
Stuart
  • 1,572
  • 4
  • 21
  • 39
0
votes
1 answer

Apply a dplyr function to one common column across 30 dataframes

I have up to 30 data frames with a common ID column. There are other colums in each df but im just showing the ID here. Library DF1 DF2 DF3 ID# ID# ID# .... 1111 1111 1112 .... 2222 …
Numan Karim
  • 43
  • 1
  • 6
0
votes
3 answers

Returning results from a NOT IN query very quickly

We've got a system where we've got a collection of items (> 1 million), and several things processing it. Each processor should only process each item once, and the processors have a heirarchy. Our current implementation is to have a 'processed'…
thecoop
  • 45,220
  • 19
  • 132
  • 189
0
votes
2 answers

find only parent records with children of certain status across multiple tables

I need all tickets where all the related children records are of status closed. So if one of the children is something else, I don't want the ticket in the result set. I tried this using the antijoin patern, but my problem is that the children live…
Jeroen
  • 1,638
  • 3
  • 23
  • 48
0
votes
2 answers

sql optimization with anti join

i have a recursive table of category and a company table with fields like: category(id, name, parent) // parent is foreign key to category id :) company(id, category_1, category_2, category_3) // category_* is foreign key to category id category…
alizelzele
  • 892
  • 2
  • 19
  • 34
0
votes
1 answer

Search for values in table1 that match values in table2 without joins (Sql Server 2008)

I ran into a situation where I have two tables that store employee information, Table1 are the employees and table2 are 3rd party temps. These tables use a different convention for the IDs (not something I have control over). The problem is that…
Jeremy
  • 131
  • 1
  • 10
-1
votes
3 answers

mongodb - left join with conditions

I am attempting an left antijoin on these two collections. I want all users where department is equal to 'IT' that aren't in a meeting that had an endAt time > 175. Either as a creator or receiver. So essentially whoever hasn't been in a meeting in…
CirqueM
  • 77
  • 2
  • 9