Navicat Blog

The SQL Anti Join Oct 21, 2024 by Robert Gravelle

One of the most powerful SQL features is the JOIN operation, providing an elegant and simple means of combining every row from one table with every row from another table. However, there are times that we may want to find values from one table that are NOT present in another table. As we'll see in today's blog article, joins can be utilized for this purpose as well, by including a predicate on which to join the tables. Known as anti joins, these can be helpful in answering a variety of business-related questions, such as:

  • Which customers did not place an order?
  • Which employees have not been assigned a department?
  • Which salespeople did not close a deal this week?

This blog will offer a primer on the types of anti joins and how to write them using a few examples based on the PostgreSQL dvdrental database. We'll write and execute the queries in Navicat Premium Lite 17.

Two Types of Anti Joins

There are two types of anti joins:

  • left anti join: returns rows in the left table that have no matching rows in the right table
  • right anti join: returns rows in the right table that have no matching rows in the left table

Returned rows are shown in blue in the diagram below:

anti-join_venn_diagram (56K)

The next section will walk through a few different syntaxes we can use to create an anti join, using a left anti join for examples.

Left Anti Join Using EXISTS

Let's say that we wanted to find all the actors in the dvdrental database that didn't appear in any film. Unfortunately, SQL doesn't have a built-in syntax for this operation, but we can emulate it using EXISTS, or, more specifically, NOT EXISTS. Here's what that query would look like:

SELECT *
FROM actor a
WHERE NOT EXISTS (
  SELECT * FROM film_actor fa
  WHERE a.actor_id = fa.actor_id
)

If we run it in Navicat Premium Lite 17, we get the following results:

left_anti-join (85K)

Beware of NOT IN!

Since EXISTS and IN are equivalent, you might be tempted to conclude that NOT EXISTS and NOT IN are also equivalent, but this is not always the case! They are only equivalent if the right table (in this instance, film_actor) has a NOT NULL constraint on the foreign key (the actor_id).

film_actor_table_design (82K)

In this specific instance, the NOT IN query returns the same results because of the NOT NULL constraint on the actor_id column:

left_anti-join_using_not_in (78K)

If the actor_id column did allow nulls, an empty result set would be returned. We can verify this via the following query:

SELECT *
FROM actor
WHERE actor_id NOT IN (1, 2, 3, 4, 5, NULL)
no_results_using_not_in (57K)

The above query doesn't return any rows because NULL represents an UNKNOWN value in SQL. Since we cannot be sure whether actor_id is in a set of values of which one value is UNKNOWN, the whole predicate becomes UNKNOWN!

The easiest way to avoid the danger posed by the NOT IN syntax is to stick with NOT EXISTS. It's really not even worth gambling on the presence of a NOT NULL constraint as the DBA might temporarily turn off the constraint to load some data, rendering your query useless in the interim.

Alternate Syntax

As alluded to in the introduction, it's also possible to perform an Anti Join using LEFT and RIGHT JOINs. For that to work, you need to add a WHERE clause with the IS NULL predicate. Here's the LEFT JOIN version of that syntax:

SELECT a.*
FROM actor a
  LEFT JOIN film_actor fa
	  ON a.actor_id = fa.actor_id
WHERE fa.actor_id IS NULL
left_anti-join_using_left_join (80K)

Just be aware that the LEFT/RIGHT JOIN syntax may run more slowly because the query optimizer doesn't recognize this as an ANTI JOIN operation.

Conclusion

In today's blog we learned how to emulate a Left Anti Join using three variations of SQL syntax. Of these, NOT EXISTS should be your first choice as it best communicates the intent of an ANTI JOIN and tends to execute the fastest.

Interested in giving Navicat Premium Lite 17 a try? You can download it for a 14-day fully functional FREE trial. It's available for Windows, macOS, and Linux operating systems.

Navicat Blogs
Feed Entries
Blog Archives
Share