Navicat Blog

Choosing between a Subquery and Join Nov 11, 2022 by Robert Gravelle

In the Joins versus Subqueries: Which Is Faster? blog article we learned that joins tend to execute faster than subqueries. Having said that, it's not a universal rule, so you may not want to automatically assume that a join will be preferable. As mentioned in that article, if you need to add many joins to a query, the database server has to do more work, which can translate to slower data retrieval times. This article will present a couple of quick tests you can perform to compare a query that employs joins to one that contains subqueries so that you can choose which performs best.

Two Queries, Same Result

Most of the time, a query can be written using joins or subqueries. To illustrate, here is a query that selects countries, along with their associated cities and addresses from the MySQL Sakila Sample Database. The first SELECT statement uses joins while the second one fetches the exact same data using subqueries:

SELECT
    co.Country,
    COUNT(DISTINCT ci.city_id) AS city_cnt,
    COUNT(a.city_id)           AS address_cnt
FROM country co
INNER JOIN city ci
    ON co.country_id = ci.country_id
INNER JOIN address a
    ON ci.city_id = a.city_id
GROUP BY
    co.country_id;
		
		
SELECT 
    Co.Country,
    (Select COUNT(1) 
		 FROM City Ci 
		 WHERE Ci.country_id=co.country_id) AS city_cnt,
    (Select COUNT(1) 
		 FROM Address A 
		   INNER JOIN city c on a.city_id=c.city_id 
		 WHERE C.country_id=co.country_id) AS address_cnt
From Country Co;

We can easily compare the results in Navicat, because it can run multiple queries simultaneously. Each result set is shown in its own tab below the SQL Editor. In the image below, the contents of the Result 2 tab is shown next to Result 1 for quick comparison:

country, cities, and addresses (142K)

Query Execution Time

Having verified that both statements are equivalent, we can now compare their execution times.

To do that, we can select an individual statement, and click the Run button, whose label changes to Run Selected whenever text is selected in the editor. An Elapsed Time of 0.020s can be seen at the bottom of the screen:

join query elapsed time (138K)

Doing the same with the second statement yields an Elapsed Time of 0.021s. A minute difference, but one that would grow as the volume of data increases:

subquery elapsed time (123K)

Comparing Execution Plans

A query's Execution Plan can reveal a lot of information about how quickly it will execute. In Navicat, we can view the Execution Plan by clicking the Explain button. While it takes some practice to become adept at interpreting the results of Explain, doing so can pay dividends when trying to ascertain a query's efficiency.

The Explain1 tab shows the Execution Plan for the first (join) query. We can see at a glance that it involves 3 SIMPLE selects:

explain1 (99K)

Meanwhile, the Explain2 tab lists one PRIMARY select, followed by three DEPENDENT SUBQUERIES. Even without digging deeper, we can already see that there is an additional step required to execute the second (subquery) statement:

explain2 (42K)

Conclusion

While this blog seems to confirm the conclusion reached by Joins versus Subqueries: Which Is Faster? article, it can be a worthwhile exercise to compare both a join and subquery approach. In any event, there are still times that a subquery is advantageous over joins, such as when you have to calculate an aggregate value on-the-fly and use it in the outer query for comparison.

Navicat Blogs
Feed Entries
Blog Archives
Share