Limiting Results with HAVING
What if we wanted a list of only those authorsthat had no jokes to their name? Once again, let's look at the query that
most users would try first:
mysql>SELECT Authors.Name, COUNT(Jokes.ID) AS NumJokes
->FROM Authors LEFT JOIN Jokes ON AID = Authors.ID
->WHERE NumJokes = 0
->GROUP BY AID;
ERROR 1054: Unknown column 'NumJokes' in 'where clause'
By now you're probably not surprised that it didn't work as expected.
The reason why WHERE NumJokes = 0 didn't do the job has
to do with the way MySQL processes result sets. First, MySQL produces the
raw, combined list of authors and jokes from the Authors and Jokes tables. Next, it processes the WHERE clause
and the ON portion of the FROM clause so that only the
relevant rows in the list are returned (in this case, rows that match up authors
with their jokes). Finally, MySQL processes the GROUP BY clause
by grouping the results according to their AID, COUNTing
the number of entries in each group that have non-NULL Jokes.ID values,
and producing the NumJokes column as a
result.Notice that the NumJokes column isn't
actually created until the GROUP BY clause is processed,
and that doesn't happen until after the WHERE clause
does its thing! If you wanted to exclude jokes that contained the word "chicken"
from the count, you could use the WHERE clause without
a problem, because that exclusion doesn't rely on a value that the GROUP
BY clause is responsible for producing. Conditions that affect the
results after grouping takes place, however, must appear in a special HAVING clause.
Here's the corrected query:
mysql>SELECT Authors.Name, COUNT(Jokes.ID) AS NumJokes
->FROM Authors LEFT JOIN Jokes ON AID = Authors.ID
->GROUP BY AID
->HAVING NumJokes = 0;
+---------------+----------+
| Name | NumJokes |
+---------------+----------+
| Amy Mathieson | 0 |
+---------------+----------+
Some conditions work both in the HAVING and the WHERE clause.
For example, if we wanted to exclude a particular author by name, we could
use Authors.Name != "Author Name" in either the WHERE or
the HAVING clause to do it, because whether you filter
out the author before or after grouping the results, the same results are
returned. In such cases, it is always best to use the WHERE clause,
because MySQL is better at internally optimizing such queries so they happen
faster.