SQL queries can become quite complicated. Getting them to produce the right results is not always trivial. Especially if verifying if the result is correct is difficult. Last week I encountered a case where at least it was obvious the result was wrong, it simply didn’t give any results. But finding the cause wasn’t that trivial even though the query itself wasn’t that complicated. And it has a lot to with my all time favourite database, MySQL (note the sarcasm here).
The query was something like this:
select t.* from ap_posts t where t.status='publish' and match(t.content) against(:query) and t.id not in (select post_id from ap_providers union select post_id from ap_methods)
Basically, it tries to do a full text search on ‘published posts’ and filter out any posts that are ‘providers’ or ‘methods’. While not terribly efficient, at first glance there are no obvious problems. Another weird thing was that there were no obvious changes made in this part of the code and the search used to work.
My strategy for finding the problem was simple. As the query didn’t give any results, the problem must be that one of the constraints is too strict. So I ran several variations of the query, each time dropping one of the constraints. Also, I run it directly against the database, skipping any JPA issues or other problems.
First reproducing the problem using the command line and a parameter that should give results was easy:
mysql> select t.* -> from ap_posts t -> where t.status='publish' -> and match(t.content) against('something') -> and t.id not in -> (select post_id from ap_providers -> union select post_id from ap_methods) -> ; Empty set (0.05 sec)
So, indeed one of the constraints should be the problem. Removing the ‘status’ constraint didn’t produce any results, removing the ‘match’ constrains also still didn’t produce any results. It has to be the third constraint. But that didn’t make any sense, looking at the following queries:
mysql> select count(*) from ap_providers -> union select count(*) from ap_methods; +----------+ | count(*) | +----------+ | 385 | | 186 | +----------+ 2 rows in set (0.00 sec) mysql> select count(*) from ap_posts; +----------+ | count(*) | +----------+ | 1999 | +----------+ 1 row in set (0.78 sec)
We are excluding a maximum of 385+186=571 id’s. And we have 1999 posts, so assuming there are no duplicate id’s, we should get at least 1999-571=1428 results.
But:
mysql> select t.* -> from ap_posts t -> where -> t.id not in -> (select post_id from ap_providers -> union select post_id from ap_methods) -> ; Empty set (0.79 sec)
Huh? Ah!
Using IN or NOT IN in a query should really only be used with a small number of constants, using a subquery as the value really is asking for trouble. There is a limit on the number of entries an IN clause can handle. In case of MySQL it is limited by some parameter called max_allowed_packet. If you go above the limit, MySQL simply doesn’t give any results anymore. A warning or error would have been helpful. But even an explicit show warnings doesn’t give any output.
It also explained why, without any changes in code, all of a sudden it stopped working: the number of ‘providers’ and ‘methods’ crossed a magic boundary and it stopped working.
Rewriting it to a proper form is easy. We could simply use a NOT EXISTS here to minimize impact:
select p.* from ap_posts p where p.status='publish' and match(p.content) against('something') and not exists ( select * from ap_providers where post_id=p.id union select * from ap_methods where post_id=p.id )
Even better would be to use joins, but that involves a bit more rewriting:
select p.* from ap_posts p left outer join (select post_id from ap_providers union select post_id from ap_methods) tmp on tmp.post_id = p.id where p.status='publish' and match(p.content) against('something') and tmp.post_id is null
So, some lessons learned: never use IN with a subquery, rewrite it to use a proper JOIN, or if you really must, an EXISTS clause. It might work at the time of testing, but it’ll break when your data grows. And, be aware of the ‘friendly’ behaviour of MySQL.
Read more:
- http://dev.mysql.com/doc/refman/5.1/en/comparison-operators.html#function_in
- http://stackoverflow.com/questions/452824/what-is-the-size-limitation-for-in-and-not-in-in-mysql