7.2.8. DISTINCT Optimization
DISTINCT combined with ORDER
BY needs a temporary table in many cases.
Because DISTINCT may use GROUP
BY, you should be aware of how MySQL works with
columns in ORDER BY or
HAVING clauses that are not part of the
selected columns. See Section 12.11.3, “GROUP BY with Hidden Fields”.
In most cases, a DISTINCT clause can be
considered as a special case of GROUP BY. For
example, the following two queries are equivalent:
SELECT DISTINCT c1, c2, c3 FROM t1 WHERE c1 > const;
SELECT c1, c2, c3 FROM t1 WHERE c1 > const GROUP BY c1, c2, c3;
Due to this equivalence, the optimizations applicable to
GROUP BY queries can be also applied to
queries with a DISTINCT clause. Thus, for
more details on the optimization possibilities for
DISTINCT queries, see
Section 7.2.13, “GROUP BY Optimization”.
When combining LIMIT
row_count with
DISTINCT, MySQL stops as soon as it finds
row_count unique rows.
If you do not use columns from all tables named in a query,
MySQL stops scanning any unused tables as soon as it finds the
first match. In the following case, assuming that
t1 is used before t2
(which you can check with EXPLAIN), MySQL
stops reading from t2 (for any particular row
in t1) when it finds the first row in
t2:
SELECT DISTINCT t1.a FROM t1, t2 where t1.a=t2.a;