1. Using GROUP BY - this strategy uses GROUP BY across all of the items in the SELECT clause. Duplicates will be aggregated, with only one occurrence in the result set. The execution plan must be verified; specifically, that the late DISTINCT SORT has now moved to an earlier SORT.
2. Using the EXISTS() function - This works best if the items in the SELECT clause are from one table. It can sometimes be done across multiple tables, though I don't know of a generalized form for re-writing the multi-table query. In the single-table case, the form is:
SELECT DISTINCT S.c1, S.c2, ..., S.cn
FROM tableName1 S
JOIN tableName2 T ON S.i1 = T.i1
...
WHERE (... other constraints ...)
<< which becomes >>
SELECT S.c1, S.c2, ..., S.cn
FROM tableName1 S
WHERE EXISTS(SELECT 1
FROM tableName1 S2
JOIN tableName2 T ON S2.i1 = T.i1
...
WHERE S.PK = S2.PK AND (... other constraints ...) )
The efficiency of this re-write is best demonstrated by the actual execution time of the query. The execution plan doesn't have a way to show that the query within the EXISTS function is actually a series of short-circuit ORs. Once any one is made, it does not need to further evaluate the condition. The PK field in the above query represents the primary key field of table S.
-- Steve
No comments:
Post a Comment