2012-02-25

Code Patterns: Setting Booleans

This comes up often enough that it deserves some thought: setting booleans.  The two most used booleans are the .Visible and .Enable properties available on all UI Forms and Web Forms controls from Microsoft.  There are four forms for setting these (a and b are of similar type; bc is a boolean):

In C-ish:

        // form one
        if (a == b)
            bc = true;
        else
            bc = false;

        // form two
        bc = false;
        if (a == b)
            bc = true;

        // form three
        bc = (a == b) ? true : false;

        // form four
        bc = (bool)(a == b);

In Visual Basic:

        'Form one
        If a = b Then
            bc = True
        Else
            bc = False
        End If

        'Form two
        bc = False
        If a = b Then bc = True

        'Form three
        bc = If(a = b, True, False)

        'Form four
        bc = CBool(a = b)

I advocate form four in both languages.  Without the "If" keyword, directive, or function, there is no zero-bit check in the assembly with the subsequent code jumps.  Even if the compiler is efficient and recodes these scenarios, then I still favor form four, because it is concise. 

I run across this frequently when I'm reviewing code, and I hope that this may remove one more rewrite in future reviews.

-- Steve

2012-02-20

TSQL DISTINCT qualifier

Using the DISTINCT qualifier in the SELECT clause typically causes a late sort in SQL Server's execution plan.  Since that sort contains duplicated rows, it is inefficient.  There are alternatives that may be more efficient.
1. Using GROUP BY - this strategy uses GROUP BY across all of the items in the SELECT clause.  Duplicates will be aggregated, with only one occurrence in the result set.  The execution plan must be verified; specifically, that the late DISTINCT SORT has now moved to an earlier SORT.
2. Using the EXISTS() function - This works best if the items in the SELECT clause are from one table.  It can sometimes be done across multiple tables, though I don't know of a generalized form for re-writing the multi-table query. In the single-table case, the form is:

SELECT DISTINCT S.c1, S.c2, ..., S.cn
FROM tableName1 S
    JOIN tableName2 T ON S.i1 = T.i1
    ...
WHERE (... other constraints ...)

<<  which  becomes  >>

SELECT S.c1, S.c2, ..., S.cn
FROM tableName1 S
WHERE EXISTS(SELECT 1
  FROM tableName1 S2 
     JOIN tableName2 T ON S2.i1 = T.i1
     ...  
  WHERE S.PK = S2.PK
     AND (... other constraints ...) )


The efficiency of this re-write is best demonstrated by the actual execution time of the query.  The execution plan doesn't have a way to show that the query within the EXISTS function is actually a series of short-circuit ORs.  Once any one is made, it does not need to further evaluate the condition.  The PK field in the above query represents the primary key field of table S.

-- Steve