Sybase® SQL Server Performance and Tuning Guide
Chapter 9: Advanced Optimizing Techniques


dbcc traceon 302

dbcc traceon (302) can often help you understand why the optimizer makes choices that seem incorrect. It can help you debug queries and help you decide whether to use specific options, like specifying an index or a join order for a particular query. It can also help you choose better indexes for your tables.

showplan tells you the final decisions that the optimizer makes about your queries. dbcc traceon (302) helps you understand why the optimizer made the choices that it did. When you turn on this trace facility, you eavesdrop on the optimizer as it examines query clauses.

The output from this trace facility is more cryptic than showplan output, but can go further in explaining such questions as why a table scan is done rather than an indexed access, why index1 is chosen rather than index2, or why a reformatting strategy is applied. The trace provides detailed information on the costs the optimizer has estimated for each permutation of the table, search clause, and join clause as well as how those costs were determined.

Invoking the dbcc Trace Facility

Execute the following command from an isql batch followed by the query or stored procedure call you want to examine:

dbcc traceon(3604, 302)

This is what the trace flags mean:

Table 9-2:

Trace Flag   Explanation 
3604  Directs trace output to the client rather than the errorlog 
302  Print trace information on index selection 

To turn off the output, use:

dbcc traceoff(3604, 302)

General Tips for Tuning with This Trace Facility

When you trace queries through this facility, run your queries in the same manner as your application, as follows:

Be very careful to ensure that your trace tests cause the optimizer to make the same decisions as in your application. You must supply the same parameters and values to your stored procedures or where clauses.

If you are using stored procedures, make sure that they are actually being optimized during the trial by executing them with recompile.

Checking for Join Columns and Search Arguments

In most situations, SQL Server can use only one index per table in a query. This means the optimizer must often choose between indexes when there are multiple where clauses supporting both search arguments and join clauses. The optimizer's first step is to match search arguments and join clauses to available indexes.

The most important item that you can verify using this trace facility is that the optimizer is evaluating all possible where clauses included in each Transact-SQL statement.

If a clause is not included in this output, then the optimizer has determined it is not a valid search argument or join clause. If you believe your query should benefit from the optimizer evaluating this clause, find out why the clause was excluded, and correct it if possible. The most common reasons for "non-optimizable" clauses include:

See "Search Arguments and Using Indexes" for more information on requirements for search arguments.

Determine How the Optimizer Estimates I/O Costs

Identifying how the optimizer estimates I/O often leads to the root of the problems and to solutions. You will be able to see when the optimizers uses your distribution page statistics and when it uses default values.

Trace Facility Output

Each set of clauses evaluated by the optimizer is printed and delimited within two lines of asterisks. If you issue an unqualified query with no search arguments or join clause, this step is not included in the output, unless the query is covered (that is, a nonclustered index contains all referenced columns).

Output for each qualified clause looks like this:

 *******************************
Entering q_score_index() for table 'name' (objectid obj_id, varno = varno). 
The table has X rows and Y pages.
Scoring the clause_type CLAUSE
    column_name operator [column_name}

<other query specific info explained later>
*******************************

q_score_index() is the name of a routine that SQL Server runs to cost index choices. It finds the best index to use for a given table and set of clauses. The clauses can be either constant search arguments or join clauses. There is a wealth of information contained in this routine's output. The next few pages evaluate each line separately.

Identifying the Table

The first line identifies the table name and its associated object ID. The actual output for this line looks like this:

Entering q_score_index() for table 'titles' (objecti 208003772), varno = 0

The optimizer analyzes all search arguments for all tables in each query, followed by all join clauses for each table in the query.

Therefore, you first see q_score_index() called for all tables in which the optimizer has found a search clause. The routine numbers the tables in the order in which they were specified in the from clause and displays the numbers as the varno. It starts numbering with 0 for the first table.

Any search clause not included in this section should be evaluated to determine whether its absence impacts performance.

Following the search clause analysis, q_score_index() is called for all tables where the optimizer has found a join clause. As above, any join clause not included in this section should be evaluated to determine whether its absence is impacting performance.

Estimating Table Size

The next line prints the size of the table in both rows and pages:

 The table has 5000 rows and 624 pages. 

These sizes are pulled from the system tables where they are periodically maintained. There are some known problems where inaccurate row estimates cause bad query plans, so verify that this is not the cause of your problem.

Identifying the where Clause

The next two lines indicate the type of clause and a representation of the clause itself with column names and abbreviations for the operators. It indicates:

All search clauses for all tables are evaluated before any join clauses are evaluated.

The operator codes are defined in Table 9-2.

Table 9-3: Operators in dbcc traceon(302) output

dbcc output  Comparison 
EQ  Equality comparisons (=)  
LT  Less than comparisons (<) 
LE  Less than or equal to comparisons (<=)  
GT  Greater than comparisons (>)  
GE  Greater than or equal to comparisons (>=) 
NE  Not equals (!=) 
ISNULL  is null comparison 
ISNOTNULL  is not null comparison 

Output for Range Queries

If your queries include a range query or clauses that are treated like range queries, they are evaluated in a single analysis to produce an estimate of the number of rows for the range. For example,

Scoring the SEARCH CLAUSE: 
    au_lname LT 
    au_lname GT

Range queries include:

Specified Indexes

If the query has specified the use of a specific index by including the index keyword and the index name in parentheses after the table name in the from clause, this is noted in the output:

User forces index IndexID.

Specifying an index prevents consideration of other alternatives.

If the I/O size and cache strategy are also included in the query, these messages are printed:

User forces data prefetch of 8K
User forces LRU buffer replacement strategy 

Calculating Base Cost

The next line of output displays the cost of a table scan for comparison, provided that there is at least one other qualification or index that can be considered. It reports index ID 0 and should match the table size estimate displayed earlier. The line looks like this:

Base cost: indid: IndexID rows: rows pages: pages prefetch: <S|N>
         I/O size: io_size cacheid: cacheID  replace: <LRU | MRU>

Here is an example:

Base cost: indid: 0 rows: 5000 pages: 624 prefetch: N 
         I/O size: 2 cacheid: 0 replace: LRU

Table 9-4: Base cost output

Output  Meaning 
indid  The index ID from sysindexes; 0 for the table itself. 
rows  The number of rows in the table. 
pages  The number of pages in the table. 
prefetch  Whether prefetch would be considered for the table scan. 
I/O size   The I/O size to be used. 
cacheid   The ID of the data cache to be used. 
replace  The cache replacement strategy to be used, either LRU or MRU. 

Verify page and row counts for accuracy. Inaccurate counts can cause bad plans. To get a completely accurate count, use the set statistics io on command along with a select * from tablename query. In a VLDB (very large database) or in 24x7 shops (applications that must run 24 hours a day, 7 days a week), where that is not practical, you may need to rely on the reasonable accuracy of the sp_spaceused system procedure. dbcc allocation-checking commands print the object size and correct the values on which sp_spaceused and other object-size estimates are based.

Costing Indexes

Next, the optimizer evaluates each useful index for a given clause to determine its cost. The optimizer first looks for a unique index that is totally qualified--meaning that the query contains where clauses on each of the keys in the index. If such an index is available, the optimizer immediately knows that only a single row satisfies the clause, and it prints the following line:

Unique index_type index found--return rows 1 pages pages 

The index_type is either clustered or nonclustered. There are three possibilities for the number of pages:

If the index is not unique, then the optimizer determines the cost, in terms of logical I/Os, for the clause. Before doing so, it prints this line:

Relop bits are: integer 

This information can be ignored. It merely restates the comparison operator (that is, =, <, >, interval, and so on) listed in the q_score_index() line mentioned earlier as an integer bitmap. This information is only necessary for Sybase Engineering to debug optimizer problems and it has no value for customer-level troubleshooting.

To estimate the I/O cost for each clause, the optimizer has a number of tools available to it, depending on the clause type (search clause or join clause) and the availability of index statistics. For more information, see "Index Statistics".

Index Statistics Used in dbcc 302

For each index, SQL Server keeps a statistical histogram of the indexed column's data distribution. This histogram is built automatically during index creation and is stored in a distribution page with the index unless the table is empty. This histogram is a sampling of the index key values every N rows.

N is dependent on the full size of the key (including overhead) and the number of rows in the table. Each sampling is known as a step. Since the optimizer knows how many rows exist between steps and the density of keys in the index, it can estimate the number of rows satisfying a clause with reasonable accuracy.

Evaluating Statistics for Search Clauses

For search clauses, the optimizer can look up specific values on the distribution page, if these values are known at compile time. In this case, it first identifies the distribution page and the number of steps with the following trace output:

Qualifying stat page; pgno: page_number steps: steps

For atomic datatypes (datatypes such as tinyint, smallint, int, char, varchar, binary, and varbinary, which are not internally implemented as structures), it prints the constant value the search argument supplied to the optimizer. It looks like this:

Search value: constant_value

If the value is implemented as a structure, the following message is output to indicate that the optimizer does not waste time building the structure's printable representation:

*** CAN'T INTERPRET ***

Distribution Page Value Matches

If an exact match is found on the distribution page, the following message is printed:

Match found on statistics page

This is followed by information pertaining to the number and location of step values found on the distribution page. Since the optimizer knows approximately how many rows exist between step values, it uses this information to estimate how many logical I/Os would be performed for this clause. To indicate this information, one of the following messages is displayed:

equal to several rows including 1st or last -use endseveralSC

This indicates that several steps matched the constant and that they were found either at the beginning or at the end of the distribution page.

equal to a single row (1st or last) -use endsingleSC

This indicates that only one step matched the constant and it was found either at the beginning or at the end of the distribution page.

equal to several rows in middle of page -use midseveralSC

This indicates that several steps matched the constant and that they were found in the middle of the distribution page.

equal to single row in middle of page -use midsingleSC

This indicates that several steps matched the constant and that they were found in the middle of the distribution page.

Values Between Steps or Out of Range

If an exact match of the search value is not found on the distribution page, the optimizer uses different formulas in its statistical estimate. The computation is based on the relational operator used in the clause, the step number, the number of steps, and the density. First, the optimizer needs to find the first step value that is less than the search value. In these cases, you see the following message:

No steps for search value -qualpage for LT search value finds

Depending on whether the search value is outside the first or last step value or contained within steps, the optimizer will print one of the following messages:

value < first step -use outsideSC
value > last step -use outsideSC

The first message indicates that the query's search value is smaller than the first entry on the distribution page. The second message indicates that the query's search value is larger than the last entry on the distribution page. If the constant is a valid value in the table, these messages indicate that update statistics may need to be run.

value between step K, K+1, K=step_number -use betweenSC

This message indicates that the query's search value falls between two steps on the distribution page. You can only confirm here that the step number seems reasonable.

For example, if the step value of "K" is 3 and you suspect that the query's search value should fall towards the end of the table, something could be wrong. It would be reasonable then to expect the value of "K" to be larger (that is, towards the end of the distribution page). This may be another indication that update statistics needs to be run.

Range Query Messages

For a range query, the trace facility looks up the steps for both the upper and lower bounds of the query. This message appears:

Scoring SARG interval, lower bound.

After displaying the costing estimates for the lower bound, the net selectivity is calculated and displayed as:

Net selectivity of interval: float_value

Search Clauses with Unknown Values

A common problem the optimizer faces is that values for search criteria are not known until run time. Common scenarios that make the optimizer unable to use distribution statistics include:

In cases like these, the optimizer tries to make intelligent guesses based on average values. For example, if a distribution page exists for the index and the query is an equality comparison, the optimizer uses the density of the index (that is, the average number of duplicates) to estimate the number of rows.

Otherwise, the optimizer uses a "magic" number: it assumes that 10 percent of the table will match an equality comparison, 25 percent of the table will match a closed interval comparison, and 33 percent of the table will match for inequality and open interval comparisons. In these cases, the trace facility prints:

SARG is a subbed VAR or expr result or local variable (constat = number) -use magicSC or densitySC

Stored procedures and triggers need special attention to ensure efficient query plans. Many procedures are coded with where clauses based on input parameters. This behavior can cause some difficulty in troubleshooting. Consider a query whose single where clause may be very selective under one parameter and return nearly the entire table under another. These types of stored procedures can be difficult to debug, since the procedure cache could potentially have multiple copies in cache, each with a different plan. Since these plans are already compiled, users may be assigned a plan that may not be appropriate for their input parameter.

Another reason for the optimizer being unable to use an index's distribution table is that the distribution page can be nonexistent. This occurs:

In these cases, the optimizer uses the above mentioned "magic" numbers to estimate I/O cost and you see:

No statistics page -use magicSC

At this point, the selectivity and cost estimates are displayed.

Cost Estimates and Selectivity

For each qualified clause, the trace facility displays:

These values are printed as variables in this message:

Estimate: indid indexID, selectivity float_val, rows rows pages pages

If this clause had no qualifications, but the optimizer found a nonclustered index that covered the entire query, it identifies the table, since it would not be listed with a q_score_index() section. In this case, you see:

Finishing q_score_index() for table table_name (objectid) ID.

At this point, the cheapest index is examined and its costs are displayed:

Cheapest index is index IndexID, costing pages pages and generating rows rows per scan using no data prefetch (size 2) 
on dcacheid N with [MRU|LRU] replacement

This can be somewhat misleading. If there are any nonclustered indexes that match the search arguments in the query, the costs for the cheapest index are printed here, even though a table scan may be used to execute the query. The actual decision on whether to perform a table scan or nonclustered index access is delayed until join order is evaluated (the next step in the optimization process).

This is because the most accurate costing of a nonclustered index depends on the ratio of physical vs. logical I/O and the amount of cache memory available when that table is chosen. Therefore, if the base cost (table scan cost) printed earlier is significantly less than the "cheapest index" shown here, it is more likely that a table scan will be used. Use showplan to verify this.

Estimating Selectivity for Search Clauses

The selectivity for search clauses is printed as the fraction of the rows in the table expected to qualify. Therefore, the lower the number, the more selective the search clause and the fewer the rows that are expected to qualify. Search clauses are output as:

Search argument selectivity is float_val.

Estimating Selectivity for Join Clauses

For joins, the optimizer never looks up specific values on the distribution page. At compile time, the optimizer has no known values for which to search. It needs to make a sophisticated estimate about costing these clauses.

If an index with a distribution page is available, the optimizer uses the density table, which stores the average number of duplicate keys in the index. All leading permutations of the composite key have their density stored, providing accurate information for multi-column joins.

If no distribution page is available for this index, the optimizer estimates the join selectivity to be 1 divided by the number of rows in the smaller table. This gives a rough estimate of the cardinality of a primary key-foreign key relationship with even data distribution. In both of these cases, the trace facility prints the calculated selectivity and cost estimates, as described below.

The selectivity of the clause is printed last. Join clauses are output as:

Join selectivity is float_val.

The selectivity for join clauses is output as the whole number of the fraction 1 divided by the selectivity. Therefore, the higher the number selectivity, the more selective the join clause, and the fewer the rows are expected to qualify.

At this point, the optimizer has evaluated all indexes for this clause and will proceed to optimize the next clause.