dbcc traceon (302) can often help you understand why the optimizer makes choices that seem incorrect. It can help you debug queries and help you decide whether to use specific options, like specifying an index or a join order for a particular query. It can also help you choose better indexes for your tables.
showplan tells you the final decisions that the optimizer makes about your queries. dbcc traceon (302) helps you understand why the optimizer made the choices that it did. When you turn on this trace facility, you eavesdrop on the optimizer as it examines query clauses.
The output from this trace facility is more cryptic than showplan output, but can go further in explaining such questions as why a table scan is done rather than an indexed access, why index1 is chosen rather than index2, or why a reformatting strategy is applied. The trace provides detailed information on the costs the optimizer has estimated for each permutation of the table, search clause, and join clause as well as how those costs were determined.
Execute the following command from an isql batch followed by the query or stored procedure call you want to examine:
dbcc traceon(3604, 302)
This is what the trace flags mean:
| Trace Flag | Explanation |
|---|---|
| 3604 | Directs trace output to the client rather than the errorlog |
| 302 | Print trace information on index selection |
To turn off the output, use:
dbcc traceoff(3604, 302)
When you trace queries through this facility, run your queries in the same manner as your application, as follows:
Supply the same parameters and values to your stored procedures or SQL statements.
If the application uses cursors, use cursors in your tests.
Be very careful to ensure that your trace tests cause the optimizer to make the same decisions as in your application. You must supply the same parameters and values to your stored procedures or where clauses.
If you are using stored procedures, make sure that they are actually being optimized during the trial by executing them with recompile.
In most situations, SQL Server can use only one index per table in a query. This means the optimizer must often choose between indexes when there are multiple where clauses supporting both search arguments and join clauses. The optimizer's first step is to match search arguments and join clauses to available indexes.
The most important item that you can verify using this trace facility is that the optimizer is evaluating all possible where clauses included in each Transact-SQL statement.
If a clause is not included in this output, then the optimizer has determined it is not a valid search argument or join clause. If you believe your query should benefit from the optimizer evaluating this clause, find out why the clause was excluded, and correct it if possible. The most common reasons for "non-optimizable" clauses include:
Data type mismatches
Use of functions, arithmetic, or concatenation on the column
Numerics compared against constants that are larger than the definition of the column
See "Search Arguments and Using Indexes" for more information on requirements for search arguments.
Identifying how the optimizer estimates I/O often leads to the root of the problems and to solutions. You will be able to see when the optimizers uses your distribution page statistics and when it uses default values.
Each set of clauses evaluated by the optimizer is printed and delimited within two lines of asterisks. If you issue an unqualified query with no search arguments or join clause, this step is not included in the output, unless the query is covered (that is, a nonclustered index contains all referenced columns).
Output for each qualified clause looks like this:
*******************************
Entering q_score_index() for table 'name' (objectid obj_id, varno = varno).
The table has X rows and Y pages.
Scoring the clause_type CLAUSE
column_name operator [column_name}
<other query specific info explained later>
*******************************
q_score_index() is the name of a routine that SQL Server runs to cost index choices. It finds the best index to use for a given table and set of clauses. The clauses can be either constant search arguments or join clauses. There is a wealth of information contained in this routine's output. The next few pages evaluate each line separately.
The first line identifies the table name and its associated object ID. The actual output for this line looks like this:
Entering q_score_index() for table 'titles' (objecti 208003772), varno = 0
The optimizer analyzes all search arguments for all tables in each query, followed by all join clauses for each table in the query.
Therefore, you first see q_score_index() called for all tables in which the optimizer has found a search clause. The routine numbers the tables in the order in which they were specified in the from clause and displays the numbers as the varno. It starts numbering with 0 for the first table.
Any search clause not included in this section should be evaluated to determine whether its absence impacts performance.
Following the search clause analysis, q_score_index() is called for all tables where the optimizer has found a join clause. As above, any join clause not included in this section should be evaluated to determine whether its absence is impacting performance.
The next line prints the size of the table in both rows and pages:
The table has 5000 rows and 624 pages.
These sizes are pulled from the system tables where they are periodically maintained. There are some known problems where inaccurate row estimates cause bad query plans, so verify that this is not the cause of your problem.
The next two lines indicate the type of clause and a representation of the clause itself with column names and abbreviations for the operators. It indicates:
That it is evaluating a search clause, like this:
Scoring the SEARCH CLAUSE:
au_fname EQ
That it is evaluating a join clause, like this:
Scoring the JOIN CLAUSE:
au_id EQ au_id
All search clauses for all tables are evaluated before any join clauses are evaluated.
The operator codes are defined in Table 9-2.
| dbcc output | Comparison |
|---|---|
| EQ | Equality comparisons (=) |
| LT | Less than comparisons (<) |
| LE | Less than or equal to comparisons (<=) |
| GT | Greater than comparisons (>) |
| GE | Greater than or equal to comparisons (>=) |
| NE | Not equals (!=) |
| ISNULL | is null comparison |
| ISNOTNULL | is not null comparison |
If your queries include a range query or clauses that are treated like range queries, they are evaluated in a single analysis to produce an estimate of the number of rows for the range. For example,
Scoring the SEARCH CLAUSE:
au_lname LT
au_lname GT
Range queries include:
Queries using the between clause
Interval clauses with and on the same column name, such as:
datecol1 >= "1/1/94" and datecol1 < "2/1/94"
like clauses such as:
like "k%"
If the query has specified the use of a specific index by including the index keyword and the index name in parentheses after the table name in the from clause, this is noted in the output:
User forces index IndexID.
Specifying an index prevents consideration of other alternatives.
If the I/O size and cache strategy are also included in the query, these messages are printed:
User forces data prefetch of 8K
User forces LRU buffer replacement strategy
The next line of output displays the cost of a table scan for comparison, provided that there is at least one other qualification or index that can be considered. It reports index ID 0 and should match the table size estimate displayed earlier. The line looks like this:
Base cost: indid: IndexID rows: rows pages: pages prefetch: <S|N>
I/O size: io_size cacheid: cacheID replace: <LRU | MRU>
Here is an example:
Base cost: indid: 0 rows: 5000 pages: 624 prefetch: N
I/O size: 2 cacheid: 0 replace: LRU
| Output | Meaning |
|---|---|
| indid | The index ID from sysindexes; 0 for the table itself. |
| rows | The number of rows in the table. |
| pages | The number of pages in the table. |
| prefetch | Whether prefetch would be considered for the table scan. |
| I/O size | The I/O size to be used. |
| cacheid | The ID of the data cache to be used. |
| replace | The cache replacement strategy to be used, either LRU or MRU. |
Verify page and row counts for accuracy. Inaccurate counts can cause bad plans. To get a completely accurate count, use the set statistics io on command along with a select * from tablename query. In a VLDB (very large database) or in 24x7 shops (applications that must run 24 hours a day, 7 days a week), where that is not practical, you may need to rely on the reasonable accuracy of the sp_spaceused system procedure. dbcc allocation-checking commands print the object size and correct the values on which sp_spaceused and other object-size estimates are based.
Next, the optimizer evaluates each useful index for a given clause to determine its cost. The optimizer first looks for a unique index that is totally qualified--meaning that the query contains where clauses on each of the keys in the index. If such an index is available, the optimizer immediately knows that only a single row satisfies the clause, and it prints the following line:
Unique index_type index found--return rows 1 pages pages
The index_type is either clustered or nonclustered. There are three possibilities for the number of pages:
The unique index is clustered. The logical I/O cost is the height of the index tree. In a clustered index, the data pages are the leaf level of the index, so the data page access is included.
The unique nonclustered index covers the query. The logical I/O is the height of the index tree. The data page access is not needed, and not counted.
The unique nonclustered index does not cover the query. An additional logical I/O is necessary to get from the leaf level of the nonclustered index to the data page, so the logical I/O cost is the height of the nonclustered index plus one page.
If the index is not unique, then the optimizer determines the cost, in terms of logical I/Os, for the clause. Before doing so, it prints this line:
Relop bits are: integer
This information can be ignored. It merely restates the comparison operator (that is, =, <, >, interval, and so on) listed in the q_score_index() line mentioned earlier as an integer bitmap. This information is only necessary for Sybase Engineering to debug optimizer problems and it has no value for customer-level troubleshooting.
To estimate the I/O cost for each clause, the optimizer has a number of tools available to it, depending on the clause type (search clause or join clause) and the availability of index statistics. For more information, see "Index Statistics".
For each index, SQL Server keeps a statistical histogram of the indexed column's data distribution. This histogram is built automatically during index creation and is stored in a distribution page with the index unless the table is empty. This histogram is a sampling of the index key values every N rows.
N is dependent on the full size of the key (including overhead) and the number of rows in the table. Each sampling is known as a step. Since the optimizer knows how many rows exist between steps and the density of keys in the index, it can estimate the number of rows satisfying a clause with reasonable accuracy.
For search clauses, the optimizer can look up specific values on the distribution page, if these values are known at compile time. In this case, it first identifies the distribution page and the number of steps with the following trace output:
Qualifying stat page; pgno: page_number steps: steps
For atomic datatypes (datatypes such as tinyint, smallint, int, char, varchar, binary, and varbinary, which are not internally implemented as structures), it prints the constant value the search argument supplied to the optimizer. It looks like this:
Search value: constant_value
If the value is implemented as a structure, the following message is output to indicate that the optimizer does not waste time building the structure's printable representation:
*** CAN'T INTERPRET ***
If an exact match is found on the distribution page, the following message is printed:
Match found on statistics page
This is followed by information pertaining to the number and location of step values found on the distribution page. Since the optimizer knows approximately how many rows exist between step values, it uses this information to estimate how many logical I/Os would be performed for this clause. To indicate this information, one of the following messages is displayed:
equal to several rows including 1st or last -use endseveralSC
This indicates that several steps matched the constant and that they were found either at the beginning or at the end of the distribution page.
equal to a single row (1st or last) -use endsingleSC
This indicates that only one step matched the constant and it was found either at the beginning or at the end of the distribution page.
equal to several rows in middle of page -use midseveralSC
This indicates that several steps matched the constant and that they were found in the middle of the distribution page.
equal to single row in middle of page -use midsingleSC
This indicates that several steps matched the constant and that they were found in the middle of the distribution page.
If an exact match of the search value is not found on the distribution page, the optimizer uses different formulas in its statistical estimate. The computation is based on the relational operator used in the clause, the step number, the number of steps, and the density. First, the optimizer needs to find the first step value that is less than the search value. In these cases, you see the following message:
No steps for search value -qualpage for LT search value finds
Depending on whether the search value is outside the first or last step value or contained within steps, the optimizer will print one of the following messages:
value < first step -use outsideSC
value > last step -use outsideSC
The first message indicates that the query's search value is smaller than the first entry on the distribution page. The second message indicates that the query's search value is larger than the last entry on the distribution page. If the constant is a valid value in the table, these messages indicate that update statistics may need to be run.
value between step K, K+1, K=step_number -use betweenSC
This message indicates that the query's search value falls between two steps on the distribution page. You can only confirm here that the step number seems reasonable.
For example, if the step value of "K" is 3 and you suspect that the query's search value should fall towards the end of the table, something could be wrong. It would be reasonable then to expect the value of "K" to be larger (that is, towards the end of the distribution page). This may be another indication that update statistics needs to be run.
For a range query, the trace facility looks up the steps for both the upper and lower bounds of the query. This message appears:
Scoring SARG interval, lower bound.
After displaying the costing estimates for the lower bound, the net selectivity is calculated and displayed as:
Net selectivity of interval: float_value
A common problem the optimizer faces is that values for search criteria are not known until run time. Common scenarios that make the optimizer unable to use distribution statistics include:
where clauses based on expressions. For example:
select *
from tableName
where dateColumn >= getdate()
where clauses based on local variables. For example:
declare @fKey int
select @fKey=lookUpID
from mainTable
where pKey = "999"
select *
from lookUpTable
where pKey >= @fKey
In cases like these, the optimizer tries to make intelligent guesses based on average values. For example, if a distribution page exists for the index and the query is an equality comparison, the optimizer uses the density of the index (that is, the average number of duplicates) to estimate the number of rows.
Otherwise, the optimizer uses a "magic" number: it assumes that 10 percent of the table will match an equality comparison, 25 percent of the table will match a closed interval comparison, and 33 percent of the table will match for inequality and open interval comparisons. In these cases, the trace facility prints:
SARG is a subbed VAR or expr result or local variable (constat = number) -use magicSC or densitySC
Stored procedures and triggers need special attention to ensure efficient query plans. Many procedures are coded with where clauses based on input parameters. This behavior can cause some difficulty in troubleshooting. Consider a query whose single where clause may be very selective under one parameter and return nearly the entire table under another. These types of stored procedures can be difficult to debug, since the procedure cache could potentially have multiple copies in cache, each with a different plan. Since these plans are already compiled, users may be assigned a plan that may not be appropriate for their input parameter.
Another reason for the optimizer being unable to use an index's distribution table is that the distribution page can be nonexistent. This occurs:
When an index is created before any data is loaded.
When truncate table is used, and then the data is loaded.
In these cases, the optimizer uses the above mentioned "magic" numbers to estimate I/O cost and you see:
No statistics page -use magicSC
At this point, the selectivity and cost estimates are displayed.
For each qualified clause, the trace facility displays:
The index ID
The selectivity as a floating-point value
The cost estimate in both rows and pages
These values are printed as variables in this message:
Estimate: indid indexID, selectivity float_val, rows rows pages pages
If this clause had no qualifications, but the optimizer found a nonclustered index that covered the entire query, it identifies the table, since it would not be listed with a q_score_index() section. In this case, you see:
Finishing q_score_index() for table table_name (objectid) ID.
At this point, the cheapest index is examined and its costs are displayed:
Cheapest index is index IndexID, costing pages pages and generating rows rows per scan using no data prefetch (size 2) on dcacheid N with [MRU|LRU] replacement
This can be somewhat misleading. If there are any nonclustered indexes that match the search arguments in the query, the costs for the cheapest index are printed here, even though a table scan may be used to execute the query. The actual decision on whether to perform a table scan or nonclustered index access is delayed until join order is evaluated (the next step in the optimization process).
This is because the most accurate costing of a nonclustered index depends on the ratio of physical vs. logical I/O and the amount of cache memory available when that table is chosen. Therefore, if the base cost (table scan cost) printed earlier is significantly less than the "cheapest index" shown here, it is more likely that a table scan will be used. Use showplan to verify this.
The selectivity for search clauses is printed as the fraction of the rows in the table expected to qualify. Therefore, the lower the number, the more selective the search clause and the fewer the rows that are expected to qualify. Search clauses are output as:
Search argument selectivity is float_val.
For joins, the optimizer never looks up specific values on the distribution page. At compile time, the optimizer has no known values for which to search. It needs to make a sophisticated estimate about costing these clauses.
If an index with a distribution page is available, the optimizer uses the density table, which stores the average number of duplicate keys in the index. All leading permutations of the composite key have their density stored, providing accurate information for multi-column joins.
If no distribution page is available for this index, the optimizer estimates the join selectivity to be 1 divided by the number of rows in the smaller table. This gives a rough estimate of the cardinality of a primary key-foreign key relationship with even data distribution. In both of these cases, the trace facility prints the calculated selectivity and cost estimates, as described below.
The selectivity of the clause is printed last. Join clauses are output as:
Join selectivity is float_val.
The selectivity for join clauses is output as the whole number of the fraction 1 divided by the selectivity. Therefore, the higher the number selectivity, the more selective the join clause, and the fewer the rows are expected to qualify.
At this point, the optimizer has evaluated all indexes for this clause and will proceed to optimize the next clause.