Saturday, June 9, 2018

Understanding the query execution plan

Whenever you issue a SQL statement in the SQL Server engine, SQL Server first has to determine the best possible way to execute it. In order to carry this out, the Query Optimizer (a system that generates the optimal query execution plan before executing the query) uses several information like the data distribution statistics, index structure, metadata, and other information to analyze several possible execution plans and finally select one that is likely to be the best execution plan most of the time.

Did you know? You can use SQL Server Management Studio to preview and analyze the estimated execution plan for the query that you are going to issue. After writing the SQL in SQL Server Management Studio, click on the estimated execution plan icon (see below) to see the execution plan before actually executing the query.

(Note: Alternatively, you can switch the actual execution plan option "on" before executing the query. If you do this, Management Studio will include the actual execution plan that is being executed along with the result set in the result window.)

Understanding the query execution plan in detail

Each icon in the execution plan graph represents an action item (Operator) in the plan. The execution plan has to be read from right to left, and each action item has a percentage of cost relative to the total execution cost of the query (100%).

In the above execution plan graph, the first icon in the right most part represents a "Clustered Index Scan" operation (reading all primary key index values in the table) in the HumanResources table (that requires 100% of the total query execution cost), and the left most icon in the graph represents a SELECT operation (that requires only 0% of the total query execution cost).

Following are the important icons and their corresponding operators you are going to see frequently in the graphical query execution plans:


(Each icon in the graphical execution plan represents a particular action item in the query. For a complete list of the icons and their corresponding action items, go to http://technet.microsoft.com/en-us/library/ms175913.aspx.)

Note the "Query cost" in the execution plan given above. It has 100% cost relative to the batch. That means, this particular query has 100% cost among all queries in the batch as there is only one query in the batch. If there were multiple queries simultaneously executed in the query window, each query would have its own percentage of cost (less than 100%).

To know more details for each particular action item in the query plan, move the mouse pointer on each item/icon. You will see a window that looks like the following:

This window provides detailed estimated information about a particular query item in the execution plan. The above window shows the estimated detailed information for the clustered index scan and it looks for the row(s) which have/has Gender = 'M' in the Employee table in HumanResources schema in the AdventureWorks database. The window also shows the estimated IO, CPU, number of rows, with the size of each row, and other costs that is used to compare with other possible execution plans to select the optimal plan.

I found an article that can help you further understand and analyze TSQL execution plans in detail. You can take a look at it here: http://www.simple-talk.com/sql/performance/execution-plan-basics/.

What information do we get by viewing the execution plans?

Whenever any of your query performs slowly, you can view the estimated (and, actual if required) execution plan and can identify the item that is taking the most amount of time (in terms of percentage) in the query. When you start reviewing any TSQL for optimization, most of the time, the first thing you would like to do is view the execution plan. You will most likely quickly identify the area in the SQL that is creating the bottleneck in the overall SQL.

Keep watching for the following costly operators in the execution plan of your query. If you find one of these, you are likely to have problems in your TSQL and you need to re-factor the TSQL to improve performance.

Table Scan: Occurs when the corresponding table does not have a clustered index. Most likely, creating a clustered index or defragmenting index will enable you to get rid of it.

Clustered Index Scan: Sometimes considered equivalent to Table Scan. Takes place when a non-clustered index on an eligible column is not available. Most of the time, creating a non-clustered index will enable you to get rid of it.

Hash Join: The most expensive joining methodology. This takes place when the joining columns between two tables are not indexed. Creating indexes on those columns will enable you to get rid of it.

Nested Loops: Most cases, this happens when a non-clustered index does not include (Cover) a column that is used in the SELECT column list. In this case, for each member in the non-clustered index column, the database server has to seek into the clustered index to retrieve the other column value specified in the SELECT list. Creating a covered index will enable you to get rid of it.

RID Lookup: Takes place when you have a non-clustered index but the same table does not have any clustered index. In this case, the database engine has to look up the actual row using the row ID, which is an expensive operation. Creating a clustered index on the corresponding table would enable you to get rid of it.

0 comments: