(Latest version 2012-10-27)
sp_sqltrace is a stored procedure written by Lee Tudor (a.k.a Mr Tea) and which I am happy to host here. I may also have made some contributions to the procedure myself, but the original ideas are Lee's.
The basic idea of sp_sqltrace is that you give it a batch text, for instance a call to a stored procedure. sp_sqltrace sets up a trace filtered for your spid. On SQL 2008 and later, sp_sqltrace by default also sets up an extended events session to collect wait-stats information for your spid. Alternatively, you can give sp_sqltrace a spid and a time for how long to snoop that spid, and the trace filters will be for that spid instead.
When the batch or snooping has completed, sp_sqltrace analyses the data in the trace files and produces a result set with important performance data aggregated from the trace per statement, so that you can see which statements that took the longest time to execute, needed the most CPU, generated the most read and writes and see which waits they generated. This is not the least powerful with code that runs loops with many short-running statements – of which some may not be short-running enough. The result set also includes information about recompilations.
sp_sqltrace have several arguments to permit you configure what information to collect and how to present it. Once you have traced a process, you may find that you want to see data in a different way. No problem! sp_sqltrace permits you to reanalyse the trace files without rerunning the trace itself. For large traces from a production system you may even prefer to move the files to run the analysis on a different server.
An extra feature in sp_sqltrace is the ability to capture query plans. Say that you have a stored procedure that runs many statements and you want to see the query plan for one. If you do this directly in SSMS, you will have to wade through lots of uninteresting plans. With sp_sqltrace, each execution plan is an XML document which you can double-click to see the plan. (In some versions of SSMS, see more below.) But...! This is not a feature to use casually, see further the section Including Query Plans.
sp_sqltrace runs on SQL 2005 and up.
Get the source code to sp_sqltrace here. Latest version is SourceSafe version 4, checked in 2012-10-27.
Note that the script creates the procedure in the master database. If you don't want to create procedures in master but create it some other database, we recommend that you rename the procedure. See further the section Analysing Your Own Process below.
To run sp_sqltrace, you need at least ALTER TRACE permission on the server. To collect wait-stats information, you also need CONTROL SERVER (which is almost the same as being member of sysadmin.)
This can be one of:
Add a filter to the trace to include only statements that generates at least this many reads. The default is set to exclude trivial statements like variable assignments, IF statements etc.
Add a filter to the trace to include only statements that take at least this much CPU in the trace. The unit is ms.
Add a filter to the trace to include only statements with at least this duration in the trace. Note that the unit is µs. See also the section An Issue with Durations.
The measure for the column
Factor in the result
set. Can be one of
Output order for the result set. Can be blank, or one of
||Whether to include execution plans
in the output. Can be blank, |
|@group_plans||bit||0||Instructs sp_sqltrace to remove parameter information from query plans so that they group better. It only comes into play when you specify a value for @plans. See further the section Including Query Plans.|
|@plan_reads||int||1||This parameter only matters when you specify a value for @plans. It instructs sp_sqltrace to only include plans for queries that generates at least this amount of reads in the result set. See further the section Including Query Plans.|
|@rollback||bit||0||Put the batch in a transaction and roll back when its done. The ROLLBACK is not included in the trace. This parameter only applies when you analyse your own process.|
For how long time in seconds the trace is to run.
Normally, sp_sqltrace stops the trace when the batch has completed. However, if the batch is aborted because
you cancel execution or lose the network connection, the trace keeps running. To avoid that a runway trace adds load to your system for an extended time, sp_sqltrace always configures the trace to self-destruct when @trace_timeout seconds has elapsed.
|@snoop_time||int||10||For how long time in seconds to snoop a process when you specify a numeric value to @batch. Does not apply when you analyse your own process.|
||Directory where SQL Server is to write files for the trace and the X-event session. This should be an existing directory on the SQL Server machine to which SQL Server has write permission.|
|@file_size||bigint||10||Size for a trace file, before SQL Server closes a file and commences on the next rollover file. This parameter has little effect on the function or the performance of sp_sqltrace. However, if you trace something that generate a lots of trace events, you will get very many files with the default size.|
|@show_waits||bit||NULL||Whether to collect and present wait-stats information. This parameter defaults to 1, unless you specify a value for @plans, in which case it defaults to 0. On SQL 2005, where this information is not collectible, the parameter is always forced to zero. See further below the section Wait-stats Information.|
Note: the choices for the parameters @factor, @order and @plans are always case-insensitive, even if you have case-sensitive database or server collations.
Caveat: sp_sqltrace does not perform any rigorous validation of the parameters, so if you specify a value outside the expected ones, you may get a strange error messages or just unexpected behaviour.
When sp_sqltrace starts running, it prints its command line so you can use the command if you want to reanalyse the trace, although it does not include all parameters.
When the batch has completed or the snooping has stopped, sp_sqltrace prints the message
Batch completed. Analysis started. If your trace generated many trace and wait-stats events, the analysis can take some time. If you want to run the analysis on a different server to avoid adding extra load to a production server, you can cancel the execution at this point without harm.
The files generated from sp_sqltrace are always based on a GUID and which you find in the command-line that sp_sqltrace prints. Take this example:
EXEC sp_sqltrace '07FB1BD4-BFF3-4114-B7AC-7229530E5312', @server_directory = 'F:\temp\'
In this case, sp_sqltrace has generated the file F:\temp\07FB1BD4-BFF3-4114-B7AC-7229530E5312.trc, and it your trace generated rollover files there are also F:\temp\07FB1BD4-BFF3-4114-B7AC-7229530E5312_1.trc etc. If @show_waits was active, there are one or more files matching the pattern F:\temp\07FB1BD4-BFF3-4114-B7AC-7229530E5312*.xel. On SQL 2008 the event session also generates files matching the pattern F:\temp\07FB1BD4-BFF3-4114-B7AC-7229530E5312*.xem. To permit you to reanalyse the files, sp_sqltrace never removes any files; you will need to delete them manually.
The result is an aggregation by statement text, procedure name, line number and nest level. Say that you have a batch that goes "EXEC A". A calls B and C directly, and B also calls C. In this case, the statements from C appear twice in the result: once each for nest levels 2 and 3. If you specify a value for @plans, the execution plan is also an aggregation term, see further the section Including Query Plans.
These are the output columns:
How much in per cent that this statement contributes to the measure chosen by the parameter @factor. Blank if the percentage is < 0.5%. When you snoop a process, the values in this column can sometimes be funny, see the section Snooping a Process below.
The text of the statement. EXEC statements are given a prefix of
|Calls||Number of times this statement appeared in the trace. Blank if the statement only appeared a single time.|
|Nesting||On which nesting level the statement was executed.|
SQL module and line number for the statement.
When there is no object name in the trace, the value reads
|Duration||Total duration for all executions of the statement captured in the trace. The value is in milliseconds with two decimals. See also the section An Issue with Durations.|
Total CPU time for all executions of the statement captured in the trace. The unit is milliseconds. Blank when 0.
|Reads||Total number of reads for all executions of the statement according to the trace. Blank when 0.|
|Writes||Total number of writes for all executions of the statement according to the trace. Blank when 0.|
|Waits||Wait states related to this statement. For more details, see the section Wait-stats Information.|
|Compiles||Number of times the statement was recompiled. Blank if there were no recompiles.|
|Reason||Reason why the statement
was recompiled. See the table below for the meaning of the codes. If there were recompiles for different reasons, sp_sqltrace arbitrarily prints only one of them and adds the text |
The total recompilation time for the statement. This value is derived
from the execution plan, and thus only populated if you supply a value for @plans, else you will only see a
Total CPU time for all the recompilations of the statement. Like Comp_Duration,
the value is taken from the execution plan, and if you did not specify a value for @plans, you will only see a
|XPlan||The execution plan for the statement. NULL if you did not specify a value for @plans or if the plan was filtered out by @plan_reads. You can double-click the XML document to see the graphical plan. However, this is broken in some versions of SSMS 2008 R2. See this blog post from SQL Server MVP Aaron Bertrand how to repair this. This feature is not available in SSMS 2005, but you will have to save the XML document with the extension .sqlplan and reopen it.|
By default, the statements are listed by the their first appearance in the trace. You can use the parameter @order to get the statements listed by Duration + Comp_Duration, Reads, CPU + Comp_CPU or Writes in descending order to find any resource hog quickly.
There is always a total line which for which Text reads
\---- EXEC (@batch); if you traced your own process or
\---- EXEC (snoop) if you snooped. This line has the totals for the columns Duration, Cpu, Reads, Writes, Waits, Compiles, Comp_Duration and Comp_Cpu. The columns Calls, Nesting and Object - Line are always blank for this line. With the default ordering the line appears last in the result set. When you use @order, it will appear at the top, usually as the first row.
This table details the codes in the Reason column. SubClassNo is the EventSubClass column in the trace. The descriptions are taken from Books Online.
|SET||4||SET option changed.|
|Temp||5||Temp table changed.|
|Remote||6||Remote rowset changed.|
|Browse||7||FOR BROWSE permissions changed.|
|QN||8||Query notification environment changed.|
|MPI||9||Partitioned view changed.|
|Cursor||10||Cursor option changed.|
|Manual||11||Option(RECOMPILE) or WITH RECOMPILE requested.|
When you supply a batch text for the @batch parameter, sp_sqltrace sets up a trace that is filtered for your own spid. Assuming that you put sp_sqltrace in master, you can use sp_sqltrace as in these examples:
EXEC sp_sqltrace 'EXEC my_new_shiny_sp 12'
EXEC sp_sqltrace 'DECLARE @x int; EXEC my_other_sp ''somestring'', @x OUTPUT; EXEC that_other_sp @x'
Since sp_sqltrace is in master and the name starts with sp_, SQL Server will find the procedure when you invoke it from another database and execute it in the context of that database, very convenient. However, to my knowing this is not officially supported by Microsoft, and it could stop working one day.
If you put sp_sqltrace in another database, we recommend that you rename it to, for instance, sqltrace. In this case the procedure will execute in the context of that database, so if you put it in a utility database, you need use it like this:
EXEC utildb..sqltrace 'EXEC mycurrentdb..my_new_shiny_sp 12'
Note that if the batch text includes string literals, you need to double the quotes to be able use it as a parameter to conform to the SQL syntax. For a longer script, this can be quite inconvenient. An alternative is to run sp_sqltrace from another window and snoop the window where the script is.
sp_sqltrace runs the batch within TRY-CATCH. If the batch fails, sp_sqltrace, re-raises the error message, but then continues and analyses the files that were collected nevertheless. (Since SSMS shows the error message, you will have to find your way to the Results tab on your own initiative.)
Beware that if you expect the batch to run for more than five minutes, you should increase @trace_timeout.
Instead of providing a batch text, you can supply an integer number, and sp_sqltrace will add a filter to the trace for that spid. Here are some examples when this is convenient:
In most cases you will want to use the @snoop_time parameter to control for how long you snoop the process. If you set it above @trace_timeout, sp_sqltrace automatically adjusts the lifetime of the trace.
When you snoop a process, the Factor column can show bizarre numbers. sp_sqltrace uses the lowest value in the Nesting column to compute the base for the values in Factor. This works well when the process has a base nesting level that it regularly returns to during the trace. A prime example is a script which you start while sp_sqltrace is running and which completes before @snoop_time runs out. On the other hand, if you trace a process that is already running and stops executing while sp_sqltrace is snooping, you may see factors that exceed 100%. There are a few more variations on that theme.
You can use sp_sqltrace to reanalyse a trace without having to rerun the trace itself. Maybe you forgot to specify @order, maybe you want to display by a different @factor. Maybe you collected plans, but you want to see the result without plans (as they affect the aggregation). To reanalyse a trace, you supply the GUID for the file names for the @batch parameter as in these examples:
EXEC sp_sqltrace 'E141C39B-88D0-4D46-88BF-FE22D92E0F60', @server_directory = 'F:\temp\' EXEC sp_sqltrace '1EAC8268=1E42=476B=84B9=023227636407'
The second example is from a snooped process, whence the = in the name. (sp_sqltrace needs to know whether the trace was from a snoop or not, and tracks this by the delimiter in the GUID.)
When you reanalyse a trace, you can use the same parameters as you did originally, or you can change them to get a different presentation. However, you cannot change them high and wild. The following applies:
|@mimimum_reads||Has no impact, since this is a filter to the trace.|
|@factor||Can be altered freely.|
|@order||Can be altered freely.|
If you supplied a value for @plans in the original call, you can leave the parameter blank when you reanalyse to ignore the plans. If you left @plans blank in the original invocation, there are no plans in the trace, so it does not help changing it.
|@group_plans||Can be altered freely (but is only meaningful, if you specify @plans).|
|@plan_reads||Can be altered freely (but is only meaningful, if you specify @plans).|
|@rollback||Does not apply.|
|@trace_timeout||Does not apply.|
|@snoop_time||Does not apply.|
|@server_directory||Must be set to the value for the original invocation, unless you have moved the files related to the trace.|
If you collected wait stats in the original invocation, you can set this parameter 0 to shorten analysis time when you reanalyse.
On SQL 2008 and later, sp_sqltrace by default collects wait-stats information, unless you specify a value for the @plans parameter. sp_sqltrace collects this information by setting up an extended events session filtered for the spid being traced. (Your own or the one you snoop.) The name of this session is
sqlTraceWaits_spid, where spid is your own spid, never the spid you snoop. When the batch has completed or snooping has stopped, sp_sqltrace drops the event session.
Note however that if you cancel sp_sqltrace before the batch completes, the event session will remain active and you will need to drop it on your own. In difference to SQL Trace, there is no provision to set up an extended event session to self-destruct. The event session will also survive a server restart, but after a restart it will be in a stopped state. If there is an existing event session with the name
sqlTraceWaits_spid when you start sp_sqltrace, sp_sqltrace assumes that this is an orphaned session and drops it and creates a new one.
The wait-stats information appears in the Waits column. Here is an example of how it is presented:
[8sig]413ms=WRITELOG(760) | [9sig]15ms=PAGELATCH_EX(712) | 0ms=SOS_SCHEDULER_YIELD(81)
The line in the example tells us that for all executions of this statement, the process seems to have experienced 760 WRITELOG waits for a total of 413 ms. Of these 8 ms were signal wait time, that is, the log-writing operation had completed, but the process had to wait before it could start executing. Likewise the statement appears to have generated 712 waits for PAGELATCH_EX for a total of 15 ms, with an additional 9 ms signal wait time. Finally, there were 81 SOS_SCHEDULER_YIELD waits, but none of these were long enough to register a non-zero value for the duration which is measured in milliseconds. (For this reason, it is likely that the total for the other two wait states were higher than the listed ones.)
The list is sorted on descending order of duration (not including signal time) and count.
The wait stats can give you more information of why a statement takes long time; the above indicates that the I/O subsystem may be a bottleneck. If you see long LCK waits, this indicates blocking. ASYNC_NETWORK_IO indicates that the client is slow in processing the result set. For a full list of all wait states, see the topic for sys.dm_os_wait_stats in Books Online. Also see this blog post from SQL Server MVP Paul Randall.
sp_sqltrace maps the waits to the statements in the trace by means of the timestamp for the wait and the completion time for the statement. This mapping is by no means perfect for a variety of reasons:
Note that the total line for the trace in the top or the bottom of the result set, includes the total of all waits in the session, including those that for some reason were not mapped to a statement.
Collecting the wait-stats information is fairly inexpensive, we believe. However, if there are many waits, the analysis can take quite some time, because of the somewhat unpractical XML format the information is delivered in. Thus, if the wait-stats information does not tell you anything, set @show_waits to 0. Or, if you plan to reanalyse the trace anyway, have @show_waits on when run the trace, and then have it off when you reanalyse as long as you don't care.
The parameter @plans permits you to include query plans in the trace. This can be very useful, if your batch is long and you want to look closer at one or two plans, but you don't want the Execution Plan tab in Mgmt Studio to be cluttered by all the other dull and uninteresting plans.
However, @plans is a parameter you should use most carefully, since it can have a negative impact on several levels:
'Actual', the execution time is three minutes.
There is not much you can do about the first two points, except: be careful! However, you can mitigate the effect of the last point and to some extent the third point by making a considerate choice of which type of plans to include and using the parameters @group_plans and @plan_reads.
@plans can be one of
Actual. Estimated plans only include the estimates from the optimizer, while actual plans includes the actual number of rows processed for each operator and the actual number of executions. If you set @plans to
Estimated, the result set will only be fragmented in case of recompilations. On the other hand, for
Actual, the result set can be very fragmented, which makes it difficult to make sense out of the data. Then again, maybe there is a statement that most of the time executes in 20 ms, but which for some odd input values runs for a lengthy time. Those executions will appear higher up in the output (with the appropriate value for @order), and by accessing the plan and looking the plan properties, you can see which values that causes the bad performance. And more generally, the actual values in the plans can reveal misestimates which often is a reason for bad performance.
When you set @group_plans to 1, sp_sqltrace deletes the ParameterList node from the XML document. This nodes holds both the values the plan was compiled for and in case of actual plans the run-time value of parameters and variables. The parameter has limited effect for
Estimated, but can reduce the size of the result set considerably for
Actual. In ideal cases, it can remove the fragmentation from the result set entirely, although in many cases the difference in row counts will still cause some fragmentation. You should beware that @group_plans is a fairly expensive operation and as noted above, sometimes the parameter values can be of interest.
@plan_reads serves as a filter for which statements sp_sqltrace includes the plan in the output. sp_sqltrace only adds the plan to the output for statements which generates at least this many reads. This reduces the size of the result set as well as the processing time for sp_sqltrace. Observe that the filter applies to the individual statement, not the aggregated value you see in the Reads column in the result set. Note that this parameter has no effect whatsoever on the size of the trace file; it's only a filter for sp_sqltrace.
The columns Comp_duration and Comp_CPU are only populated when you set @plans, as these values are taken from the XML document for execution plan. sp_sqltrace collects these values also for plans that fall below the @plan_reads limit.
When @plans is active, sp_sqltrace defaults to not collect wait-stats information. You can request wait stats by explicitly setting @show_waits = 1. Beware, though, that because of the overhead for collecting plans, the amount of waits will increase drastically. The presence of the plans in the trace, also affects the ability of sp_sqltrace to map waits to statements.
The Duration column in SQL Trace is in microseconds, and the default presentation in Profiler is in milliseconds. With SQL 2005 up to Service Pack2, you could also see values like 30 µs if you enable microsecond display. However, in modern computers, the frequency of the CPU clock often varies to save energy, why these values are not accurate. Since SQL 2005 SP3, Microsoft uses interrupt ticks, which are approximately 1 ms, but the exact value depends on your hardware. Therefore you will mainly see values ending in .00 in the Duration column, although you may see occasional decimals here.
For more information on this issue see this blog post and this other blog post from Bob Dorr, Senior SQL Server Escalation Engineer and KB Article 931279.
If you have any comments on sp_sqltrace, contact me on firstname.lastname@example.org or Lee Tudor directly on email@example.com.
2012-10-27 – Major overhaul:
2010-08-21 – Behzad Sadeghi pointed out that plans were missing for dynamic SQL invoked through sp_executesql, due to an inconsistency in SQL Server, which I have reported on Connect.With help of Behzad I have implemented a workaround for the issue.
2008-11-29 – First release posted on my web site.
Back to my home page.