An SQL text by Erland Sommarskog, SQL Server
MVP. Most recent update 2021-03-11.
Copyright applies to this text. See here for font conventions used in this article.
This article tackles two related questions:
In this text I will discuss a number of possible solutions and point out their advantages and drawbacks. Some methods apply only when you want to access the output from a stored procedure, whereas other methods are good for the input scenario, and yet others are good for both input and output. In the case you want to access a result set, most methods require you to rewrite the stored procedure you are calling (the callee) in one way or another, but some solutions do not.
Here is a summary of the methods that I will cover. Required version refers to the earliest version of SQL Server where the solution is available. When the column is empty, this means all versions from SQL 2000 and up.
|Method||In/Out||Rewrite callee?||Required version||Comment|
||Output||Yes||Not generally applicable, but sometimes overlooked.|
|Table-valued Functions||Output||Yes||Often the best choice for output-only, but there are several restrictions.|
|Inline Functions||Use this to reuse a single SELECT.|
|Multi-statement Functions||When you need to encapsulate more complex logic.|
|Using a Table||In/Out||Yes||The most general solution. My favoured choice for input/output scenarios.|
|Sharing a Temp Table||Mainly for a single pair of caller/callee.|
|Process-keyed Table||Best choice for many callers to the same callee.|
|Table-valued Parameters||Input||Yes||SQL 2008||Mainly useful when passing data from a client.|
||Output||No||Deceivingly appealing, but should be used sparingly.|
||Output||No||SQL 2005||Complex, but useful as a last resort when INSERT-EXEC does not work.|
||Output||No||Tricky with many pitfalls. Discouraged.|
||In/Out||Yes||SQL 2005||A bit of a kludge, but not without advantages.|
|Using Cursor Variables||Output||Yes||Not recommendable.|
|Session Context||In/Out||Yes||SQL 2016||Not a general method, but useful to keep data globally available in a process.|
At the end of the article, I briefly discuss the particular situation when your stored procedures are on different servers, which is a quite challenging situation.
A related question is how to pass table data from a client, but this is a topic which is outside the scope for this text, but I discuss this in my article Using Table-Valued Parameters in SQL Server and .NET.
Examples in the article featuring tables such as authors, titles, sales etc run in the old sample database pubs. You can download the script for pubs from Microsoft's web site. (Some examples use purely fictive tables, and do not run in pubs.)
This method can only be used when the result set is one single row. Nevertheless, this is a method that is sometimes overlooked. Say you have this simple stored procedure:
CREATE PROCEDURE insert_customer @name nvarchar(50), @address nvarchar(50), @city nvarchar(50) AS DECLARE @cust_id int BEGIN TRANSACTION SELECT @cust_id = coalesce(MAX(cust_id), 0) + 1 FROM customers (UPDLOCK) INSERT customers (cust_id, name, address, city) VALUES (@cust_id, @name, @address, @city) COMMIT TRANSACTION SELECT @cust_idThat is, the procedure inserts a row into a table, and returns the id for the row.
Rewrite this procedure as:
CREATE PROCEDURE insert_customer @name nvarchar(50), @address nvarchar(50), @city nvarchar(50), @cust_id int OUTPUT AS BEGIN TRANSACTION SELECT @cust_id = coalesce(MAX(cust_id), 0) + 1 FROM customers (UPDLOCK) INSERT customers (cust_id, name, address, city) VALUES (@cust_id, @name, @address, @city) COMMIT TRANSACTION
You can now easily call insert_customer from another stored procedure. Just recall that in T‑SQL you need to specify the OUTPUT keyword also in the call:
EXEC insert_customer @name, @address, @city, @cust_id OUTPUT
Note: this example has a single output parameter, but a stored procedure can have many output parameters.
When all you want to do is to reuse the result set from a stored procedure, the first thing to investigate is whether it is possible to rewrite the stored procedure as a table-valued function. This is far from always possible, because SQL Server is very restrictive with what you can put into a function. But when it is possible, this is often the best choice.
There are two types of table functions in SQL Server: inline and multi-statement functions.
Here is a example of an inline function adapted from Books Online for SQL 2000:
CREATE FUNCTION SalesByStore (@storeid varchar(30)) RETURNS TABLE AS RETURN (SELECT t.title, s.qty FROM sales s JOIN titles t ON t.title_id = s.title_id WHERE s.stor_id = @storeid)To use it, you simply say:
SELECT * FROM SalesByStore('6380')You can filter the data with WHERE or use it in a bigger query that includes other tables. That is, you use the function just like was a table or a view. You could say that an inline function is a parameterised view, because the query optimizer expands the function as if it was a macro, and generates the plan as if you had provided the expanded query. Thus, there is no performance cost for packaging a SELECT statement into an inline function. For this reason, when you want to reuse a stored procedure that consists of a single SELECT statement, rewriting it into an inline UDF is without doubt the best choice. (Or instead of rewriting it, move the SELECT into a UDF, and rewrite the existing procedure as a wrapper on the function, so that the client is unaffected.)
There are a couple of system functions you cannot use in a UDF, because SQL Server thinks it matters that they are side-effecting. Two examples are newid(), and rand(). On SQL 2000 this restriction goes further and disallows all system functions that are nondeterministic, that is, functions that do not return the same value for the same input parameters on each call. A typical example is getdate().
A multi-statement function has a body that can have as many statements as you like. You need to declare a return table, and you insert the data to return into that table. Here is the function above as a multi-statement function:
CREATE FUNCTION SalesByStore (@storeid varchar(30)) RETURNS @t TABLE (title varchar(80) NOT NULL PRIMARY KEY, qty smallint NOT NULL) AS BEGIN INSERT @t (title, qty) SELECT t.title, s.qty FROM sales s JOIN titles t ON t.title_id = s.title_id WHERE s.stor_id = @storeid RETURN END
You use multi-statement functions in the same way as you use inline functions, but in difference to inline functions, they are not expanded in place, but instead it's like you would call a stored procedure in the middle of the query and return the data in a table variable. This permits you to move the code of a more complex stored procedure into a function.
As you can see in the example, you can define a primary key for your return table. I like to point out that this definitely best practice for two reasons:
It goes without saying, that this is only meaningful if you define a primary key on the columns you produce in the body of the UDF. Adding an IDENTITY column to the return table only to get a primary key is pointless.
Compared to inline functions, multi-statement functions incur some overhead due to the return table. More important, though, is that if you use the function in a query where you join with other tables, the optimizer will have no idea of what the function returns, and will make standard assumptions. This is far from always an issue, but the more rows the function returns, the higher the risk that the optimizer will make incorrect estimates and produce an inefficient query plan. One way to avoid this is to insert the results from the function into a temp table. Since a temp table has statistics this helps the optimizer to make a better plan.
Note: In SQL 2017, Microsoft introduced something they called interleaved execution for multi-statement functions. When compiling the query, they first run the function, and once the optimizer knows the number of rows the function returns, it builds the rest of the plan based on this. However, because of the lack of distribution statistics, bouncing the data over a temp table may still yield better results. Also, beware that interleaved execution only applies to SELECT statements, but not to SELECT INTO, INSERT, UPDATE, DELETE or MERGE. Particularly the limitation with the first two can be deceitful, since a SELECT that works fine when it returns data to the client, may start to misbehave when you decide to capture the rows in a table in an INSERT or SELECT INTO statement instead, because you now get the blind estimate.
User-defined functions are quite restricted in what they can do, because a UDF is not permitted to change the database state. The most important restrictions are:
Please see the Remarks section in the topic for CREATE FUNCTION in Books Online for a complete list of restrictions.
What could be better for passing data in a database than a table? When using a table there are no restrictions like there is when you use a table-valued function. There are two main variations of this method: 1) Sharing a local temp table. 2) Using a process-keyed table. The former is more lightweight, but it comes with a maintainability problem which the second alternative addresses by using a table with a persisted schema. Both solution comes with recompilation problems that can be a serious problem for stored procedures that are called with a high frequency, although for a process-keyed tables this can be mitigated. Using a local temp table also introduces a risk for cache littering. I will discuss these problems in more details as we move on.
The method itself is as simple as this:
CREATE PROCEDURE inner_sp @par1 int, @par2 bit, ... AS ... INSERT/UPDATE/DELETE #mytemp go CREATE PROCEDURE outer_sp AS DECLARE ... CREATE TABLE #mytemp (col1 int NOT NULL, col2 char(5) NULL, ...) ... EXEC inner_sp @par1, @par2 ... SELECT * FROM #mytemp go
In this example, outer_sp creates the temp table, and inner_sp fills it in, that is, the table is output-only. A different scenario is that outer_sp fills the table with input data whereupon inner_sp performs some general computation, and the caller uses the result from that computation for some purpose. That is, the table is used for both input and output. Yet a scenario is that the caller prepares the temp table with data, and the callee first performs checks to verify that a number of business rules are not violated, and then goes on to update one or more tables. This would be an input-only scenario.
Say that you have this procedure:
CREATE PROCEDURE SalesByStore @storeid varchar(30) AS SELECT t.title, s.qty FROM sales s JOIN titles t ON t.title_id = s.title_id WHERE s.stor_id = @storeid
You want to reuse this result set in a second procedure that returns only titles that have sold above a certain quantity. How would you achieve this by sharing a temp table without affect existing clients? The solution is to move the meat of the procedure into a sub-procedure, and make the original procedure a wrapper on the original like this:
CREATE PROCEDURE SalesByStore_core @storeid varchar(30) AS INSERT #SalesByStore (title, qty) SELECT t.title, s.qty FROM sales s JOIN titles t ON t.title_id = s.title_id WHERE s.stor_id = @storeid go CREATE PROCEDURE SalesByStore @storeid varchar(30) AS CREATE TABLE #SalesByStore(title varchar(80) NOT NULL PRIMARY KEY, qty smallint NOT NULL) EXEC SalesByStore_core @storeid SELECT * FROM #SalesByStore go CREATE PROCEDURE BigSalesByStore @storeid varchar(30), @qty smallint AS CREATE TABLE #SalesByStore(title varchar(80) NOT NULL PRIMARY KEY, qty smallint NOT NULL) EXEC SalesByStore_core @storeid SELECT * FROM #SalesByStore WHERE qty >= @qty go EXEC SalesByStore '7131' EXEC BigSalesByStore '7131', 25 go DROP PROCEDURE SalesByStore, BigSalesByStore, SalesByStore_core
Note: This script is a complete repro script that creates some objects, tests them, and then drops them, to permit simple testing of variations. We will look at more versions of these procedures later in this text.
Just like in the example with the multi-statement function, I have defined a primary key for the temp table, and exactly for the same reasons. Speaking of best practices, some readers may wonder about the use of SELECT * here. I think using SELECT * from a temp table created in the same procedure is OK, particularly if the purpose is to return all columns in the temp table. (In difference to using SELECT * from a table created elsewhere, and which may be altered without your knowledge.)
While this solution is straightforward, you may feel uneasy by the fact that the CREATE TABLE statement for the temp table appears in two places, and there is a third procedure that depends on the definition. Here is a solution which is a little more convoluted that to some extent alleviates the situation:
CREATE PROCEDURE SalesByStore_core @storeid varchar(30), @wantresultset bit = 0 AS IF object_id('tempdb..#SalesByStore') IS NULL BEGIN CREATE TABLE #SalesByStore(title varchar(80) NOT NULL PRIMARY KEY, qty smallint NOT NULL) END INSERT #SalesByStore (title, qty) SELECT t.title, s.qty FROM sales s JOIN titles t ON t.title_id = s.title_id WHERE s.stor_id = @storeid IF @wantresultset = 1 SELECT * FROM #SalesByStore go CREATE PROCEDURE SalesByStore @storeid varchar(30) AS EXEC SalesByStore_core @storeid, 1 go
I've moved the CREATE TABLE statement from the wrapper into the core procedure, which only creates the temp table only if it does not already exist. The wrapper now consists of a single EXEC statement and passes the parameter @wantresultset as 1 to instruct the core procedure to produce the result set. Since this parameter has a default of 0, BigSalesByStore can be left unaffected from the previous example. (Thus, there are still two CREATE TABLE statements for the temp table.)
Before we move on, I like to point out that the given example as such is not very good practice. Not because the concept of sharing temp tables as such is bad, but as with all solutions, you need to use them in the right place. As you realise, defining a temp table and creating one extra stored procedure is too heavy artillery for this simple problem. But an example where sharing temp tables would be a good solution would have to consist of many more lines of code, which would have obscured the forest with a number of trees. Thus, I've chosen a very simple example to highlight the technique as such.
Keep in mind that compared to languages such as C# and Java, Transact-SQL is poorly equipped for code reuse, and therefore solutions in T‑SQL to reuse code are clumsier. For this reason, the bar for reuse is somewhat higher in T‑SQL. It's still a virtue, but not as big virtue as in modern object-oriented languages. In this simple problem, the best would of course be to add @qty as a parameter to SalesByStore. And if that would not be feasible for some reason, it would still be better to create BigSalesByStore by means of copy-paste than sharing a temp table.
There are two performance issues with this technique that you need to be aware of: it can cause quite some cache littering and all statements in the inner procedure that refer to the shared temp table will be recompiled every time.
The first issue is something I managed to sleep over myself for many years, until Alex Friedman made me aware of it. It is covered in the white paper Plan Caching in SQL Server 2008 (which despite the title apply to later versions as well) where it says:
If a stored procedure refers to a temporary table not created statically in the procedure, the spid (process ID) gets added to the cache key. This means that the plan for the stored procedure would only be reused when executed again by the same session. Temporary tables created statically within the stored procedure do not cause this behavior.
That is, if two sessions invoke outer_sp there be will be one entry each in the cache of inner_sp for the two sessions. Say now that you have a busy system where there are thousand sessions that invoke outer_sp. That results in thousand cache entries for one single procedure. Imagine now that you employ this technique in some 10-20 pairs of stored procedure with that usage pattern. That's up to 20000 cache entries. That is quite a lot. A second consequence of this is that depending on the session id you get you can get different execution plans due to parameter sniffing, which can lead to some confusion until you realise what is going on.
There is a simple way to avoid this cache littering, although it comes with its own price: add WITH RECOMPILE to the inner procedure. This prevents the plan for this procedure to be put in the cache at all, but it also means that the procedure has to be compiled every time. This is however not as bad it may sound at first, if you are on SQL 2017 or earlier. Or more precisely: even without WITH RECOMPILE, you may still have to face a lot of recompilation of the inner procedure and that is the second performance issue with this technique on versions before SQL 2019.
To wit: every time outer_sp is called, inner_sp sees a new instance of the shared temp table, and there is no guarantee that the schema is the same as last time. For this reason, SQL Server will recompile all statements in inner_sp that refers to the shared temp table. As long as the procedure is called with moderate frequency, this recompilation is not so much of concern, but in a high-frequency scenario it can cause quite an increase in CPU usage.
If the pattern is such that outer_sp is called with high frequency, but from a small number of sessions, so that you don't need WITH RECOMPILE to avoid cache littering, and the number of rows in the shared temp table is moderate, you can reduce the number of recompilations in this way: you write inner_sp so that it most of the time it works with a temp table local to the procedure (which you must not create with SELECT INTO!). On entry and exit from the procedure you copy data from/to the shared temp table. To be any point in this, I think you should use the query hint OPTION (KEEPFIXED PLAN) in the queries referring to this local temp table to prevent recompilation because of changes in statistics.
Much to my surprise – and delight! – Microsoft addressed this in SQL 2019. Now SQL Server caches the definition of the temp table table and checks if the schema is the same as in the previous call, and if it is, the recompile is skipped. For a pattern where A calls B in a loop, and B creates a temp table and then calls C and works on the temp table, this change is quite a big performance booster. Note that the first issue with the spid getting added to the cache key still remains in SQL 2019.
If the inner procedure is called from many places, and you want to change which columns it reads/writes, you need to revisit all calling stored procedures to edit the temp-table definition. For this reason, sharing temp tables is mainly useful when you have a single pair of caller and callee. Then again, if the temp table is narrow, maybe only a single column of customer IDs to process, the table is likely to be very stable.
There are some alternatives to overcome the maintenance problem. One is to use a process-keyed table, which we will look into in the next section. I have also received some interesting ideas from readers of this article.
One solution comes from Richard St-Aubin. The callers create the temp table with a single dummy column, and then call a stored procedure that uses ALTER TABLE to add the real columns. It would look something like this:
CREATE PROCEDURE inner_sp @par1 int, @par2 bit, ... AS ... INSERT/UPDATE/DELETE #mytemp go CREATE PROCEDURE define_temp_table AS ALTER TABLE #mytemp ADD col1 int NOT NULL, col2 char(5) NULL, ... go CREATE PROCEDURE outer_sp AS DECLARE ... CREATE TABLE #mytemp (dummycol bit) EXEC define_temp_table ... EXEC inner_sp @par1, @par2 ... SELECT * FROM #mytemp go
You must create the temp table in outer_sp, since if you were to put the CREATE TABLE statement in define_temp_table, the table would be dropped when that procedure exits. This method can definitely be worth exploring, but beware that now you will get the schema-induced recompile that I discussed in the previous section in the outer procedures as well. Also, beware that this trick prevents SQL Server from caching the temp-table definition. This is one more thing that could have a significant impact in case of high-frequency calls.
Another solution, which requires SQL 2008, comes from Wayne Bloss. He creates a table type that holds the definition of the temp table. You can only use table types for declaring table variable and table parameters. But Wayne has a cure for this:
DECLARE @dummy my_table_type SELECT * INTO #mytemp FROM @dummy
From this point you work with #mytemp; the sole purpose of @dummy is to be able to create #mytemp from a known and shared definition. (If you are unacquainted with table types, we will take a closer look on them in the section on table-valued parameters.) A limitation with this method is that you can only centralise column definitions this way, but not constraints as they are not copied with SELECT INTO. You may think that constraints are odd things that you rarely put in a temp table, but I have found that it is often fruitful to add constraints to my temp tables as assertions for my assumptions about the data. This does not the least apply for temp tables that are shared between stored procedures. Also, defining primary keys for your temp tables can avoid performance issues when you start to join them.
Let me end this section by pointing out that sharing temp tables opens for some flexibility. The callee only cares about the columns it reads or writes. This permits a caller to add extra columns for its own usage when it creates the temp table. Thus, two callers to the same inner procedure could have different definitions of the temp table, as long as the columns accessed by the inner procedure are defined consistently.
Note: A more advanced way to tackle the maintenance problem is to use a pre-processor and put the definition of the temp table in an include-file. If you have a C compiler around, you can use the C pre-processor. My AbaPerls includes a pre-processor, Preppis, which we use in the system I spend most of my time with.
SQL Server Data Tools, SSDT, is a very versatile environment that gives you many benefits. One benefit is that if you write a stored procedure like:
CREATE PROCEDURE example_sp AS CREATE TABLE #temp(a int NOT NULL) SELECT a FROM #temmp
SSDT will tell you up front of the misspelling about the temp table name, before you try to create the procedure. This is certainly a very helpful feature to have the typo trapped early. However, SSDT has no notion about sharing temp tables, so SSDT will also give you a warning for a procedure like SalesByStore_core, or more precisely three: one per column. They are only warnings, so you can proceed, but it only takes a handful of such procedures to clutter up the Error List window so there is a risk that you miss other and more important issues.
There is a way to suppress the warning: right-click the file in Solution Explorer and select Properties. There is a property Suppress T‑Sql Warning and here you can enter the code for the error. But this means that you lose the checking of all table names in the procedure; there is no means to only suppress the warning only for the shared temp table.
All and all, if you are using SSDT, you will find this to be an extra resistance barrier against sharing temp tables.
This method evades cache-littering problem and the maintenance problem by using a permanent table instead. There is still a recompilation problem, though, but of a different nature.
A process-keyed table is simply a permanent table that serves as a temp table. To permit processes to use the table simultaneously, the table has an extra column to identify the process. The simplest way to do this is the global variable @@spid (@@spid is the process id in SQL Server). In fact, this is so common, that these tables are often referred to as spid-keyed tables. Here is an outline; I will give you a more complete example later.
CREATE TABLE process_keyed (spid int NOT NULL, col1 int NOT NULL, col2 char(5) NULL, ...) go CREATE CLUSTERED INDEX processkey_ix ON process_keyed (spid) -- Add other columns as needed. go ... DELETE process_keyed WHERE spid = @@spid INSERT process_keyed (spid, col1, col2, ....) VALUES (@@spid, @val1, @val2, ...) ... SELECT col1, col2, ... FROM process_keyed WHERE spid = @@spid ... DELETE process_keyed WHERE spid = @@spid
A few things to note here:
WHERE spid = @@spid.
While it's common to use @@spid as the process key there are two problems with this:
One alternative for the process-key is to use a GUID (data type uniqueidentifier). If you create the process key in SQL Server, you can use the function newid(). (You can rely on newid() to return a unique value, which is why it addresses the first point.) You may have heard that you should not have guids in your clustered index, but that applies when the guid is the primary key alone, since this can cause fragmentation and a lot of page splits. In a process-keyed table, you will typically have many rows for the same guid, so it is a different situation. And more to the point: you don't get more page splits if you use a GUID than if you use @@spid.
Another alternative is to generate the process key from a sequence object, which you create with the statement CREATE SEQUENCE, for instance:
CREATE SEQUENCE MySequence AS int
You can then use NEXT VALUE FOR to get values from the sequence:
DECLARE @processkey int = NEXT VALUE FOR MySequence INSERT tbl (processkey, col1, col2, ...) SELECT @processkey, col1, col2 FROM ...
You need to use the variable, or else the sequence will generate a different value on each row. (The typical use for sequences is to generate surrogate keys just like IDENTITY.)
Note: Sequences were introduced in SQL 2012.
Let's say that there are several places in the application where you need to compute the total number of sold books for one or more stores. You put this computation in a procedure ComputeTotalStoreQty, which operates on the table stores_aid. In this example, the procedure is nothing more than a simple UPDATE statement that computes the total number of books sold per store. A real-life problem could have a complex computation that runs over several hundred lines of code. There is also an example procedure TotalStoreQty which returns the returns the total sales for a certain state. It fills stores_aid with all stores in that state, calls ComputeTotalStoreQty and then returns the result to the client. Note that TotalStoreQty is careful to clear its entry in stores_aid both before and after the call.
CREATE TABLE stores_aid (process_key uniqueidentifier NOT NULL, storeid char(4) NOT NULL, totalqty smallint NULL, CONSTRAINT pk_stores_aid PRIMARY KEY (process_key, storeid) ) go CREATE PROCEDURE ComputeTotalStoreQty @process_key uniqueidentifier AS UPDATE stores_aid SET totalqty = s.totalqty FROM stores_aid sa JOIN (SELECT stor_id, SUM(qty) AS totalqty FROM sales GROUP BY stor_id) AS s ON s.stor_id = sa.storeid WHERE sa.process_key = @process_key go CREATE PROCEDURE TotalStoreQty @state char(2) AS DECLARE @process_key uniqueidentifier SELECT @process_key = newid() DELETE stores_aid WHERE process_key = @process_key INSERT stores_aid(process_key, storeid) SELECT @process_key, stor_id FROM stores WHERE state = @state EXEC ComputeTotalStoreQty @process_key SELECT storeid, totalqty FROM stores_aid WHERE process_key = @process_key DELETE stores_aid WHERE process_key = @process_key go EXEC TotalStoreQty 'CA' go DROP PROCEDURE TotalStoreQty, ComputeTotalStoreQty DROP TABLE stores_aid
Please note that I have defined a proper key for stores_aid adhering to best practices.
You may wonder what that _aid in the table name comes from. In the environment where I do my daily chores, we have quite a few process-keyed tables, and we have adapted the convention that all these tables end in -aid. This way, when you read some code, you know directly that this is not a "real" table with persistent data. (Nevertheless, some of our aid tables are very important in our system as they are used by core functions.)
There is a second point with this name convention. It cannot be denied that a drawback with process-keyed tables is that sloppy programmers could forget to delete data when they are done. Not only this wastes space, it can also result in incorrect row-count estimates leading to poor query plans. For this reason, it is a good idea to clean up these tables on a regular basis. For instance, in our night job we have a procedure that runs the query below and then executes the generated statements:
SELECT 'DELETE ' + quotename(name) FROM sys.objects WHERE type = 'U' AND name LIKE '%aid'
As we saw, when sharing temp tables, this causes recompilations in the inner procedure, because the temp table is a new table every time. While this issue does not exist with process-keyed tables, you can still get a fair share of recompilation because of auto-statistics, a feature which is enabled in SQL Server by default. For a permanent table, auto-statistics kicks in when the first 500 rows have been added, and then every time 20 % of the rows have changed. (For full details on recompilation, see this white paper by Eric Hanson and Yavor Angelov.) Since a process-keyed table is typically empty when it is not in use, auto-statistics sets in often. Sometimes this can be a good thing, as the statistics may help the optimizer to find a better plan. But the recompilation may also cause an unacceptable performance overhead. There are three ways to deal with this:
Compared to sharing temp tables, one disadvantage with process-keyed tables is that you tend to put them in the same database as your other tables. This has two ramifications:
You can address the second point putting your process-keyed tables in a separate database with simple recovery. Both points can be addressed by using a memory-optimised table or a global temp table, discussed in the next two sections.
In SQL 2014, Microsoft introduced In-Memory OLTP, also known as Hekaton, which amongst other things introduced so-called memory-optimised tables. In SQL 2014, In-Memory OLTP is only available in Enterprise and Developer Edition, but starting with SQL 2016 SP1, it is available all editions, save for localdb and the very lowest editions of Windows Azure SQL Database.
A memory-optimised table is entirely in memory. By default, updates are also logged and written to files, so that data in the table survives a restart of the server. However, you can define a memory-optimised table to have a durability only for the schema. That is, all data in the table is lost on a server restart. This makes them a perfect fit for process-keyed tables, as these tables have hardly have any logging at all.
A couple of notes:
Here is an in-memory version of the stores_aid table:
CREATE TABLE stores_aid (process_key uniqueidentifier NOT NULL, storeid char(4) NOT NULL, totalqty smallint NULL, CONSTRAINT pk_stores_aid PRIMARY KEY NONCLUSTERED (process_key, storeid) ) WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_ONLY)
Using a global temp table is another solution to reduce logging for your process-keyed table, which you mainly would choose if you are on an edition or version of SQL Server where memory-optimised tables are not available to you. (Or you bump into to any of the restrictions for such tables.)
A global temp table has a name which has two leading hash marks (e.g. ##temp). In difference to a regular temp table, a global temp table is visible to all processes. When the process that created the table goes away, so does the table (with some delay if another process is running a query against the table in that precise moment). Under that condition, it is difficult to use such a table as a process-keyed table.
However, there is a special case that SQL Server MVP Itzik Ben-Gan made me aware of: if you create a global temp table in a start-up procedure, the global temp table will be around as long as the server is up, unless someone explicitly drops it. This permits you to use the table as a process-keyed table, because you can rely on the table to always be there. Since the table is in tempdb, you get less logging that for a table in your main database.
Here is a quick sample of how you create a global temp table when SQL Server starts:
USE master go CREATE PROCEDURE create_global_temp AS CREATE TABLE ##global(process_key uniqueidentifier NOT NULL, -- other columns here ) go EXEC sp_procoption create_global_temp, 'startup', 'true'
It cannot be denied that there are some problems with this solution. What if you need to change the definition of the global temp table in way that cannot be handled with ALTER TABLE? Restarting the server to get the new definition in place may not be acceptable. Or what if you have different versions of your database on a test server, and the different versions require different schemas for the process-keyed table? I would recommend that you refer to your process-keyed table through a synonym, so that in a development database or on a common test server it can point to a database-local table, and only on production or acceptance-test servers it would point to the global temp table. This also permits you retarget the synonym if the schema has to be changed without a server restart.
While process-keyed tables are not without issues when it comes to performance, and they are certainly a bit heavy-handed for the simpler cases, I still see this is the best overall solution that I present in this article. It does not come with a ton of restrictions like table-valued functions and it is robust, meaning that code will not break because of seemingly innocent changes in difference to some of the other methods we will look at later.
But that does not mean that using a process-keyed table is always the way to go. For instance, if you only need output-only, and your procedure can be written as a table-valued function, that should be your choice.
Table-valued parameters (TVP) were introduced in SQL 2008. They permit you to pass a table variable as a parameter to a stored procedure. When you create your procedure, you don't put the table definition directly in the parameter list, instead you first have to create a table type and use that in the procedure definition. At first glance, it may seem like an extra step of work, but when you think of it, it makes very much sense: you will need to declare the table in at least two places, in the caller and in the callee. So why not have the definition in one place?
Here is a quick example of a table-valued parameter in play:
CREATE TYPE my_table_type AS TABLE(a int NOT NULL, b int NOT NULL) go CREATE PROCEDURE inner_sp @indata my_table_type READONLY AS INSERT targettable (col1, col2) SELECT a, b FROM @indata go CREATE PROCEDURE outer_sp AS DECLARE @data my_table_type INSERT @data (a, b) VALUES (5, 7) EXEC inner_sp @data go
One thing to note is that a table-valued parameter always has an implicit default value of an empty table. So saying
EXEC inner_sp in this example would not be an error.
Table-valued parameters certainly seem like the definite solution, don't they? Unfortunately, TVPs have a very limited usage for the problem I'm discussing in this article. If you look closely at the procedure definition, you find the keyword READONLY. And that is not an optional keyword, but it is compulsory for TVPs. So if you want to use TVPs to pass data between stored procedures, they are usable solely for input-only scenarios. I don't know about you, but in almost all situations where I share a temp table or use a process-keyed table it's for input-output or output-only.
When I first heard that SQL 2008 was to have TVPs, I was really excited. And when I learnt that they were readonly, I was equally disappointed. During the beta of SQL 2008 I wrote an article, Why read-only table parameters is not enough, where I tried to whip up support for a feedback item in order to persuade the dev team to permit read-write TVPs when they are passed between stored procedures inside SQL Server. (Making them read-write when called from a client is likely to be a bigger challenge.) Ten years later, the feedback item is still active, but table-valued parameters are still readonly.
Note: While outside the scope for this article, table-valued parameters is still a welcome addition to SQL Server, since it makes it a lot easier to pass a set of data from client to server, and this context the READONLY restriction is not a big deal. I give an introduction how to use TVPs from ADO .Net in my article Using Table-Valued Parameters in SQL Server and .NET.
INSERT-EXEC is a method that has been in the product for a long time. It's a method that is seemingly very appealing, because it's very simple to use and understand. Also, it permits you use the result of a stored procedure without any changes to it. Above we had the example with the procedure SalesByStore. Here is a how we can implement BigSalesByStore with INSERT-EXEC:
CREATE PROCEDURE SalesByStore @storeid varchar(30) AS SELECT t.title, s.qty FROM sales s JOIN titles t ON t.title_id = s.title_id WHERE s.stor_id = @storeid go CREATE PROCEDURE BigSalesByStore @storeid varchar(30), @qty smallint AS CREATE TABLE #SalesByStore(title varchar(80) NOT NULL PRIMARY KEY, qty smallint NOT NULL) INSERT #SalesByStore (title, qty) EXEC SalesByStore @storeid SELECT * FROM #SalesByStore WHERE qty >= @qty go EXEC SalesByStore '7131' EXEC BigSalesByStore '7131', 25 go DROP PROCEDURE SalesByStore, BigSalesByStore
In this example, I receive the data in a temp table, but it could also be a permanent table or a table variable. (Except on SQL 2000, where you cannot use a table variable.)
It cannot be denied that this solution is simpler than the solution with sharing a temp table. So why then did I first present a more complex solution? Because when we peel off the surface, we find that this method has a couple of issues that are quite problematic.
If you for some reason would try:
CREATE TABLE #BigSalesByStore(titleid varchar(80) NOT NULL PRIMARY KEY, qty smallint NOT NULL) INSERT #BigSalesByStore (titleid, qty) EXEC BigSalesByStore '7131', 25
SQL Server will tell you:
Msg 8164, Level 16, State 1, Procedure BigSalesByStore, Line 8
An INSERT EXEC statement cannot be nested.
This is a restriction in SQL Server and there is not much you can do about it. Except than to save the use of INSERT-EXEC until when you really need it. That is, when rewriting the callee is out of the question, for instance because it is a system stored procedure.
Six months later there is a user requirement for the application function that uses the result set from SalesByStore that the column title_id should be displayed. A developer merrily adds the column to the result set. Unfortunately, any attempt to use the form that calls BigSalesByStore now ends in tears:
Msg 213, Level 16, State 7, Procedure SalesByStore, Line 2
Column name or number of supplied values does not match table definition.
What it says. The result set from the called procedure must match the column list in the INSERT statement exactly. The procedure may produce multiple result sets, and that's alright as long as all of them match the INSERT statement.
From my perspective, having spent a lot of my professional life with systems development, this is completely unacceptable. Yes, there are many ways to break code in SQL Server. For instance, a developer could add a new mandatory parameter to SalesByStore and that would also break BigSalesByStore. But most developers are aware the risks with such a change to an API and therefore adds a default value for the new parameter. Likewise, most developers understand that removing a column from a result set could break client code that expects that column and they would not do this without checking all code that uses the procedure. But adding a column to a result set seems so innocent. And what is really bad: there is no way to find out that there is a dependency – save searching through all the database code for calls.
Provided that you can change the procedure you are calling, there are two ways to alleviate the problem. One is simply to add a comment in the code of the callee, so that the next developer that comes around is made aware of the dependency and hopefully changes your procedure as well.
Another way is to use table types (if you are on SQL 2008 or later). Here is an example:
CREATE TYPE SalesByStore_tbl AS TABLE (title varchar(80) NOT NULL PRIMARY KEY, qty smallint NOT NULL) go CREATE PROCEDURE SalesByStore @storeid varchar(30) AS DECLARE @ret SalesByStore_tbl INSERT @ret (title, qty) SELECT t.title, s.qty FROM sales s JOIN titles t ON t.title_id = s.title_id WHERE s.stor_id = @storeid SELECT * FROM @ret go CREATE PROCEDURE BigSalesByStore @storeid varchar(30), @qty smallint AS DECLARE @data SalesByStore_tbl INSERT @data EXEC SalesByStore @storeid SELECT title, qty FROM @data WHERE qty >= @qty go EXEC SalesByStore '7131' EXEC BigSalesByStore '7131', 25 go DROP PROCEDURE SalesByStore, BigSalesByStore DROP TYPE SalesByStore_tbl
It is interesting to note that this code makes virtue of two things that usually are bad practice, to wit SELECT * and INSERT without an explicit column list. This is not a matter of sloppiness – it is essential here. If someone wants to extend the result set of SalesByStore, the developer has to change the table type, and BigSalesByStore will survive, even if the developer does not know about its existence.
You could argue that this almost like an output TVP, but don't forget the other problems with INSERT-EXEC – of which there are two more to cover.
Even if there is no explicit transaction started with BEGIN TRANSACTION, an INSERT statement constitutes a transaction of its own. (So that the statement can be rolled back in case of an error.) That includes any procedure called through INSERT-EXEC. Is this bad or not? In many cases, this is not much of an issue. But there are a couple of situations where this can cause problems:
In my articles on Error and Transaction Handling in SQL Server, I suggest that you should always have an error handler like
BEGIN CATCH IF @@trancount > 0 ROLLBACK TRANSACTION EXEC error_handler_sp RETURN 55555 END CATCH
The idea is that even if you do not start a transaction in the procedure, you should always include a ROLLBACK, because if you were not able to fulfil your contract, the transaction is not valid.
Unfortunately, this does not work well with INSERT-EXEC. If the called procedure executes a ROLLBACK statement, this happens:
Msg 3915, Level 16, State 0, Procedure SalesByStore, Line 9
Cannot use the ROLLBACK statement within an INSERT-EXEC statement.
The execution of the stored procedure is aborted. If there is no CATCH handler anywhere, the entire batch is aborted, and the transaction is rolled back. If the INSERT-EXEC is inside TRY-CATCH, that CATCH handler will fire, but the transaction is doomed, that is, you must roll it back. The net effect is that the rollback is achieved as requested, but the original error message that triggered the rollback is lost. That may seem like a small thing, but it makes troubleshooting much more difficult, because when you see this error, all you know is that something went wrong, but you don't know what.
If you at run-time can find out whether you are in INSERT-EXEC? Hm, yes, but that is a serious kludge. See this section in Part Two of Error and Transaction Handling in SQL Server for how to do it, if you absolutely need to.
You can also use INSERT-EXEC with dynamic SQL:
INSERT #tmp (...) EXEC sp_executesql @sql, @params, @par1, ...
Presumably, you have created the statement in @sql within your stored procedure, so it is unlikely that a change in the result set will go unnoticed. So from this perspective, INSERT-EXEC is fine. But the restriction that INSERT-EXEC can't nest remains, so if you use it, no one can call you with INSERT-EXEC. For this reason, in many cases it is better to put the INSERT statement inside the dynamic SQL.
There is also a performance aspect, that SQL Server MVP Adam Machanic has detailed in a blog post. The short summary is that with INSERT-EXEC, data does not go directly to the target table but bounces over a "parameter table", which incurs some overhead. Then again, if your target table is a temp table, and you put the INSERT inside the dynamic SQL, you may face a performance issue because of recompilation.
Occasionally, I see people who use INSERT-EXEC to get back scalar values from their dynamic SQL statement, which they typically invoke with EXEC(). In this case, you should not use INSERT-EXEC at all, but instead use sp_executesql which permits you to use OUTPUT parameters. Dynamic SQL is a complex topic, and if you are not acquainted with it, I recommend you to read my article The Curse and Blessings of Dynamic SQL.
INSERT-EXEC is simple to use, and if all you want to do is to grab a big result set from a stored procedure for further analysis ad hoc, it's alright.
But you should be very restrictive to use it in application code. Only use it when rewriting the procedure you are calling is completely out of the question. That is, the procedure is not part of your application: a system stored procedure or part of a third-party product. And in this case, you should make it a routine to always test your code before you take a new version of the other product in use.
If INSERT-EXEC shines in its simplicity, using the CLR is complex and bulky. It is not likely to be your first choice, and nor should it. However, if you are in the situation that you cannot change the callee, and nor it possible for you to use INSERT-EXEC because of any of its limitations, the CLR can be your last resort.
As a recap, here are the main situations where INSERT-EXEC fails you, and you may want to turn to the CLR:
The CLR has one more advantage over INSERT-EXEC: it is less sensitive to changes in the procedure you call. If a column is added to the result set of the procedure, your CLR procedure will not break.
The idea as such is simple: you write a stored procedure in a CLR language like C# or VB .NET that runs the callee and captures the result set(s) into a DataSet object. Then you write the data from the DataSet back to the table where you want the data. While simple, you need to write some code.
Let's have a look at an example. When you call the system procedure sp_helpdb for a specific database, it produces two result sets, of which the second lists the files for the database. Say that you want to gather this output for all databases on the server. You cannot use INSERT-EXEC due to the multiple result sets. To address this issue, I wrote a stored procedure in C# that you find in the file helpdb.cs. In the script helpdb.sql you can see how I employ it. This script defines a temp table #helpdb which has the same structure as the output as the second result set from sp_helpdb with two alterations: 1) I have added a column dbname to hold the name of the database the row comes from. 2) I have renamed the column name to logicalname to make things a little clearer.
The C# procedure runs sp_helpdb with the DataAdapter.Fill method to get the data into a DataSet. It then inserts the data in the second DataTable in the DataSet to #helpdb. This is done in a single INSERT statement by passing the DataTable directly to a table-valued parameter. The database name is passed in a separate parameter.
Note: If you still are on SQL 2005 which does not support table-valued parameters, you can use helpdb-2005.cs which inserts the rows one by one.
Undoubtedly, this solution requires a bit of work. You need to write more code than with most other methods, and you get an assembly that you must somehow deploy. If you already are using the CLR in your database, you probably already have routines for dealing with assemblies. But if you are not, that first assembly you add to the database is quite of a step to take. A further complication is that the CLR in SQL Server is disabled by default. To enable it, you (or the DBA) need to run:
EXEC sp_configure 'clr enabled', 1 RECONFIGURE
Another issue is that this solution goes against best practices for using the CLR in SQL Server. First of all, data access from the CLR should be avoided, simply because T‑SQL is better equipped for this. But here we are talking about situations where we need to circumvent limitations in T‑SQL. Another violation of best practice is the use of the DataAdapter, DataTable and DataSet classes. This is something to be avoided in CLR stored procedures, because it means that you have data in memory in SQL Server outside the buffer pool. Of course, a few megabytes is not an issue, but if you would read several gigabytes of data into a DataSet, this could have quite nasty effects for the stability of the entire SQL Server process.
The alternative is to use a plain ExecuteReader and insert the rows as they come, possibly buffering them in small sets of say 500 rows to improve performance. This is certainly a viable solution, but it makes deployment even more difficult. To wit, you cannot perform the INSERT statements on the context connection while the DataReader is running, so you would need to open a second connection and this requires that the assembly has the permission EXTERNAL_ACCESS. So for practical purposes, you would only go this road, if you are anxious that you will read too much data than what is defensible for a DataSet.
Just like INSERT-EXEC this is a method where you can use the called stored procedure as-is. The purpose of OPENQUERY and its cousin OPENROWSET is to permit you to run pass-through queries on linked servers. It can be very useful, not the least if you want to join multiple tables on the remote server and want to be sure that the join is evaluated remotely.
Instead of accessing a remote server, you can make a loopback connection to your own server, so you can say things like:
SELECT * FROM OPENQUERY(LOCALSERVER, 'EXEC sp_who') WHERE status = 'runnable'
If you want to create a table from the output of a stored procedure with SELECT INTO to save typing, this is the only method in the article that fits the bill.
So far, OPENQUERY looks very simple, but as this chapter moves on you will learn that OPENQUERY can be very difficult to use. Moreover, it is not aimed at improving performance. It may save you from rewriting your stored procedure, but most likely you will have to put in more work overall – and in the end you get a poorer solution. While I'm not enthusiastic over INSERT-EXEC, it is still a far better choice than OPENQUERY.
In the example, LOCALSERVER may look like a keyword, but it is only name. This is how you define it:
EXEC sp_addlinkedserver @server = 'LOCALSERVER', @srvproduct = '', @provider = 'SQLOLEDB', @datasrc = @@servername
To create a linked server, you must have the permission ALTER ANY SERVER, or be a member of any of the fixed server roles sysadmin or setupadmin. Instead of SQLOLEDB, you can specify SQLNCLI, SQLNCLI10 or SQLNCLI11 depending on your version of SQL Server. SQL Server seems to use the most recent version of the provider anyway.
It's important to understand that OPENQUERY opens a new connection to SQL Server. This has some implications:
The settings ANSI_NULLS and ANSI_WARNINGS must be ON for queries involving linked servers. Thankfully, these settings are also on by default in most contexts. However, if you have a database which started its life on SQL 2000 or earlier there can be problems with the ANSI_NULLS setting as it is saved with the procedure. (ANSI_WARNINGS is a pure run-time setting.) One of the tools that came with SQL 2000 defaulted to create procedures with ANSI_NULLS OFF. If you script this procedure in SSMS, SSMS with faithfully include SET ANSI_NULLS OFF in the script, retaining the setting.
You know that you are victim to this issue if you see this error message:
Msg 7405, Level 16, State 1, Line 17
Heterogeneous queries require the ANSI_NULLS and ANSI_WARNINGS options to be set for the connection. This ensures consistent query semantics. Enable these options and then reissue your query.
The remedy is to change the ANSI_NULLS setting for the procedure (by altering it when SET ANSI_NULLS ON is in effect). but you will need to check that the procedure does not include logic which depends on ANSI_NULLS being OFF. (When ANSI_NULLS is OFF a comparison with NULL will yield TRUE or FALSE, not UNKNOWN.)
The second parameter to OPENQUERY is the query to run on the remote server, and you may expect to be able to use a variable here, but you cannot. The query string must be a constant, since SQL Server needs to be able to determine the shape of the result set at compile time. This means that you as soon your query has a parameter value, you need to use dynamic SQL. Here is how to implement BigSalesByStore with OPENQUERY:
CREATE FUNCTION quotestring(@str nvarchar(MAX)) RETURNS nvarchar(MAX) AS BEGIN DECLARE @ret nvarchar(MAX), @sq char(1) SELECT @sq = '''' SELECT @ret = replace(@str, @sq, @sq + @sq) RETURN(@sq + @ret + @sq) END go CREATE PROCEDURE SalesByStore @storeid varchar(30) AS SELECT t.title, s.qty FROM sales s JOIN titles t ON t.title_id = s.title_id WHERE s.stor_id = @storeid go CREATE PROCEDURE BigSalesByStore @storeid varchar(30), @qty smallint, @debug bit = 0 AS DECLARE @remotesql nvarchar(MAX), @localsql nvarchar(MAX) SELECT @remotesql = 'EXEC ' + quotename(db_name()) + '.dbo.SalesByStore ' + dbo.quotestring(@storeid) SELECT @localsql = 'SELECT * FROM OPENQUERY(LOCALSERVER, ' + dbo.quotestring(@remotesql) + ') WHERE qty >= @qty' IF @debug = 1 PRINT @localsql EXEC sp_executesql @localsql, N'@qty smallint', @qty go EXEC SalesByStore '7131' EXEC BigSalesByStore '7131', 25, 1 go DROP PROCEDURE BigSalesByStore, SalesByStore DROP FUNCTION quotestring
What initially seemed simple to use, is no longer so simple. What I did not say above is there are two reasons why we need dynamic SQL here. Beside the parameter @storeid, there is also the database name. Since OPENQUERY opens a loopback connection, the EXEC statement must include the database name. Yes, you could hardcode the name, but sooner or later that will bite you, if nothing else the day you want to restore a copy of your database on the same server for test purposes. From this follows that in practice, there are not many situations in application code where you can use OPENQUERY without using dynamic SQL.
The code certainly requires some explanation. The function quotestring is a helper, taken from my article on dynamic SQL. It encloses a string in single quotes, and doubles any quotes within it to conform to the T‑SQL syntax. The problem with writing dynamic SQL which involves OPENQUERY is that you get at least three levels of nested strings, and if you try to do all at once, you will find yourself writing code which has up to eight consecutive single quotes that you or no one else cannot read. Therefore, it is essential to approach the problem in a structured way like I do above. I first form the query on the remote server, and I use quotestring to embed the store id. Then I form the SQL string to execute locally, and again I use quotestring to embed the remote query. I could also have embedded @qty in the string, but I prefer to adhere to best practices and pass it as a parameter to the dynamic SQL string. As always when I use dynamic SQL, I include a @debug parameter, so that I can inspect the statement I've generated.
To be able to compile the query, SQL Server needs to know the shape of the result set returned from the procedure. From SQL 2012 and on, SQL Server does this with the help of the procedure sp_describe_first_result_set which trawls code for the procedure as well all sub-procedures to figure out the shape of the first result set. Beware that this procedure will not always succeed, and the classic example is when temp tables are involved. (This issue does not apply to table variables, only temp tables.) Here is one example. This query:
SELECT * FROM OPENQUERY(LOCALSERVER, 'EXEC msdb..sp_helpindex sysjobs')
results in this error message:
Msg 11526, Level 16, State 1, Procedure sp_describe_first_result_set, Line 1 [Batch Start Line 17]
The metadata could not be determined because statement 'insert into #spindtab values (@indname, @indid, @ignore_dup_key, @is_unique, @is_hypothetical,
@' in procedure 'sp_helpindex' uses a temp table.
There is a remedy for this, though. Starting with SQL 2012, the EXEC statement provides the clause WITH RESULT SETS that permits you to specify the shape of the result set yourself. So this returns the data as desired:
SELECT * FROM OPENQUERY(LOCALSERVER, 'EXEC msdb..sp_helpindex sysjobs WITH RESULT SETS ((index_name sysname, index_description nvarchar(500), index_keys nvarchar(500)))')
Note that SQL Server will validate that the result set returned from the procedure actually aligns with what you say in WITH RESULT SETS and raise an error if not. Also observe the syntax – there are two pairs of parentheses.
Obviously, this requires that you know what the result set looks like. If you really wanted to do something like
SELECT a FROM (EXEC mysp) WHERE b = 123
And even if you don't care about the umpteen other columns returned by mysp, you still have to list them all. And if someone adds one more column to the result set, your query will break – exactly the same problem as we saw with INSERT-EXEC.
Another situation where sp_describe_first_result_set bites the dust is when there is dynamic SQL involved. Again, you can use WITH RESULT SETS to save the show. If you have the powers to change the procedure you are calling, it is better to attach the clause directly to where you execute the dynamic SQL:
EXEC sp_executesql @sql, @params, @param1, ... WITH RESULT SETS ((...))
Thanks to the extra clause, sp_describe_first_result_set can return data successfully.
Up to SQL 2008, SQL Server used a different mechanism to get the metadata. During compilation SQL Server would run the SQL text fed to OPENQUERY preceded by the command SET FMTONLY ON. When FMTONLY is ON, SQL Server does not execute any data-retrieving statements, but only sifts through the statements to return metadata about the result sets. This is not a very robust mechanism, and FMTONLY can be a source for confusion in more than one way, and you will see some hilarious code later on.
As with sp_describe_first_result_set, temp tables is likely to cause you grief. When FMTONLY is ON, the temp table is not created. If you run this on SQL 2008:
SELECT * FROM OPENQUERY(LOCALSERVER, 'EXEC msdb..sp_helpindex sysjobs')
this results in:
Msg 208, Level 16, State 1, Procedure sp_helpindex, Line 104
Invalid object name '#spindtab'.
Here you cannot add the WITH RESULT SETS clause, as this clause was introduced in SQL 2012. However, there is a trick you can do: you can turn off FMTONLY in your batch and lure SQL Server:
SELECT * FROM OPENQUERY(LOCALSERVER, 'SET FMTONLY OFF EXEC msdb..sp_helpindex sysjobs')
At first glance, this may seem like a nicer solution than having to specify the exact shape of the result set using WITH RESULT SETS, but that is absolutely not the case, for two reasons:
For these reasons, I strongly recommend you to stay away from this trick! The best solution if you want to use the data from your stored procedure is to use any other of the methods described in this article, include INSERT-EXEC. But if you absolutely want to run your stored procedure with a temp table through OPENQUERY on SQL 2008 or earlier, here is a trick that at least addresses the second point above. Beware, you are about to look at one of the most obscure pieces of T‑SQL code I've ever come up with:
CREATE PROCEDURE temp_temp_trick AS DECLARE @fmtonlyon int SELECT @fmtonlyon = 0 IF 1 = 0 SELECT @fmtonlyon = 1 SET FMTONLY OFF CREATE TABLE #temp(...) IF @fmtonlyon = 1 SET FMTONLY ON -- Rest of the code goes here.
What happens here is that when FMTONLY is ON, variable assignments are still carried out. When it comes to IF statements, the conditions are not evaluated, but both the IF and ELSE branches are "executed" (that is, sifted through). Thus, only when FMTONLY is ON, the flag @fmtonlyon will be set. We turn off FMTONLY before the creation of the temp table, to prevent compilation from failing, but then we restore the setting immediately after thanks to the @fmtonlyon flag.
Again, keep in mind that this is not future-proof. The day you upgrade to a newer version of SQL Server, your loopback query will fail, and you will need to change it to add WITH RESULT SETS. And, of course, since this trick only possible when you are in the position that you can change the procedure, you are better off anyway by sharing a temp table or use a process-keyed table.
If the stored procedure you call returns its result set through dynamic SQL, this may work if the SQL string is formed without reading data from any table, but if not, the query will fail with a message like this one:
Msg 7357, Level 16, State 2, Line 21
Cannot process the object "EXEC sp_who2". The OLE DB provider "SQLNCLI10" for linked server "LOCALSERVER" indicates that either the object has no columns or the current user does not have permissions on that object.
Again, the workaround with SET FMTONLY OFF, but it is still not future-proof.
Finally, an amusement with FMTONLY you may run into is that since variable assignments are carried but conditions for IF are ignored, you may get seemingly inexplicable errors. For instance, if you have a procedure that can call itself, the attempt to call it through OPENQUERY is likely to end with exceeding the maximum nest level of 32 even if the procedure has perfectly correct recursion handling.
Yet a problem with OPENQUERY is demonstrated by this script:
USE tempdb go CREATE TABLE nisse (a int NOT NULL) go CREATE PROCEDURE silly_sp @x int AS --SET NOCOUNT ON INSERT nisse VALUES (@x) SELECT @x * @@trancount SELECT @x * 3 go SELECT * FROM OPENQUERY(LOCALSERVER, 'EXEC tempdb.dbo.silly_sp 7') go SELECT * FROM nisse
When you run this, you get this error message:
Msg 7357, Level 16, State 1, Line 11
Cannot process the object "EXEC tempdb.dbo.silly_sp 7". The OLE DB provider "SQLNCLI11" for linked server "LOCALSERVER" indicates that either the object has no columns or the current user does not have permissions on that object.
The reason for this message is that the first "result set" is the rows affected message generated by the INSERT statement, and this message lures OPENQUERY to think that there were no columns in the result set. Adding SET NOCOUNT ON to the procedure resolves this issue. You could also add SET NOCOUNT ON the command string you pass to OPENQUERY. (And in difference to tricking with SET FMTONLY ON, this is a perfectly valid thing to do with no menacing side effects.)
When SQL Server executes the query for real, the OLE DB provider wraps the submitted batch in a transaction which it rolls back at the end, so if you try to sneak in an update in your query you are in for a surprise. For instance, take the script above. Once SET NOCOUNT ON is in force, this is the output:
(1 row(s) affected)
(0 row(s) affected)
We get back 7 from the call to silly_sp, which indicates that @@trancount is 1, and there is thus an open transaction, despite there is no BEGIN TRANSACTION in the procedure. (We don't get the '21' that we get when we execute silly_sp directly, because with OPENQUERY, we only get one result set.) You also see that when we SELECT directly from nisse after the call to OPENQUERY, that the table is empty, because that transaction started under the covers was rolled back. If you are dead set on it, you can prepend your query batch with COMMIT TRANSACTION to have your update to really be persisted.
As you have seen, at first OPENQUERY seemed very simple to use, but the stakes rise steeply. If you are still considering to use OPENQUERY after having read this section, I can only wish you good luck and I hope that you really understand what you are doing. OPENQUERY was not intended for accessing the local server, and you should think twice before you use it that way.
XML is a solution that aims at the same spot as sharing a temp table and process-keyed tables. That is, the realm of general solutions without restrictions, to the price of a little more work. While SQL 2000 has support for XML, if you want to use XML to pass data between stored procedures, you need to have at least SQL 2005..
We will look at a version of SalesByStore and BigSalesByStore which uses XML, but since this is a little too much to digest in one go, we first only look at SalesByStore_core to see how we construct the XML:
CREATE PROCEDURE SalesByStore_core @storeid varchar(30), @xmldata xml OUTPUT AS SET @xmldata = ( SELECT t.title, s.qty FROM sales s JOIN titles t ON t.title_id = s.title_id WHERE s.stor_id = @storeid FOR XML RAW('SalesByStore'), TYPE) go
In the previous version of SalesByStore_core, we stored the data from the result in a temp table. Here we use FOR XML RAW to generate an XML document that we save to the output parameter @xmldata.
This is how the resulting XML document may look like (with one title abbreviated for space):
<SalesByStore title="Is Anger the Enemy?" qty="20" /> <SalesByStore title="The Gourmet Microwave" qty="25" /> <SalesByStore title="Computer Phobic AND Non-Phobic Individuals" qty="20" /> <SalesByStore title="Life Without Fear" qty="25" /> <SalesByStore title="Prolonged Data Deprivation: Four Case Studies" qty="15" /> <SalesByStore title="Emotional Security: A New Algorithm" qty="25" />
FOR XML has three more options beside RAW: AUTO, ELEMENTS and PATH, but for our purposes here, RAW is the simplest to use. You don't have to specify a name for the elements; the default in this case will be row, but I would suggest that using a name is good for clarity.
The keyword TYPE ensures that the return type of the SELECT query is the xml data type; without TYPE the type would be nvarchar(MAX). TYPE is not needed here, since there will be an implicit conversion to xml anyway, but it can be considered good practice to include it. However, I need to add a caveat. There is a bug which may make XML documents created with TYPE to take longer time to read and shread than documents created as a string and then converted to xml. I submitted this bug long ago, but when I test my repro scripts on SQL 2017, the issue still persists.
Since SalesByStore should work like it did originally, it has to convert the data back to tabular format, a process known as shredding. Here is how the XML version looks like:
CREATE PROCEDURE SalesByStore @storeid varchar(30) AS DECLARE @xmldata xml EXEC SalesByStore_core @storeid, @xmldata OUTPUT SELECT T.c.value('@title', 'varchar(80)') AS title, T.c.value('@qty', 'smallint') AS qty FROM @xmldata.nodes('SalesByStore') AS T(c) go
To shred the document, we use two of the xml type methods. The first is nodes which shreds the documents into fragments of a single element. That is, this part:
FROM @xmldata.nodes('SalesByStore') AS T(c)
The part T(c) defines an alias for the one-column table as well as an alias for the column. To get the values out of the fragments, we use another xml type method, value. The value method takes two arguments whereof the first addresses the value we want to extract, and the second specifies the data type. The first parameter is a fairly complex story, but as long as you follow the example above, you don't really need to know any more. Just keep in mind that you must put an @ before the attribute names, else you would be addressing an element. In the XML section of my article Arrays and Lists in SQL Server (The Long Version), I have some more information about nodes and value.
To make the example complete, here is the XML version of BigSalesByStore. To avoid having to repeat the call to value in the WHERE clause, I use a CTE (Common Table Expression).
CREATE PROCEDURE BigSalesByStore @storeid varchar(30), @qty smallint AS DECLARE @xmldata xml EXEC SalesByStore_core @storeid, @xmldata OUTPUT ; WITH SalesByStore AS ( SELECT T.c.value('@title', 'varchar(80)') AS title, T.c.value('@qty', 'smallint') AS qty FROM @xmldata.nodes('SalesByStore') AS T(c) ) SELECT title, qty FROM SalesByStore WHERE qty >= @qty go
In this example the XML document is output-only, but it's easy to see that the same method can be used for input-only scenarios. The caller builds the XML document and the callee shreds it back to a table.
What about input-output scenarios like the procedure ComputeTotalStoreQty? One possibility is that the callee shreds the data into a temp table, performs its operation, and then converts the data back to XML. A second alternative is that the callee modifies the XML directly using the xml type method modify. I will spare you from an example of this, however, as it unlikely that you would try it, unless you already are proficient in XQuery. A better alternative may be to mix methods: use a table-valued parameter for input and only use XML for output.
The result set in the example is from a single table, but what if we have some form of parent/child-relationship? Say that we want to return the name of all authors, as well as all the titles they have written. With temp tables or process-keyed tables, the natural solution would be to use two tables (or actually three, since in pubs there is a many-to-many relationship between titles and authors, which I overlook here.) But since XML is hierarchical, it would be more natural to put everything in a single XML document, and here is a query to do this:
SELECT a.au_id , a.au_lname, a.au_fname , (SELECT t.title FROM pubs..titleauthor ta JOIN pubs..titles t ON t.title_id = ta.title_id WHERE a.au_id = ta.au_id FOR XML RAW('titles'), TYPE) FROM pubs..authors a FOR XML RAW('authors'), TYPE
Rather than a regular join query, I use a subquery for the titles, because I only want one node per author with all titles. With a join, I get one author node for each title, so that authors with many books appear in multiple nodes. The subquery uses FOR XML to create a nested XML document, and this time the TYPE option is mandatory, since without it the nested XML data would be included as a plain string.
To retrieve the titles from the XML document, you could use this query:
SELECT au_id = A.item.value('@au_id', 'varchar(11)'), title = T.item.value('@title', 'varchar(80)') FROM @x.nodes('/authors') AS A(item) CROSS APPLY A.item.nodes('titles') AS T(item)
The first call to nodes gives you a fragment per authors node, and then you use CROSS APPLY to dig down to the titles node. For a little longer discussion on this way of shredding a hierarchical XML document, see the XML section of my article Arrays and Lists in SQL Server (The Long Version).
So far the technique to use this method. Let's now assess it. If you have never worked with XML in SQL Server, you are probably saying to yourself I will never use that!. And one can hardly blame you. This method is like pushing the table camel through the needles eye of the parameter list of a stored procedure. Personally, I think the method spells k-l-u-d-g-e. But it's certainly a matter of opinion. I got a mail from David Walker, and he went as far as saying this is the only method that really works.
And, that cannot be denied, there are certainly advantages with XML over about all the other methods I have presented here. It is less contrived than using the CLR, and it is definitely a better option than OPENQUERY. You are not caught up with the limitations of table-valued functions. Nor do you have any of the issues with INSERT-EXEC. Compared to temp tables and process-keyed tables, you don't have to be worried about recompilation or that programmers fail to clean up a process-keyed table after use.
When it comes to performance, you get some cost for building the XML document and shredding it shortly thereafter. Then again, as long as the amount of data is small, say less than 200 KB, the data will stay in memory and there is no logging involved like when you use a table of any sort. Larger XML documents will spill to disk, though. A general caveat is that inappropriate addressing in a large XML document can be a real performance killer, so if you expect large amounts of data, you have to be careful. (And these issues can appear with sizes below 200 KB.)
Besides the daunting complexity, there are downsides with XML from a robustness perspective. XML is more sensitive to errors. If you make a spelling mistake in the first argument to value, you will silently get NULL back, and no error message. Likewise, if you get the argument to nodes wrong, you will simply get no rows back. The same problem arises if you change a column alias or a node name in the FOR XML query, and forget to update a caller. When you use a process-keyed table or a temp table you will get an error message at some point, either at compile-time or at run-time.
Another weak point is that you have to specify the data type for each column in the call to value, inviting you to make the mistake to use different data types for the same value in different procedures. This mistake is certainly possible when use temp tables as well, although copy-and-paste are easier to apply on the latter. With a process-keyed table it cannot happen at all.
One thing I like with tables is that they give you a description of the data you are passing around; this is not the least important when many procedures are using the same process-keyed table. This is more difficult to achieve with XML. You could use schema collections for the task, but you will not find very many SQL Server DBAs who speak XSD fluently. Also, schema-bound XML tends to incur a performance penalty in SQL Server.
For these reasons, I feel that using a temp table or a process-keyed table are better choices than XML. And while I find XML an overall better method than INSERT-EXEC or OPENQUERY, these methods have the advantage that you don't have to change the callee. So that kind of leaves XML in nowhere land. But as they say, your mileage may vary. If you feel that XML is your thing, go for it!
If you are on SQL 2016 or later, you can use JSON if you prefer JSON over XML. I am not taking up space with examples, but I leave it to the reader to explore this option of your own. The assessment is the very much the same as for XML. Personally, I think it is a kludge, but you may love it.
This method was suggested to me by Peter Radocchia. Cursor variables were introduced in SQL 7, but I suspect that many SQL developers are at most only dimly aware of their existence. I never use them myself. Here is an example of how you use them to bring the result set from one procedure to another:
CREATE PROCEDURE get_cursor @cursor CURSOR VARYING OUTPUT AS SET @cursor = CURSOR STATIC FOR SELECT au_id, au_lname, au_fname FROM pubs..authors OPEN @cursor go CREATE PROCEDURE caller AS DECLARE @cursor CURSOR DECLARE @au_id char(11), @au_fname varchar(40), @au_lname varchar(40) SET NOCOUNT ON EXEC get_cursor @cursor OUTPUT WHILE 1 = 1 BEGIN FETCH NEXT FROM @cursor into @au_id, @au_lname, @au_fname IF @@fetch_status <> 0 BREAK PRINT 'Au_id: ' + @au_id + ', name: ' + @au_fname + ' ' + @au_lname END DEALLOCATE @cursor go EXEC caller go DROP PROCEDURE caller, get_cursor
Note that the cursor is STATIC. Static cursors are much preferable over dynamic cursors, the default cursor type, since the latter essentially evaluates the query for every FETCH. When you use a static cursor, the result set of the SELECT statement is saved into a temp table, from where FETCH retrieves the data.
I will have to admit that I see little reason to use this method. Just like INSERT-EXEC, this method requires an exact match between the caller and the callee for the column list. And since data is processed row by row, performance is likely to take a serious toll if there are any volumes.
Starting with SQL 2016 you can say things like:
EXEC sp_set_session_context N'SomeKey', 'Somevalue'
You can then retrieve the value of SomeKey this way:
Instead of literals, you can use variables in all places. The data type for the key is nvarchar, while the data type for the value is sql_variant. That is, sp_set_session_context permits you to store key-value pairs, where they value is of any type in SQL Server that fits into 8000 bytes (CLR types excluded). In total, you can store up to 256 KB of data (including keys) this way.
This can be used to pass data between stored procedures, but you may ask what the point would be with outer_sp setting a value with sp_set_session_context and inner_sp retrieving the value with session_context or vice versa. And, correct, if outer_sp calls inner_sp directly, it is a lot more straightforward to use parameters. It's a different matter if there are one or more procedure in between. In that case a top SP can set a value which an SP three levels down the stack can retrieve without the value being passed as a parameter all the way down.
Session context is intended to be used for things that are common to the session. For instance, an application that logs in with an application login can use sp_set_session_context to set the name of the actual user so that it can be written to auditing column which would be set up with a DEFAULT constraint using session_context. (In this case the third parameter sp_set_session_context, @read_only, should be 1, so that the value cannot be changed once set.)
There is one situation where session_context does make sense in the communication between a direct pair of caller and callee, to wit triggers. You may have checks in a trigger to prevent violation of business rules. However, you may have a stored procedure that needs to make a temporary breach of a certain business rule within a transaction. The procedure can use sp_set_session_context to set a flag and the trigger can look at the flag and don't perform the check of this rule when the flag is set. (This is possible to achieve in SQL 2014 earlier as well, by checking in the trigger for the existence of a temp table with a specific name, but session context is a lot cleaner.) If you are so inclined, you can also use session context to have output parameters from your trigger.
Note: the concept of session context as such is not new to SQL 2016, but has existed since SQL 2000. However, before SQL 2016, the session context was limited to 128 bytes which you set with the command SET CONTEXT_INFO and you retrieved the value with the function context_info. While it could be used for "anything", I would recommend to not use it for anything else than the name of the actual user when an application logs in on behalf of the user, as this is a very common scenario. For an example of using SET CONTEXT_INFO for this purpose, see here.
If your procedures are on different servers, the level of difficulty rises steeply. There are many restrictions with linked servers, and several of the methods I have presented cannot be used at all. Ironically, some of the methods that I have discouraged you from suddenly step up as the better alternatives. One reason for this is that with linked servers, things are difficult anyway.
It is somewhat easier to retrieve data from a procedure on a linked server than passing data to it, so let's look at output first. If you have an input-output scenario, you should probably look into mixing methods.
If all you want to do is to get data back, these methods works:
OUTPUT parameters – but only for data types that are 8000 bytes or less. That is, you cannot retrieve the value of output parameters that are varchar(MAX) etc.
INSERT-EXEC – INSERT-EXEC works fine with linked servers. Actually, even better than with local procedures, since if the procedure you call uses INSERT-EXEC, this will not matter. The only restriction is that the result set must not include types that are not supported in distributed queries, for instance xml. ((n)varchar(MAX) is OK.) The fact that INSERT-EXEC starts a transaction can cause a real nightmare, since the transaction now will be a distributed transaction and this requires that you configure MSDTC (Microsoft Distributed Transaction Coordinator) correctly. If both servers are in the same domain, it often works out of the box. If they are not, for instance because you only have a workgroup, it may be impossible (at least I have not been able to). On SQL 2008 and later, you may be able to escape the problem by setting the option remote proc transaction promotion for the linked server to false. (Note that this affects all use of the linked server, and there may be situations where a distributed transaction is desirable.)
OPENQUERY – since OPENQUERY is a feature for linked servers in the first place, there is no difference to what I discussed above. It is still difficult with lots of pitfalls, but the land of linked servers is overall difficult. Nevertheless, INSERT-EXEC will in many cases be simpler to use. But with OPENQUERY you don't have to bounce the remote data over a table, and if result set of the remote procedure is extended with more columns, your code will not break.
Using the CLR – Using the CLR for linked servers is interesting, because the normal step would be to connect to the remote server directly, and bypass the local definition of linked servers – and thereby bypass all restrictions with regards to data types. When you make a connection to a remote server through the CLR, the default is to enlist into the current transaction, which means that you have to battle MSDTC. However, you can easily escape this battle by adding
enlist=false in the connection string to the remote server. This works on all versions of SQL Server from SQL 2005 and on. When using the CLR to access a remote server, there are no obstacles with using ExecuteReader and store the data into a local table as they come, since you are using two different connections. For a CLR procedure to be able to access a remote server, the assembly must be installed with the permission EXTERNAL_ACCESS.
XML – You cannot use the xml data type in a call to a remote stored procedure. However, you can make the OUTPUT parameter to be varchar(8000) and return the XML document that way – if it fits.
The other methods do not work at all, and that includes user-defined functions. You cannot call a user-defined function on a linked server.
If you want to pass a large amount of data for input over a linked server, there are three possibilities. Or three kludges if you like.
XML might be the easiest. The xml data type is not supported in calls to remote procedures, so you need to convert the XML document to nvarchar(MAX) or varbinary(MAX). The parameter on the other side can still be xml. (Note that the restriction mentioned in the previous section only applies to OUTPUT parameters. For input, there is no restriction.)
You cannot pass a table-valued parameter to a remote stored procedure. But you could have a CLR stored procedure that connects to the remote server and passes the TVP directly; as noted above, the assembly needs to have the permission EXTERNAL_ACCESS. You cannot pass a TVP to a CLR stored procedure from T‑SQL, so you would either have to pass the data as XML to the CLR procedure, or the CLR procedure would have to read the data from a (temp) table.
The last alternative is really messy. The caller stores the data in a process-keyed table locally and then calls the remote procedure, passing the process-key. The remote procedure then calls back to the first server and either selects directly from the process-keyed table, or calls a procedure on the source server with INSERT-EXEC. For an input-output scenario, the callee could write data back directly to the process-keyed table.
The issue about using SET FMTONLY ON is something that I learnt from Umachandar Jayachandran at Microsoft. SQL Server MVP Tony Rogerson pointed out that a process-keyed table should have a clustered index on the process key. Simon Hayes suggested some clarifications. Peter Radocchia suggested the cursor method. Richard St-Aubin and Wayne Bloss both suggested interesting approaches when sharing temp tables. Thanks to SQL Server MVP Iztik Ben-Gan for making me aware of global temp tables and start-up procedures. Sankar Reddy pointed out to me that my original suggestion for XML as a solution for linked servers was flawed. Greg Borota pointed out that an old leftover from SQL 2000 still was in the text. SQL Server MVP Adam Machanic made some interesting revelations about INSERT-EXEC with dynamic SQL. David Walker encouraged me to write more in depth on XML, and SQL Server MVP Denis Gobo gave me a tip on that part. Jay Michael and Tom Stagg both pointed out errors in the section on table-valued parameters. Alex Friedman made aware of the cache-littering problem when sharing temp tables and encouraged me to write about session context. Graham Lower reminded me that I ought to mention JSON.
If you have suggestions for improvements, corrections on topic, language or formatting, please mail me at firstname.lastname@example.org. If you have technical questions that any knowledgeable person could answer, I encourage you to post to the Transact-SQL forum on MSDN/Technet or any other SQL forum you frequent.
enlist=trueto the connection string.
Back to my home page.