An SQL text by Erland
Sommarskog, SQL Server MVP. Most recently updated 2018-06-15.
Copyright applies to this text. See here for font conventions used in this article.
This is a text that I originally wrote when SQL 2008 was still in beta. My plea was not honoured in that time frame – and still have not been. So the article is still here, only slightly brushed up.
In this article I discuss table-valued parameters and particularly the restriction that they have to be read-only. I argue that this makes this feature considerably less useful than it could be, and that in order to build scalable applications, it's essential to be able to pass read-write table parameters between stored procedures. Assuming that the read-only restriction is related to the communication with client API, I suggest that it would be acceptable that stored procedures with table parameters that are read-write cannot be called from client APIs.
The ultimate purpose for this article is to whip up support for an item that I originally filed on Microsoft Connect, these days found on the Uservoice site for SQL Server. I urge you to go to https://feedback.azure.com/forums/908035-sql-server/suggestions/32894995-relax-restriction-that-table-parameters-must-be-re and cast your vote.
When I first heard that SQL 2008 would have table parameters, I was very excited, because in the system I work with there are several places where we pass data between stored procedures, using some of the techniques discussed in my article How to Share Data Between Stored Procedures. While those methods work, they are not optimal, and all through the years I have felt that table-valued parameters is what we really need.
But when I learnt that the table parameters in SQL 2008 are read-only, I was equally disappointed. In almost all the cases that I share a temp table or use a process-keyed table (techniques discussed in the aforementioned article), I want to pass data back, or at least use the input table as a working area. Essentially, SQL 2008 is not bringing me any closer to a real solution.
Don't get me wrong, table-valued parameters is still a valuable addition to SQL Server even when read-only when you want to pass structured data from the client. In my article Arrays and Lists in SQL Server, I discuss a whole bunch of methods to do this in SQL 2005 and earlier. Table parameters could make history of all of them. But this article is about passing data between stored procedures.
The week I originally wrote this article I had been working with designing a set of interfaces in our system where read-write table-valued parameters would be very very useful to make this is a truly scalable solution. (As I had to support SQL 2005, I had to use process-keyed tables no matter what.)
First a short background of the business problem: clients of a brokerage firm may place orders to buy fund units. Commonly when you buy funds, you buy for a given amount, for instance €1000. At some point during the day, the brokerage firm makes a cut-off and aggregates all orders for one or more funds, and sends a total order to the fund company. At the same time the order amount is withdrawn from the client accounts and paid to the fund company. A few days later, the fund company confirms the transaction and informs the broker how many units he received and at which price. These units are then deposited into the client accounts.
My system keeps track of the cash holdings and the positions of instruments (funds, stocks, bonds, etc.) The purpose of the interface I have been designing is to permit an external system to handle the logic to aggregate the orders, communicate with the fund company etc. Currently, I know of two such external systems, and there could be more in the future. My interface is intended to be general, and I expect that for about every external system that will communicate with us, we will need some glue code specific to that system. That code would also reside in the database. Thus, the interface is a set of stored procedures that receives and returns data.
For this article, I will focus on the interface get_fund_orders. The external fund system passes me one or more funds to retrieve orders for, and in return it should get:
At the same time, in our system I should perform some updates:
A typical number of funds in a single call would be 100-200. The number of client orders can easily be 100 000 – 200 000 at the end of the month.
In the nineties, I would maybe have defined this procedure as simply as this.
CREATE PROCEDURE get_fund_orders @fundid int, @batchreference int OUTPUT AS ...
The orders would have been returned as a result set, and any errors would have been communicated by RAISERROR and a RETURN with a non-zero value.
But I have learnt since long, that this does not scale. Retrieving the orders and performing the updates per fund, is likely to take about 100 times longer for 100 funds, than retrieving all orders and performing all updates at once. OK, maybe not 100 times, but easily 80 times longer.
And while returning the orders as a result set would be perfectly OK, if the fund system really is external and runs in a different machine, it would exclude the possibility that the fund system would be a component of our own system or at least running in a database on the same server. Well, not really since INSERT-EXEC could come to the rescue, but as I argue in my article How to Share Data..., INSERT-EXEC is not very robust.
Had I had access to read-write table parameters, I would define the interface as:
CREATE TYPE fundlist_type AS TABLE (fundid int NOT NULL PRIMARY KEY, batchref int NULL, errmsg nvarchar(255) NULL) CREATE TYPE orderdata_type AS TABLE (batchref int NOT NULL, orderid int NOT NULL, orderamt decimal(10,2) NOT NULL, currency char(3) NOT NULL, PRIMARY KEY(batchref, orderid))
CREATE PROCEDURE get_fund_orders @funds fundlist_type OUTPUT, @orders orderdata_type OUTPUT AS
Clean and simple. And scalable.
I like to stress that it is essential that errmsg is added as a column in fundlist_type. If one fundid is in error, for instance there is no such fund, it's completely unacceptable that the procedure fails for all funds. get_fund_orders must return data for all funds that are in order. Of course, the procedure could still fail for all funds if there is an unanticipated error like a foreign-key violation or a deadlock, but the key is that since this is communication with an external system, some errors are anticipated, and the effects of such errors should be limited. After all, that was how the interface of nineties ago behaved, so why not now?
As I've mentioned, in my article How to Share Data Between Stored Procedures, I look at several methods to pass data between stored procedures, and in this section I will take a look at some of them, and see how they could be used in my business case, and what disadvantages they have. Here I do not describe the methods, so if you are unfamiliar with them, please refer to the article for reference.
Table-valued functions cannot be used at all in this scenario, because I need to perform updates.
There are two major problems with this method here:
The major gain here is that the table definition is one place. But there are other drawbacks:
INSERT-EXEC permits for a workaround for read-only table parameters, as illustrated in this example:
CREATE TYPE my_table_type AS TABLE(a int NOT NULL PRIMARY KEY, b int NOT NULL, c int NULL) go CREATE PROCEDURE the_callee @indata my_table_type READONLY AS DECLARE @copy my_table_type INSERT @copy SELECT * FROM @indata UPDATE @copy SET c = a + b FROM @copy c SELECT * FROM @copy go go CREATE PROCEDURE the_caller AS DECLARE @indata my_table_type, @outdata my_table_type INSERT @indata (a, b) VALUES (5, 7), (9, 2), (12, 23) INSERT @outdata EXEC the_callee @indata
Normally, it's bad practice to use SELECT * and INSERT without explicit column lists, but since we are using a table type we have full control over that we copy all columns.
While this workaround can make table-valued parameters a serious option for passing data between stored procedures, there are still drawbacks:
The latter point is actually more important than it may look at first glance. A couple of years ago, a colleague of mine got the task of rewriting a stored procedure from single-row operation to be multi-row capable. This procedure was some 700 lines long and called several other procedures, of which the most also worked with scalar parameters. When he was done, he had one big monolith of 3000 lines of code. He had to incorporate the sub-procedures into the main procedure, as it was not realistic to define process-keyed tables for all of them. Good coding practice of regular programming languages would of course call for the modularisation we had with the scalar procedure. But you need read-write table-valued parameters for this.
Yes, I could use parameters of the xml data type to pass data in and out of get_fund_orders, but seriously!. If creating and shredding XML to pass data between T‑SQL stored procedures does not spell k-l-u-d-g-e, nothing does. In a relational database we work with tables.
If you find that argument too much on the emotional side, consider the overhead for copying to XML and back to a table.
Also, keep in mind that the maximum size for a value of the xml data type is 2 GB, a limit that is not too unrealistic to hit in this context.
I think you already have a clear picture of how I would like to work with table parameters, but I still like to summarise the different cases:
Actually, while table-valued parameters may look new and exotic to some, there are probably a lot of stored procedures in this world that takes scalar input that really should take table input. There is a lot of code out there that loops over a table to call a stored procedure for one row at a time – when that procedure should accept a set of data for the application to be scalable.
Of course, Microsoft did not introduce the restriction that table parameters have to be read-only on a whim, but obviously they ran into some technical problem which time did not permit them to overcome.
What those problems were, I don't know, but I do have a theory: the difficulties lie in passing the data of read-write table parameters back to client APIs. And also defining methods in the client API to retrieve the data.
Working from that theory, my suggestion to make it a viable option to add read-write table parameters in SQL Server is to replace the current restriction with a less severe restriction: it is only permitted to call a stored procedure with a read-write table parameter from T‑SQL, not directly from a client API.
I believe this restriction is far more acceptable than the current one. If you want to pass a table from a client application to a stored procedure to have some rows updated in it, you can always return the data as regular result set. In difference to communication between stored procedures, this does not incur any overhead, since the data will have to cross the wire one way or another, so whether it is a result set, or a table parameter that is updated should not matter. Note also, that in an example such as my get_fund_orders, handling multiple result sets is trivial in the communication between client and server.
Strictly speaking, a CLR module in SQL Server is not a client, but since CLR modules uses the same API, SqlClient, that real client programs use, it is obvious that my restriction would apply to CLR modules as well. An author of a CLR module that would like to work with read-write table parameters would have to return the data as result sets from the T‑SQL stored procedure. This would admittedly mean one round of copying that would not be needed with read-write table parameters. But I think passing data between T‑SQL procedures is far more common. Note that SQL Server does not support passing table parameters to CLR modules.
It could be questioned whether it is acceptable to have stored procedures that only can be called from T‑SQL code, but not from a client API. However, there is a precedent: the cursor data type. Stored procedures with cursor parameters cannot be called from a client API.
Finally a quick note about linked servers. When you access a linked server, your local server is after all a client to the remote server. It would certainly be nice to be able pass read-write parameters when calling a remote procedure. However, currently SQL Server does not permit you to pass table parameters to remote procedures at all.
By default, parameters of scalar types are input-only, but not read-only; the called procedure may modify the parameter as it sees fit. A parameter can be made an input/output parameter by specifying the OUTPUT keyword both in the procedure definition and in the call to the procedure. That is, even if a parameter is declared as OUTPUT, a caller can still opt to pass the parameter as input-only.
This pattern is not really suitable for table parameters, as it would more or less require SQL Server to pass table variables by value, that is copy on input and copy back on output.
Therefore, I propose that a table-valued parameter that is not defined as READONLY, is always an input-output parameter. This could be amplified by requiring that the OUTPUT keyword is specified for table parameters when READONLY is not present. Likewise, it could be a requirement that OUTPUT is specified for actual parameters if the formal parameter is declared as OUTPUT. This would thwart the risk that a programmer thinks that his table variable will not be affected when he passes it to a stored procedure that takes a non-read-only table-valued parameter.
Obviously, if a table parameter is read-only, you cannot pass it to an inner procedure as an output parameter.
A special note about functions: I don't think the restriction that table-valued parameters must be read-only should be lifted for functions. It's a basic rule of user-defined functions in SQL Server that they should not have side effects.
I hope this text has convinced the reader that in order to build really scalable applications with SQL Server requires passing data between stored procedures through tables, both for input and output. Therefore the restriction that table-valued parameters must be read-only should be lifted in the next version of SQL Server. And if you agree with this, go to https://feedback.azure.com/forums/908035-sql-server/suggestions/32894995-relax-restriction-that-table-parameters-must-be-re and vote!