Arrays and Lists in SQL Server 2008
Using Table-Valued Parameters

An SQL text by Erland Sommarskog, SQL Server MVP. Latest revision: 2010-01-06.

Introduction

In the public forums for SQL Server, you often see people asking How do I use arrays in SQL Server? Or Why does SELECT * FROM tbl WHERE col IN (@list) not work? The short answer to the first question is that SQL Server does not have arrays – SQL Server has tables. Upto SQL Server 2005, there was no way to pass a table from a client, but you had to pass a comma-separated string or similar. This changed with SQL 2008 and the advent of table-valued parameters.

In this article I will show examples of how to pass a table-value parameter (TVP) from ADO .NET. I will also briefly cover other client APIs and whether you can use TVPs with them. In the last section, I discuss the performance of TVPs and compare them to the other methods of passing lists we had to use in SQL 2005 and earlier.

This article has an accompanying article: Arrays and Lists in SQL Server 2005 and Beyond where I in detail describe various methods to pass a list of values in a string and unpack them into a table. There are also two performance appendixes where I relate data from performance tests of the methods described in the articles, including table-valued parameters. There is one newer appendix labelled SQL 2008 (this is the one that cover TVPs) and one older labelled SQL 2005. I should hasten to add that these appendixes are more for the hard-core geeks. For SQL 2000 and earlier versions there is a separate article: Arrays and Lists in SQL Server 2000 and Earlier.

Note: the sample code in this article refers to the Northwind database. This database does not ship with SQL 2008, but you can download the script to install it from Microsoft's web site.

Contents:

   Introduction
   Background
   Table-Valued Parameters in T-SQL
   Passing Table-Valued Parameters from ADO .NET
      Using a List<SqlDataRecord>
      Using a DataTable
      Using a DataReader
      Final Remarks
   Using Table-Valued Parameters from Other APIs
      ODBC
      OLE DB
      ADO
      LINQ and Entity Framework
      JDBC
      PHP
      Perl
      What If Your API Does Not Support TVPs
   Performance Considerations
      Server-side
      Client-side
      Primary Key or Not?
   Acknowledgements and Feedback
   Revision History

Background

You have a number of key values, identifying a couple of rows in a table, and you want to retrieve these rows. If you are the sort of person who composes your SQL statements in client code, you might have something that looks like this:

cmd.CommandText = "SELECT ProductID, ProductName FROM Northwind..Products " & _
                  "WHERE ProductID IN (" & List & ")"
reader = cmd.ExecuteReader()

List is here a string variable that you somewhere have assigned a comma-separated list, for instance "9, 12, 27, 39".

This sort of code is bad practice, because you should never interpolate parameter values into your query string. (Why, is beyond the scope of this article, but I discuss this in detail in my article The Curse and Blessings of Dynamic SQL, particularly in the sections on SQL Injection and Caching Query Plans.)

Since this is bad practice, you want to use stored procedures. However, at first glance you don't seem to find that any apparent way to do this. Many have tried with:

CREATE PROCEDURE get_product_names @ids varchar(50) AS
SELECT ProductID, ProductName
FROM   Northwind..Products
WHERE  ProductID IN (@ids)

But when they test:

EXEC get_product_names '9, 12, 27, 37'

The reward is this error message:

Server: Msg 245, Level 16, State 1, Procedure get_product_names, Line 2
Syntax error converting the varchar value '9, 12, 27, 37' to a column
of data type int.

This fails, because we are no longer composing an SQL statement dynamically, and @ids is just one value in the IN clause. An IN clause could also read:

... WHERE col IN (@a, @b, @c)

Or more to the point, consider this little script:

CREATE TABLE #csv (a varchar(20) NOT NULL)
go
INSERT #csv (a) VALUES ('9, 12, 27, 37')
INSERT #csv (a) VALUES ('something else')
SELECT a FROM #csv WHERE a IN ('9, 12, 27, 37')

So now you know why col IN (@list) does not work. Or rather: you know now that it works differently from what you might have thought it did. In the following we will look into how to solve this kind of problem with table-valued parameters.

Before I go on, I should add that sometimes you may be in the (very unfortunate) situation when you have a delimited list in a table column in your database. To unpack such a list, you would need any of the methods that I discuss in Arrays and Lists for SQL 2005 and Beyond; TVPs cannot help you here.

Table-Valued Parameters in T-SQL

Let's first look at how to use TVPs in T-SQL without involving a client. To be able to declare a TVP, you first need to create a table type like this:

CREATE TYPE integer_list_tbltype AS TABLE (n int NOT NULL PRIMARY KEY)

That is, after CREATE TYPE you specify the type name followed by AS TABLE and then comes the table definition, using the same syntax as CREATE TABLE. You cannot use exactly everything you can use with CREATE TABLE, but you can define PRIMARY KEY, UNIQUE and CHECK constraints, you can use IDENTITY and DEFAULT definitions, and you can define computed columns. Once you have this table type, you can use it to declare table variables:

DECLARE @mylist integer_list_tbltype

However, you cannot use the type with CREATE TABLE (it could have been nutty with temp tables!), and nor can you use it for the declaration of the return table in multi-step table functions. The raison d'être for table types is to use them when you declare table-valued parameters for stored procedures or user-defined functions. Here is one example:

CREATE PROCEDURE get_product_names @prodids integer_list_tbltype READONLY AS
   SELECT p.ProductID, p.ProductName
   FROM   Northwind..Products p
   WHERE  p.ProductID IN (SELECT n FROM @prodids)

The body of the procedure brings no surprises. The code looks just as it would have if @prodids had been a local table variable. The parameter declaration on the other hand includes a keyword hitherto not seen in this context: READONLY. This keyword means what it says: you cannot modify the contents of the table parameter in any way in the procedure. As an aside, this restriction makes TVPs far less useful than they could have been; it's common to want to pass data between data between stored procedure as I discuss in my article How to Share Data Between Stored Procedures. However, for the task at hand, passing data from a client, the READONLY restriction is no major obstacle.

Calling this procedure is straightforward:

DECLARE @mylist integer_list_tbltype
INSERT @mylist(n) VALUES(9),(12),(27),(37)
EXEC get_product_names @mylist

(Here I use the new syntax for INSERT that permits me to specify values for more than one row in the VALUES clause.) You can also use TVPs with sp_executesql:

DECLARE @mylist integer_list_tbltype,
        @sql nvarchar(MAX)
SELECT  @sql = N'SELECT p.ProductID, p.ProductName
                 FROM    Northwind..Products p
                 WHERE   p.ProductID IN (SELECT n FROM @prodids)'
INSERT @mylist VALUES(9),(12),(27),(37)
EXEC sp_executesql @sql, N'@prodids integer_list_tbltype READONLY', @mylist

There are a few peculiarities, though. This does not work:

EXEC get_product_names NULL

but results in this error message:

Operand type clash: void type is incompatible with integer_list_tbltype

It is quite logical when you think of it: NULL is a scalar value, and not a table value. But what do you think about this:

EXEC get_product_names

You may expect this to result in an error about missing parameters, but instead this runs and produces an empty result set! The same happens with:

EXEC get_product_names DEFAULT

The scoop is that a table-valued parameter always has the implicit default value of an empty table. Whether this is good or bad can be disputed, but if there were to be explicit default values, Microsoft would have to invent a lot of syntax for it. And in most cases, the default value you want is probably the empty table, so it is not entirely unreasonable.

Finally, I should add that you cannot create stored procedures or user-defined functions in the CLR that take a table-valued parameter. But the other way works: you can call a T-SQL procedure with a TVP from a CLR procedure, using the same mechanisms you use from a client and which is what we will look at next.

Passing Table-Valued Parameters from ADO .NET

Passing values to TVPs from ADO .NET is very straightforward, and requires very little extra code compared to passing data to regular parameters. In fact, if you already have the data in a DataTable you can pass it as-is to the TVP. All you need is one extra line to define the table type. Rather than passing a DataTable, you can also use the List generic instantiated with the SqlDataRecord class or you can pass a class that implements the IDataReader interface.

I have implemented examples for all three options, and you can find these examples in the files tvpdemo.cs (C#) and tvpdemo.vb (Visual Basic). In the following text I will discuss the examples with List<SqlDataRecord> and DataTable in detail, using the code from the C# version. I will also briefly discuss the IDataReader example. If you want to run the examples, you will need to change the connection string, which you find at the top of the files. Before I go on, I need to add a disclaimer: I'm only an occasional .NET programmer, so if you see something that you think disagrees with best practice in C# or VB .NET, you are probably right.

And, oh, while I say ADO .NET, I really mean SqlClient. If you use System.Data.OleDb or System.Data.Odbc to connect to SQL Server, you cannot use table-valued parameters. Then again, the only reason to use these classes rather than SqlClient is that you need to also support other data sources, in which case you should not use proprietary features like TVPs anyway.

Using a List<SqlDataRecord>

I will first look at using a List instantiated with the class SqlDataRecord. This class exists in the Microsoft.SqlServer.Server namespace. In this example we will call the procedure get_product_names in the previous section.

The pertinent part of the tvpdemo program is the procedure list_example, which looks like this in entirety, with comments stripped:

private static void list_example() {

   int[] products = {9, 12, 27, 37};

   List <SqlDataRecord> product_list = new List<SqlDataRecord>();

   SqlMetaData[] tvp_definition = {new SqlMetaData("n", SqlDbType.Int)};

   foreach (int prodid in products) {
      SqlDataRecord rec = new SqlDataRecord(tvp_definition);
      rec.SetInt32(0, prodid);
      product_list.Add(rec);
   }

   using (SqlConnection cn = setup_connection()) {
      using (SqlCommand cmd = cn.CreateCommand()) {

         cmd.CommandType = CommandType.StoredProcedure;
         cmd.CommandText = "dbo.get_product_names";

         cmd.Parameters.Add("@prodids", SqlDbType.Structured);
         cmd.Parameters["@prodids"].Direction = ParameterDirection.Input;
         cmd.Parameters["@prodids"].TypeName = "integer_list_tbltype";
         cmd.Parameters["@prodids"].Value = product_list;


         using (SqlDataAdapter da = new SqlDataAdapter(cmd))
         using (DataSet        ds = new DataSet()) {
            da.Fill(ds);
            PrintDataSet(ds);
         }
      }
   }
}

The first line:

int[] products = {9, 12, 27, 37};

is just the data for the demo. Normally you would get this data from somewhere: a file, an input form or similar. On the next line we instantiate a list:

List <SqlDataRecord> product_list = new List<SqlDataRecord>();

The List class is a generic class, meaning that you cannot create a List object as such, but you have to specify what sort of objects the list is to contain, and you specify the type parameter(s) in angle brackets. (I guess most .NET programmers reading this have encountered generics before, but if you haven't you don't really need to worry, just think of List<SqlDataRecord> as a quirky identifier for now.)

On the next line we define the metadata for our table type by setting up an array of SqlMetaData elements:

SqlMetaData[] tvp_definition = {new SqlMetaData("n", SqlDbType.Int)};

The SqlMetaData constructor has a number of overloads, whereof the simplest is seen here: you specify the name of the column and its type. Other overloads permit you to specify the length for varchar, the scale and precision for decimal etc. In this case, since we are only passing a list of values, the array has one single element. But if you were to use a TVP to insert many rows into a table, your TVP would have many columns, and so would your SqlMetaData array.

Next we run a loop to populate the list:

foreach (int prodid in products) {
   SqlDataRecord rec = new SqlDataRecord(tvp_definition);
   rec.SetInt32(0, prodid);
   product_list.Add(rec);
}

First we create a new SqlDataRecord instance and define the metadata for it. The constructor for SqlDataRecord takes an array of SqlMetaData objects as its only argument, and that is the array we defined just above. Next we set the value for the only column in the record. Note that column numbers starts with 0 in this context. Finally in the loop, the record is added to the list.

The next step in the routine is to set up the connection and the command objects:

using (SqlConnection cn = setup_connection()) {
   using (SqlCommand cmd = cn.CreateCommand()) {

To set up the connection, I use an internal helper routine that I don't show here, but which you find in the tvpdemo files. (In case you are unacquainted with using: this statement permits you to declare a variable that is accessible in the enclosed block, and if the class of the variable has a Dispose method, this method will be invoked when the using block exits. This is highly recommendable for SqlConnection and SqlCommand objects, to ensure that they are explicitly disposed when you leave the block. If you just trust on garbage collection to take care of them, you may spew SQL Server with a lot of extra connections.)

Next we define what to call:

cmd.CommandType = CommandType.StoredProcedure;
cmd.CommandText = "dbo.get_product_names";

That's pretty much routine, but now comes the more interesting part, to define the parameter of this procedure. First we add the parameter and define the direction:

cmd.Parameters.Add("@prodids", SqlDbType.Structured);
cmd.Parameters["@prodids"].Direction = ParameterDirection.Input;

The name of the parameter is given by the procedure definition, and for the type we say SqlDbType.Structured, which is a general catch-all for all sorts of table-valued parameters. I also explicitly state the direction for the parameter for clarity. Next step is to define the exact table type of our procedure by setting the special parameter property TypeName:

cmd.Parameters["@prodids"].TypeName = "integer_list_tbltype";

Strictly speaking, this is not necessary when calling a stored procedure, since SQL Server knows the type anyway. However, it is definitely best practice to always specify the type. For one thing, this will give you a clearer error message when you make a mistake, like assuming that a parameter is of table type A, but in is of type B.

Finally, we set the parameter value, our product list:

cmd.Parameters["@prodids"].Value = product_list;

Once that is done, we can run our command:

using (SqlDataAdapter da = new SqlDataAdapter(cmd))
using (DataSet        ds = new DataSet()) {
   da.Fill(ds);
   PrintDataSet(ds);
}

To keep the code short, I've opted to use the Fill method, and PrintDataSet is a routine which loops over all datatables in a dataset and prints them in a fairly simple fashion. You find this routine in the tvpdemo files.

Using a DataTable

In the next example we will make a two things differently. We will pass the data using a DataTable, and rather than calling a stored procedure, we use an SQL batch to show that this is possible. For this example we use this type:

CREATE TYPE custid_list_tbltyp AS TABLE(custid nchar(5) NOT NULL PRIMARY KEY)

The tvpdemo program includes the procedure datatable_example, which follows here with comments below:

private static void datatable_example() {

   string [] custids = {"ALFKI", "BONAP", "CACTU", "FRANK"};

   DataTable custid_list = new DataTable();
   custid_list.Columns.Add("custid", typeof(String));

   foreach (string custid in custids) {
      DataRow dr = custid_list.NewRow();
      dr["custid"] = custid;
      custid_list.Rows.Add(dr);
   }

   using(SqlConnection cn = setup_connection()) {
      using(SqlCommand cmd = cn.CreateCommand()) {

         cmd.CommandText =
           @"SELECT C.CustomerID, C.CompanyName
             FROM   Northwind.dbo.Customers C
             WHERE  C.CustomerID IN (SELECT id.custid FROM @custids id)";
         cmd.CommandType = CommandType.Text;

         cmd.Parameters.Add("@custids", SqlDbType.Structured);
         cmd.Parameters["@custids"].Direction = ParameterDirection.Input;
         cmd.Parameters["@custids"].TypeName = "custid_list_tbltype";
         cmd.Parameters["@custids"].Value = custid_list;

         using (SqlDataAdapter da = new SqlDataAdapter(cmd))
         using (DataSet        ds = new DataSet()) {
            da.Fill(ds);
            PrintDataSet(ds);
         }
      }
   }
}

As in the previous example, the first line:

string [] custids = {"ALFKI", "BONAP", "CACTU", "FRANK"};

is just the test data for this demo. On the next line, I create the DataTable and define its only column to match my table type:

DataTable custid_list = new DataTable();
custid_list.Columns.Add("custid", typeof(String));

Then I run a loop to add rows to that table for my test data, in the same vein as I did in the previous example. As you may know, a DataTable consists of a number of DataRow objects:

foreach (string custid in custids) {
   DataRow dr = custid_list.NewRow();
   dr["custid"] = custid;
   custid_list.Rows.Add(dr);
}

Now, it is not unlikely that you already have the data in a DataTable, in which case there is no need to run this loop. However, keep in mind that your DataTable must match your table type exactly.

I set up the connection and command objects as above, and then I define the text for my command batch:

cmd.CommandText =
    @"SELECT C.CustomerID, C.CompanyName
      FROM   Northwind.dbo.Customers C
      WHERE  C.CustomerID IN (SELECT id.custid FROM @custids id)";
cmd.CommandType = CommandType.Text;

Note that the command batch includes a reference to a table variable, @custids. This is in fact the TVP that we will pass the DataTable to, and the next step is to set up that parameter:

cmd.Parameters.Add("@custids", SqlDbType.Structured);
cmd.Parameters["@custids"].Direction = ParameterDirection.Input;
cmd.Parameters["@custids"].TypeName = "custid_list_tbltype";
cmd.Parameters["@custids"].Value = custid_list;

You can see that it is very much the same as the previous example, only the names have changed. And one small thing you don't see: when you use an SQL batch, the .TypeName property is mandatory. As I noted above it's optional with a stored procedure, but as I also said, best practice is to always include the type name.

Finally, we run they query like in the previous example.

Using a DataReader

A third option is to use a class that implements the IDataReader interface. This is useful when you get data from an external source, a file, a TCP/IP connection or some other database, as it permits you stream the data directly to the TVP, without having to bounce the data over a List or a DataTable. Well, at least on according to this MSDN Library topic, using a DataReader gives you streaming. But according to what I heard from a Microsoft developer, the current implementation is non-streaming: the DataReader is fully consumed, before the data is passed to the TVP. If you need to use true streaming to feed your TVP, you should consider using SQL Server Native Client instead, see below. That is not to say the DataReader option is useless, it does after all save you from some extra variables. And who knows, maybe Microsoft will improve the implementation in a future release of the .NET Framework.

I don't include any example in full here, as it is quite difficult to make a meaningful example that is also is brief. It definitely goes beyond the scope of this article to show how to implement IDataReader. In the tvpdemo files you find the routine datareader_example where I use an OleDbDataReader to read data from SQL Server to pass it to a TVP, which is not terribly useful. But say that you want to pass data from an Access database or an Excel file to a TVP, it would be a good choice to use an OleDbDataReader to read the data from that source and then pass the reader to the TVP.

Final Remarks

As you have seen from these examples, passing a value to a table-valued parameter is a very simple affair in ADO .NET, no matter you use a List<SqlDataRecord> or a DataTable. And if you are able to fill the DataTable with data binding, it may be even easier. If you have external data, you can also pass a DataReader directly with no need for extra variables.

You may think that since you can use data binding with a DataTable, there is little reason to ever consider List<SqlDataRecord>, but there are situations where a DataTable doesn't cut it. One such situation is when your table type has an IDENTITY column or a computed column. Here you must use List<SqlDataRecord>. The trick is to use an SqlMetaData constructor that permits you to set the property UseServerDefault when you define your SqlDataRecord. Other things you can do with List<SqlDataRecord> and SqlMetaData is to specify the length of string columns and the precision and scale of decimal columns. A general tip is that if you cannot get your DataTable to play with your TVP, try using List<SqlDataRecord> instead.

Using Table-Valued Parameters from Other APIs

So far I have talked only about ADO .NET and SqlClient. In this section I will give a brief exposé over other APIs around and whether they support table-valued parameters. I also discuss what options you have if your API does not support TVPs.

ODBC

You can use table-valued parameters with ODBC. You need to specify SQL Server Native Client 10.0 as your ODBC driver. SQL Server Native Client 10, is a combined ODBC and OLE DB DLL that comes with SQL Server 2008, and which is freely redistributable.

As I have not worked with ODBC myself, I cannot assess how smooth or difficult to use TVPs with ODBC. I believe there are two way to pass a TVP through ODBC: streaming and non-streaming. (And I am told in difference to IDataReader this is real streaming.) Books Online have two examples on using table-valued parameter in the section Table-Valued Parameters (ODBC) . There is also a sample on Codeplex at http://msftdpprodsamples.codeplex.com/Release/ProjectReleases.aspx?ReleaseId=18947.

OLE DB

You can use table-valued parameters with OLE DB, if you use the SQLNCLI10 provider, that is the OLE DB half of SQL Server Native Client 10. OLE DB offers two models for passing TVPs. One is the push model, where you create a rowset with the metadata, fill the rowset with your data, and in the regular parameter area, you pass the rowset pointer. The same basic idea as passing a List or a DataTable with ADO .NET, but you need to write more code. (As always with OLE DB, I'm tempted to say.)

The alternative is the pull model, which essentially is a role reversal where the consumer needs to implement IRowset on its own whereupon the provider will read from the Rowset as a consumer. This model is intended for streaming scenarios, where you get data from an external source, and don't want any intermediate storage. (And as with ODBC this is real streaming.)

Whether you can use TVPs if you use the OLE DB Consumer Templates, I don't know. I've only worked with "naked" OLE DB myself, never the consumer templates.

You can find a sample for the pull model at http://msftdpprodsamples.codeplex.com/Release/ProjectReleases.aspx?ReleaseId=18947. (Yes, that the same link as above. The download gives you access to a number of useful samples for OLE DB and ODBC.) I have not found any sample for the push model, but if you are desperate I have some code that I can share. (It's not really in the shape that I want to put it out as a public sample.)

ADO

No, you cannot use table-valued parameters with old ADO. Yes, ADO sits on top of OLE DB, and you can use SQLNCLI10 as the OLE DB provider with ADO. But ADO itself has not been updated for the new data types in SQL 2005 and SQL 2008 and cannot work with them.

LINQ to SQL and Entity Framework

None of them have support for table-valued parameters. Ironic isn't it? Microsoft touts them as a the hottest and best way to access SQL Server, and then you find you don't have access to all features. The obvious workaround, beside using one of the older list-to-table methods, is to make a direct call from ADO .NET and bypass that language-integrated thing. Arguable, this causes your code to have a mix of paradigms. But it could be a first step from moving away from LINQ/EF entirely. (Wait, did I just say that? OK, let it be said: I am not a fan of neither of these technologies, as I feel that they serve to increase the object/relational impedance between client-side developers and SQL Server people.)

JDBC

The version of the Microsoft SQL Server JDBC Driver that is current of this writing (2.0) does not support table-valued parameters. But please check Microsoft's sites for updates. I don't know whether other vendor's of JDBC drivers for SQL Server support TVPs, but it could definitely be worth investigating.

PHP

Microsoft has a PHP driver for SQL Server. The current version of this writing is 1.0. There is also a CTP of 1.1 out. None of these versions support table-valued parameters of what I can see. I advice you check out Microsoft's PHP site for updates.

Perl

If you use the standard DBI/DBD modules, I doubt that you will find any support for table-valued parameters. However, the best option for connecting to SQL Server – as long you do not need support other data sources – is Win32::SqlServer, of which I am the author myself. And, yes, it supports table-valued parameters. (Although I will have to make an embarrassing confession in the performance section.)

What If Your API Does Not Support TVPs

As you have realised when you've read this small summary is that if you are using VB6, VBA, Access, Java, PHP – and probably a few more environments which I did not list here – you cannot use TVPs directly in your API. If you need to pass a list of values, you should use any of the methods that I discuss in my article Arrays and Lists for SQL 2005 and Beyond.

If you need to call a stored procedure that takes a table-valued parameter, you can always do this by writing a wrapper procedure that takes a comma-separated list (or XML document for multi-column TVPs) as parameter and inserts that to a table variable and then calls the inner procedure. Say for instance that you need to call get_product_names from VB6:

CREATE PROCEDURE get_product_names_wrapper @prodids nvarchar(MAX)
DECLARE @prodid_table integer_list_tbltype
INSERT @prodid_table(n)
    SELECT number FROM iter_intlist_to_tbl(@prodids)
EXEC get_product_names @prodid_table

If you think creating a procedure is too much, you can just submit a parameterised command batch:

DECLARE @prodid_table integer_list_tbltype
INSERT @prodid_table(n)
    SELECT number FROM iter_intlist_to_tbl(?)
EXEC get_product_names @prodid_table

If you've never seen a parameterised command before, see the section on SQL Injection in my article on dynamic SQL for a brief introduction.

Performance Considerations

Server-side

As you have understood from the fact that I devoted an article solely to table-valued parameters, this is the preferred method for passing a list of values. One important reason is simplicity: writing a stored procedure that accepts a table-valued parameter is straight-forward. Not that using a list-to-table function is a big deal, but using a table is the normal way to work in a relational database. And as you have seen, passing a value to a TVP from ADO .NET is a simple affair. TVPs also have the advantage that you can add constraints to the table type to disallow duplicates, or enforce some other type of contract. Not talking of the simple fact that with a comma-separated list, format errors can give you nasty surprises that just cannot happen with table parameters.

Does this also mean that this the method gives you the best performance? In general, yes, but in each and every case? No. When running the tests for my performance appendix, I did find situations where other methods outperformed TVPs. However, I believe that in the long run TVPs will you give you better performance than any other method. There are two reasons for this:

Data is already in table format. With all other methods, cycles need to be spent on parsing a character string to get the data into table format. In my tests, the fixed-length method performed better in some tests with integer data. Indeed, this method just chops up a fixed-length string reading from a table of numbers, so it is very similar to reading from a table variable. However, TVPs have one more ace up the sleeve:

The optimizer gets a clue. For all the other methods, the optimizer has no understanding of what is going on. In most situations you get a useful plan, but with methods based on inline T-SQL functions the optimizer may lose grip entirely and produce plans that are nothing but a disaster. And even in the case when the plan is useful, the optimizer has no idea how many rows your list of values will produce, which means that if you use the list in a query with other tables, row estimates may be way off.

This is different for table-valued parameters. Just like table variables, table-valued parameters do not have distribution statistics, but there is one piece of information: cardinality. That is, the first time you call a procedure that takes a TVP, the optimizer sniffs the TVP, as it sniffs all other parameters, and the optimizer sees that the TVP has so and so many rows. This gives the optimizer better odds to get good estimates for the number of rows in the rest of the query.

Not that it is perfect: There is the general problem with parameter sniffing that the sniffed value may be atypical. And it is not always correct information leads to the best plan; in the performance appendix for SQL 2008 you can read about a case where SQL Server chooses an incorrect plan, because it has more accurate cardinality information. But the appendix also shows that this only concerns a window of the input table size.

And of course, cardinality is far from sufficient in all cases. Consider the query:

SELECT * FROM Orders WHERE CustomerID IN (SELECT custid FROM @custids)

Say that there are four values in @custids. If they are just four plain customers, seeking the non-clustered index on CustomerID is good. But if they are the top four customers accounting for 40% of the volume, you want a table scan. But since a TVP does not have distribution statistics, the optimizer cannot distinguish the cases. The workaround is simple: bounce the data over a temp table and take benefit of that temp tables have distribution statistics. But of course, that workaround is the same as for all list-to-table functions, so you could argue that if you need to do this, there is no special advantage of TVPs.

Client-side

A reasonable question is: does TVP incur more calling overhead than regular parameters? The answer is yes. In my tests I found that passing 50 000 integer values to an unindexed TVP from ADO .NET took 40-50 ms compared to 20-35 ms for comma-separated lists. For a TVP with a primary key, the overhead was around 150 ms. If I also added the time to load the DataTable or List<SqlDataRecord>, there was even more overhead. (In fairness, this last test should be compared with the time it takes to build a comma-separated list or some other input.)

While this overhead may seem considerable, you need to put it in perspective and look at the total execution time, and in most cases, the server-side execution time exceeds the the numbers in the previous paragraph with a wide margin. As just one data point: in my test, the server-side execution time for my join test over 50 000 list elements was 213 ms for a non-indexed TVP, and the best non-TVP method (fixed-length binary input) needed 420 ms. The performance appendix for SQL 2008 has more details.

I need to conclude this section with an embarrassing confession. While the overhead in ADO .NET nowhere close to alarming, the same cannot be said about Win32::SqlServer, my own API. The overhead for passing 50 000 values to a TVP are two shameful seconds. I cannot say whether this is due to OLE DB or my own miserable programming.

Primary Key or Not?

In all examples in this article, I've augmented the table with a primary key, but the previous section tells us that using a PK on our TVP adds a higher call overhead, so this raises the question whether we should use one.

This is a question where the answer has two layers: correctness and performance. And correctness is always more important than performance. All queries in this article uses IN, which means that if there are duplicates in the list, this will not cause duplicates in the output. But say that your query goes:

SELECT C.CustomerID, C.CompanyName
FROM   Northwind.dbo.Customers C
JOIN   @custids t ON t.custid = C.CustomerID

If @custids would include two occurrences of ALFKI, Afreds Futterkiste would also be returned twice. If this is exactly what you want for some reason, you should not put any primary key on your table type. But in most cases you don't want any duplicates, and in this case, slapping a PK constraint on the TVP adds a safeguard against accidents. If you pass duplicates by mistake, it will be caught directly when you call your stored procedure or command batch.

Now, let's put on our performance glasses, and consider this procedure again:

CREATE PROCEDURE get_product_names @prodids integer_list_tbltype READONLY AS
   SELECT p.ProductID, p.ProductName
   FROM   Northwind..Products p
   WHERE  p.ProductID IN (SELECT n FROM @prodids)

I would suggest that for this procedure it does not matter whether the TVP has a primary key or not! Assume first that you have a PK, and you pass very many values, like 50 000, which was the highest value I used in my tests. To be able to insert the values in the TVP, SQL Server has to sort them as they arrive, adding to the call overhead. However, when the query is executed, SQL Server knows that @prodids holds unique values, and the query can be executed as a regular join. If we instead remove the PK from the table type, receiving the values from the client is now a faster affair. But instead, SQL Server somewhere needs to eliminate any duplicates there may be when running the query, which will be by sorting or hashing. In other words: what you lose on the swings, you will gain on the roundabout. But you may note that if you use your TVP in several queries of this style in the same procedure/batch, having a PK is the winner, as SQL Server only has to look for duplicates once.

Acknowledgements and Feedback

I like to thank my MVP colleagues who helped me by reviewing my demo program and with other research: Bob Beauchemin, Alejandro Mesa, Greg Low and Daniel Joskovski.

If you have opinions, additions or just have spotted a language/grammar error, please mail me at esquel@sommarskog.se. If you have questions about using TVPs or arrays and lists in SQL Server in general, I advice you to post your question to a public forum, for instance microsoft.public.sqlserver.programming.

Revision History

2010-01-06 – First version.

Back to my home page.