My SQL Server Wishlist
Version 2018

An SQL text by Erland Sommarskog, SQL Server MVP. Most recently updated 2019-04-27.
Copyright applies to this text. See here for font conventions used in this article.


As you may imagine, my wishlist for new features in of SQL Server exceeds what most kids send to Santa Claus. This article is a collection of suggestions filed on the Uservoice site for SQL Server feedback, and my hope is that you will vote for those you like. This is not to say that this is all that I wish for in SQL Server. To keep a focus, I have constrained this list to development-related items, so you will not find items related to high availability, security etc on the wishlist.

The list consists of four sections. There is one for general programming, one for improvements to querying and there is a section with DDL improvements. But the most important section is the first section which focuses on area where I feel that Microsoft rather have gone backwards over the years: facilitating robust system development. Robustness implies that you detect errors as early as possible and the first gate is compile time. My first and most important wish is for much stronger compile-time checks, and I have compiled these ideas in a separate article, Ideas for SET STRICT_CHECKS ON. Robustness is also about error handling at run-time, an area in which Microsoft made some improvements in SQL 2005, but it's still a mess. Robustness is also about expressiveness, for instance to have domains with rules attached to them and that cannot be converted by mistake to another domain. Another such example is using identifiers, e.g. enums, instead of constants, so that spelling errors can be detected at compile-time. It's also about contracts, for instance being able to guarantee the shape of the result set returned from a stored procedure. All these are wishes you find in the section Robust System Development.

Far from all the feedback items on this wishlist are my own. Other people have submitted great ideas that I have embraced. Or they have simply submitted a suggestion for a feature before I came around to do it myself. In this article, I am giving credit to these persons as much as I can. However, this is not always possible for items that were originally submitted on the old Connect site that preceeded the current site on Uservoice. When Connect closed down, Microsoft migrated open items to Uservoice, but in that process the identity of the submitter was lost. For some of these old items, I have the name/alias of the submitter because I had the item on this wishlist already before the move to Uservoice. But there are also items that I did not discover until after migration, and in this case I can only refer to the submitter as unknown or anonymous.

I first published this wishlist when SQL 2008 had been released. A few items on the original list have been implemented, and a few have been added over the years, but many have been there from the beginning.

Table of Contents


Robust System Development

SET STRICT_CHECKS ON – Find More Errors at Compile Time

Add a Virtual Errors Table

TRY-CATCH Should Always Work!

Permit Recoverable Error Handling – Don't Doom Transactions Willy-nilly

Savepoints and Distributed Transactions

Enums in T‑SQL

Contracts for Stored Procedures

Give us Real Domains!

Immutable and Auto-Updated Column Values


Other General Programming Improvements

Make Table Parameters Read-Write

Get the Call Stack!

Autonomous Transactions

Module-Level Table Expressions

Expose All Error Messages

Improvements for Bulk Load

Deleting Lots of Data Efficiently

Support for Multi-line Verbatim Strings

Get Position within an XML Document

Inline Scalar UDFs

Really Small Things

Improvements for Querying

A Built-in Table of Numbers – generate_series

Row Constructors

Regular Expressions

Type the Literal!

Remove Illogical Restrictions for IF EXISTS()

QUALIFY for More Concise Queries

Dynamic Pivot

IS [NOT] DISTINCT FROM – no longer a wishlist item

DDL Improvements

Server-Side CLR Compiles

Synonyms for Databases and Linked Servers

Application-Time Periods for Temporal Tables

BEFORE triggers or Enhanced INSTEAD OF Triggers

ALTER TABLE Improvements

Revision History

Robust System Development

SET STRICT_CHECKS ON – Find More Errors at Compile Time

Could you imagine programming in C# and not be told until run-time that you left out a mandatory parameter in a procedure call? Or that you accidently assigned an integer variable a string value and your program crashes when the string is not a number? Or that the compiler does not care if you refer to a non-existent class? The very idea is of course ludicrous. Yet, this is exactly the reality that T‑SQL developers have to work in. This means increased cost of development and testing, because it takes longer time to find silly errors. It also incurs a risk that a critical function in a production system errors out, because of a typo that could have been caught within seconds from when it was made.

I see no reason why we T‑SQL programmers should be content with such a primitive environment that has a such distinct flavour of, um, Fortran. For this reason, a couple of years ago I submitted an item entitled Add optional checks for more robust development where I suggest a new command SET STRICT_CHECKS ON. When strict checks are in force, you would get compile-time errors for a lot of things that does not blow up until run-time today – or even worse produce incorrect results.

To further help Microsoft to get this right, I have a separate article, Ideas for SET STRICT_CHECKS ON, where I discuss what could be included in this command.

Here is a brief summary, with direct links to the various sections in that article:

Add a Virtual Errors Table

You are taught to use set-based statements, because they perform a lot better than cursors. There is just one problem: if there is an error with a single row, the entire statement is rolled back. Sometimes this is what you want, but far from always. In many cases you want to take the rows that passes all constraints and triggers, and divert the rejected rows elsewhere.

Dan Holmes came up with this great suggestion: SQL Server should have a virtual table called errors to go together with inserted and deleted. That is, rows that causes errors – conversion errors, string or binary data would be truncated, constraint errors would be diverted to this table, and provided that they are successfully inserted into that table, the statement as such would pass. That is, you would need to have an OUTPUT clause for this table. If you have not, the query would fail like today.

It shall not be denied that there is some work to get this feature right. And while not covered in Dan's suggestion, I would love to have access to this table inside an INSTEAD OF trigger where I could insert bad rows. The prime area for this feature would be in ETL scenarios, but I can see lot of other situations where this feature would be super-useful.

Dan's item has the title New virtual table: errors. It would analogous to the deleted and inserted tables. I recall that on old Connect, Microsoft attempted to close it at one occasion. The result was the number of votes four-doubled in a matter of days... The number of votes carried over from Connect is 650 which is quite a lot.

TRY-CATCH Should Always Work!

If you try this:

   PRINT 'This prints'  
   SELECT * FROM doesnotexist
   PRINT 'This does not print'

When you run the procedure you find that the output is:

This prints

Msg 208, Level 16, State 1, Line 3

Invalid object name 'doesnotexist'.

That is, the CATCH block is never entered. The root problem here is deferred name resolution, the demise of which I call for in my strict-checks document. SQL Server never reports the table as missing until it actually tries to execute the statement. But it is still a compilation error, and today a compilation error cannot be caught in the same scope as the error occurs. (This is fully documented in Books Online.) You can catch the error in an outer scope, that is:

   EXEC bad_sp
   PRINT 'The error is: ' + error_message()

This will print:

This prints

The error is: Invalid object name 'doesnotexist'.

When TRY-CATCH first appeared I was so happy with the big improvement from SQL 2000 that I was prepared to accept this glitch. But all these many releases later? No way! This is a serious problem. Deferred name resolution itself is bad, because it defers checking of silly typos until run-time. The fact that you cannot handle it appropriately adds insult to injury. Who says that you always have that surrounding stored procedure?

I see posts about this about every week on the forums. It may be people who are trying out TRY-CATCH for the first time, and what impression do they get of SQL Server when they find that it does not seem to work? That Oracle after is all is not such a bad product?

During the course of Connect, there were at least two items submitted. One by Carlos Fatureto and one myself. Microsoft closed both saying Won't fix or By design (!). I have added a new item on Uservoice: TRY-CATCH should always work.

Note: even if STRICT_CHECKS would be implemented and deferred name resolution would be killed, TRY-CATCH still needs to be fixed, since a table may be dropped after a procedure is created.

Permit Recoverable Error Handling – Don't Doom Transactions Willy-nilly

In many situations when an error occurs in your code, all you want to do is to roll back everything and make sure that you exit as soon as possible. Maybe you want to log the error on the way out, but that's that.

But there are also situations where you for one reason or another want to continue the transaction, either by performing some alternate action, or roll back to a savepoint – which is quite different from rolling back the entire transaction. Say that you write a stored procedure that perform some well-defined piece of work on its own, and thus defines a transaction for this work. However you expect your procedure to be used as part of a bigger plot, and if an error occurs in your code you don't want to pull the rug for the caller and roll back its transaction. Rather, you could define a savepoint in the beginning of your procedure, and if there is an error, you roll back to that savepoint.

A specific situation is when you work with Service Broker. In this case you want a transaction that always commits so that the message is taken off the queue. If the normal processing fails, you want to roll back to the point right after you had taken the message off the queue and instead store the message in an error table.

This is in practice impossible to implement today, because when you enter the CATCH block, the transaction may be doomed, meaning that the transaction must be rolled back in its entirety. You cannot do more work in the transaction, nor can you roll back to a savepoint. The concept as such is not strange, because this is exactly the behaviour you expect on a concurrency error like deadlock or a resource error like running out of disk space.

The problem is that in SQL Server it is completely willy-nilly whether an error dooms the transaction or not when the setting XACT_ABORT is OFF. Consider these two statements that have the same error:

SELECT convert(date, '20001131')
SELECT convert(datetime, '20001131')

The first dooms the transaction, the second does not.

When XACT_ABORT is ON, it is more consistent – now all errors doom the transaction.

This is covered in the item Make it possible to have transactions doomed only where there is a reason to.

Note: This item applies to traditional T‑SQL procedures. In natively compiled proceduress, it is all different and transactions are indeed only doomed for a reason.

Savepoints and Distributed Transactions

Even if the previous wish is fulfilled, there is still a problem: you cannot use SAVE TRANSACTION in a distributed transaction. This restriction as such is acceptable, given that distributed savepoints seems like a big challenge to implement. Not all data sources may even support savepoints.

You could argue that it would be OK to set up a savepoint inside a distributed transaction and roll back to as long as you have only made local updates. I can't say how difficult this would be for Microsoft to implement, but the fact that you are not permitted to call natively compiled procedures (which have implicit savepoints) is a clue that it may not be that easy. So I have not submitted any request along these lines.

But there is another problem which appears a lot easier to solve. You write a stored procedure that uses savepoints and everyone is happy. Then two years later someone decides to call your procedure inside a distributed transaction and this will fail because of the savepoint. So you decide that, "OK, if there is a distributed transaction, I skip the savepoint". Problem is – there is no way to find out that you are. For this reason I have submitted There should be a way to tell whether I am in distributed transaction.

Enums in T‑SQL

We probably all have lot of constants in our code. Maybe we use numbers, in which case we will have to know that 1 for order type means "Internet order", 2 means "telephone order" and so on. Or maybe we try to use mnemonic codes. Whatever, there is no safety-net if we mix up the numbers or spell the codes wrongly.

SQL Server MVP Simon Sabin has a long-standing suggestion that T‑SQL should permit us define enums to get away from this, a very interesting idea that has attracted over 200 votes. Allow literals and read only table data to be represented as enums.

Contracts for Stored Procedures

A common question is "how do I find out what result set(s) a stored procedure returns", but there is not any fully satisfactory way to do this. There is the system procedure sp_describe_first_result_set (introduced in SQL 2012), but as the name suggests, it only returns information about the first result set. Furthermore, there are several situations where it gives up, for instance if there is a temp table involved.

It would make more sense if you could expose a contract for the stored procedure which states what you can get back from the procedure. If the actual result set would deviate, that would be an error.

SQL Server MVP Greg Low discusses this in more detail in this blog post, and the comments to this blog post inspired him to submit the item Stored procedures should expose detailed contracts. I think this idea fits very well with my thinking that the programming environment should supply methods to make our systems robust. Contracts are an essential part of object-oriented programming in clients and middle tiers. So why would good virtues stop just because we enter a database?

Note: Microsoft did do some development along these lines in SQL Server 2012 when they added the clause WITH RESULT SETS. But this is a clause of the EXEC statement. Thus, it's the onus of the caller to state what result set it expects the callee to return, which is fairly clunky.

Give us Real Domains!

If you go to you will find this syntax:

CREATE DOMAIN name [AS] data_type   
   [ DEFAULT expression ]      
   [ constraint [ ... ] ]    

where constraint is:    

[ CONSTRAINT constraint_name ]  
{ NOT NULL | NULL | CHECK (expression) }

This syntax is not proprietary to PostgreSQL, but it's drawn from ANSI SQL-92. With this syntax you can for instance say:

CREATE DOMAIN update_user AS nvarchar(128)

And then you can use that domain in a table definition:

CREATE TABLE sometable (col1        int   NOT NULL, 
... whodidit update_user,

You can achieve this in SQL Server, using a different syntax:

CREATE TYPE update_user FROM nvarchar(128) NOT NULL
EXEC sp_bindefault 'update_user_def', 'update_user'
CREATE RULE update_user_rule AS @x = SYSTEM_USER
EXEC sp_bindrule 'update_user_rule', 'update_user'

However, all commands above but CREATE TYPE are deprecated. And not entirely without reason: you can use sp_bindxxx to bind a rule or a default directly to a table column, for which there is little reason since we have constraints. But for types there is nothing better.

No matter whether you call them types or domains, they can help you in several way to implement more robust database systems. With a domain you confine set of rules that apply to all columns and variables of that domain. That is, the length for string/binary columns, scale and precision for decimal data types, as well as defaults and constraints. For string columns, the collation could also be such a property. Furthermore, a good database system would not permit us to mix domains. If we have one domain for order IDs and one for product IDs, a condition like:

FROM OrderDetails OD JOIN Products P ON P.ProductID = OD.OrderID

would result in an error. (Although, this probably goes beyond the ANSI standard. Postgres permits you to mix domains in a join.)

SQL Server supports some of these things I've mentioned but far from all. This is legal:

DECLARE @myvar my_domain
However, no rules or defaults are applied for variables. Also, this is illegal:
SELECT cast(somecol AS my_domain)

For cast and convert you need to use the physical type definition. Furthermore, SQL Server imposes no restrictions of mixing user-defined types. As the alternate names suggests, they are just aliases.

Clearly, there is room for improvement. At a minimum, Microsoft should undeprecate the use of sp_bindxxx with user-defined types and then implement support for types with constraints or defaults where this is missing. (E.g., you cannot use types with rules in table variables; a table cannot be the target for MERGE if the table has a column of a type with a rule.)

There is a very old Connect item from Andrew Mackie which still is active on the new Uservoice site: Add CREATE DOMAIN command. (The number of votes is somewhat unimpressing, but for many years it was not possible to vote for the oldest Connect items. With the migration to Uservoice, this problem is gone.)

Immutable and Auto-Updated Column Values

This section calls for for DRI to define what first seems to be two things in opposition to each other, but in fact supplement each other quite well.

It is not uncommon to have auditing columns like this:

ModifiedDate  datetime2(3) NOT NULL DEFAULT sysdatetime(),
ModifiedBy    datetime2(3) NOT NULL DEFAULT original_login(),

But the default only works automatically on INSERT. On UPDATE you need to say:

ModifiedDate = DEFAULT

explicitly. If the programmer forgets this, the colunm is not updated. Which is bad. You can work around that with triggers, but that's not really a pretty solution, not the least for performance sinces it results in double updates.

So we would need a solution to say that such a column should always be updated automatically with its default or similar solution.

Among your auditing columns you may also have a CreateDate column, and here the opposite applies. It should never be permitted to be update it. In fact, in some tables there are many columns that should be immutable, either from the second they are entered – or when some column in the the row has been set to a certain value. For instance, once an order has been processed, not very many rows on the order should be permitted to change – but the ReturnDate column might need to be updated if the customer returns the goods. And this is the kicker. If the full row is to be frozen, this is easily implemented with an INSTEAD OF UPDATE trigger, but if it only applies to a selection of the columns, the trigger solution is not very palatable, because you have to repeat the statement that fired the trigger. (Although see below in the section BEFORE triggers or Enhanced INSTEAD OF Triggers.)

There are two feedback items in this area. One is an item that Jason Kresowaty originally submitted in 2008. The title is somewhat misleading, Trigger Performance for Simple/Common Scenarios, but if you read it to the end you will find that he asks for DRI for both immutable and auto-updated columns. The second item is a more recent submission from SQL Server MVP Adam Machanic, Provide a mechanism for columns to automatically update themselves, which only discusses auto-updated columns.


This is a request is that looks at robustness from a perspective which is very important in database systems, to wit transaction consistency.

SQL 2005 introduced two ways to use a snapshot where readers and writes do not block each other. Setting your database into READ_COMMITTED_SNAPSHOT is a very quick operation for increasing the concurrency of your application. (And it also saves you from having to use the highly unsafe NOLOCK hint). However, there is a potential problem. Code that reads data to check integrity may read stale data. SQL Server MVP Hugo Kornelis discusses this in a series of blog posts, Snapshot isolation: A threat for integrity? Alex Kuznetsov also discusses this in his book Defensive Database Programming.

A quick example: Assume two processes in the system one that enters orders and one that deregister products. There is a trigger on the OrderDetails table to check that ordered products are still active, and there is a trigger on Products that verifies that there are no open orders for deregistered products. At the same time as an order being entered, a product on that order is deregistered. When the two triggers run, both triggers read from the snapshot and both incorrectly think everything is fine. The end result is a violation of a business rule.

There is a workaround today, you can add the hint READCOMMITTEDLOCK to force reading directly from the data page and not the snapshot. But a hint is not really appropriate, because a hint is for a specific table. When you are in the situation you don't want to read stale data, it is much likely that it applies across the board, for instance in a trigger. Thus you need to use the hint for all tables accessed in your stored procedure or trigger which will litter the code. And there is a risk that you miss to add the hint for some tables.

For this reason I have submitted the feedback item Add SET TRANSACTION ISOLATION LEVEL READ COMMITTED LOCK.

Other General Programming Improvements

Make Table Parameters Read-Write

Table-valued parameters were a nice addition in SQL 2008, but since they are read-only they are not half as exciting as they could have been. With read-write table parameters, it would be possible to pass sets of data between stored procedures in a structured way. To do this today, you can employ various tricks that I discuss in my article How to share data between stored procedures, but all of them have their issues.

During the beta of SQL 2008, I submitted a Connect item where I argued that Microsoft should lift this restriction for calls between stored procedures, but retain it for calls from client APIs. My thinking was that the problems that had caused MS to introduce the restriction lied in the API:s and the communication with them. Unfortunately, the request has not yet been honoured. But the feedback item Relax restriction that table parameters must be readonly when SPs call each other is still active.

I argue further for my position in the article Why Read-only Table Parameters is Not Enough.

Get the Call Stack!

You can get the currently executing module through the @@procid function, and you can pass it as a parameter to a general logging function. But there are many cases, where you would like to retrieve the entire call stack. For instance, say that you have a generic error-handling procedure that you call from your CATCH handlers. Today you can log error_procedure() and also @@procid if you take it as a parameter. But if the procedure that failed is called from several places, you may have very little knowing how you ended up there. Another situation is in a trigger where you would like to bypass a check if the trigger was invoked by a certain procedure that is permitted to break the rules.

This could be implemented as a function that returns the call stack, but it could also be a DMV. A caveat in the latter case is that normally DMVs requires that the user has the server permission VIEW SERVER STATE, but that would be contraproductive here, since this is something you want to run from application code, so it must work for a non-privileged user.

I have two feedback items for this. The original one is called Provide function to retrieve the entire call stack which I submitted in the early days of Connect. There was a technical issue at Connect that the prevented people from voting on very old items, which is why I years later added a new item: Add a means to retrieve the call stack inside a stored procedure. There is a third item from SQL Server MVP Ola Hallengren, Add a function that returns the tsql_stack, which generalises the idea by calling for a function that can return the call stack for any request currently executing.

Autonomous Transactions

Today if you nest BEGIN and COMMIT TRANSACTION, inner COMMIT:s just decrease the transaction count, and only when you come to the outermost COMMIT, the actual commit happens. And most of the time, this is what you want. But there are situations where you want to do things outside the current transaction. A typical example is when you log something to an error table. You want that error message to persist, even if the transaction is rolled back. Another example is given in the feedback item Add support for autonomous transactions, submitted by Anthony Bloesch.

Module-Level Table Expressions

Some time back I found myself writing a stored procedure where I prepared data coming from an incoming source, which included a lot of validations and transformations. I had a bunch of UPDATE statements that went:

UPDATE mytable
SET    some_col = ...
FROM   mytable m
JOIN   ...
WHERE  m.fileid = @fileid
  AND  m.status = 'New'

And it made me a bit nervous. What if I accidently left out the conditions on fileid or status and updated rows I did not want to touch? It occurred to me that it would have been really nice, if I had been able to say:

DECLARE @mynewrows AS
   (SELECT *
    FROM   mytable
    WHERE  fileid = @fileid
      AND  status = 'New')

Then my UPDATE could be written as:

UPDATE @mynewrows
SET    some_col = ...
FROM   @mynewrows
JOIN   ...

Essentially, a Common Table Expression, but on module level. Note that a view would not cut it, due to the reference to the local variable. A inline-table function would work, but I would still have to repeat that @fileid again and again. And I would have to litter the system catalogue with something I need in one single stored procedure only.

That inspired me to this feedback item, Module-level table expressions.

Expose All Error Messages

A problem with the functions error_message, error_number etc is that you can only access one error message if there are several of them. Indeed, in many case there is only one message. But with BACKUP/RESTORE and DDL there are often multiple error messages. Consider this:

   BACKUP DATABASE msdb TO DISK = 'X:\nosuchdisk\dxx.bak'
   PRINT error_message()

The output is:

BACKUP DATABASE is terminating abnormally.

That is, the error message about the invalid path is lost. This is clearly unsatisfactory, as it makes impossible to write good error-logging code for things like BACKUP-RESTORE. (Although see chapter 7 in Part Three in my series Error and Transaction Handling in SQL Server for some feeble attempts to workarounds.)

This could be solved in several ways. A cheap one would be to add an optional parameter so that error_message(2) would return the second error message. But how would you know how many error messages there are? A table-valued function would be a more palatable option – that function would return one component for each component. I have a feedback item for this in Make it possible to retrieve all error messages in a CATCH handler.

And while they are at it, they could add the schema for procedures; this is missing from error_procedure today. This was something SQL Server MVP Louis Davidson pointed out in this old item, Schema not reported in the ERROR_PROCEDURE function.

Improvements for Bulk Load

Bulk load – i.e. BCP, BULK INSERT and OPENROWSET(BULK) - is an area that could benefit from some enhancements.

Deleting Lots of Data Efficiently

TRUNCATE TABLE is an efficient command if you want to empty a table entirely, but there is a restriction: you cannot use it if there are referencing foreign keys. Seen from a general perspective, this is a reasonable rule. Say that I have a table Orders which has a foreign-key column CampaignID which is a reference to Campaigns. If I say TRUNCATE TABLE Campaigns, this is only permissible if CampaignID is NULL for all orders. If this column is not indexed, the Orders table has to be scanned, which can be expensive and in direct conflict with the swiftness promised by TRUNCATE TABLE.

However, there are situations where this is not really an issue, because you want to empty a set of related tables. Let's say that I have a test database with Orders and OrderDetails that I've filled up with ten million orders, and now I want to remove the test data to start over. I can use TRUNCATE TABLE on OrderDetails, but the foreign key from this table prevents me from using the same command on the Orders table.

Here are no less than three feedback items that suggest different solutions to handle this situation. The first is my own, while the other two are from submitters whose names have been lost in the migration from Connect to Uservoice.

  1. Permit TRUNCATE TABLE when referencing tables are empty. That is, if all referencing foreign key(s) are from tables with no rows in them, there can be no constraint violation. With this suggestion you would be able say:
    TRUNCATE TABLE OrderDetails
  2. TRUNCATE TABLE should support truncating multiple tables to make it easier to delete hierarchies with foreign keys. With suggestion, you would use a single command to empty both tables:
    TRUNCATE TABLE OrderDetails, Orders
    This is an idea that I had never thought until I saw this feedback item, but it is definitely an interesting, and may be easier for Microsoft to implement than my own idea.
  3. TRUNCATE TABLE should respect CASCADE foreign keys. What it says. If all referencing tables have their foreign keys defined with ON DELETE CASCADE, they are also truncated. Thus, with this idea you would simply say
    From a logical perspective, this makes perfect sense. After all TRUNCATE TABLE is just DELETE but implemented in a physically different way.

Support for Multi-line Verbatim Strings

In some languages you can define string literals that spills over multiple lines and which have special delimiters, so that you don't have to worry about string delimiters in the string. They are often known as here-strings or similar. They are not the least useful when you want to put program code inside a string literal, for instance because you work with dynamic SQL, where you often end up in a mess of nested quotes.

As an example, every once in a while I find that I want to wrap an SQL statement in sp_executesql for testing purposes. Currently I have to go through the SQL string and double all single quotes which can be tedious. But what if I could say:

SET @sql = $'
FROM sys.objects o
JOIN sys.columns c ON o.object_id = c.object_id
WHERE type = 'V'

That is, all text between $' and '$, not including the lines where these token appear, would be a single string. I've borrowed the syntax from Powershell. (Although Powershell uses @ and not $.)

With here-strings, you could in many cases make define your dynamic SQL string in a single here-document. For things that you cannot pass as parameters, like table or database names, you would use variable-lookalikes which you replace later. For instance,

SET @sql = $'
FROM @db.sys.objects o
JOIN @db.sys.columns c ON o.object_id = c.object_id
WHERE type = 'V'
SET @sql = replace(@sql, '@db', @mydb)

I have submitted item on Uservoice for this, Add support for here-strings in T‑SQL.

Get Position within an XML Document

When you work with an XML document, you often have reason to refer to the position in the document. For instance, consider an incoming document with Order and OrderDetails data in a business-to-business system, and you need to be able to correlate the order and the details. However, there is not really any good key in the document, at least not one that you can feel that you can trust. That is, it is perfectly possible that there are two Order elements that are identical on that level. (The details may differ.) If you look at the document with your eyes, it's still a piece of cake: that's the third order and that's the fourth order and so on. But if you are shredding the documents with .nodes(), there is no reliable way you can find out which node that is the first, the second and so on.

There is a very good feedback item Fully support position() in xquery, submitted by the signature "phe" which discusses this. The Connect item also suggests some workarounds, but as the author points out, they are inefficient with large XML documents. My thanks to SQL Server MVP Alejandro Mesa who pointed me to this item.

Inline Scalar UDFs

Yes, it is happening!

Background: up to SQL 2017, user-defined scalar functions were notorious performance thieves, because they were never inlined, and basically they were a query within the query, which the optimizer did not see.

Therefore, it was a long-standing wish from the community that you should be able to have inline scalar functions just like you can inlne table functions. That is, the function would be a mere macro that is pasted into the query and the optimizer sees the whole thing.

And, finally, in SQL 2019, it is happening. And what way it is happening! There is not really any new syntax, no SQL Server will analyse your UDF and will inline it if possible. You may have lots of statements in your UDF, but the function may still be inlined – with a boost for performance.

Not everything is inlined. Here are some inline blockers:

Most of these are quite logical. The last maybe less so at first sight, but there is a backwards-compatibility issue that explains it. (When inlined, they will return the same value for all rows; but the values can be different if there is no inlining.) It would be nice, though, if you could force inlining in this case, so I have submitted a feedback item entitled Permit UDF with time-dependent functions to be inlined.

Really Small Things

We start with an item that actually has been implemented. Topping the bills on Connect in the past and more recently on Uservoice were requests to fix the awful error message String or binary data would be truncated to report which value that would be truncated and/or the target column for the value. Microsoft has now addressed this, and an improved message is now available in SQL 2019, SQL 2017 starting with CU12 and SQL 2016 SP2 CU6. (And possibly also CU5, which I have not tested.) Note that to get the new and improved message in SQL 2016 and SQL 2017, you need to enable trace flag 460. This flag is not needed on SQL 2019.

SQL 2008: Add sysdate(). A sysdate() function would make things easier when you want query a date column for things that happened today.

In the same vein as the request above to include the schema name with error_procedure, there are also feedback items that calls for the schema being included in the output from SET STATISTICS IO. I have found two: Please include schema / database prefix in STATISTICS IO output which I believe was originally submitted by Aaron Bertrand and SET STATISTICS IO should also show schema name for tables from Michael Søndergaard.

Improvements for Querying

A Built-in Table of Numbers – generate_series

There are quite a few SQL problems where you need a suite of numbers to generate a dimension. For instance: Show me sales per day. Days for which there were no sales at all should appear with a 0. To generate the days missing from the Sales table, you need a calendar table – or a table with numbers from 1 to as many days there are in the period. If you have read my article Arrays and Lists in SQL Server, you have seen several methods to unpack a list into a table that employ a table of numbers. You can easily generate such a table, but if you generate too few numbers, your queries will produce incorrect results when you run out of numbers. Another issue is that the optimizer thinks this that is just another table, and does not understand its special properties, and therefore does not make a very good job with it.

On the other hand, if there was such a built-in table (which of course would not be a physical table, but a logical concept), you would never run out of numbers, and the optimizer could make shortcuts for better performance. For this reason I have submitted the item Add a built-in table of numbers. Other items are Include Generate Series from Manjunathan R and Create a generate_series table value function from an unknown submitter.

It is worth noting that such a feature exists in PostgreSQL, where it is called generate_series. You can look at the Postgres manual page to get some idea of how it works.

Row Constructors

If you want your UPDATE statements to be ANSI-compliant and update several columns from the same source you need to write:

SET    col1 = (SELECT somecol FROM ...),
       col2 = (SELECT someothercol FROM ...),

Not only is this tedious; since SQL Server is not able to recognize that the subqueries are the same, performance can suffer as well. In true ANSI SQL, this is not an issue, because you can use row constructors:

SET    (col1, col2, ...) = (SELECT somecol, someothercol, ... FROM ...)

The regular workaround in SQL Server is to use the proprietary syntax with the FROM clause, and personally I prefer that syntax in most situations, but sometimes it's more natural to use a subquery.

Another situation where row constructors can be handy is for queries like:

SELECT ... FROM tbl 
WHERE (col1, col2, col3) IN (SELECT col1, col2, col3 FROM somewhereelse)

In this case, too, there is a good alternative, the EXISTS clause, and it's not even proprietary. Nevertheless, there are situations where you may be more comfortable with the IN syntax.

You may have also written queries with several CASE expressions where the same WHEN conditions reappearss. What if you could say:

CASE WHEN ... THEN (this, that, orthat)
     WHEN ... THEN (a, b, c)

The most compelling case for row constructors is maybe the SET statement:

SET (@a, @b, @c) = (SELECT a, b, c FROM ...)

The alternative today is to use the SELECT statement, but the behaviour when the SELECT hits anything but exactly one row is not exactly intuitive. With the addition of row constructors, SET would become the prime alternative for assignment.

There are several feedback items in this area. The main item with the most votes is Add support for ANSI standard row value constructors, submitted by SQL Server MVP Hugo Kornelis. Other items are Support row constructors for conditional expressions, submitted by Ahsukal and Row types and constructors (SQL99 features T051, F641) for which I don't know the submitter.

Note: a potential problem with row constructors with many columns is that it can be difficult to keep the lists in the two constructors in sync. This problem exists today already with INSERT-SELECT. In my document on strict checks, I discuss ideas how SQL Server could help us from goofs here.

Regular Expressions

If you have any experience of regular expressions from the Unix world, Perl or the RegEx classes in .Net, you probably don't find LIKE in SQL very powerful. On top of that it uses wildcard characters that are a bit, well, unusual. SQL:1999 has a more powerful operator SIMILAR TO, but it's still not in par with the real McCoy, and it still uses the odd wildcard characters. Microsoft could implement SIMILAR TO while they are at it, but the prime importance is that we get something that draws on the Unix traditions to bring the full power of regular expressions to the SQL Server users.

A full implementation would not only permit us to use a more powerful LIKE, but also an enhanced replace(), so that you can use regular expressions for replacements and refer back to the matches in the replacements.

I'm not including an example of how this would look like, because if you already know about regular expressions you don't need any examples to be convinced. And if you never have seen one, you would only be frightened. :-) Admittedly, regular expressions have a bit of a learning curve, and they are not always pretty. But they are extremely powerful.

SQL Server MVP Simon Sabin has submitted an item entitled Regex functionality in pattern matching. Alternate items are Add built-in support for regular expressions in T‑SQL (Steven Twitchell, Microsoft), Support POSIX-style regular expressions in T‑SQL (yours truly), Introduce more regular expressions functions to T‑SQL (unknown), Support for regular expressions (unknown) and RegEx Column Property for Validation (Eidolon).

Type the Literal!

Today there are three base types of literals in T‑SQL: numbers, strings and binary strings (those starting with 0x). This means that for data types like date, uniqueidentifier etc, there is no way you can achieve a literal of that type. Add to this that it's more or less a science trying to figure out what is the data type of a numeric literal. You can work around this by using cast() or convert(), but it is quite bulky. SQL Server 2012 added a couple of new functions which are specifically geared for the date/time data types, for instance datefromparts(). And while they are welcome additions, you cannot say that datetimeoffsetfromparts() with its nine arguments is short and slender.

ANSI SQL suggests a path where you can prefix the literal with the type, for instance DATE '2008-09-21', a bit in analogy how we do it with Unicode literals today. SQL Server MVP Steve Kass has suggested this in the item Support syntax for typed literal values. While a bit of syntactic sugar, it could help to make our code more concise.

Remove Illogical Restrictions for IF EXISTS()

When you have a query in IN/WHILE EXISTS(), you cannot use CTE:s or query hints, restrictions that appear as illogical. That is, you cannot say:

           SELECT * FROM tab)
   PRINT 'okiedokie'
   PRINT 'not ok';


           OPTION (MAXDOP 1))
    PRINT 'yes';
    PRINT 'no';

The workaround is certainly simple: run a SELECT that assigns the result into a variable, and the test on the variable. But nevertheless the inconsistency sticks out like a sore thumb. And, assume that you are in a stressful situation where you need to address a performance problem with an IF EXISTS query. You realise that the MAXDOP hint is what you need. Certainly you get far less nervous if you could only insert the query hint, rather than having to actually edit the code to move the query, add a variable and all that.

SQL Server MVPs Aaron Bertrand and Hugo Kornelis have submitted the items Query hints not allowed after or inside IF EXISTS() and Allow Common Table Expressions in all query expressions.

QUALIFY for More Concise Queries

It is very common to use row_number (or rank or dense_rank) to number set of rows and then filter on the numbering to get the most recent row(s). To do this, you always need to use a CTE or a derived table, and put the row-number filter in the outer query. The reason for this is that an SQL query is evaluated in this order:


Since the ranking is defined in the SELECT list, you cannot refer to it in WHERE, only in SELECT and ORDER BY.

While this is not very difficult to write, it still makes the code more tedious. Teradata implements a fourth filtering clause beside ON, WHERE and HAVING, which is evaluated after SELECT and before ORDER BY. They call it QUALIFY. While QUALIFY does not bring any new query powers, it opens the possibility to make queries more concise, since the need to use nested queries is reduced. Therefore it would be a very welcome addition to SQL Server.

SQL Server MVP Itzik Ben-Gan has submitted the item QUALIFY – Request for a New Filtering Clause. His item also includes a more detailed discussion on QUALIFY as well as examples.

Dynamic Pivot

This is something I do very rarely myself, but there are questions about dynamic pivot about every day on the forums, so I don't need to be convinced that there is a business case. What it is all about is that you want to make rows into columns for a report. This can easily be done with this pattern:

SELECT group_item,
       MAX(CASE WHEN month = 1 THEN qty END) AS JanQty,
       MAX(CASE WHEN month = 2 THEN qty END) AS FebQty,
       MAX(CASE WHEN month = 12 THEN qty END) AS DecQty
FROM   tbl
GROUP  BY group_item

You can also use the PIVOT operator, but I find it more difficult to use without any actual benefit. Whichever method you use there is an important restriction: you need to know in advance which columns – their names and how many there are of them – your query will produce. But in many cases, this is information that is not known until run-time. If there are sales numbers for five products, you want one column for each product. If there are 14 products, you want 14 columns.

One thing is important to observe: this can never be a regular SELECT, because a SELECT statement returns a table, and a table has a known set of columns. This makes it possible to reuse a SELECT as the inner part of a bigger query. Thus SQL Server would need to use a different keyword, for instance TRANSFORM (which is the command for dynamic pivot in in Access), and this command would only be able to return data to the client.

In the early days of Connect, Pich submitted an item simply entitled PIVOT. Other item are MSFT: MADDOG – Pivot should allow dynamic column generation DCR and Pivot function.

IS [NOT] DISTINCT FROM – no longer a wishlist item

This has been in the product since SQL 2005, but I was not aware of it. So this is no longer really a wishlist item. However, the section still remains as you may want to learn the trick.

Say that you want to compare the same tables in two databases with the same schema and find all rows for which there are differences. You may write something like this:

FROM   db1..tbl a
JOIN   db2..tbl b ON a.keycol = b.keycol
WHERE  (a.col1 <> b.col1 OR
        a.col1 IS NULL AND b.col1 IS NOT NULL OR
        a.col1 IS NOT NULL AND b.col1 IS NULL)
   OR  ... -- The same expression for all other columns.

You write it this tediously, because the plain <> condition will not return rows where only one of the columns are NULL, since NULL is never equal or unequal to another NULL, but NULL <> NULL yields UNKNOWN.

I only recently learnt from Diethard Michaelis that you can instead write the above as:

FROM   db1.tbl a
JOIN   db2.tbl b ON a.keycol = b.keycol
                   SELECT b.col1, ...)

Which certainly is compact enough. Furthermore, as SQL Server MVP Paul White discusses this more closely in this blog post, the optimizer is able to use indexes efficiently.

SQL:1999 defines the operators IS DISTINCT FROM and IS NOT DISTINCT FROM which are the same as <> and = respectively, except that they also apply to NULL values. With this operator, the above becomes:

FROM   db1.tbl a
JOIN   db2.tbl b ON a.keycol = b.keycol
   OR  a.col2 IS DISTINCT FROM b.col2
   OR  ... 

Personally, I think [NOT] EXISTS (SELECT ... INTERSECT ... SELECT) is good enough, and I think there are bigger fishes to fry than IS [NOT] DISTINCT FROM, so this no longer a wishlist item as such.

DDL Improvements

Server-Side CLR Compiles

The ability to create stored procedures and functions in .Net languages was one of the most spectacular additions to SQL 2005. And used correctly, this is also very powerful. One problem, though, is that it's fairly bulky to create a CLR module. You first have to compile your C#/VB code, load it into SQL Server with CREATE ASSEMBLY, and finally run CREATE PROCEDURE/FUNCTION/etc to create the object. Visual Studio can perform these steps for you, and if you create a larger assembly from which you define a whole number of procedures and functions that may be alright. But if all you want to do is to write a 50-line utility function, using Visual Studio is even bulkier than using the command line. Not the least if you are a DBA sort of person, who don't have Visual Studio installed at all.

It would be so much more convenient if you could say:

CREATE PROCEDURE my_proc @par1 int,
                         @par2 int WITH
using System;
using System.Data;
using System.Data.SqlClient;

SQL Server MVP Greg Low has submitted the very compelling item Add server-side compilation ability to SQL CLR about this.

Note: in the appendix to my article Packaging Permissions in Stored Procedures, I have a chapter Loading CLR Objects where I present a way to do this in current versions. However it is a far cry from what Greg suggests. Not the least are practical problems to employ my idea in real life, as I discuss in the chapter.

Synonyms for Databases and Linked Servers

Synonyms were added in SQL 2005, and they can be useful. However, maybe the most common use for synonyms is when your application is split up on over several databases, so get your cross-databases queries running you can set up a synonym for every table, and a setup script to run with that. This is a bit roundabout, it would be simpler if all you had to do was to set up a synonym for the other database, and then you would be done. SQL Server MVP Aaron Bertrand has submitted an item with the title Expand synonym to other entities (database, linked server).

Application-Time Periods for Temporal Tables

One of the hot things in SQL 2016 was temporal tables, but exactly how hot were they? Well, not as hot as they could have been.

To wit, the current tables are good for auditing, when you want to save the every state a row has been in. But beyond that? Not much, because what says that they actual time you want to track agrees with the time the data was entered in SQL Server? Say for instance a table that holds the interest rate for a loan. If a change in interest rate is decided in advance, it is very impractical if an operator needs to remember to enter the new interest rate on the very day. And I remember once when my bank had applied an interest-rate increase retroactively and I got a little upset. I was able negotiate the bank to move the change date from the past to the current date. You cannot do that (easily) with a temporal table in SQL Server today. And this is true for most situations where you want to capture a period beyond plain auditing. If this is what you want, you will need to keep on rolling your own and give temporal tables a miss.

However, it does not really have to be that way. Temporal tables is nothing that Microsoft have invented on their own, but it all comes from the SQL:2011 standard, which also specifies something known as application time. Here is a quick example of how to define table with application time:

   EStart DATE,
   EEnd DATE,
   PERIOD FOR EPeriod (EStart, EEnd)

This table gives us the different departments an employee belonged to in different points in time. The PERIOD FOR is the application-defined period. To find out which department an employee was on a certain day, we can use this query:

SELECT Ename, Edept
WHERE ENo = 22217 AND 
      EPeriod CONTAINS DATE '2011-01-02'

Note: this query is taken from the paper I refer to below and which was written by DB2 people. The appearence of DATE here is a typed literal as discussed above.

There are a whole lot more rules and syntax for UPDATE, DELETE, constraints etc that I don't cover here, but if you are curious there is a paper Temporal features in SQL:2011 which discusses this in details. The same paper also discusses system-time and bitemporal tables that have both application-time and system-time periods.

There is a feedback item for this matter, submitted on the old Connect site, but the name/alias of the submitter have been lost, Add Application-period temporal tables and Bitemporal tables to SQL Server 2016.

BEFORE triggers or Enhanced INSTEAD OF Triggers

SQL Server has AFTER and INSTEAD OF triggers, the latter for DML only. Other products have BEFORE triggers which is a more natural thing. INSTEAD OF triggers, mainly introduced to make more views updatable, are a little funky as they require you to redo the command. This can introduce a maintenance problem and sometimes other difficulties as well. With BEFORE triggers there are no such problems. Then again, maybe INSTEAD OF triggers could be enhanced?

Let's first look at why you would want anything else than plain AFTER triggers:

  1. You want to prevent the operation altogether, without having to take the cost of an update+rollback of millions of rows. This is even more significant for DDL, for instance if you want to prevent index creation on large tables cxuring business hours.
  2. You want to perform validations before any updates are physically performed to avoid cost for update+rollback.
  3. You want to modify the inserted/updated rows, for instance to set a last_updated column without having to take the cost of an extra update in an AFTER trigger. (This applies even if the suggstion about auto-updated columns above is implemented, since the update may be such that it cannot be expressed declaratively.)
  4. You want to do cascading deletes that cannot be done with FOR CASCADE DELETE, due to the many restrictions with cascading FKs in SQL Server.
  5. You want to extend the statement to rows not directly affected, or remove some rows to leave them unaffected – or something even wilder.
  6. For DDL, you may want to alter the statement to add some extra clause, for instance WITH ONLINE for index creation.

For a long time I was sure that BEFORE triggers was the solution, but I'm starting to change my mind. If Microsoft adds BEFORE triggers, they will also have determine in which order BEFORE triggers, INSTEAD OF triggers and constraints fire. Some of these rules can be taken from the SQL standards, but these rules may not be intuitive. Or more to the point, whatever the rules are, they will unintuitive to some people, because they have different expectations. So I'm leaning towards that it is better to enhance the existing INSTEAD OF triggers. Let's look at the list above:

  1. You can do this with INSTEAD OF triggers today for DML, but we don't have INSTEAD OF triggers for DDL. This could be done with BEFORE triggers as well, would they be implemented.
  2. This can be done with INSTEAD OF triggers, but we need to redo the statement in the trigger. This as such is not that bad, but the problem for INSERT and UPDATE is that you need to include all columns, which means that if you add more columns to the table, you need to remember the change the trigger. If there was a way to specify the redo without having enumerate the columns, this would not be an issue. A BEFORE trigger will always be smoother, since you don't need to say redo at all – it's done for you.
  3. This can be done with INSTEAD OF triggers today, but as noted there is a maintenance problem. For this to be possible with BEFORE triggers, you would need to be able to modify the inserted table, which is likely to have a performance penalty compared to INSTEAD OF triggers where you produce a direct operation on the target table.
  4. This can be done with INSTEAD OF triggers today. If it can be done with BEFORE triggers, depends on when the BEFORE trigger fires – if it fires after the constraints, you lose.
  5. Yes, you can do this with INSTEAD OF triggers, and this is sort of their raison d'être, as they are intended to permit operation on views that are not directly updatable. For this to be possible from BEFORE triggers, you need to be able to delete or insert from inserted and deleted, which sounds very wild.
  6. If there was INSTEAD OF DDL triggers, we could easily do this. But with BEFORE DDL triggers, it is not likely that we would be able to modify the statement executed.

So I think INSTEAD OF comes out as the winner, but if Microsoft thinks it is easier for them to achieve this with BEFORE triggers, that would be alright.

There are a number of feedback items in this area.:

ALTER TABLE Improvements

In the very early days of Connect, in fact in the system that pre-dated Connect, an item that rose to the top was an item calling for ALTER TABLE syntax to change the column order. This item does not seem to have made it to Uservoice. However "Steve" has submitted a new item on this theme entitled T‑SQL for changing table column order. This is a request that some people frown upon, because they think column order should not matter, something I am in complete disagreement with. When I document a table, I want to describe the columns in the order they appear in the table, so you easily can map to what you see with SELECT *. While you should not have SELECT * in your application code, SELECT * is still very practical when you run ad-hoc queries from SSMS. Today, if you want to change the colunm order of a table, this is a tedious and slow operation, since you need to copy the table to a new table. And remember to re-add triggers, constrtains, indexes and referencing foreign keys. When it could be a simple metadata operation.

What if you realise that you have a table with the IDENTITY property, and that was a mistake? Or the other way around, you should have made that column an IDENTITY from the beginning? Strangely enough, you can use ALTER TABLE to change this in SQL Server Compact Edition. But in the "big" SQL Server your only option is to drop or reload the table. SQL Server MVP Adam Machanic has submitted the item Allow enabling and disabling of a column's IDENTITY property to call for a change. You may think that with the introduction of sequences in SQL 2012, this request is no longer equally important, but I say it's the other way round. You may want to move over to use sequences, but you may find that recreating and reloading your big table with IDENTITY is a roadblock. A second item with the same idea is ALTER COLUMN IDENTITY from someone whose alias was lost when moving to Uservoice.

Note: There is a workaround that permits you to add/remove IDENTITY without having copying all data to a new table. Create a new table with exactly the same schema, save the addition/removal of IDENTITY. Then you can use ALTER TABLE SWITCH to swap the tables with each other. This command is normally used with partitioned tables, but both tables can be unpartitioned. You will need to drop and re-add referencing foreign keys.

A small thing, but nevertheless. When you change or drop a column you say ALTER TABLE tbl DROP/ALTER COLUMN, but when you add a column, COLUMN is not even permitted in the syntax! The item Add COLUMN keyword to the ADD COLUMN SYNTAX (as alternate), submitted by SQL Server MVP Louis Davidson calls for a rectification for this.

Also, it is difficult not be sympathetic with the suggestion Alter table – Nullability of column [Remove requirement to specify datatype]. That is, if all you want to do is to change NULL/NOT NULL for a column, you should not have to specify the data type with the risk of getting it wrong.

Revision History

Updated the section Really Small Things with the recent status in SQL 2019 and SQL 2016 for the replacement message for String or binary data would be truncated.
Updated the text to reflect that SQL 2019 offers inline scalar UDF, and the awful message String or binary data would be truncated has been improved in SQL 2019 and SQL 2017 and with a fix in SQL 2016 under way.
This update was sparked by the move from the old Connect to the new feedback site on Uservoice, which required me to update all links in the article. Since Microsoft only migrated active items, some Connect items were lost. For some I submitted new items, but in other cases I simply dropped the item from the list. While I was at it, I also added some new items. Here are these changes in summary:
Added one more Connect item in the section Get the Call Stack.
Added one more Connect item to the section on BEFORE triggers.
Added a note to the section on table of numbers that PostgreSQL has a function for this task. Rewrote the section on immutable column values to incorporate a Connect item from SQL Server MVP Adam Machanic.
Updated the wishlist after the release of SQL 2016. Mainly an update of status of some items and some rearrangement of order. One new item calls for make it possible to get all error messages in a CATCH handler. The wish for IS [NOT] DISTINCT FROM has been withdrawn, since there is a good workaround as explained in the article. For some items, I have addded new Connect entries.
Added a new item Add a Virtual Errors Table.
Updated the list with regards to the old bugs with numbers starting in the 12xxxxx range. They are now visible for everyone – but voting is not enabled for them.
Released the SQL 2012 version of the wishlist.

Back to my home page.