Where Is that Table Used?

Finding References in Your Database

An SQL text by Erland Sommarskog, SQL Server MVP. Last revision: 2020-05-24.
Copyright applies to this text. See here for font conventions used in this article.

Introduction

It is common to see people in SQL Server forums asking how to find references to a certain table or column, or what tables and columns that are accessed by a stored procedure, or some variation of these questions. In this article, I will explore the available options, including some utilities that I have written myself over the years. None of the alternatives I present is good for all situations, but you will often have to work with more than one of them.

In summary, here are the options we will look at:

This article is applicable to all SQL Server versions from SQL 2008 and up. We will work with a very simple non-sensical demo database:

USE tempdb
go
IF db_id('dependdemo') IS NOT NULL
   DROP DATABASE dependdemo
CREATE DATABASE dependdemo 
go
USE dependdemo
go
CREATE TABLE tableone   (colone int NOT NULL, coltwo int NULL, colthree int NULL)
CREATE TABLE tabletwo   (colone int NOT NULL, coltwo int NULL, colthree int NULL)
CREATE TABLE tablethree (colone int NOT NULL, coltwo int NULL, colthree int NULL)
go
CREATE VIEW aview AS
   SELECT colone + colthree AS col31 FROM tableone
go
CREATE PROCEDURE firstproc AS 
CREATE TABLE #temp (a int NOT NULL)

UPDATE tableone
SET    colone = 14
WHERE  coltwo = 19

SELECT two.colone, two.coltwo
FROM   tabletwo two
JOIN   #temp t ON two.colone = t.a

EXEC('SELECT colone, coltwo FROM tablethree')
go
CREATE PROCEDURE otherproc AS
   SELECT * FROM aview
   EXEC dbo.firstproc
go
EXEC otherproc

The Built-in Dependency Views

SQL Server offers one stored procedure five catalog views and DMVs to track dependencies:

sp_depends
sys.sysdepends
The only option up to SQL 2000. Now obsolete, and I will not discuss them furthter.
sys.sql_dependencies Introduced in SQL 2005, but now officially deprecated.
sys.sql_expression_dependencies
sys.dm_sql_referenced_entities
sys.dm_sql_referencing_entities
Introduced in SQL 2008 and the official recommendation from Microsoft.

While sys.sql_dependencies is deprecated, I still find it worthwhile to explore what it can offer. The query below gives you all dependencies in the database. In practice, you would add a WHERE clause to filter for the table, column or stored procedure you are interested in for the moment. (And this is true for most queries we will look at in this article.)

SELECT o.name AS procname, o2.name AS tbl, c.name AS col, d.* 
FROM   sys.sql_dependencies d
JOIN   sys.objects o  ON d.object_id           = o.object_id
JOIN   sys.objects o2 ON d.referenced_major_id = o2.object_id
LEFT   JOIN  sys.columns c ON c.object_id = o2.object_id
                          AND c.column_id = d.referenced_minor_id

Here is the result set. For space reasons, I have split up the result set into two sections and left out one column class_desc, for which the value is always OBJECT_OR_COLUMN_REFERENCE_NON_SCHEMA_BOUND.

procname  tbl       col      class object_id column_id

--------- --------- -------- ----- --------- ---------

aview     tableone  colone   0     949578421 0

aview     tableone  colthree 0     949578421 0

firstproc tableone  colone   0     965578478 0

firstproc tableone  coltwo   0     965578478 0

otherproc aview     col31    0     981578535 0

otherproc firstproc NULL     0     981578535 0

 

referenced_major_id referenced_minor_id is_selected is_updated is_select_all

------------------- ------------------- ----------- ---------- -------------

901578250           1                   1           0          0

901578250           3                   1           0          0

901578250           1                   0           1          0

901578250           2                   1           0          0

949578421           1                   0           0          1

965578478           0                   0           0          0

You can see that the dependency from the view aview to tableone is fully recorded. Next you can see that firstproc refers to two of the columns in tableone, and thanks to the column is_updated you can see that the column colone is updated by the procedure.

However, the other two tables are not listed. The reason that the dependency on tabletwo is missing is the temp table. When SQL Server creates the procedure, the temp table does not exist, and therefore SQL Server abandons analysis of the query with #temp and tabletwo. The unfortunate consequence of this is that the dependency to tabletwo is not recorded. It is not really surprising that the reference to tablethree is missing, since it is hidden in a string literal which executed through dynamic SQL. Technically, the dynamic SQL is not part of the procedure, but it is a nameless stored procedure on its own. Nevertheless, from a source-code maintenance perspective, we are likely to count firstproc as referring to tablethree, and in an ideal world, we would like to see this reference listed.

Note: the limitation with temp tables does not apply to table variables, as they are declared entities. If you replace #temp above with a table variable, you will find that the dependency on tabletwo is fully recorded.

Finally, you can see the references from otherproc to aview and firstproc. Note that for the reference to aview, the column is_select_all is 1 which reflects that the column reference is by SELECT *.

There are a few more limitations with sys.sql_dependencies. Cross-database and cross-server dependencies are not recorded at all so you cannot tell if a procedure is referencing things outside the database. Another issue is illustrated if you run this batch:

DROP TABLE tableone
CREATE TABLE tableone   (colone int NOT NULL, coltwo int NULL, colthree int NULL)

and then run the query above again. You will find that the result set consists of two rows only: the references from otherproc to aview and firstproc. Since sys.sql_dependencies records references by object id, all references were lost when tableone was dropped, and they were not magically restored when the table was recreated. To restore the information, you would also need to recreate firstproc and aview or run sp_refreshsqlmodule and sp_refreshview on them.

It was to address these shortcomings that Microsoft introduced sys.sql_expression_dependencies and the two DMVs in SQL 2008. Let's take a look of what they return, starting with sys.sql_expression_dependencies:

SELECT o.name AS procname, ed.* 
FROM   sys.sql_expression_dependencies ed
JOIN   sys.objects o ON ed.referencing_id = o.object_id

This query returns quite a few more columns than the previous one. As I am not going to cover all of them, I have left out several columns:

procname  referencing_id referencing_class_desc referenced_class_desc

--------- -------------- ---------------------- ---------------------

aview     949578421      OBJECT_OR_COLUMN       OBJECT_OR_COLUMN

firstproc 965578478      OBJECT_OR_COLUMN       OBJECT_OR_COLUMN

firstproc 965578478      OBJECT_OR_COLUMN       OBJECT_OR_COLUMN

otherproc 981578535      OBJECT_OR_COLUMN       OBJECT_OR_COLUMN

otherproc 981578535      OBJECT_OR_COLUMN       OBJECT_OR_COLUMN

 

referenced_server_name referenced_database_name referenced_schema_name

---------------------- ------------------------ ----------------------

NULL                   NULL                     NULL

NULL                   NULL                     NULL

NULL                   NULL                     NULL

NULL                   NULL                     NULL

NULL                   NULL                     dbo

 

referenced_entity_name referenced_id referenced_minor_id

---------------------- ------------- -------------------

tableone               997578592     0

tableone               997578592     0

tabletwo               917578307     0

aview                  949578421     0

firstproc              965578478     0

This view records dependencies by name, so therefore it does not matter that we dropped and recreated tableone. The dependency is still recorded. You may also note that the dependency from firstproc to tabletwo is included, so the temp table is not a limitation here. The reference to tablethree is still missing, but that is more or less to be expected.

Pay attention to the columns referenced_server_name and referenced_database_name. They are all NULL here, since we do not have any cross-server or cross-database references, but this view is perfectly capable to record them. You may also note that the column referenced_schema_name is only populated for the reference from otherproc to firstproc – the only place in the database creation script where the dbo schema is specified explicitly.

An interesting property of this view is that it can help you to find orphaned references. Test dropping tableone again and run the query above before creating the table again. The reference is still there, but the column referenced_id is now NULL, indicating that this is a reference to a non-existing object.

In difference to sys.sql_dependencies, there is no information about column references here, but we only see references on view/table level. (There is a referenced_minor_id in this view, but it appears to be non-zero only for references like CHECK constraints and indexes.)

To find column references, you need to use the DMV sys.dm_sql_referenced_entities, which accepts a schema-qualified name as input and gives you a list of objects that the input references. Here is a query that runs the DMV for all objects in the database. (Make sure that you have recreated tableone before you run this query):

SELECT s.name + '.' + o.name AS referrer, re.*
FROM   sys.objects o
JOIN   sys.schemas s ON o.schema_id = s.schema_id
CROSS  APPLY sys.dm_sql_referenced_entities(s.name + '.' + o.name, 'OBJECT') re

I have removed some columns from the output, including referenced_server_name and referenced_database_name, since they are all NULL in this example, but this DMV is good for finding cross-server and cross-database references.

referrer       referenced_schema_name  referenced_entity_name referenced_minor_name

-------------- ----------------------- --------------------------------------------

dbo.aview      NULL                    tableone               NULL

dbo.aview      NULL                    tableone               colone

dbo.aview      NULL                    tableone               colthree

dbo.firstproc  NULL                    tabletwo               NULL

dbo.firstproc  NULL                    tableone               NULL

dbo.firstproc  NULL                    tableone               colone

dbo.firstproc  NULL                    tableone               coltwo

dbo.otherproc  dbo                     firstproc              NULL

dbo.otherproc  NULL                    aview                  NULL

dbo.otherproc  NULL                    aview                  col31

 

referenced_id is_caller_dependent is_ambiguous is_selected is_updated

------------- ------------------- ------------ ----------- ----------

901578250     0                   0            1           0

901578250     0                   0            1           0

901578250     0                   0            1           0

917578307     0                   0            0           0

901578250     0                   0            0           1

901578250     0                   0            0           1

901578250     0                   0            1           0

965578478     0                   0            0           0

949578421     0                   0            0           0

949578421     0                   0            0           0

is_select_all is_all_columns_found is_insert_all is_incomplete

------------- -------------------- ------------- -------------

0             1                    0             0

0             1                    0             0

0             1                    0             0

0             0                    0             0

0             1                    0             0

0             1                    0             0

0             1                    0             0

0             0                    0             0

1             1                    0             0

1             1                    0             0

If you look at the lines for aview, you see that there is first a row for the reference to tableone as such, and then there are two more rows for the two columns in tableone included in the view. The same is true for the reference from firstproc to tableone and the reference from otherproc to aview. However, for the reference from firstproc to tabletwo, we only have the reference to the table itself, but the references to the columns are not included. Just like with sys.sql_dependencies, this is due to the temp table and deferred name resolution. So while the new features in SQL 2008 resolved some of the limitations with the older view, this issue still remains. As does the limitation with dynamic SQL – tablethree is still not listed.

There are quite a few columns that gives us information about the nature of the reference. I will only discuss some of them. The triplet is_selected, is_updated and is_select_all are the same as in sys.sql_dependencies. You also get is_insert_all which is 1 if the table appears in an INSERT statement without a column list. (Something which is generally considered bad practice.) The column is_all_columns_found relates to the problem with the temp table. That is, SQL Server will set this column to 0 when a table appears in a query with a temp table (or some other missing table), preventing full analysis of that query. For instance, you can see that this column is 0 for the reference from firstproc to tabletwo. (It is also zero for the reference between the two stored procedures, since there are no columns at all in this case.)

Note: not all columns above are present in all versions of SQL Server. On SQL 2008, the last column to appear in the result set is is_ambiguous. The next four – is_selected, is_updated, is_select_all and is_all_columns_found – were added in SQL 2012. is_insert_all was added with SQL 2016. Books Online says that is_incomplete was added in SQL 2016 SP2, but I also see it on SQL 2014 SP3, so it appears to have been backported.

sys.dm_sql_referenced_entities does not rely on persisted information, but instead it runs binding on the object you pass to it. If you only want to know what objects a certain stored procedure refers to that is alright. But if you want to use this DMV to find all references to a certain table column, you need invoke it for all modules in the database as in the query above and then filter for the column. If you have several thousands of stored procedures, this can take quite some time. You may also get error messages, because the DMV may fail for some modules. This fact in combination that it fails to return complete information for queries with temp tables makes sys.dm_sql_referenced_entities less attractive to me.

The other DMV, sys.dm_sql_referencing_entities, gives you all reference to an object. Here is a query:

SELECT s.name, o.name, re.*
FROM   sys.objects o
JOIN   sys.schemas s ON o.schema_id = s.schema_id
CROSS  APPLY sys.dm_sql_referencing_entities(s.name + '.' + o.name, 'OBJECT') re

This is the full result set:

referred_to   referencing_schema_name referencing_entity_name referencing_id

------------- ----------------------- ----------------------- --------------

dbo.tableone  dbo                     aview                   949578421

dbo.tableone  dbo                     firstproc               965578478

dbo.tabletwo  dbo                     firstproc               965578478

dbo.aview     dbo                     otherproc               981578535

dbo.firstproc dbo                     otherproc               981578535

referencing_class referencing_class_desc is_caller_dependent

----------------- ---------------------- -------------------

1                 OBJECT_OR_COLUMN       0

1                 OBJECT_OR_COLUMN       0

1                 OBJECT_OR_COLUMN       0

1                 OBJECT_OR_COLUMN       0

1                 OBJECT_OR_COLUMN       0

I will have to admit that I have not been able to figure out if this DMV provides any information which is not present is sys.sql_expression_dependencies.

Rather than querying these views directly, you can use View Depenencies from the context menu of an object in Object Explorer in SSMS. This will result to queries against the views discussed in this cahapter.

Entering Full-text

An alternative to use the built-in views is to search the procedure definitions in sys.sql_modules directly with LIKE queries. This can be a tad slow if you have thousands of stored procedures, and it can be difficult to make the searches precise. A more sophisticated approach is to build a full-text index on the code, which make these searches to be really quick.

Full-text is an optional component when you install SQL Server, so it may not be present on your instance. Run this query to find out:

SELECT serverproperty('IsFullTextInstalled')

If it returns 0, full-text is missing and you need to run SQL Server Setup to add it. Note that if you are using Express Edition, you need to have Express with Advanced Features to install full-text.

The full script for this chapter is in file SearchCode.sql. The script creates a database SearchCode and in this database the script creates two tables Objects and Lines. Objects holds the name and type of the object (stored procedure, trigger etc), while Lines holds the SQL code split up into lines. The script loads the two tables with the SQL code in the dependdemo database and then creates a full-text index on the Lines table. The file also includes two sample queries. If you want to use full-text to search your database, you can use the script as-is. You only have to replace dependdemo with the name of your database. Below I discuss the script in a little more detail.

The script starts off with creating two auxiliary objects: a Numbers table and a table-valued function tblnum_split_mstmt, that I don't discuss further here. If you are curious, you can read about them in my article Arrays and Lists in SQL Server. These are the two tables that actually hold the data:

CREATE TABLE Objects (object_id int     NOT NULL,
                      schema_   sysname NOT NULL,
                      name      sysname NOT NULL,
                      type      char(2) NOT NULL,
                      CONSTRAINT pk_Objects PRIMARY KEY (object_id),
                      CONSTRAINT u_Objects UNIQUE (schema_, name)
)
go
CREATE TABLE Lines (ident       int IDENTITY   NOT NULL,
                    object_id   int            NOT NULL,
                    linenum     int            NOT NULL,
                    linetext    nvarchar(MAX)  NOT NULL,
                    CONSTRAINT pk_Lines PRIMARY KEY (object_id, linenum),
                    CONSTRAINT u_Lines UNIQUE(ident),
                    CONSTRAINT fk_Lines_Objects FOREIGN KEY(object_id)
                        REFERENCES Objects(object_id)
)

The object_id is the object id in the source database, would you need it. schema_ has an underscore since SCHEMA is a reserved keyword in T‑SQL (and I don't like typing brackets). type is the same as in sys.objects. The purpose of the ident column in Lines will appear later. linenum is, yes, the line number, and linetext is the actual code on that line.

The Objects table is loaded this way:

INSERT Objects (object_id, schema_, name, type)
   SELECT o.object_id, s.name, o.name, o.type
   FROM   dependdemo.sys.objects o
   JOIN   dependdemo.sys.schemas s ON o.schema_id = s.schema_id
   WHERE  EXISTS (SELECT *
                  FROM   dependdemo.sys.sql_modules sm
                  WHERE  o.object_id = sm.object_id)

Keep in mind that when you load SearchCode for your own database, you should replace dependdemo with name of your own database. The same applies to the statement that loads Lines:

INSERT Lines (object_id, linenum, linetext)
   SELECT sm.object_id, ism.Pos, ism.Value
   FROM   dependdemo.sys.sql_modules sm 
   CROSS  APPLY tblnum_split_mstmt(replace(sm.definition COLLATE Latin1_General_BIN2, 
                                           char(13), ''), 
                                   char(10)) ism
   WHERE  EXISTS (SELECT *
                  FROM   dependdemo.sys.objects o
                  WHERE  sm.object_id = o.object_id)

tblnum_split_mstmt splits the input into rows on the given separator which is char(10), i.e., line-feed. Commonly, lines in the Windows world are terminated by CR-LF, but I also want to support code that was created from Unix or elsewhere and which only has LF as the line terminator. Whence, I replace carriage return with nothing at all. The reason I force a binary collation is performance – on a larger database, this makes a considerable difference. You may be puzzled by the EXISTS clause, but sys.sql_modules can include code for modules that are not in sys.objects. The one such example I know of is DDL triggers. Obviously, I could have extended the INSERT into Objects to include DDL triggers, but I decided to leave that as an exercise for the reader. Also, a DDL trigger may be there for administrative reasons and not really be part of the application, so it may not be of interest anyway.

Finally, we set up the full-text index:

CREATE FULLTEXT CATALOG fulltext WITH ACCENT_SENSITIVITY = ON AS DEFAULT
CREATE FULLTEXT INDEX ON Lines(linetext LANGUAGE 0x0) KEY INDEX u_Lines

All full-text indexes must belong to a full-text catalog, so we need to start by creating one, and since we mark it as default, we don't have to actually mention it when we create the index. There are three things we need to specify with the index, though:

In difference to a regular index, a full-text index is populated asynchronously, so it can take some time before you can use the index. For the small dependdemo database it is only a matter of seconds but for a database with several thousands of store procedures it may take up to an hour. To see whether the index build has completed, you can use the query below. Once it returns 1, the index has been fully populated.

SELECT has_crawl_completed FROM sys.fulltext_indexes

So how to you use this? Here is a sample query:

SELECT O.schema_, O.name, O.type, L.linenum, L.linetext
FROM   Objects O
JOIN   Lines L ON O.object_id = L.object_id
WHERE  CONTAINS(L.linetext, 'tablethree')
ORDER  BY O.name, L.linenum

This is the result set:

schema_  name       type linenum linetext

-------- ---------- ---- ------- ---------------------------------------------

dbo      firstproc  P    11      EXEC('SELECT colone, coltwo FROM tablethree')

This is the reference that none of the built-in views were able to find, because it was hidden in dynamic SQL.

CONTAINS is a predicate that is special for full-text. Here the search-term was a single word. But you can also search on a sequence:

SELECT O.schema_, O.name, O.type, L.linenum, L.linetext
FROM   Objects O
JOIN   Lines L ON O.object_id = L.object_id
WHERE  CONTAINS(L.linetext, '"colone coltwo"')
ORDER  BY O.name, L.linenum

This returns the same output as above. Observe that the search term has to be enclosed in double-quotes when it is more than one word in it. Note that in the code above, there is actually a comma between the two column names, but the search string does not include the comma. Full-text cares about words, not interpunctuation, so it finds the sequence anyway. One situation where it can be useful to search for sequences is to find where a table is updated; in this case you would search for UPDATE tablename.

You can also use OR:

SELECT O.schema_, O.name, O.type, L.linenum, L.linetext
FROM   Objects O
JOIN   Lines L ON O.object_id = L.object_id
WHERE  CONTAINS(L.linetext, '"colone" OR "coltwo"')
ORDER  BY O.name, L.linenum

Note that the OR should not be included in double quotes; they apply to the search terms only.

You can use * as a wild-card, but only at the end of the search-term. This finds both tabletwo and tablethree:

SELECT O.schema_, O.name, O.type, L.linenum, L.linetext
FROM   Objects O
JOIN   Lines L ON O.object_id = L.object_id
WHERE  CONTAINS(L.linetext, '"tablet*"')
ORDER  BY O.name, L.linenum

To learn more about the possibilities, look at the topic for CONTAINS in Books Online. I should add that for my own part, I very rarely find the need to search for anything else but a single search term. (Typically, the name of table, column or stored procedure.)

Full-text queries are always case-insensitive. If you have a database with a case-sensitive collation, and you only want to find hits on customer, but not Customer, you can add a traditional SQL condition to the query, for instance:

AND L.linetext LIKE '%customer%'
Here is one thing you should be aware of. You might expect this query to return several rows:
SELECT O.schema_, O.name, O.type, L.linenum, L.linetext
FROM   Objects O
JOIN   Lines L ON O.object_id = L.object_id
WHERE  CONTAINS(L.linetext, 'FROM')
ORDER  BY O.name, L.linenum

But it returns no rows at all. This is because FROM is defined as a stopword. (Also known as a noise word.) Stopwords are common words like conjunctions, prepositions, etc that would take up a lot of space in the index, but be of little benefit, since they would return so many hits. (Recall that full-text is designed for searching human-written text, not programming code.) Thus, would you have a table or a column of which the name is considered to be a stopword by full-text you would not be able to find it this way. You can review the stopwords for language 0 with this query:

SELECT *FROM sys.fulltext_system_stopwords WHERE language_id = 0

Note: it is possible to manipulate the stoplists. I have never engaged in this myself, but if you have the need, start with the topic Configure and Manage Stopwords and Stoplists for Full-Text Search.

After some time, your SearchCode database may be inaccurate with regards to the contents in your database. The easiest way is to re-run the script. That is, drop the database and start over.

Let's now look at the properties of this solution. The main advantage is that since full-text extracts words, it does not matter where a name appears. That is, references are never hidden because of dynamic SQL (as long as you don't construct the names themselves dynamically!) nor are they hidden due to deferred name resolution with temp tables.

But there are also drawbacks:

In the end, full-text is a not replacement for the built-in capabilities in SQL Server, but rather a supplement.

Note: you may find that running the script takes about ten seconds. What takes time is the loading of the Numbers table, which I have not bother to optimise. When you load SearchCode for a real production database, it will take a lot more time to load Objects and Lines anyway.

Using the Plan Cache

This is an approach that Paul McMillan made me aware of. The idea is that you find references by interrogating the plan cache in SQL Server. Here is a query, adapted from a post on StackOverflow by Stephen Turner that Paul pointed me to. Before you run it, recreate dependdemo from the script to be sure to have it fresh.

WITH XMLNAMESPACES (
    'http://schemas.microsoft.com/sqlserver/2004/07/showplan' as ns1
), CompiledPlan AS (
    SELECT qp.query_plan, est.objectid
    FROM   sys.dm_exec_cached_plans cp
    CROSS  APPLY sys.dm_exec_plan_attributes(cp.plan_handle) pa
    CROSS  APPLY sys.dm_exec_query_plan(cp.plan_handle) qp
    CROSS  APPLY sys.dm_exec_sql_text(cp.plan_handle) est
    WHERE  pa.attribute = 'dbid'
      AND  pa.value = convert(sql_variant, db_id())
), ColumnReferences AS (
    SELECT DISTINCT
        objectid,
        p.query('.').value('./ns1:ColumnReference[1]/@Database', 'sysname') AS [Database],
        p.query('.').value('./ns1:ColumnReference[1]/@Schema', 'sysname') AS [Schema],
        p.query('.').value('./ns1:ColumnReference[1]/@Table', 'sysname') AS [Table],
        p.query('.').value('./ns1:ColumnReference[1]/@Column', 'sysname') AS [Column]
    FROM CompiledPlan
        CROSS APPLY query_plan.nodes('//ns1:ColumnReference') t(p)
)
SELECT s.name + '.' + o.name AS Module, 
       [Database], [Schema], [Table], [Column]
FROM   ColumnReferences CR
LEFT   JOIN  (sys.objects o 
              JOIN  sys.schemas s ON o.schema_id = s.schema_id)     
     ON CR.objectid = o.object_id
WHERE  [Database] IS NOT NULL
  AND  [Schema] <> '[sys]'

Before I show the result, first a quick discussion about the query. The query starts with sys.dm_exec_cached_plans and uses the plan_handle to lookup data in other DMVs. From sys.dm_exec_plan_attributes, we find the originating database for the plan, so that we can filter out plans from queries in other databases. From sys.dm_exec_query_plan we get the plan itself, which we query for references in the next CTE. Finally, from sys.dm_exec_sql_text, we get the object id which we translate to an object name in the final query. We use LEFT JOIN here, since object id is NULL for an ad-hoc query. The condition on Database serves to filter out references to things like Expr1003 that are temporary values inside the plan and which are of no interest in this context. This filter also removes references the temp table #temp in firstproc. The filter on Schema removes reference to system views, which you may not be interested in, but that can add quite some noise.

This is the output from the query:

Module         Database      Schema  Table         Column

-------------- ------------- ------- ------------- ---------

NULL           [dependdemo]  [dbo]   [tablethree]  colone

dbo.otherproc  [dependdemo]  [dbo]   [tableone]    colthree

NULL           [dependdemo]  [dbo]   [tablethree]  coltwo

dbo.otherproc  [dependdemo]  [dbo]   [tableone]    colone

dbo.firstproc  [dependdemo]  [dbo]   [tableone]    coltwo

dbo.firstproc  [dependdemo]  [dbo]   [tableone]    colone

dbo.firstproc  [dependdemo]  [dbo]   [tabletwo]    coltwo

dbo.firstproc  [dependdemo]  [dbo]   [tabletwo]    colone

For firstproc, you get the references to tableone and tabletwo, including the columns. The temp table does not pose a problem here. The references to tablethree are also there, but we don't see a procedure name. As I mentioned previously, a batch of a dynamic SQL is technically a nameless stored procedure of its own.

Note: to actually see the reference to tablethree, you may need to run otherproc a second time. When the configuration option optimize for ad hoc workloads is set to 1 (which is generally seen as best practice), the plan for an unparameterised query outside a real stored procedure is not cached until it is executed for a second time.

When it comes to the references from otherproc, notice that they are listed against tableone and not aview. A view in SQL Server is just a macro which is replaced with its definition before optimisation, so you will generally not see view references with this query. (The one exception is if the optimizer has matched a query against an indexed view.) Since there are no plans for calls to stored procedures, we do not see the dependency from otherproc to firstproc here.

In the original question on Stack Overflow, the poster wanted to know which tables and columns a certain stored procedure touches. And as long as you are not concerned by columns in views being replaced by the underlying table columns, the query above serves that purpose well. Run the procedure (so that you know that there is a plan for it in the cache) and add a filter for it in the query above. If you want to see all references made by the procedure and everything it invokes, directly or indirectly, clear the plan cache with DBCC FLUSHPROCINDB (you need to pass the id for the database as parameter), run your procedure, and use the query above without any filter. Just be sure that you are on a test/dev server, so that you don't flush the cache for a production database!

On the other hand, if you want to know where a certain table or column is used, this is not a very reliable method, since the plan for a stored procedure may not be in the cache for a number of reasons:

But there is a flip side to this as well, as the plan cache will tell you that the references you see are active references. Say that you are looking for references to table_that_smells_funny, and the other methods show you that dusty_old_procedure refers to this table. But when you look in the plan cache, you don't see this reference. This may indicate that you can drop both the table and the procedure. (Although, you probably need to make more investigations to be certain.)

Also, it is worth pointing out that this is the only method so far that can show you references from client code that submits SQL queries directly. You cannot see where the references come from, and they could originate from someone playing around in SSMS. But at least you see that a certain table and column is used in a query somewhere. (You could extend the query above by adding the text column from sys.dm_exec_sql_text to see the actual query text.)

Yet an interesting property of this method is that if you want to find references to a table from other databases, you can do so if you tweak the query a little bit. Remove the line

 AND  pa.value = convert(sql_variant, db_id())

from the first CTE, and instead bring this value down to the SELECT list of the final query:

db_name(convert(smallint, dbid))

to display the name of the originating database. You probably what to filter the column Database against db_name() in this case.

Finally, you should be aware of that this query is not exactly lean on resources, and it is not really apt to run on a busy production system. At the same time, on an idle test system the plan cache may be thin, so the production system is where to go to get useful data. The best alternative may be to dump the first CTE to a table and then copy that table to an instance where you can run heavy queries without disturbing production.

Using Query Store

An alternative to use the plan cache on SQL 2016 or later is Query Store. Inspired by Stephen Turner's query, I wrote this script:

DROP TABLE IF EXISTS #plans

SELECT TRY_CAST (p.query_plan AS xml) AS query_plan, q.object_id
INTO   #plans
FROM   sys.query_store_plan p
JOIN   sys.query_store_query q ON q.query_id = p.query_id
WHERE  try_cast(p.query_plan AS xml) is not null

; WITH XMLNAMESPACES (
    'http://schemas.microsoft.com/sqlserver/2004/07/showplan' as ns1
), ColumnReferences AS (
    SELECT DISTINCT
        p.object_id,
        CR.p.query('.').value('./ns1:ColumnReference[1]/@Database', 'sysname') AS [Database],
        CR.p.query('.').value('./ns1:ColumnReference[1]/@Schema', 'sysname') AS [Schema],
        CR.p.query('.').value('./ns1:ColumnReference[1]/@Table', 'sysname') AS [Table],
        CR.p.query('.').value('./ns1:ColumnReference[1]/@Column', 'sysname') AS [Column]
    FROM  #plans p
    CROSS APPLY p.query_plan.nodes('//ns1:ColumnReference') CR(p)
    WHERE p.query_plan IS NOT NULL
)
SELECT s.name + '.' + o.name AS Module, 
       CR.[Database], CR.[Schema], CR.[Table], CR.[Column]
FROM   ColumnReferences CR
LEFT   JOIN  (sys.objects o 
              JOIN  sys.schemas s ON o.schema_id = s.schema_id)     
     ON CR.object_id = o.object_id
WHERE  CR.[Database] IS NOT NULL
  AND  CR.[Schema] <> '[sys]'

To test it, you first need to enable Query Store and run otherproc again before you run the query above.

ALTER DATABASE CURRENT SET QUERY_STORE = ON
EXEC otherproc

If you are on SQL 2016 or SQL 2017, the output should be the same as for the query against the plan cache. If you are on SQL 2019 or later, you will probably only see the references from firstproc. This is because there were some changes in SQL 2019 to what Query Store captures by default. Rather than capturing exactly everything, it now skips very trivial queries that only run occasionally. But if you execute otherproc repeatedly, you will eventually get eight rows back from the query above on SQL 2019 as well.

The advantage with using Query Store over the plan cache is that the data is persisted and survives a restart of SQL Server. This does not mean that Query Store is perfectly reliable for finding references. It is after all designed for a different purpose: to be an aid for performance tuning. Here are a couple of things to consider:

Performance is also a consideration. You may note that rather than using CTEs all the way, I have introduced an intermediate temp table. In my first version I did not have this temp table, but I found that even for the very small dependdemo database, the query could run for more than ten seconds. If your system has many different stored procedures or ad-hoc queries, there are a lot of plans to get through, much more than in the plan cache, since there can be many plans for the same query. So don't run the query above in production, but restore a backup to an idle test system. Or even better, use DBCC CLONEDATABASE which gives you a database with metadata and Query Store but no sensitive table data and copy that small database to your laptop or wherever.

What AbaPerls Can Do for You

In this section I will briefly discuss two possibilities that comes with my AbaPerls toolset. AbaPerls is not exactly to get started with, but you can use the first tool that I discuss, VCDBLOAD, without embracing the rest of AbaPerls. This is not exactly true for the second option.

VCDBLOAD gives you the same thing as SearchCode: a full-text indexed database where you can search your source code. But instead of loading the code from sys.sql_modules in the database, VCDBLOAD loads the source code from your version-control system (which must be TFS or SourceSafe, the only two I support). This is not restricted to SQL code, but VCDBLOAD loads all non-binary files it finds to the full-text indexed database. Furthermore, for languages it understands (SQL, C++, C#, Visual Basic, JavaScript), it stores the source code with the comments retained as well as with the comments stripped out, so that you can search without getting false hits in comments. This full-text database can be a very powerful tool to find references in your system, both server-side and client-side, and this is a very important tool at the client where I developed this solution.

The other tool is DBBUILD which builds an empty database from a source control, which must be organised according to a structure mandated by AbaPerls. (VCDBLOAD does not have any such restrictions). When AbaPerls loads a stored procedure, it first extracts all temp-table definitions in the procedure and creates them prior to creating the procedure. This lures SQL Server to perform full binding of these queries. The main objective with the exercise is to get SQL Server to report errors in such queries early. But a side-effect is that sys.sql_dependencies is populated with complete information for all such queries, and thus becomes fully reliable except for dynamic SQL and cross-database and cross-server queries. Note that it is only in this older view where you get the complete information. sys.sql_expression_dependencies does not hold information on column level, and since sys.dm_sql_referenced_entities starts over, it has no benefit of the tricks that AbaPerls played when the procedure was created.

Conclusion

We have now looked at a number of ways to find references in a database. Is this the full story, or are there more options? There are a few more alternatives I can think of, but which I did not find worthwhile to explore further:

As always, I appreciate feedback from my readers on my articles. The feedback could be of any size from pointing out a major omission to pointing out a spelling or grammar error or two. And, of course, if you think that there is something that is unclear and you have any questions directly related to the article, you are more than welcome to contact me. The mail address is esquel@sommarskog.se. (On the other hand, if you have a more general SQL question, I recommend you to ask in a public SQL forum, as you may get help faster that way.)

Revisions

2020-05-24
The article has been reworked more or less entirely and given a new title. This was triggered by Paul McMillan who pointed me to a query against the plan cache to find references, so I felt obliged to add a chapter on this method. Then I adapted this query to work against Query Store for yet a new chapter. In the Conclusion I have added a brief discussion of yet some more possible methods, and I have reworked the original text to quite some extent. I have also added output for all queries in the article, to make it easier to read without an SQL Server instance next to you.
2019-04-24
2018-12-01
First version.

Back to my home page.