A Complex Database Project in Visual Studio SSDT

This article describes one approach to creating a complex database project in Visual Studio with SQL Server Data Tools (SSDT). This was for a group of databases that had defied more than one previous attempt at being successfully converted into working SSDT database project(s) by senior DBAs and developers proficient in their craft.


We have a web product that is serviced by databases with a layout that looks something like this (databases renamed for illustration/security):

The system was probably not designed this way from the start. It evolved into this complex structure over more than a decade of conversions from one line of business to another and provisioning/appropriating of modules from other products to fulfill a functional need in the current product. Many of these components could probably be defined as legacy. The main database DB1 looks up (red arrows) objects in other databases via stored procedures, functions and views. Most of those databases also have similar references (blue arrows) into other databases, including back into DB1. Thus, there are numerous cross dependencies, ranging from dozens to hundreds of references from any given database into other databases.

Previous Attempts

For any database project to build and compile successfully in SSDT, all its references must exist and must be valid. For instance, if a stored procedure in DB1 is looking up table dbo.TTT in DB2, the SSDT project must be able to resolve the table DB2.dbo.TTT. Without it, the build will fail (error on missing objects). You could choose to suppress specific types of errors, but that is bad practice because if a genuine error of that type happens, the project will not alert you to it. Previous attempts at creating a deployable database project for our product revolved around importing all these databases into SSDT. This never succeeded due to the sheer size and scope of the task. There were too many databases, references and dependencies. Visual Studio could never finish the import, it would keep churning for hours and eventually freeze or crash. I suspect even if it had managed to import all the objects, any development, build, and deployment actions on this project would have been too slow and cumbersome for it to have been of practical use.

Our established company standard for working with database development and deployment is to use SSDT projects. Until this point in time, we had been unable to use SSDT for this product. We had to resort to using a combination of Red Gate tools (SQL/Schema Compare, Data Compare, and Source Control) to accomplish this. However, we really wanted to be able to take advantage of SSDT, because among other benefits, Visual Studio is an environment our application developers are more comfortable with than SSMS or Red Gate. Also, automating product deployment for our teams was far easier with Visual Studio than anything else. So how did we accomplish this?


Let's take a closer look at just two databases in our layout: DB1 and DB2. The relationship between them looks like this:

There are references from DB1 to DB2 (red), and references from DB2 to other databases (blue). The condition that all references had to exist had so far pushed a SSDT project out of reach. However, I had an epiphany one day. What if I created a project only to deploy DB1? And what if I gave it only those objects from DB2 that it needed to resolve references, and nothing more? From the above image, DB1 only needs Table1, Table2, Table3, StoredProc1 and Function1 from DB2. It has no need for Table4, Table5, StoredProc2 and Function2. This would greatly reduce the number of objects and references in the project. I reasoned that the number of references would still be large enough to cause performance issues in the DB1 project, so I created a separate solution and project for DB2. This project would only have DB2 objects in it, and only those objects needed by DB1, as follows:

Although the DB2 project was technically sound in that it built and deployed without error, it only had table definitions without any table data. Also, any stored procedures and functions in DB2 were empty shells. There was no need for them to have the real code, just as long as the signature met the requirements that the reference from DB1 was looking for, and it was syntactically valid. So my DB2 objects looked like this:

Remember, this project is not intended to deploy DB2 from. It is only meant to be a reference resolution instrument for DB1. I built the DB2 project successfully, thus creating a dacpac file from it. Once I had this dacpac, I copied it at the OS file level into a folder in my main project, DB1. Then, I added a database reference (right-click "References") inside the DB1 project to DB2.dacpac. Alternatively, I could have added a database reference from DB1 directly into the DB2 project location, but I wanted to avoid that dependency. I wanted very loose coupling between DB1 and DB2. At this point, my DB1 project looked conceptually like this:

 Next, I repeated my process of building the DB2 dacpac file for each of the other databases, and similarly imported and referenced all those dacpacs from DB1. I ended up with a DB1 project that had the following layout:

Note that the references are now only one-way, going from DB1 into other databases. There are no references from those other databases going anywhere else. The solution explorer looks like this:

And finally, the goal of this exercise, a DB1 project that successfully builds and allows deployment:

The majority of our database development is done on DB1, so this is really the only project we need for the time being. If the developers want to push any changes to any of the other databases, I will just repeat the above process for that database and create a separate deployable project. Granted that this would require several projects instead of one, but it allows us to go back to company standards and use SSDT for this complex database layout.


T-SQL Throwdown at SQL Saturday Oklahoma City

This was the third public running of T-SQL Throwdown, this time at SQL Saturday Oklahoma City.

Third time's the charm, the saying goes. Not to imply that the first two runnings of this event at the SQL community level weren't successful. They were, as evidenced by the feedback forms. However, this time it went even better. Just like a resume is a living document that is never finished, this event is a living work in progress, and has benefited from some updates that were made for this go-around.

The biggest change was how the questions and answers were sent and received. Until the last running, the questions were emailed out to the participants, one question at a time. When teams felt they had the answer, they emailed the answer back. This method was heavily dependent on WiFi connectivity at the conferences. Also, it hid teams' answers from other teams, so the opportunity to learn by exposure was limited. For this running, I distributed all the questions on USB sticks at the beginning of the session, and asked the teams to come up to the main display screen and type up their answers. This achieved some desired and expected positives:
- It increased the instructional value of the event by showing the audience what a team's answer was
- It eliminated dependence on Internet connectivity
- It reduced some anxiety because the teams weren't left wondering if their emailed answers were received
- It made the competition more fun, because if a team attempted an answer and got it wrong, the next team attempting that answer could build on the previous team's entry

There was also the unexpected benefit that this format allowed more questions to be asked and answered. At previous runnings of the same duration (1 hour), we generally managed to fit in 4-5 questions. This session had 10 questions, and all were used.

The feedback was very positive. The feedback forms (below) indicate mostly 5s and some 4s. Another thing happened at SQL Saturday Oklahoma City for the first time: after the session was done and the room vacated, participants kept finding me and telling me that they loved the event or had a lot of fun or learned something, and would love to see it again. Very grateful for this feedback, and humbled by the positive response from #SQLFamily.

One area that I need to improve is the expert rental. In past events, teams have been reluctant to rent an expert when they were stuck. Whether this is due to the cost of the rental, or because teams are not comfortable admitting they need help, or some other reason, I'm not sure. This time, the experts were almost not rented at all. I like the idea of expert rental because it helps participants learn new things, and also helps competitive teams attempt a question they may not know the answer to, rather than giving up.

I will work on making expert rental easier/cheaper for the teams, and continue thinking up ways to improve the event. If you participated/observed, and have any suggestions, please feel free to email me.


T-SQL Throwdown at NTSSUG

The second public running of T-SQL Throwdown happened at the North Texas SQL Server User Group in Dallas on 2014/01/16.

One of the lessons I learned from the first running at SQL Saturday was that almost nobody reads the rules ahead of time. At that running, I had listed three things as required: a laptop, SQL Server installed on the laptop, and a copy of the AdventureWorks database. I came somewhat prepared for people who had not read the requirements ahead of time. I brought several USB sticks with AdventureWorks loaded. However, it takes time to pass the USB sticks around and for people to load it and get up and running.

To reduce setup time for this running, I decided to ditch the requirement for the AdventureWorks database. All the questions were based on tables for which I provided the create and insert statements. It worked, and the event started a little bit more smoothly, despite there being more teams and participants. Still, there may be room for improvement, and I have feedback with some good ideas that I will play with for the next running.

There were no evaluation forms for the audience to fill this time. Thus, I do not have objective documentation of participants' experience. Anecdotal feedback based on hallway talk was generally positive: people had fun and networked, meeting at least two of my three objectives for the event. These were: learn some SQL, make a new friend, and have fun. I am hoping that the learning of SQL happened when team members discussed solutions to the questions presented.

Overall, more encouraging feedback leaves me with the desire to keep running this event and to improve its scalability. T-SQL Throwdown will be back...


T-SQL Throwdown at SQL Saturday Dallas (#255)

The first open-to-public running of my SQL game, T-SQL Throwdown, happened at SQL Saturday Dallas on 2013/11/02, at the University of Texas in Arlington. Thanks to the NTSSUG Board, and especially Ryan, Sri and Ganesh for making it possible for me to run the event at SQL Saturday.

This being my first presentation at the community level, I was a little afraid that nobody would show up. I had run it at my workplace a couple of times and it was generally liked, but at those times I had access to the audience to explain beforehand what this was about. But now, I could not predict how people would take to this idea based on only a small paragraph in the schedule. I was hoping for the best.

Turns out people did show up. My team and I had to explain the game a few times, and a very few people even left the room after hearing us because they weren't interested in playing a SQL game. But fortunately, most people stayed and participated, and I dare say even enjoyed the experience. I owe a huge debt of gratitude to my team of judges/experts/technical support who helped me run the event and improve it way beyond my original idea: Gabe F, Ed S, Patrick B, Adam P, John G, John S, Mark R, Kathy K.

The winning team (Brad, Ranjith, Marc) received a $100 gift card, runners up (Gonzalo, Ryan, Sal) got $50, sponsored by Think Finance. I received some very nice feedback via the session evals:

A couple of user groups even expressed interest in running the event at their gatherings. Looking forward to more throwdown fun.


Find Stored Procedures and Functions That Reference Certain Tables

During the data discovery process on a database that I have inherited from another designer, I often have the need to determine all the stored procedures that interact with a certain table. For instance, I may need to look up all the stored procedures that insert records into a certain table. Or all the stored procedures that look up a certain view. Here is a code snippet that allows me to plug in the name of the table/view, and returns all the code objects (stored procedures, functions, triggers etc) that interact with that table/view. It returns the object type, the name of the object, and the code within that object broken up into more than one Code Sequence if the code is lengthy enough (longer than 4000 characters). You could, of course, also plug in the name of a code object to return other code objects that look up your searched object, for instance to find all the stored procedures that use a certain function.

Just replace the "search_term" in the code snippet with the name of the table/view/stored procedure/function etc that you want to search on.

select distinct
ObjectType = o.type_desc
, ObjectName = o.name
, CodeSequence = c_display.colid
, Code = c_display.[text]
from sys.objects o
inner join sys.syscomments c_search
on c_search.id = o.[object_id]
inner join sys.syscomments c_display
on c_display.id = o.[object_id]
where o.type_desc not in ('INTERNAL_TABLE','SERVICE_QUEUE','SYSTEM_TABLE')
and c_search.[text] like '%search_term%'
order by 1, 2, 3


Delete All Database User Accounts for a Given Server Login

Ever had the need to remove all database-level user accounts from SQL Server for a given server level login? Say an employee leaves your organization, and you need to remove her account from all databases. Do you do this manually? Do you use the GUI in SSMS? Do you script this out? What if the database level accounts do not have the same name as the login? What if the account name is different in every database? What if there are hundreds of databases on your server, making this a tedious exercise?

Fear not, here is a script that you can run on SQL Server that will drop all database-level user accounts for a specified login. Thanks to Jason Strate for this article which inspired me to create this script. Just set the variable @LoginName to the login for which you want accounts deleted, and execute.

2013/07/15 hakim.ali@SQLzen.com

        SQL Server 2005 and higher.

        Script to delete (drop) all database level user accounts for a given server login from all
        databases on a database server. This will drop users even if they have a different name from
        the server login, so long as the two are associated. This will not drop users from a database
        where there are schemas/roles owned by the user.
use [master]

-- Set the login name here for which you want to delete all database user accounts.
declare @LoginName nvarchar(200); set @LoginName = 'LOGIN_NAME_HERE'

declare @counter int
declare @sql nvarchar(1000)
declare @dbname nvarchar(200)

-- To allow for repeated running of this script in one session (for separate logins).
begin try drop table #DBUsers end try begin catch end catch

-- Temp table to hold database user names for the login.
create table #DBUsers
(                       ID          int identity(1,1)
                ,       LoginName   varchar(200)
                ,       DB          varchar(200)
                ,       UserName    varchar(200)
                ,       Deleted     bit

-- Add all user databases.
insert into #DBUsers
(                       LoginName
                ,       DB
                ,       Deleted
select                  @LoginName
                ,       name
                ,       1
from            sys.databases
where           name not in ('master','tempdb','model','msdb')
and             is_read_only = 0
and             [state] = -- online
order by        name

-- Add database level users (if they exist) for the login.
set @counter = (select min(ID) from #DBUsers)

while exists (select 1 from #DBUsers where ID >= @counter)
        set @dbname = (select db from #DBUsers where ID = @counter)
        set @sql = '
        update          temp
        set             temp.UserName = users.name
        from            sys.server_principals                      as logins
        inner join      [' + @dbname + '].sys.database_principals  as users
                        on users.sid = logins.sid
                        and logins.name = ''' + @LoginName + '''
        inner join      #DBUsers                                   as temp
                                on temp.DB = ''' + @dbname + ''''

        exec sp_executesql @sql
        set @counter = @counter + 1

-- Don't need databases where a login-corresponding user was not found.
delete          #DBUsers
where           UserName is null

-- Now drop the users.
set @counter = (select min(ID) from #DBUsers)

while exists (select 1 from #DBUsers where ID >= @counter)
        select  @sql = 'use [' + DB + ']; drop user [' + UserName + ']'
        from    #DBUsers
        where   ID = @counter

        --select @sql
        begin try exec sp_executesql @sql end try begin catch end catch
        set @counter = @counter + 1

-- Report on which users were/were not dropped.
set @counter = (select min(ID) from #DBUsers)

while exists (select 1 from #DBUsers where ID >= @counter)
        set @dbname = (select db from #DBUsers where ID = @counter)
        set @sql = '
        update          temp
        set             temp.Deleted = 0
        from            sys.server_principals                       as logins
        inner join      [' + @dbname + '].sys.database_principals   as users
                        on users.sid = logins.sid
                        and logins.name = ''' + @LoginName + '''
        inner join      #DBUsers                                    as temp
                                on temp.DB = ''' + @dbname + ''''

        exec sp_executesql @sql
        set @counter = @counter + 1

-- This shows the users that were/were not dropped, and the database they belong to.
if exists (select 1 from #DBUsers)
        select                  LoginName
                        ,       [Database]     = DB
                        ,       UserName       = UserName
                        ,       Deleted        = case Deleted when 1 then 'Yes' else 'No !!!!!!' end
        from            #DBUsers
        order by        DB
        select [No Users Found] = 'No database-level users found on any database for the login "' + @LoginName + '".'

Not automatically dropping the login. If there are database level users that were not dropped,
dropping the login will create orphaned users. Enable at your discretion.
set @sql = 'drop login [' + @LoginName + ']'
exec sp_executesql @sql


Tables and Clustered Indexes Part 2

A little while ago I wrote a piece on why a table does not so much have a clustered index as a table is a clustered index*. Here's a follow up with some SQL code that further strengthens that assertion, at least the way it is implemented in SQL Server.

Let's start by creating a very simple table and inserting one record into it:

create table dbo.test (id int)

insert into dbo.test (id) select 1

Note that this table is currently a heap, it does not have any indexes of any type. Verify that the table has one record:

select * from dbo.test

Here is the execution plan for this select:

As expected, we have a table scan on the heap. Now let's add a non-clustered index to this table on the id column: 

create nonclustered index NCIndex_test on dbo.test (id)

Run the select statement again, and you will see the same result again, as expected. The execution plan, of course, is different. Now the query uses the non-clustered index instead of the table scan (I won't get into seeks vs scans here, different conversation).

Now let's try disabling (not dropping) the non-clustered index:

alter index NCIndex_test on dbo.test disable

Once again, run the select statement. We now see the same result and execution plan as in our first select statement above. The index is ignored, and the table is treated like a heap, resulting in a table scan. Again, so far all of this is as expected.

Now, let's repeat the above exercise with a clustered index. To keep everything else the same between the two exercises, I will drop and recreate the table.

drop table dbo.test

create table dbo.test (id int)

insert into dbo.test (id) select 1

Now, running our select will again show us the table scan:

select * from dbo.test

Let us now create the clustered index:

create clustered index CIndex_test on dbo.test (id)

Running our select statement again, here's what our execution plan looks like with the clustered index:

select * from dbo.test

Now let's disable (not drop) the clustered index:

alter index CIndex_test on dbo.test disable

Now if you run the select statement, what do you expect to see? I would have expected to see the table treated as a heap, and a table scan in the execution plan. It seems reasonable to assume that clustered table minus clustered index equals heap, right?

select * from dbo.test

Apparently, not so. Disabling the non-clustered index causes the query to treat the table like a heap, but disabling the clustered index does not have the same effect. Disabling the clustered index disables the table itself.

Thus, when you add a clustered index to a table, the table becomes the clustered index*. A table does not so much have a clustered index as a table is a clustered index.

* There is much more to a clustered index, including nodes, leaves, a root, a b-tree structure etc. This is a simplification to help illustrate the point. As an aside, if you drop the clustered index (drop index CIndex_test on dbo.test), the table goes back to being a heap and returns results from the select statement. Here is what Books Online/MSDN has to say on the subject: "Disabling an index prevents user access to the index, and for clustered indexes to the underlying table data... Disabling a clustered index on a table prevents access to the data; the data still remains in the table, but is unavailable for data manipulation language (DML) operations until the index is dropped or rebuilt."