DBA Repository – Find the Databases, Part 2

In a previous entry I demonstrated how to find all the databases on all the SQL Server instances on the network, update the repository with some properties of the databases and notify the DBA team when a new database is found.  The last thing I need to do is delete all the DBs in the DBA Repository that are no longer in a SQL Server instance.

The last step of this SSIS package, Remove Deleted DBs is what this post is all about.

Remove Deleted DBs Data Flow

Remove Deleted DBs contains all the transformations we need to determine which databases were deleted from a SQL instance but still have a record in the DBA Repository.  Logically, it is a simple process to find this information: outer join the repository database list with the source server list, when we don’t find a match in the source server set we delete those records in the DBA Repository set.

In order to complete this in SSIS I query two the source server and the DBA Repository, sort the data sets by database name, use a Merge Join task to find our mis-matches, use a conditional split to handle only the deleted DBs then delete the records from DBA Repository that are no longer needed.  Here’s a screen shot of those steps and a little detail of some interesting highlights from those steps.

DBs in DBARepository OLE DB Source

This is an OLE DB source that uses a parameterized query to get all the databases on record in the DBA Repository for the server currently in the loop, the parameter is the User::SQLServer variable we use in the ForEach server loop:

SELECT  d.ServerName
      , d.DatabaseName
FROM    dbo.Databases AS d
WHERE   ServerName = ?

All Current DBs on the Server OLEDB Source

This OLEDB Source runs a query to find all the databases on the current server in the loop.  This OLEDB Source’s the Connection Manager ServerName property is updated with the same User:SQLServer variable from the ForEach server loop.  Here’s the query for this OLEDB Source:

DECLARE @ServerVersion varchar(18)
SET @ServerVersion = CAST(SERVERPROPERTY('ProductVersion') AS varchar(20))
SET @ServerVersion = LEFT(@ServerVersion, CHARINDEX('.', @ServerVersion) - 1)
 
IF CAST(@ServerVersion AS tinyint) <= 8
    SELECT  ServerName = @@SERVERNAME
          , DatabaseName = d.[name]
          , CompatibilityLevel = d.cmptlevel
          , lf.LastFullBackup
          , ll.LastTransactionLogBackup
          , ld.LastDifferentialBackup
    FROM    [master].dbo.sysdatabases AS d
            LEFT JOIN (SELECT   DatabaseName = b.database_name
                              , LastFullBackup = MAX(b.backup_finish_date)
                       FROM     msdb.dbo.backupset AS b
                       WHERE    b.[type] = 'D'
                       GROUP BY b.database_name) AS lf ON d.[name] = lf.DatabaseName
            LEFT JOIN (SELECT   DatabaseName = b.database_name
                              , LastTransactionLogBackup = MAX(b.backup_finish_date)
                       FROM     msdb.dbo.backupset AS b
                       WHERE    b.[type] = 'L'
                       GROUP BY b.database_name) AS ll ON d.[name] = ll.DatabaseName
            LEFT JOIN (SELECT   DatabaseName = b.database_name
                              , LastDifferentialBackup = MAX(b.backup_finish_date)
                       FROM     msdb.dbo.backupset AS b
                       WHERE    b.[type] = 'I'
                       GROUP BY b.database_name) AS ld ON d.[name] = ld.DatabaseName
IF CAST(@ServerVersion AS tinyint) > 8
    SELECT  ServerName = @@SERVERNAME
          , DatabaseName = d.[name]
          , CompatibilityLevel = d.[compatibility_level]
          , lf.LastFullBackup
          , ll.LastTransactionLogBackup
          , ld.LastDifferentialBackup
    FROM    [master].sys.databases AS d
            LEFT JOIN (SELECT   DatabaseName = b.database_name
                              , LastFullBackup = MAX(b.backup_finish_date)
                       FROM     msdb.dbo.backupset AS b
                       WHERE    b.[type] = 'D'
                       GROUP BY b.database_name) AS lf ON d.[name] = lf.DatabaseName
            LEFT JOIN (SELECT   DatabaseName = b.database_name
                              , LastTransactionLogBackup = MAX(b.backup_finish_date)
                       FROM     msdb.dbo.backupset AS b
                       WHERE    b.[type] = 'L'
                       GROUP BY b.database_name) AS ll ON d.[name] = ll.DatabaseName
            LEFT JOIN (SELECT   DatabaseName = b.database_name
                              , LastDifferentialBackup = MAX(b.backup_finish_date)
                       FROM     msdb.dbo.backupset AS b
                       WHERE    b.[type] = 'I'
                       GROUP BY b.database_name) AS ld ON d.[name] = ld.DatabaseName

Sort Transforms

The next steps in the Remove Deleted DBs flow simply sorts each OLEDB dataset by database name because the subsequent Merge Join requires a sorted set to function.

Merge the Repository with the Current Server Database List Merge Join Transform

A merge join task allows two data sets to merge side-by-side based on a common key field.  In this Merge the Repository join I perform a left outer join meaning I’m keeping all the records on the left side of the join and non-matching records on the right side will be NULL.  Since I’ve configured the left side to be the DBA Repository records and the right to be the Source Server, I know all mis-matches will result in a NULL for DBs that were deleted at the source server but still have a record in the DBA Repository.  Here’s the configuration for the Merge Join Transform:

Split Out the Deleted DBs Conditional Split

Once the two data sets are merged we can use a Conditional Split transform to find any DBs that have a null record in the source server dataset and output them to be handled by an OLE DB Command transform.  A Conditional Split transform allows you to run an expression on the data and split any matching rows in the data set to a separate output.  The transform also has a default output so all records that don’t match a condition have are sent to the default output.  I only need to find the mis-matched database names so I check for null on the source server side of the merge and let the matching records go to the default output:

Perform Delete of Repository Database Records OLE DB Command

Last step, we only need to issue a delete statement against the DBA Repository for the DBs we’ve found to delete.  Here’s the command we’ll issue:

DELETE dbo.Databases
WHERE ServerName = ?
AND DatabaseName = ?

And here’s a screen shot of the parameter mapping configuration:

Summary

All we had to do to automatically remove deleted databases from the DBA Repository was gather two datasets; first, what we had in the repository and, second, what is currently on a server.  Then, merge the two sets in a left outer join to find those DBs that exist in the repository but not at the server.  Next, split only the missing databases into a separate record set.  Finally, delete each database record in the repository in so it matches what exists on the server.

One Response to DBA Repository – Find the Databases, Part 2

  1. Abu Dina says:

    David, this is great stuff! I’m building this from scratch and it’s proving to be a very useful exercise!

    Thanks very much for all your efforts with this project.

Leave a Reply

Your email address will not be published. Required fields are marked *