SQL Server : compare two tables with UNION and Select * plus additional label column

Question

I've been playing around with the sample on Jeff' Server blog to compare two tables to find the differences.

In my case the tables are a backup and the current data. I can get what I want with this SQL statement (simplified by removing most of the columns). I can then see the rows from each table that don't have an exact match and I can see from which table they come.

SELECT 
   MIN(TableName) as TableName
   ,[strCustomer]
   ,[strAddress1]
   ,[strCity]
   ,[strPostalCode]
FROM 
   (SELECT 
       'Old' as TableName
       ,[JAS001].[dbo].[AR_CustomerAddresses].[strCustomer]
       ,[JAS001].[dbo].[AR_CustomerAddresses].[strAddress1]
       ,[JAS001].[dbo].[AR_CustomerAddresses].[strCity]
       ,[JAS001].[dbo].[AR_CustomerAddresses].[strPostalCode]
    FROM  
       [JAS001].[dbo].[AR_CustomerAddresses]
    UNION ALL
    SELECT 
       'New' as TableName
       ,[JAS001new].[dbo].[AR_CustomerAddresses].[strCustomer]
       ,[JAS001new].[dbo].[AR_CustomerAddresses].[strAddress1]
       ,[JAS001new].[dbo].[AR_CustomerAddresses].[strCity]
       ,[JAS001new].[dbo].[AR_CustomerAddresses].[strPostalCode]
    FROM 
       [JAS001new].[dbo].[AR_CustomerAddresses]) tmp
  GROUP BY 
     [strCustomer]
     ,[strAddress1]
     ,[strCity]
     ,[strPostalCode]
  HAVING 
     COUNT(*) = 1

This Stack Overflow Answer gives me a much cleaner SQL query but does not tell me from which table the rows come.

    SELECT * FROM [JAS001new].[dbo].[AR_CustomerAddresses]
    UNION 
    SELECT * FROM [JAS001].[dbo].[AR_CustomerAddresses]
    EXCEPT 
    SELECT * FROM [JAS001new].[dbo].[AR_CustomerAddresses]
    INTERSECT
    SELECT * FROM [JAS001].[dbo].[AR_CustomerAddresses]

I could use the first version but I have many tables that I need to compare and I think that there has to be an easy way to add the source table column to the second query. I've tried several things and googled to no avail. I suspect that maybe I'm just not searching for the correct thing since I'm sure it's been answered before.

Maybe I'm going down the wrong trail and there is a better way to compare the databases?

score 3 · Accepted Answer · answered Aug 27 '14 at 17:17

Could you use the following setup to accomplish your goal?

SELECT 'New not in Old' Descriptor, *
FROM 
  ( 
    SELECT * FROM [JAS001new].[dbo].[AR_CustomerAddresses]
    EXCEPT
    SELECT * FROM [JAS001].[dbo].[AR_CustomerAddresses]
  ) a

UNION 

SELECT 'Old not in New' Descriptor, *
FROM 
  (
    SELECT * FROM [JAS001].[dbo].[AR_CustomerAddresses]
    EXCEPT
    SELECT * FROM [JAS001new].[dbo].[AR_CustomerAddresses]
  ) b

score 0 · Answer 2 · answered Aug 27 '14 at 17:06

You can't add the table name there because union, except, and intersection all compare all columns. This means you can't differentiate between them by adding the table name to the query. A group by gives you control over what columns are considered in finding duplicates so you can exclude the table name.

To help you with the large number of tables you need to compare you could write a sql query off the metadata tables that hold table names and columns and generate the sql commands dynamically off those values.

score 0 · Answer 3 · edited Aug 27 '14 at 17:23

Derive one column using table names like below

    SELECT MIN(TableName) as TableName
              ,[strCustomer]
              ,[strAddress1]
              ,[strCity]
              ,[strPostalCode]
          ,table_name_came
          FROM 
          (SELECT 'Old' as TableName
              ,[JAS001].[dbo].[AR_CustomerAddresses].[strCustomer]
              ,[JAS001].[dbo].[AR_CustomerAddresses].[strAddress1]
              ,[JAS001].[dbo].[AR_CustomerAddresses].[strCity]
              ,[JAS001].[dbo].[AR_CustomerAddresses].[strPostalCode]
              ,'[JAS001].[dbo].[AR_CustomerAddresses]' as table_name_came
          FROM [JAS001].[dbo].[AR_CustomerAddresses]
          UNION ALL
          SELECT 'New' as TableName
              ,[JAS001new].[dbo].[AR_CustomerAddresses].[strCustomer]
              ,[JAS001new].[dbo].[AR_CustomerAddresses].[strAddress1]
              ,[JAS001new].[dbo].[AR_CustomerAddresses].[strCity]
              ,[JAS001new].[dbo].[AR_CustomerAddresses].[strPostalCode]
          ,'[JAS001new].[dbo].[AR_CustomerAddresses]' as table_name_came
          FROM [JAS001new].[dbo].[AR_CustomerAddresses]
          ) tmp
          GROUP BY [strCustomer]
              ,[strAddress1]
              ,[strCity]
              ,[strPostalCode]
        ,table_name_came
            HAVING COUNT(*) = 1

this doesn't address his question at all. What does adding the table name (incorrectly, as it's only in the first UNION) do that the New/Old doesn't? — AHiggins, Aug 27 '14 at 17:14
Suggested an edit; the `format as code` button is for formatting code, not sentences. Also put quotes around your table name column in the second UNION... — AHiggins, Aug 27 '14 at 17:20
Right, but he already has a satisfactory table name method: Old/New. His question isn't how to get the first query working, but instead how to make it simpler OR to alter the second query to include the table name. — AHiggins, Aug 27 '14 at 17:21

SQL Server : compare two tables with UNION and Select * plus additional label column

3 Answers3